Sample Header Ad - 728x90

pgbench: generating a random padded string

0 votes
1 answer
473 views
I have the IMDb database in PostgreSQL 16, and I am trying to generate random strings for performance benchmark. The relevant table is title_basics with the primary key column tconst, VARCHAR(9). tconst has values in the form of 'tt0000000', where the zeroes represent numbers, e.g., 'tt0000001' and 'tt9999999'. I want to write an .sql file which can be run with pgbench, which generates random, padded numbers appended after the string 'tt', e.g.: SELECT * FROM title_basics WHERE tconst = 'tt0000001'; With simple SQL, this would be trivial, but since pgbench supports only a small number of scalar functions, it seems difficult. I have also explored the use of SQL in the pgbench script, but constructs such as this do not seem to be supported:
\set random_number (SELECT random());
or:
\set random_number :randint(0, 9999999)
\set padding_length 7 - length(:'random_number')
\set primary_key 'tt' || repeat('0', :'padding_length') || :'random_number'
I also considered generating the numbers with SQL, but this effectively prevents PostgreSQL from using the index on tconst. I do not want to create an additional index to satisfy the query, as that would be problematic for other reasons.
SELECT * 
FROM title_basics 
WHERE tconst = 'tt' || LPAD((FLOOR(RANDOM()*(9916880 - 1 + 1)) + 1)::text, 7, '0');
Edit: I found a somewhat functional solution of generating the random value in a CTE. This does not prevent PostgreSQL from using the index on tconst. However, I would be happy to see a more pgbench-oriented approach.
WITH a AS (
SELECT 'tt' || LPAD((FLOOR(RANDOM()*(9916880 - 1 + 1)) + 1)::text, 7, '0') AS tconst_random
)
SELECT * 
FROM title_basics, a 
WHERE tconst = a.tconst_random;
Asked by Fuiba (5 rep)
May 16, 2023, 09:13 AM
Last activity: May 17, 2023, 07:20 AM