Sample Header Ad - 728x90

Why does ORDER BY significantly slow down my query with a computed score of trigram similarity and null field?

1 vote
1 answer
52 views
I'm working on optimizing a query in PostgreSQL, and I've encountered a performance issue when using the ORDER BY clause. The query is intended to search profiles based on a similarity match to a name (for example: 'john') and then order the results by a computed score. The score is a combination of word similarity and whether the profile has an avatar. Here's the query: SELECT uuid, type, byline, display_name, username, avatar, ( word_similarity('john', search_text) + CASE WHEN avatar != '' THEN 1 ELSE 0 END ) AS combined_score FROM test_mv_all_profiles WHERE 'john' Sort (cost=35130.07..35158.41 rows=11335 width=52) (actual time=8092.502..8092.565 rows=100 loops=1) Sort Key: ((word_similarity('john'::text, search_text) + (CASE WHEN ((avatar)::text ''::text) THEN 1 ELSE 0 END)::double precision)) DESC Sort Method: top-N heapsort Memory: 51kB Buffers: shared hit=66811 -> Bitmap Heap Scan on test_mv_all_profiles (cost=187.84..34696.86 rows=11335 width=52) (actual time=69.060..8052.737 rows=90765 loops=1) Recheck Cond: ('john'::text Bitmap Index Scan on test_idx_mv_social_profile_search_text_trigram_idx_gin (cost=0.00..185.01 rows=11335 width=0) (actual time=58.323..58.323 rows=91483 loops=1) Index Cond: ('john'::text Index Scan using test_idx_mv_social_profile_search_text_trigram_idx on test_mv_all_profiles (cost=0.42..44444.13 rows=11335 width=52) (actual time=0.506..4.417 rows=100 loops=1) Index Cond: ('john'::text <% search_text) Rows Removed by Index Recheck: 1 Buffers: shared hit=311 Planning time: 0.118 ms Execution time: 4.482 ms My questions: - Why does the ORDER BY clause slow down the query so much? - Is there a way to optimize this query while keeping the ORDER BY clause? Would adding an index on the computed score help, and if so, how should I approach that? Additional Information: The table test_mv_all_profiles is a materialized view with around 11M rows. We are using a rather old version of Postgres (9.6) so some newer features are not available to us in the mean time. The search_text field is a concatenation of multiple columns (like username, first name, and last_name). I already have a trigram index on search_text for the similarity search. I'm looking for advice on how to maintain performance while still being able to sort by the combined score. Any insights or recommendations would be greatly appreciated!
Asked by Sheila Loekito (11 rep)
Aug 28, 2024, 10:11 PM
Last activity: Aug 29, 2024, 10:29 AM