Sample Header Ad - 728x90

Sharding a Materialized View

0 votes
1 answer
233 views
I am a Postgres 15.3 AWS RDS instance and have a database that centers around one table, called posts that has 20 other relations to build the full object. The full query is about a second altogether and there are indexes on all the join keys. So in order to make reads fast we have a materialized view, mv_posts that represents the full table with ~20 joins and subselects and everything the front end needs (see below). Here is the math that scares me, I have about 5000 rows and building the materialized view takes 2050 ms. This is not scalable. I am considering the following implementation. Create a new materialized view mv_new_posts that keeps new posts created that day, and join it 1-1 to the mv_posts materialized view in a normal view and use that to query the posts object. I would then have to truncate mv_new_posts and regenerate mv_posts at midnight every night. This feels ugly but would scale for everything except when a posts row is updated, which would require regenerating the base view in some form. I have considered the following other solutions: - Partition Tables: The issue is that I dont have the row volume for this. I have not benchmarked it but do not beleive that with my small number of rows it would work. - pg_ivm but the restrictions on outer (i.e. left) joins and aggregation functions make this impossible. - Rolling my own, i.e. creating a table and triggers so that a single insert/update on a dependent table results in a single row update in the denormalized table that would replace the materialized view. This I have done on a smaller scale but do not feel would be worth the work here. In particular it would be a maintenance nightmare. The materialized view definition is below, in case there is an obvious optimization I have missed. CREATE MATERIALIZED VIEW mv_posts AS( SELECT p.*, ( setweight(to_tsvector('english', COALESCE(p.title, '')), 'A') || setweight(to_tsvector('english', COALESCE(p.description, '')), 'B') || setweight(to_tsvector('english', COALESCE(pc.email, '')), 'A') || setweight(to_tsvector('english', COALESCE(d.name, '')), 'B') ) AS search_document, CASE WHEN pt.slug = 'ping' THEN True ELSE False END AS ping, ARRAY_REMOVE(vis.all_user_ids, NULL) AS all_visibility_user_ids, ARRAY_REMOVE(vis.user_ids, NULL) AS visibility_user_ids, ARRAY_REMOVE(vis.group_ids, NULL) AS visibility_group_ids, ARRAY_REMOVE(vis.org_ids, NULL) AS visibility_org_ids, CASE WHEN hm.id IS NULL THEN NULL ELSE ROW_TO_JSON(hm.*) END AS headline_file, COALESCE(JSON_AGG(DISTINCT JSON_BUILD_OBJECT( 'id', pm.id, 'file_name', pm.file_name, 'content_type', pm.content_type, 'name', pm.name, 'orphan', pm.orphan, 'folder', pm.folder, 'extension', pm.extension, 'ordinal', pmj.ordinal)::JSONB ) FILTER ( WHERE NOT (pm.id::TEXT = 'null') ), '[]'::JSON ) AS file, COALESCE(JSON_AGG(DISTINCT pmj.file_id) FILTER ( WHERE NOT (pmj.file_id IS NULL) ), '[]'::JSON) AS file_ids, COALESCE(JSON_AGG(DISTINCT pu) FILTER ( WHERE NOT (pu.id::TEXT = 'null') ), '[]'::JSON) AS users, COALESCE(JSON_AGG(DISTINCT pu.email) FILTER ( WHERE NOT (pu.id::TEXT = 'null') ), '[]'::JSON) AS user_emails, pt.id AS post_type_id, row_to_json(pt.*) AS post_type, row_to_json(pit.*) AS incident_type, row_to_json(pc.*) AS creator, d.id AS org_id, JSON_BUILD_OBJECT( 'id', d.id, 'name', d.name, 'admin_id', d.admin_id,'group_id', d.group_id, 'creator_id', d.creator_id, 'law_enforcement_partner', d.law_enforcement_partner, 'slug', d.slug, 'state', d.state, 'location', JSON_BUILD_OBJECT( 'id', dl.id, 'address', dl.address, 'city', dl.city, 'state_province', dl.state_province, 'postal_code', dl.postal_code, 'geojson', ST_AsGeoJSON(dl.geom)::JSON ) ) AS org, ARRAY_REMOVE(ARRAY_AGG(pp.user_id),NULL) AS users_saved, COALESCE( JSON_AGG( DISTINCT JSON_BUILD_OBJECT( 'id', v.id, 'make', v.make, 'year', v.year, 'state', v.state, 'model', v.model, 'color', v.color, 'license_plate', v.license_plate, 'vin', v.vin, 'description', v.description, 'file', ( SELECT COALESCE(JSON_AGG(vm.*) FILTER ( WHERE NOT (vm.id::TEXT = 'null') ), '[]'::JSON) FROM file vm JOIN car_file_join vmj ON vmj.file_id = vm.id WHERE vmj.car_id=v.id ) )::JSONB ) FILTER ( WHERE NOT (v.id::TEXT = 'null') ), '[]'::JSON) AS cars,COALESCE( JSON_AGG( DISTINCT JSON_BUILD_OBJECT( 'id', psn.id, 'first_name', psn.first_name, 'last_name', psn.last_name, 'aliases', psn.aliases, 'description', psn.description, 'hair', psn.hair, 'build', psn.build, 'sex_id', psn.sex_id, 'age', psn.age, 'eye_color', psn.eye_color, 'height', psn.height, 'weight', psn.weight, 'race', psn.race, 'identifying_characteristics', psn.identifying_characteristics, 'country', psn.country, 'guy_type', row_to_json(psnt.*), 'location', row_to_json(psnl.*), 'file', ( SELECT COALESCE(JSON_AGG(psnm.*) FILTER ( WHERE NOT (psnm.id::TEXT = 'null') ), '[]'::JSON) FROM file psnm JOIN guy_file_join psnmj ON psnmj.file_id = psnm.id WHERE psnmj.guy_id=psn.id ) )::JSONB ) FILTER ( WHERE NOT (psn.id::TEXT = 'null') ), '[]'::JSON) AS guys,COALESCE( JSON_AGG( DISTINCT JSON_BUILD_OBJECT( 'id', l.id, 'address', l.address, 'city', l.city, 'state_province', l.state_province, 'postal_code', l.postal_code, 'geojson', ST_AsGeoJSON(l.geom)::JSON, 'location_string', l.location_string )::JSONB ) FILTER ( WHERE NOT (l.id::TEXT = 'null') ), '[]'::JSON) AS locations, COALESCE( ARRAY_AGG(DISTINCT l.location_string) FILTER ( WHERE l.location_string IS NOT NULL ), '{}'::TEXT[] )AS location_strings, ST_UNION( l.geom )::geography AS geom FROM posts p INNER JOIN visibility vis ON vis.post_id = p.id LEFT JOIN file hm ON hm.id = p.headline_file_id LEFT JOIN post_file_join pmj ON pmj.post_id = p.id LEFT JOIN file pm ON pm.id=pmj.file_id LEFT JOIN auth_users AS pc ON p.creator_id=pc.id LEFT JOIN post_user_join puj ON puj.post_id=p.id LEFT JOIN auth_users pu ON pu.id=puj.user_id LEFT JOIN types pt ON pt.id=p.type_id LEFT JOIN incident_types pit ON pit.id = p.incident_type_id LEFT JOIN post_location_join plj ON plj.post_id = p.id LEFT JOIN locations l ON plj.location_id = l.id LEFT JOIN post_guy_join ppj ON ppj.post_id = p.id LEFT JOIN guys psn ON psn.id = ppj.guy_id LEFT JOIN guy_types psnt ON psnt.id = ppj.guy_type_id LEFT JOIN locations psnl ON psnl.id= psn.current_location_id LEFT JOIN post_car_join pvj ON pvj.post_id=p.id LEFT JOIN cars v ON v.id = pvj.car_id LEFT JOIN pinned_posts pp ON pp.post_id = p.id LEFT JOIN orgs d ON pc.org_id = d.id LEFT JOIN locations dl ON d.primary_location_id = dl.id WHERE p.active GROUP BY p.id, vis.all_user_ids, vis.user_ids, vis.org_ids, vis.group_ids, hm.id, pc.id, pt.id, pit.id, d.id, dl.id);
Asked by Frank Conry (123 rep)
Jan 28, 2024, 01:07 PM
Last activity: Jun 12, 2025, 04:09 AM