Fetching 88 records via DISTINCT from a table of 1.5 million rows is taking 3 seconds

0 votes

1 answer

121 views

There is a table which are populated from an online service and updated every week. This goes on a separate (second) database. The table is like 300MB in size (1.5 million rows) processed by a cron that downloads the CSV data every week. [This CSV file](https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.csv) is 3GB in size (4 millions rows) where rows and columns are filtered to retain only a subset of the data. And the order of the columns *may* change. It has in the past. So we can't even predict based on the first row. Even row header names may change. So its a total dump and filter by some hard-set row names. In my Django's view which is a DRF powered one, my API URL takes 3 seconds (0:00:03.263031) to execute.

SELECT DISTINCT Region AS Region
FROM my-table-name
WHERE flag = 'condition'
LIMIT 0, 100

Thing is, if it just this endpoint that I'm fetching on the page it wouldn't kill. But there are many endpoints on the single page which are fetched when the user clicks on an INPUT element. And 3 seconds for each trigger is way too long. What else can I do to optimize the table / query ? enter image description here

Asked by anjanesh (279 rep)

Mar 23, 2023, 11:42 AM
Last activity: Mar 23, 2023, 12:01 PM

Fetching 88 records via DISTINCT from a table of 1.5 million rows is taking 3 seconds

Related Questions