Why does a query on 4 times more data take 300 times more processing?
0
votes
1
answer
55
views
To score the price preference of our individual customers in, we run the below query.
Basically it is a weighted average of a product score. The score depends on the product group and the packaging type. The weight depends on the product type and the customer type.
SELECT SUBQRY.customerNo,
SUM(WGT.weight * SUBQRY.avg_score) / SUM(WGT.weight) AS score
FROM
(SELECT SALE.customerNo,
SALE.ProductType,
SALE.CustomerType,
AVG(SCORE.score) AS avg_score
FROM mySchema.Retail_sales AS SALE
LEFT JOIN mySchema.product_scores AS SCORE
ON (SALE.ProductGroup = SCORE.ProductGroup AND
SALE.Packaging = SCORE.Packaging)
GROUP BY SALE.customerNo,
SALE.ProductType,
SALE.CustomerType
) AS SUBQRY
LEFT JOIN mySchema.product_SOW AS WGT
ON (SUBQRY.CustomerType = WGT.CustomerType AND
SUBQRY.ProductType = WGT.ProductType)
GROUP BY SUBQRY.customerNo
ORDER BY SUBQRY.customerNo
We have two reatail chanes, one with 4 times as much sales and product Groups _(16k vs 4k)_ as the other.
How can I explain the larger one takes 300 times more time to execute and how can I cure that?
PS: We currently have no indexes on tables within mySchema. Creating an index on mySchema.product_scores.ProductGroup does hardly help.
Asked by Dirk Horsten
(85 rep)
Jul 19, 2016, 02:17 PM
Last activity: Jul 19, 2016, 02:46 PM
Last activity: Jul 19, 2016, 02:46 PM