Sample Header Ad - 728x90

Ranking and use of cube by in an SQL query?

1 vote
0 answers
134 views
I am a the MSc student who does not understand this following SQL dilemma and want to be certain from an Academic standpoint. The entire “Foodmart” database can be downloaded from: http://didawiki.cli.di.unipi.it/lib/exe/fetch.php/bdd-infuma/foodmart_mysql.sql.zip In short, is my SQL syntax valid in interpreting the question correctly? I put the questions in reverse order from 4.3a 4.3b 4.2 and 4.1 so they decrease with difficulty. Question 4.3 Produce a report with information about the A) profit and ratioP of each product family, store country, year and gender; the ratioP is between profit and the store sales in the gender category (Male or Female) B) the profit of each subset of the above combination My syntax is: -- 4.3a select PC.product_family , ST.store_country , TBD.the_year as year , C.gender , (sum(store_sales) - sum(store_cost)) as total_profit , ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit from sales_fact S left join product P ON S.product_id = P.product_id left join product_class PC ON P.product_class_id = PC.product_class_id left join store ST ON S.store_id = st.store_id left join time_by_day TBD ON S.time_id = TBD.time_id left join customer C ON S.customer_id = C.customer_id group by PC.product_family, ST.store_country, TBD.the_year, C.gender ; -- 4.3b select PC.product_family , ST.store_country , ST.store_city , TBD.the_year as year , TBD.the_month as month , TBD.the_day as day , C.gender , (sum(store_sales) - sum(store_cost)) as total_profit from sales_fact S left join product P ON S.product_id = P.product_id left join product_class PC ON P.product_class_id = PC.product_class_id left join store ST ON S.store_id = st.store_id left join time_by_day TBD ON S.time_id = TBD.time_id left join customer C ON S.customer_id = C.customer_id group by PC.product_family, ST.store_country, ST.store_city, TBD.the_year, TBD.the_month, TBD.the_day, C.gender ; select PC.product_family , ST.store_city , TBD.the_year as year , sum(store_sales) as total_sales , (sum(store_sales) - sum(store_cost)) as total_profit , ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit from sales_fact S left join product P ON S.product_id = P.product_id left join product_class PC ON P.product_class_id = PC.product_class_id left join store ST ON S.store_id = ST.store_id left join time_by_day TBD ON S.time_id = TBD.time_id group by PC.product_family, ST.store_city, TBD.the_year ; Could there be any merit, necessity, or efficiency in using cube by or roll-up functions in this query? A university Professor suggests: You do not need cube by. The queries you mention have nothing to do with cubes or rollup They are simply group by and many combinations of the intermediate results. "They are simply group by and many combinations of the intermediate results." "With group by having and aggregate functions like max sum etc. can achieve what is asked. You just need to create a strategy (plan) on how to compute it and then do the select project join queries that do that." “With group by having and aggregate functions like max sum etc. you can achieve what is asked. You just need to create a strategy (plan) on how to compute it and then do the select project join queries that do that.” “It is not something done it 2 min … It needs some thinking (but that does not make it hard).” I have other questions below but am grateful for your reply or anyone else you deem suitable! Here are the other 2 questions on the same database with my syntax: 4.2 For every product family and customer country and gender return the customer with the highest store sales of the product family. If there are two or more customers with the same store sales, pick up any. -- 4.2 select distinct C.lname, C.fname, PC.product_family, C.country, C.gender, MAX(S.store_sales) from sales_fact S left join product P on S.product_id = P.product_id c left join product_class PC on P.product_class_id = PC.product_class_id left join customer C on S.customer_id = C.customer_id group by PC.product_family, C.country, C.gender ; The university Professor suggests only grouping by year?: you group by (product family, customer country, gender, customer) in the select clause you put product family, customer country, gender, customer, sum(sakes) Now you know the sales of each customer. Then you use that table, to identify the max sales and select the tuples that have such a value. Possible error? Is there also for example the requirement of the use of a ranking or anything else for that matter? 4.1 Return a table where you list product family, store city and year, with the corresponding total sales and profit, and the percentage of profit over the total sales of the year. -- 4.1 select PC.product_family , ST.store_city , TBD.the_year as year , sum(store_sales) as total_sales , (sum(store_sales) - sum(store_cost)) as total_profit , ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit from sales_fact S left join product P on S.product_id = P.product_id left join product_class PC on P.product_class_id = PC.product_class_id left join store ST on S.store_id = ST.store_id left join time_by_day TBD on S.time_id = TBD.time_id group by PC.product_family, ST.store_city, TBD.the_year ; you simply group by year and then you do in the select clause sum(sales), sum(profit), sum(sales)/sum(profits) The from clause should have the join of all the tables containing the needed information That is all. I do not only group by year as I see also PC.product_family, ST.store_city in my syntax. Possible error? Could there also, for example, in the Select be the error of the inclusion of attributes not present in the group by? Many thanks people for this help, Tim
Asked by Timothy (11 rep)
Mar 26, 2021, 03:13 PM
Last activity: Mar 26, 2021, 04:51 PM