Ranking and use of cube by in an SQL query?

1 vote
0 answers
134 views
                          I am a the MSc student who does not understand this following SQL dilemma and want to be certain from an Academic standpoint.

The entire “Foodmart” database can be downloaded from: http://didawiki.cli.di.unipi.it/lib/exe/fetch.php/bdd-infuma/foodmart_mysql.sql.zip 



In short, is my SQL syntax valid in interpreting the question correctly? I put the questions in reverse order from 4.3a 4.3b 4.2 and 4.1 so they decrease with difficulty.

Question 4.3 Produce a report with information about the 

A) profit and ratioP of each product family, store country, year and gender; the ratioP is between profit and the store sales in the gender category (Male or Female)

B) the profit of each subset of the above combination



My syntax is:

    -- 4.3a
    select  PC.product_family , ST.store_country , TBD.the_year as year ,
            C.gender , (sum(store_sales) - sum(store_cost)) as total_profit ,
                ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
        from  sales_fact S
        left join  product P  ON S.product_id = P.product_id
        left join  product_class PC  ON P.product_class_id = PC.product_class_id
        left join  store ST  ON S.store_id = st.store_id
        left join  time_by_day TBD  ON S.time_id = TBD.time_id
        left join  customer C  ON S.customer_id = C.customer_id
        group by  PC.product_family, ST.store_country, TBD.the_year,
            C.gender ;
    -- 4.3b 
    select  PC.product_family , ST.store_country , ST.store_city ,
            TBD.the_year as year , TBD.the_month as month , TBD.the_day as day ,
            C.gender , (sum(store_sales) - sum(store_cost)) as total_profit
        from  sales_fact S
        left join  product P  ON S.product_id = P.product_id
        left join  product_class PC  ON P.product_class_id = PC.product_class_id
        left join  store ST  ON S.store_id = st.store_id
        left join  time_by_day TBD  ON S.time_id = TBD.time_id
        left join  customer C  ON S.customer_id = C.customer_id
        group by  PC.product_family, ST.store_country, ST.store_city,
            TBD.the_year, TBD.the_month, TBD.the_day, C.gender ;

    select  PC.product_family , ST.store_city , TBD.the_year as year ,
            sum(store_sales) as total_sales , (sum(store_sales) - sum(store_cost)) as total_profit ,
                ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
        from  sales_fact S
        left join  product P  ON S.product_id = P.product_id
        left join  product_class PC  ON P.product_class_id = PC.product_class_id
        left join  store ST  ON S.store_id = ST.store_id
        left join  time_by_day TBD  ON S.time_id = TBD.time_id
        group by  PC.product_family, ST.store_city, TBD.the_year ;


Could there be any merit, necessity, or efficiency in using cube by or roll-up functions in this query?  

A university Professor suggests:

You do not need cube by. The queries you mention have nothing to do with cubes or rollup 
They are simply group by and many combinations of the intermediate results.

"They are simply group by and many combinations of the intermediate results."

 "With group by having and aggregate functions like max sum etc. can achieve what is asked. You just need to create a strategy (plan) on how to compute it and then do the select project join queries that do that."

“With group by having and aggregate functions like max sum etc. you can achieve what is asked. You just need to create a strategy (plan) on how to compute it and then do the select project join queries that do that.”

“It is not something done it 2 min … It needs some thinking (but that does not make it hard).”


I have other questions below but am grateful for your reply or anyone else you deem suitable!

Here are the other 2 questions on the same database with my syntax:

4.2 For every product family and customer country and gender return the customer with the highest store sales of the product family. If there are two or more customers with the same store sales, pick up any.

-- 4.2

select distinct
C.lname,
    C.fname,
    PC.product_family,
    C.country,
    C.gender,
MAX(S.store_sales)
from
    sales_fact  S
    left join product P on S.product_id = P.product_id
c    left join product_class PC on P.product_class_id = PC.product_class_id
    left join customer C on S.customer_id = C.customer_id
group by
    PC.product_family, C.country, C.gender
;

The university Professor suggests only grouping by year?:

you group by (product family, customer country, gender, customer) 
in the select clause you put 
product family, customer country, gender, customer, sum(sakes)

Now you know the sales of each customer. 

Then you use that table, to identify the max sales and select the tuples that have such a value.

Possible error?

Is there also for example the requirement of the use of a ranking or anything else for that matter?



4.1 Return a table where you list product family, store city and year, with the corresponding total sales 

and profit, and the percentage of profit over the total sales of the year.



-- 4.1
select
PC.product_family
    , ST.store_city
    , TBD.the_year as year
    , sum(store_sales) as total_sales
    , (sum(store_sales) - sum(store_cost)) as total_profit
    , ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
from
sales_fact  S
    left join product P on S.product_id = P.product_id
    left join product_class PC on P.product_class_id = PC.product_class_id
    left join store ST on S.store_id = ST.store_id
    left join time_by_day TBD on S.time_id = TBD.time_id
group by
PC.product_family, ST.store_city, TBD.the_year

;


you simply group by year and then you do 
in the select clause 
sum(sales), sum(profit), sum(sales)/sum(profits)

The from clause should have the join of all the tables containing the needed information

That is all.

I do not only group by year as I see also PC.product_family, ST.store_city in my syntax. Possible error?



Could there also, for example, in the Select be the error of the inclusion of attributes not present in the group by?



Many thanks people for this help,

Tim
                        
Asked by Timothy (11 rep)
Mar 26, 2021, 03:13 PM
Last activity: Mar 26, 2021, 04:51 PM
Ranking and use of cube by in an SQL query?

Related Questions