Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

1 votes

2 answers

182 views

MYSQL Calculate ranking of row that is not yet inserted

I've got a table similar to this data: | id | username | value | |-----------|----------|-------| | 1 | bob | 46 | | 483 | alice | 90 | | 176 | sue | 3001 | | 82 | harry | 0 | | 10493 | bill | 327 | I have this query that returns me the ranking of a user based on their `id` SELECT username, value, r...

                                  I've got a table similar to this data:


    |     id    | username | value |
    |-----------|----------|-------|
    | 1         | bob      | 46    |
    | 483       | alice    | 90    |
    | 176       | sue      | 3001  |
    | 82        | harry    | 0     |
    | 10493     | bill     | 327   |

I have this query that returns me the ranking of a user based on their id

    SELECT 
		username, 
		value, 
		rank from 
		(
			SELECT 
			tp.username, 
			tp.value,
			tp.id, 
			@curRank := @curRank + 1 AS rank
			FROM table tp, 
			(SELECT @curRank := 0) r
			ORDER BY tp.value DESC
		)
	as temp
	WHERE id = 483;

So for the above query, I would get a ranking returned of **4** for the id 483.

Let's say I want to insert the following row into the table:

    |   id   | username | value |
    |---------------------------|
    |    2   |  sally   | 2000  |

Is there any way to know what rank this row _will_ have after it is inserted, **without** actually inserting it?

I.e. sally would have a rank of 2 from the above query if inserted.

The reason I am curious if this is possible is that I'm attempting to insert the row into the database and only have this one transaction, rather than having to insert the row, then re-run the rank query.

Thanks!
                                

follmer (113 rep)

Dec 27, 2019, 05:42 AM • Last activity: Jun 30, 2025, 05:02 PM

2 votes

1 answers

952 views

DENSE_RANK with distinct columns after filter

sql-server rank

I find myself needing to get the most recent N things accessed by each user, and I have a log table of all accesses. Something like ```CREATE TABLE [AccessLog] ([UserId] UNIQUEIDENTIFIER, [ThingId] UNIQUEIDENTIFIER, [AccessDate] DATETIME)``` So I wrote ``` SELECT [UserId], [ThingId], [SequenceNumber...

I find myself needing to get the most recent N things accessed by each user, and I have a log table of all accesses. Something like

TABLE [AccessLog] ([UserId] UNIQUEIDENTIFIER, [ThingId] UNIQUEIDENTIFIER, [AccessDate] DATETIME)

So I wrote

SELECT [UserId], [ThingId], [SequenceNumber]
  FROM (SELECT [UserId], [ThingId], DENSE_RANK()
          OVER (PARTITION BY [UserId] ORDER BY [AccessDate] DESC) SequenceNumber
          FROM [AccessLog]
     ) [AccessLog]
 WHERE [SequenceNumber] < @N

This doesn't work because if the same user accessed the same thing repeatedly we get the same [ThingId] back up to N times rather than N distinct [ThingIds] back for each user. I'm baffled how to actually do it. The table is so large that if I have to write the N cubed join like it's 2000 again (join table back on itself once to exclude duplicate copies and a second time to generate the count) it will be faster to suck in the entire table and process it in application code. It's almost like I want

FOR EACH [UserId] IN (SELECT [UserId] IN [User])
    SELECT TOP (@N) [UserId], [ThingId]
      FROM [AccessLog]
     ORDER BY [AccessDate] DESC

But that performance is also so bad I'd to better reading the table into application code again.

Joshua (426 rep)

Jun 7, 2023, 11:53 PM • Last activity: Jun 8, 2023, 04:00 AM

-3 votes

2 answers

94 views

Rank over discontinuous blocks of rows

sql-server gaps-and-islands rank

I have the following table Logs: | Id| Char| |:--|:----| |1 | a | |2 | a | |3 | a | |4 | b | |5 | b | |6 | a | |7 | a | |8 | c | I want to get something like: | Id| Char|Rank| |:--|:----|:---| |1 | a |1 | |2 | a |2 | |3 | a |3 | |4 | b |1 | |5 | b |2 | |6 | a |1 | |7 | a |2 | |8 | c |1 | A MS-SQL-Se...

                                  I have the following table Logs:
    
| Id| Char|
|:--|:----|
|1  | a   |
|2  | a   |
|3  | a   | 
|4  | b   | 
|5  | b   |
|6  | a   |
|7  | a   |  
|8  | c   |

I want to get something like:

| Id| Char|Rank|
|:--|:----|:---|
|1  | a   |1   |
|2  | a   |2   |
|3  | a   |3   | 
|4  | b   |1   | 
|5  | b   |2   |
|6  | a   |1   |
|7  | a   |2   |  
|8  | c   |1   |

A MS-SQL-Server query is desirable
        
                                

Wasim Ali (1 rep)

May 2, 2023, 05:04 PM • Last activity: May 3, 2023, 10:49 AM

1 votes

1 answers

841 views

Fetch 10 paginated ranked posts of each category

postgresql rank

Posts table as p: ``` | id | cid | score | ... -------------------- | 1 | 1 | 3 | ... ``` Categories Table as c: ``` | id |... -------------------- | 1 |... ``` What i need? Reterive paginated ranked posts in each category. What i did do: ```SQL SELECT DENSE_RANK() OVER ( PARTITION BY c.id ORDER BY...

Posts table as p:

| id | cid | score | ...
--------------------
|  1 |  1  |   3   | ...

Categories Table as c:

| id |...
--------------------
|  1 |...

What i need? Reterive paginated ranked posts in each category. What i did do:

SELECT
	DENSE_RANK() OVER (
    PARTITION BY c.id
    ORDER BY p."score" DESC NULLS LAST) AS ranking,
    p.id,
    count(*) OVER() AS full_count
FROM p								
INNER JOIN c ON c.id=p.cid
offset 20 limit 10

This query will return whole records and affect offset/limit on it. But i need to affect offset/limit on each category, separated.

user3690884 (13 rep)

May 8, 2019, 10:33 AM • Last activity: Feb 23, 2023, 09:03 AM

-3 votes

1 answers

41 views

Sql ranking query

rank

I have a requirement to make tier based on rank Example first 5 ranks tier 1 Next 5 ranks tier 2 Next 3 ranks tier 3 Next 5 ranks tier 3 Kindly let me know if any help in the sql query for above requirement Thanks in advance

                                  I have a requirement to make tier based on rank
Example first 5 ranks tier 1
Next 5 ranks tier 2
Next 3 ranks tier 3
Next 5 ranks tier 3

Kindly let me know if any help in the sql query for above requirement 
Thanks in advance 
                                

HEMANTH KUMAR REDDY (1 rep)

Jan 29, 2023, 07:39 PM • Last activity: Jan 30, 2023, 07:21 AM

7 votes

3 answers

1795 views

A more intelligent ntile

sql-server postgresql window-functions rank

When using the `ntile()` window function, the major issue is that it arbitrarily groups into roughly equal parts regardless of the actual value. For example with the following query: select id,title,price, row_number() over(order by price) as row_number, rank() over(order by price) as rank, count(*)...

                                  When using the ntile() window function, the major issue is that it arbitrarily groups into roughly equal parts regardless of the actual value.

For example with the following query:

	select
		id,title,price,
		row_number() over(order by price) as row_number,
		rank() over(order by price) as rank,
		count(*) over(order by price) as count,
		dense_rank() over(order by price) as dense_rank,
		ntile(10) over(order by price) as decile
	from paintings
	order by price;

I will get 10 groups of about the same size, with the strong likelihood that paintings with the same price will end up in different bins.

For Example:



	┌────┬────────────────────────────────────────────┬───────┬────────────┬──────┬───────┬────────────┬────────┐
	│ id │ title                                      │ price │ row_number │ rank │ count │ dense_rank │ decile │
	╞════╪════════════════════════════════════════════╪═══════╪════════════╪══════╪═══════╪════════════╪════════╡
	│ 11 │ Eyes in the Heat                           │ 10    │ 1          │ 1    │ 1     │ 1          │ 1      │
	│ 19 │ Deux fillettes, fond jaune et rouge        │ 11    │ 2          │ 2    │ 2     │ 2          │ 1      │
	│ 17 │ Flowers in a Pitcher                       │ 12    │ 3          │ 3    │ 6     │ 3          │ 1      │
	│ 5  │ Composition with Red, Yellow and Blue      │ 12    │ 4          │ 3    │ 6     │ 3          │ 2      │
	│ 18 │ La lecon de musique (The Music Lesson)     │ 12    │ 5          │ 3    │ 6     │ 3          │ 2      │
	│ 9  │ The Adoration of the Magi                  │ 12    │ 6          │ 3    │ 6     │ 3          │ 2      │
	│ 29 │ Self-Portrait                              │ 14    │ 7          │ 7    │ 10    │ 4          │ 3      │
	│ 25 │ Symphony in White, No. 1: The White Girl   │ 14    │ 8          │ 7    │ 10    │ 4          │ 3      │
	│ 30 │ The Anatomy Lecture of Dr. Nicolaes Tulp   │ 14    │ 9          │ 7    │ 10    │ 4          │ 3      │
	│ 20 │ Les repasseuses (Women Ironing)            │ 14    │ 10         │ 7    │ 10    │ 4          │ 4      │
	│ 1  │ The Birth of Venus                         │ 15    │ 11         │ 11   │ 14    │ 5          │ 4      │
	│ 12 │ Femme se promenant dans une foret exotique │ 15    │ 12         │ 11   │ 14    │ 5          │ 4      │
	│ 24 │ Portrait of the Painter’s Mother           │ 15    │ 13         │ 11   │ 14    │ 5          │ 5      │
	│ 28 │ Jeunes filles au piano                     │ 15    │ 14         │ 11   │ 14    │ 5          │ 5      │
	│ 7  │ Portrait de l artiste (Self-portrait)      │ 16    │ 15         │ 15   │ 17    │ 6          │ 5      │
	│ 3  │ The Last Supper                            │ 16    │ 16         │ 15   │ 17    │ 6          │ 6      │
	│ 13 │ Combat of a Tiger and a Buffalo            │ 16    │ 17         │ 15   │ 17    │ 6          │ 6      │
	│ 4  │ The Creation of Man                        │ 17    │ 18         │ 18   │ 19    │ 7          │ 6      │
	│ 22 │ Le Chemin de Fer                           │ 17    │ 19         │ 18   │ 19    │ 7          │ 7      │
	│ 6  │ Femmes de Tahiti [Sur la plage]            │ 18    │ 20         │ 20   │ 24    │ 8          │ 7      │
	│ 21 │ Le Bar aux Folies-Berg                     │ 18    │ 21         │ 20   │ 24    │ 8          │ 7      │
	│ 26 │ Lady at the Piano                          │ 18    │ 22         │ 20   │ 24    │ 8          │ 8      │
	│ 15 │ Remembrance of a Garden                    │ 18    │ 23         │ 20   │ 24    │ 8          │ 8      │
	│ 16 │ 1914                                       │ 18    │ 24         │ 20   │ 24    │ 8          │ 8      │
	│ 14 │ Ancient Sound, Abstract on Black           │ 19    │ 25         │ 25   │ 28    │ 9          │ 9      │
	│ 8  │ The Large Turf                             │ 19    │ 26         │ 25   │ 28    │ 9          │ 9      │
	│ 23 │ On the Beach                               │ 19    │ 27         │ 25   │ 28    │ 9          │ 9      │
	│ 2  │ Portrait of Mona Lisa                      │ 19    │ 28         │ 25   │ 28    │ 9          │ 10     │
	│ 27 │ On the Terrace                             │ 20    │ 29         │ 29   │ 30    │ 10         │ 10     │
	│ 10 │ The She-Wolf                               │ 20    │ 30         │ 29   │ 30    │ 10         │ 10     │
	└────┴────────────────────────────────────────────┴───────┴────────────┴──────┴───────┴────────────┴────────┘

Note that there are four items with price 12, but two of them are in the have decile 1, and two of them decile 2. I would like to keep these items together, and I’m not fussed about which decile.

I have included other window functions to make the comparison.

It seems that ntile() uses the row_number() only and bases the cuttoffs on that. It would be more fair if it used either the rank() or count(*) function as items with the same price would end up in the same bin.

This is the behaviour for both PostgreSQL and SQL Server, and, presumably, for the rest.

The question is, is there a way to achieve this?
                                

Manngo (3145 rep)

Aug 22, 2019, 05:58 AM • Last activity: Dec 22, 2022, 06:15 AM

2 votes

1 answers

908 views

Is there a better SQL query to find ranking of a single row in database?

postgresql rank

I would like to obtain the ranking of a user's points in their country. For example, 23rd highest points in the US. After some Googling, I found that the `RANK` function would be the best solution. Here is my SQL query: ```sql SELECT points_ranking FROM ( SELECT user_country, user_id, user_points, R...

I would like to obtain the ranking of a user's points in their country. For example, 23rd highest points in the US. After some Googling, I found that the RANK function would be the best solution. Here is my SQL query:

SELECT points_ranking FROM 

	(
		SELECT
			user_country,
			user_id,
			user_points,
			RANK () OVER ( 
				ORDER BY user_points DESC
			) points_ranking
		
		FROM data.user
		WHERE
			user_country iLIKE 'usa'
	) as subquery
	
WHERE
	user_id = 3

However, I'm unsure if this is highly inefficient or if there's a better way. I ask this because I'm concerned the inner query would be ranking every single row in the table, just to select one.

nreh (123 rep)

Nov 29, 2022, 02:10 AM • Last activity: Nov 29, 2022, 05:14 AM

2 votes

1 answers

613 views

Select rank of single row

mysql mysql-8.0 rank

``` CREATE TABLE Player ( id BIGINT NOT NULL PRIMARY KEY, xp BIGINT NOT NULL ); INSERT INTO Player (id, xp) VALUES (1, 100); INSERT INTO Player (id, xp) VALUES (2, 150); INSERT INTO Player (id, xp) VALUES (3, 250); INSERT INTO Player (id, xp) VALUES (4, 125); INSERT INTO Player (id, xp) VALUES (5, 5...

CREATE TABLE Player (
  id BIGINT NOT NULL PRIMARY KEY,
  xp BIGINT NOT NULL
);

INSERT INTO Player (id, xp) VALUES (1, 100);
INSERT INTO Player (id, xp) VALUES (2, 150);
INSERT INTO Player (id, xp) VALUES (3, 250);
INSERT INTO Player (id, xp) VALUES (4, 125);
INSERT INTO Player (id, xp) VALUES (5, 500);

This query:

SELECT id, xp, RANK() over (ORDER BY xp DESC) 'rank' FROM Player;

Produces these results:

+----+-----+------+
| id | xp  | rank |
+----+-----+------+
|  5 | 500 |    1 |
|  3 | 250 |    2 |
|  2 | 150 |    3 |
|  4 | 125 |    4 |
|  1 | 100 |    5 |
+----+-----+------+

But when I add WHERE id = '.....', the rank comes back as 1. I know that this is because it's now ranking over just one row, so it'll always be 1. But how would I select the correct rank just for a single row? Is the RANK() function still appropriate for this? I don't mind about duplicate scores.

Lucien (123 rep)

Oct 22, 2022, 10:53 AM • Last activity: Oct 23, 2022, 01:02 PM

0 votes

0 answers

63 views

student scores ranking or positioning on mysqli

mysql rank

I have a MySQL table (named "scores") where I save student scores and I want to get each student Rank using total score column which I successfully did. Below is my code: ```sql SELECT studentRegNo, subject_Id, total, FIND_IN_SET( total, ( SELECT GROUP_CONCAT( total ORDER BY total DESC ) FROM scores...

I have a MySQL table (named "scores") where I save student scores and I want to get each student Rank using total score column which I successfully did. Below is my code:

SELECT studentRegNo, 
       subject_Id, 
       total, 
       FIND_IN_SET( total, 
                    ( SELECT GROUP_CONCAT( total ORDER BY total DESC ) 
                      FROM scores ) ) AS rank
FROM scores;

But I want the rank of each student separately based of each subject_Id Like this: total| studentRegNo| subject_Id| Rank -|-|-|- 86 | STM/2022/1 | 3 | 1 84 | STR/2021/3 | 3 | 2 98 | STM/2022/1 | 7 | 1 85 | STR/2021/3 | 7 | 2 85 | STM/2022/4 | 7 | 2 70 | STR/2021/5 | 7 | 4

EZE JOHN EMEKA (1 rep)

Jul 6, 2022, 07:34 AM • Last activity: Jul 6, 2022, 11:21 AM

1 votes

2 answers

719 views

Getting rank of row

mysql window-functions rank

I am using `mysql Ver 14.14 Distrib 5.7.21, for Linux (x86_64)`. I am having a table referral: ``` CREATE TABLE `referrals` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `referred_by` bigint(20) unsigned DEFAULT NULL, `referral_token` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `emai...

I am using mysql Ver 14.14 Distrib 5.7.21, for Linux (x86_64). I am having a table referral:

CREATE TABLE referrals (
  id bigint(20) unsigned NOT NULL AUTO_INCREMENT,
  referred_by bigint(20) unsigned DEFAULT NULL,
  referral_token varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
  email varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
  created_at timestamp NULL DEFAULT NULL,
  updated_at timestamp NULL DEFAULT NULL,
  PRIMARY KEY (id),
  UNIQUE KEY referrals_email_unique (email)
);

In the table you basically insert the person and who referred it. I have created the following leaderboard:

select
		referred_by,
		count(referred_by) as referred_by_count,
			@curRank := @curRank + 1 AS rank
from
		referrals r,
	(
	SELECT
		@curRank := 0) o
group by
		referred_by
order by
		referred_by_count DESC;

This gives me: | referred_by | referred_by_count | rank | | ----------- | ----------------- | ---- | | 10 | 3 | 3 | | 2 | 2 | 1 | | 3 | 2 | 2 | | 13 | 2 | 5 | | 11 | 2 | 6 | | 15 | 1 | 8 | | 12 | 1 | 9 | | 4 | 1 | 4 | | 9 | 1 | 7 | However, I would like to get: | referred_by | referred_by_count | rank | | ----------- | ----------------- | ---- | | 10 | 3 | 1 | | 2 | 2 | 2 | | 3 | 2 | 3 | | 13 | 2 | 4 | | 11 | 2 | 5 | | 15 | 1 | 6 | | 12 | 1 | 7 | | 4 | 1 | 8 | | 9 | 1 | 9 | [View on DB Fiddle](https://www.db-fiddle.com/f/o42zo6XiJZzEbrrVDfgQR4/3) I created the following dbfiddle using the faker php plugin. I tried using the mysql RANK() function. However, I am struggeling how to correctly use it. I appreciate your replies!

Carol.Kar (135 rep)

Dec 25, 2021, 07:30 PM • Last activity: Dec 26, 2021, 01:03 AM

1 votes

1 answers

194 views

Atomic updates to ordering in a table for a waiting list application

postgresql database-design rank

I am trying to implement a gamified waiting list. This waiting list is for eventual access to an application. A user on the waiting list can perform certain actions which may move them up in the waiting list. The table would look something like this (open to suggestions here as well): # table waitinglist | user_id | position | | ------- | -------- | | 12 | 0 | | 3 | 1 | | 15 | 2 | | 10 | 3 | ... and so on. This waiting list could potentially (hopefully) have millions of entries. When a user performs an action that changes their order in the waiting list, I want to be able to update that order atomically so that if someone else does something to change their order at a similar time that they don't conflict. We are using Postgres for our database, but I'm open to other (opensource) ideas as well. In the example table, if user #3 did something that moves them to position 0, it means that user #12 would be moved to position 1. We could do a query like:

BEGIN;
UPDATE waitinglist AS w SET
    position = d.position
FROM (VALUES
    (3, 0),
    (12, 1)  
) AS d(user_id, position) 
WHERE d.user_id = w.user_id;
COMMIT;

However, if user #15 also moved up a position at nearly the same time, there might be conflict. Are there some strategies I can use to make sure that the updates don't conflict?

synic (115 rep)

Dec 1, 2021, 10:13 PM • Last activity: Dec 1, 2021, 11:35 PM

0 votes

1 answers

1100 views

Substitute DISTINCT ON for query transformation (Postgresql to Redshift)

query redshift distinct rank

I need to transform my Postgresql query to Redshift and I working with the following data, here is an excerpt: [![enter image description here][1]][1] [1]: https://i.sstatic.net/9YSPR.png In my final output, I would like only to keep rows with the highest rank (in green). And I was using the followi...

I need to transform my Postgresql query to Redshift and I working with the following data, here is an excerpt:

In my final output, I would like only to keep rows with the highest rank (in green). And I was using the following query, which works to get the output, but I have to transform it into Redshift because the visualization tool is only using Redshift and Redshift doesn't recognize DISTINCT ON.

select distinct on customer_id
id_created,
customer_id,
call_start,	
status_prior,	
status_after,	
rank() over (partition by customer_id order by call_start) as rank
order by customer_id, rank desc;

ColRow (43 rep)

Jun 30, 2020, 08:50 PM • Last activity: Sep 14, 2021, 10:04 AM

1 votes

1 answers

42 views

How can I group together rankings by number of users?

rank google-bigquery

I have a table which includes the following: ``` ID evar event_date ranking 1 landing 2019-01-01 1 1 content 2019-01-02 2 1 homepage 2019-01-03 3 2 support 2019-02-01 1 2 content 2019-02-02 2 3 content 2019-01-23 1 3 support 2019-01-24 2 ``` etc. I am trying to map customer journeys, so I've ranked...

I have a table which includes the following:

ID    evar     event_date   ranking  
1    landing   2019-01-01      1   
1    content   2019-01-02      2  
1    homepage  2019-01-03      3  
2    support   2019-02-01      1  
2    content   2019-02-02      2  
3    content   2019-01-23      1  
3    support   2019-01-24      2

etc. I am trying to map customer journeys, so I've ranked the pages that were visited by date. I need to find the top journeys on the website - so it needs to be at this level, but aggregated by the number of users. For example: ID 1 made the journey landing > content > homepage; I would like to see who else made that same journey (without me manually checking) How can I write such a query?

markjones0121 (11 rep)

Dec 31, 2019, 11:43 AM • Last activity: Sep 3, 2021, 11:18 AM

39 votes

10 answers

162187 views

Get the rank of a user in a score table

mysql rank

I have a very simple MySQL table where I save highscores. It looks like that: Id Name Score So far so good. The question is: How do I get what's a users rank? For example, I have a users `Name` or `Id` and want to get his rank, where all rows are ordinal ordered descending for the `Score`. An Exampl...

                                  I have a very simple MySQL table where I save highscores. It looks like that:

    Id     Name     Score

So far so good. The question is: How do I get what's a users rank?
For example, I have a users Name or Id and want to get his rank, where all rows are ordinal ordered descending for the Score.

An Example

    Id	Name	Score
    1	Ida    	100
    2	Boo	    58
    3	Lala	88
    4	Bash	102
    5	Assem	99

In this very case, Assem's rank would be 3, because he got the 3rd highest score.

The query should return one row, which contains (only) the required Rank.

Michael (595 rep)

Feb 22, 2012, 08:48 PM • Last activity: Jul 24, 2021, 07:21 AM

1 votes

0 answers

134 views

Ranking and use of cube by in an SQL query?

sql-server mysql subquery rank cube

I am a the MSc student who does not understand this following SQL dilemma and want to be certain from an Academic standpoint. The entire “Foodmart” database can be downloaded from: http://didawiki.cli.di.unipi.it/lib/exe/fetch.php/bdd-infuma/foodmart_mysql.sql.zip In short, is my SQL syntax valid in...

                                  I am a the MSc student who does not understand this following SQL dilemma and want to be certain from an Academic standpoint.

The entire “Foodmart” database can be downloaded from: http://didawiki.cli.di.unipi.it/lib/exe/fetch.php/bdd-infuma/foodmart_mysql.sql.zip 



In short, is my SQL syntax valid in interpreting the question correctly? I put the questions in reverse order from 4.3a 4.3b 4.2 and 4.1 so they decrease with difficulty.

Question 4.3 Produce a report with information about the 

A) profit and ratioP of each product family, store country, year and gender; the ratioP is between profit and the store sales in the gender category (Male or Female)

B) the profit of each subset of the above combination



My syntax is:

    -- 4.3a
    select  PC.product_family , ST.store_country , TBD.the_year as year ,
            C.gender , (sum(store_sales) - sum(store_cost)) as total_profit ,
                ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
        from  sales_fact S
        left join  product P  ON S.product_id = P.product_id
        left join  product_class PC  ON P.product_class_id = PC.product_class_id
        left join  store ST  ON S.store_id = st.store_id
        left join  time_by_day TBD  ON S.time_id = TBD.time_id
        left join  customer C  ON S.customer_id = C.customer_id
        group by  PC.product_family, ST.store_country, TBD.the_year,
            C.gender ;
    -- 4.3b 
    select  PC.product_family , ST.store_country , ST.store_city ,
            TBD.the_year as year , TBD.the_month as month , TBD.the_day as day ,
            C.gender , (sum(store_sales) - sum(store_cost)) as total_profit
        from  sales_fact S
        left join  product P  ON S.product_id = P.product_id
        left join  product_class PC  ON P.product_class_id = PC.product_class_id
        left join  store ST  ON S.store_id = st.store_id
        left join  time_by_day TBD  ON S.time_id = TBD.time_id
        left join  customer C  ON S.customer_id = C.customer_id
        group by  PC.product_family, ST.store_country, ST.store_city,
            TBD.the_year, TBD.the_month, TBD.the_day, C.gender ;

    select  PC.product_family , ST.store_city , TBD.the_year as year ,
            sum(store_sales) as total_sales , (sum(store_sales) - sum(store_cost)) as total_profit ,
                ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
        from  sales_fact S
        left join  product P  ON S.product_id = P.product_id
        left join  product_class PC  ON P.product_class_id = PC.product_class_id
        left join  store ST  ON S.store_id = ST.store_id
        left join  time_by_day TBD  ON S.time_id = TBD.time_id
        group by  PC.product_family, ST.store_city, TBD.the_year ;


Could there be any merit, necessity, or efficiency in using cube by or roll-up functions in this query?  

A university Professor suggests:

You do not need cube by. The queries you mention have nothing to do with cubes or rollup 
They are simply group by and many combinations of the intermediate results.

"They are simply group by and many combinations of the intermediate results."

 "With group by having and aggregate functions like max sum etc. can achieve what is asked. You just need to create a strategy (plan) on how to compute it and then do the select project join queries that do that."

“With group by having and aggregate functions like max sum etc. you can achieve what is asked. You just need to create a strategy (plan) on how to compute it and then do the select project join queries that do that.”

“It is not something done it 2 min … It needs some thinking (but that does not make it hard).”


I have other questions below but am grateful for your reply or anyone else you deem suitable!

Here are the other 2 questions on the same database with my syntax:

4.2 For every product family and customer country and gender return the customer with the highest store sales of the product family. If there are two or more customers with the same store sales, pick up any.

-- 4.2

select distinct
C.lname,
    C.fname,
    PC.product_family,
    C.country,
    C.gender,
MAX(S.store_sales)
from
    sales_fact  S
    left join product P on S.product_id = P.product_id
c    left join product_class PC on P.product_class_id = PC.product_class_id
    left join customer C on S.customer_id = C.customer_id
group by
    PC.product_family, C.country, C.gender
;

The university Professor suggests only grouping by year?:

you group by (product family, customer country, gender, customer) 
in the select clause you put 
product family, customer country, gender, customer, sum(sakes)

Now you know the sales of each customer. 

Then you use that table, to identify the max sales and select the tuples that have such a value.

Possible error?

Is there also for example the requirement of the use of a ranking or anything else for that matter?



4.1 Return a table where you list product family, store city and year, with the corresponding total sales 

and profit, and the percentage of profit over the total sales of the year.



-- 4.1
select
PC.product_family
    , ST.store_city
    , TBD.the_year as year
    , sum(store_sales) as total_sales
    , (sum(store_sales) - sum(store_cost)) as total_profit
    , ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
from
sales_fact  S
    left join product P on S.product_id = P.product_id
    left join product_class PC on P.product_class_id = PC.product_class_id
    left join store ST on S.store_id = ST.store_id
    left join time_by_day TBD on S.time_id = TBD.time_id
group by
PC.product_family, ST.store_city, TBD.the_year

;


you simply group by year and then you do 
in the select clause 
sum(sales), sum(profit), sum(sales)/sum(profits)

The from clause should have the join of all the tables containing the needed information

That is all.

I do not only group by year as I see also PC.product_family, ST.store_city in my syntax. Possible error?



Could there also, for example, in the Select be the error of the inclusion of attributes not present in the group by?



Many thanks people for this help,

Tim
                                

Timothy (11 rep)

Mar 26, 2021, 03:13 PM • Last activity: Mar 26, 2021, 04:51 PM

0 votes

0 answers

25 views

Field with Top Ranking Field Name

rank hadoop hive hiveql

Let's imagine a table structured like this: | Bucket | Red | Blue | Green | | ------ | --- |----- | ----- | | First | 1 |3 |4 | | Second | 6 |5 |2 | What I'm trying to achieve is based on the values within each bucket, I'd like to generate another set of fields with the highest ranking, second highe...

                                  Let's imagine a table structured like this:

| Bucket | Red | Blue | Green |
| ------ | --- |----- | ----- |
| First  | 1   |3     |4      |
| Second | 6   |5     |2      |

What I'm trying to achieve is based on the values within each bucket, I'd like to generate another set of fields with the highest ranking, second highest ranking, and third highest ranking colors (assume there are more than three colors as well). We are limiting to top 3.

Essentially, what my final output should look like is this:

| Bucket | Red | Blue | Green | Rank 1 | Rank 2 | Rank 3 |
| ------ | --- |----- | ----- | ------ | ------ | ------ |
| First  | 1   |3     |4      | Green  | Blue   | Red    |
| Second | 6   |5     |2      | Red    | Blue   | Green  |

Hoping this isn't a redundant question.
                                

Franco Buhay (1 rep)

Feb 10, 2021, 07:37 PM

0 votes

1 answers

62 views

Normalization using DENSE_RANK and LAST_INSERT_ID

mysql normalization mysql-8.0 auto-increment rank

**[db-fiddle][1]** I have created a table called `orders` (orderId, customersId) and a table called `orderDetails` (orderId, productId). My goal is to denormalize a table called `temp` which is like ``` scanId, customersId, productId 1,1,1 1,1,2 1,2,3 2,3,4 2,1,5 2,1,6 ``` I want to insert a new `or...

**db-fiddle ** I have created a table called orders (orderId, customersId) and a table called orderDetails (orderId, productId). My goal is to denormalize a table called temp which is like

scanId, customersId, productId
1,1,1
1,1,2
1,2,3
2,3,4
2,1,5
2,1,6

I want to insert a new order for every distinct scanId,customersId, and then for every order insert a productId into orderDetails. So given temp you would see 4 orders created, with the 1st and 4th order containing 2 productIds each. I am used to SQL Server where we can utilize the OUTPUT clause of an INSERT to tie identities back to an initial INSERT. However MySQL doesn't have anything like that. Following advice from https://stackoverflow.com/questions/7333524/how-can-i-insert-many-rows-into-a-mysql-table-and-return-the-new-ids I came up with the following:

INSERT INTO orders(customersId)
SELECT customersId
FROM temp
GROUP BY scanId, customersId
ORDER BY scanId, customersId;

INSERT INTO orderDetails(orderId, productId)
SELECT 
	DENSE_RANK() OVER(
		ORDER BY scanId, customersId
	) + LAST_INSERT_ID() -1, productId
FROM temp;

The trick being DENSE_RANK() OVER( ORDER BY scanId, customersId ) + LAST_INSERT_ID() -1 constructs the orderId. Will this always work? Specifically, will MySQL always assign autoincrement ids to inserts based on the ORDER BY and then utilzing a windowed function on the partition and order always tie back to the inserted id?

ParoX (163 rep)

Jan 9, 2021, 05:15 AM • Last activity: Jan 9, 2021, 03:34 PM

0 votes

1 answers

139 views

Creating a table with Rank ordering in Oracle

oracle order-by rank

I am trying to create a table in Oracle as follow: create table hops.t_distribution AS (select UID, CONTACT_EMAIL, PRODUCT_ID, rank() over(Partition by UID order by to_number(PRODUCT_ID)) as PRIORITY_ORDER from hops.staging_test); hops.staging_test is basically the same table without the PRIORITY_OR...

                                  I am trying to create a table in Oracle as follow:

    create table hops.t_distribution AS
    (select 
    UID, 
    CONTACT_EMAIL, 
    PRODUCT_ID, 
    rank() over(Partition by UID order by to_number(PRODUCT_ID)) as PRIORITY_ORDER 
    from hops.staging_test);

hops.staging_test is basically the same table without the PRIORITY_ORDER column. 

However, I am getting an ORA error because you can't order by in a create table statement. Is there another clever way to do this? 

Thank you!
                                

JoD (1 rep)

Dec 10, 2020, 04:59 PM • Last activity: Dec 10, 2020, 09:27 PM

1 votes

2 answers

103 views

Find the greatest difference between each unique record with different timestamps

postgresql order-by greatest-n-per-group window-functions rank

                                  I have a PostgreSQL table with the following schema and data:

    CREATE TABLE IF NOT EXISTS T(
         id uuid PRIMARY KEY,
         username varchar(15),
         person varchar(10),
         tweets int,
         followers int,
         following int,
         likes int,
         created_at date) 
    ;
     
     id                                   | username     | person     | tweets | followers | following | likes | created_at
     :----------------------------------- | :----------- | :--------- | -----: | --------: | --------: | ----: | :---------
     3fa34100-d688-4051-a687-ec49d05e7212 | renok        | null       |    110 |         6 |         0 |     0 | 2020-10-10
     bab9ceb9-2770-49ea-8489-77e5d763a223 | Lydia_C      | test user2 |    515 |      1301 |      1852 |  1677 | 2020-10-10
     4649077a-9188-4821-a1ec-3b38608ea44a | Kingston_Sav | null       |   2730 |      1087 |      1082 |  1339 | 2020-10-10
     eef80836-e140-4adc-9598-8b612ab1825b | TP_s         | null       |   1835 |       998 |       956 |  1832 | 2020-10-10
     fd3ff8c7-0994-40b6-abe0-915368ab9ae5 | DKSnr4       | null       |    580 |       268 |       705 |   703 | 2020-10-10
     3fa34100-d688-4051-a687-ec49d05e7312 | renok        | null       |    119 |         6 |         0 |     0 | 2020-10-12
     bab9ceb9-2770-49ea-8489-77e5d763a224 | Lydia_C      | test user2 |    516 |      1301 |      1852 |  1687 | 2020-10-12
     4649077a-9188-4821-a1ec-3b38608ea44B | Kingston_Sav | null       |   2737 |      1090 |      1084 |  1342 | 2020-10-12
     eef80836-e140-4adc-9598-8b612ae1835c | TP_s         | null       |   1833 |       998 |       957 |  1837 | 2020-10-12
     fd3ff8c7-0994-40b6-abe0-915368ab7ab5 | DKSnr4       | null       |    570 |       268 |       700 |   703 | 2020-10-12

I intend to get the biggest difference between the most recent date and the next most recent date for each unique username and the find the username with the largest margin (difference) for example..In the above table the most recent date is 2020-10-12 and the next most recent date is 2020-10-10.

So I want to get something like this 

     id                                   | username     | person | tweets | followers | following | likes | created_at | prev_followers | gain
     :----------------------------------- | :----------- | :----- | -----: | --------: | --------: | ----: | :--------- | -------------: | ---:
     4649077a-9188-4821-a1ec-3b38608ea44a | Kingston_Sav | null   |   2737 |      1090 |      1084 |  1342 | 2020-10-12 |           1087 |    3
                                

basebandit (13 rep)

Oct 29, 2020, 09:02 AM • Last activity: Nov 2, 2020, 07:31 AM

0 votes

1 answers

184 views

Create a temp table that consolidates start and end dates plus generates a uniqueid for each consolidated record

sql-server sql-server-2008-r2 stored-procedures view rank

I have a table called `projecthistory`, it is actually a view so I could take advantage of the `Rank` function to get a unique record number. From this view I need to create a temp table that has a `uniqueid` field which would be calculated bases on these rules: 1. Rule 1 - The next sequential recor...

                                  
I have a table called projecthistory,
it is actually a view so I could take advantage of the Rank function to get a unique record number. 

From this view I need to create a temp table that has a uniqueid field which would be calculated bases on these rules:

 1. Rule 1 - The next sequential record for the id changes from one
    project to another or

 2. Rule 2 - If the id and project match for the next sequential record
    , but the days between the end date of the current record and start
    date of the next record for the is more than 7 days.

It would also consolidate the start and end dates appropriately for each group.  

I believe this would be best done in a stored procedure that can call the data when needed, but I'm fairly new to stored procedures. 

Here is an example of the table data, as well as the desired output

Projecthistory

    ╔════════╦══════╦═════════╦═══════════╦═══════════╗
    ║ Record ║  Id  ║ Project ║   Start   ║    End    ║
    ╠════════╬══════╬═════════╬═══════════╬═══════════╣
    ║      1 ║ 1111 ║ A       ║ 1/1/2016  ║ 1/7/2016  ║
    ║      2 ║ 1111 ║ A       ║ 1/10/2016 ║ 1/20/2016 ║
    ║      3 ║ 1111 ║ B       ║ 1/21/2016 ║ 1/25/2016 ║
    ║      4 ║ 1111 ║ A       ║ 1/26/2016 ║ NULL      ║
    ║      5 ║ 2222 ║ A       ║ 1/1/2016  ║ 2/1/2016  ║
    ║      6 ║ 2222 ║ B       ║ 2/2/2016  ║ NULL      ║
    ║      7 ║ 3333 ║ D       ║ 1/15/2016 ║ 2/1/2016  ║
    ║      8 ║ 4444 ║ B       ║ 2/10/2016 ║ NULL      ║
    ║      9 ║ 5555 ║ A       ║ 2/1/2016  ║ 2/5/2016  ║
    ║     10 ║ 5555 ║ A       ║ 2/20/2016 ║ 2/27/2016 ║
    ║     11 ║ 5555 ║ A       ║ 2/28/2016 ║ 3/10/2016 ║
    ║     12 ║ 5555 ║ A       ║ 3/11/2016 ║ NULL      ║
    ╚════════╩══════╩═════════╩═══════════╩═══════════╝

Desired output

    ╔════════╦══════╦═════════╦══════════════╦════════════╗
    ║ unique ║  Id  ║ Project ║ ProjectStart ║ ProjectEnd ║
    ╠════════╬══════╬═════════╬══════════════╬════════════╣
    ║ 1111A1 ║ 1111 ║ A       ║ 1/1/2016     ║ 1/20/2016  ║
    ║ 1111B1 ║ 1111 ║ B       ║ 1/21/2016    ║ 1/25/2016  ║
    ║ 1111A2 ║ 1111 ║ A       ║ 1/26/2016    ║ NULL       ║
    ║ 2222A1 ║ 2222 ║ A       ║ 1/1/2016     ║ 2/1/2016   ║
    ║ 2222B1 ║ 2222 ║ B       ║ 2/2/2016     ║ NULL       ║
    ║ 3333D1 ║ 3333 ║ D       ║ 1/15/2016    ║ 2/1/2016   ║
    ║ 4444B1 ║ 4444 ║ B       ║ 2/10/2016    ║ NULL       ║
    ║ 5555A1 ║ 5555 ║ A       ║ 2/1/2016     ║ 2/5/2016   ║
    ║ 5555A2 ║ 5555 ║ A       ║ 2/20/2016    ║ Null       ║
    ╚════════╩══════╩═════════╩══════════════╩════════════╝


Here are some notes about the desired output and some of
more difficult scenarios in this data.

Take note that records 1-2 became 1111A1, 
record 3 becomes 1111b1,
record 4 became the second instance of that project for id 1111 so its uniqueid became 1111A2, be careful this doesn’t get lumped
in with records 1&2. 

Record 9 become 55555A1 and because of the date gap
between records 9&10, record 10-12 became uniqueid 5555A2

I look forward to any solutions, feedback or questions. I’m
truly grateful in advance.
                                

Josh (11 rep)

Mar 9, 2016, 08:13 PM • Last activity: Aug 29, 2020, 10:04 AM

Showing page 1 of 20 total questions