Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
1
votes
2
answers
182
views
MYSQL Calculate ranking of row that is not yet inserted
I've got a table similar to this data: | id | username | value | |-----------|----------|-------| | 1 | bob | 46 | | 483 | alice | 90 | | 176 | sue | 3001 | | 82 | harry | 0 | | 10493 | bill | 327 | I have this query that returns me the ranking of a user based on their `id` SELECT username, value, r...
I've got a table similar to this data:
| id | username | value |
|-----------|----------|-------|
| 1 | bob | 46 |
| 483 | alice | 90 |
| 176 | sue | 3001 |
| 82 | harry | 0 |
| 10493 | bill | 327 |
I have this query that returns me the ranking of a user based on their
id
SELECT
username,
value,
rank from
(
SELECT
tp.username,
tp.value,
tp.id,
@curRank := @curRank + 1 AS rank
FROM table tp,
(SELECT @curRank := 0) r
ORDER BY tp.value DESC
)
as temp
WHERE id = 483;
So for the above query, I would get a ranking returned of **4** for the id
483.
Let's say I want to insert the following row into the table:
| id | username | value |
|---------------------------|
| 2 | sally | 2000 |
Is there any way to know what rank
this row _will_ have after it is inserted, **without** actually inserting it?
I.e. sally
would have a rank of 2 from the above query if inserted.
The reason I am curious if this is possible is that I'm attempting to insert the row into the database and only have this one transaction, rather than having to insert the row, then re-run the rank query.
Thanks!
follmer
(113 rep)
Dec 27, 2019, 05:42 AM
• Last activity: Jun 30, 2025, 05:02 PM
2
votes
1
answers
952
views
DENSE_RANK with distinct columns after filter
I find myself needing to get the most recent N things accessed by each user, and I have a log table of all accesses. Something like ```CREATE TABLE [AccessLog] ([UserId] UNIQUEIDENTIFIER, [ThingId] UNIQUEIDENTIFIER, [AccessDate] DATETIME)``` So I wrote ``` SELECT [UserId], [ThingId], [SequenceNumber...
I find myself needing to get the most recent N things accessed by each user, and I have a log table of all accesses. Something like
TABLE [AccessLog] ([UserId] UNIQUEIDENTIFIER, [ThingId] UNIQUEIDENTIFIER, [AccessDate] DATETIME)
So I wrote
SELECT [UserId], [ThingId], [SequenceNumber]
FROM (SELECT [UserId], [ThingId], DENSE_RANK()
OVER (PARTITION BY [UserId] ORDER BY [AccessDate] DESC) SequenceNumber
FROM [AccessLog]
) [AccessLog]
WHERE [SequenceNumber] < @N
This doesn't work because if the same user accessed the same thing repeatedly we get the same [ThingId]
back up to N times rather than N distinct [ThingIds]
back for each user.
I'm baffled how to actually do it. The table is so large that if I have to write the N cubed join like it's 2000 again (join table back on itself once to exclude duplicate copies and a second time to generate the count) it will be faster to suck in the entire table and process it in application code.
It's almost like I want
FOR EACH [UserId] IN (SELECT [UserId] IN [User])
SELECT TOP (@N) [UserId], [ThingId]
FROM [AccessLog]
ORDER BY [AccessDate] DESC
But that performance is also so bad I'd to better reading the table into application code again.
Joshua
(426 rep)
Jun 7, 2023, 11:53 PM
• Last activity: Jun 8, 2023, 04:00 AM
-3
votes
2
answers
94
views
Rank over discontinuous blocks of rows
I have the following table Logs: | Id| Char| |:--|:----| |1 | a | |2 | a | |3 | a | |4 | b | |5 | b | |6 | a | |7 | a | |8 | c | I want to get something like: | Id| Char|Rank| |:--|:----|:---| |1 | a |1 | |2 | a |2 | |3 | a |3 | |4 | b |1 | |5 | b |2 | |6 | a |1 | |7 | a |2 | |8 | c |1 | A MS-SQL-Se...
I have the following table Logs:
| Id| Char|
|:--|:----|
|1 | a |
|2 | a |
|3 | a |
|4 | b |
|5 | b |
|6 | a |
|7 | a |
|8 | c |
I want to get something like:
| Id| Char|Rank|
|:--|:----|:---|
|1 | a |1 |
|2 | a |2 |
|3 | a |3 |
|4 | b |1 |
|5 | b |2 |
|6 | a |1 |
|7 | a |2 |
|8 | c |1 |
A MS-SQL-Server query is desirable
Wasim Ali
(1 rep)
May 2, 2023, 05:04 PM
• Last activity: May 3, 2023, 10:49 AM
1
votes
1
answers
841
views
Fetch 10 paginated ranked posts of each category
Posts table as p: ``` | id | cid | score | ... -------------------- | 1 | 1 | 3 | ... ``` Categories Table as c: ``` | id |... -------------------- | 1 |... ``` What i need? Reterive paginated ranked posts in each category. What i did do: ```SQL SELECT DENSE_RANK() OVER ( PARTITION BY c.id ORDER BY...
Posts table as p:
| id | cid | score | ...
--------------------
| 1 | 1 | 3 | ...
Categories Table as c:
| id |...
--------------------
| 1 |...
What i need? Reterive paginated ranked posts in each category.
What i did do:
SELECT
DENSE_RANK() OVER (
PARTITION BY c.id
ORDER BY p."score" DESC NULLS LAST) AS ranking,
p.id,
count(*) OVER() AS full_count
FROM p
INNER JOIN c ON c.id=p.cid
offset 20 limit 10
This query will return whole records and affect offset/limit on it. But i need to affect offset/limit on each category, separated.
user3690884
(13 rep)
May 8, 2019, 10:33 AM
• Last activity: Feb 23, 2023, 09:03 AM
-3
votes
1
answers
41
views
Sql ranking query
I have a requirement to make tier based on rank Example first 5 ranks tier 1 Next 5 ranks tier 2 Next 3 ranks tier 3 Next 5 ranks tier 3 Kindly let me know if any help in the sql query for above requirement Thanks in advance
I have a requirement to make tier based on rank
Example first 5 ranks tier 1
Next 5 ranks tier 2
Next 3 ranks tier 3
Next 5 ranks tier 3
Kindly let me know if any help in the sql query for above requirement
Thanks in advance
HEMANTH KUMAR REDDY
(1 rep)
Jan 29, 2023, 07:39 PM
• Last activity: Jan 30, 2023, 07:21 AM
7
votes
3
answers
1795
views
A more intelligent ntile
When using the `ntile()` window function, the major issue is that it arbitrarily groups into roughly equal parts regardless of the actual value. For example with the following query: select id,title,price, row_number() over(order by price) as row_number, rank() over(order by price) as rank, count(*)...
When using the
ntile()
window function, the major issue is that it arbitrarily groups into roughly equal parts regardless of the actual value.
For example with the following query:
select
id,title,price,
row_number() over(order by price) as row_number,
rank() over(order by price) as rank,
count(*) over(order by price) as count,
dense_rank() over(order by price) as dense_rank,
ntile(10) over(order by price) as decile
from paintings
order by price;
I will get 10 groups of about the same size, with the strong likelihood that paintings with the same price will end up in different bins.
For Example:
┌────┬────────────────────────────────────────────┬───────┬────────────┬──────┬───────┬────────────┬────────┐
│ id │ title │ price │ row_number │ rank │ count │ dense_rank │ decile │
╞════╪════════════════════════════════════════════╪═══════╪════════════╪══════╪═══════╪════════════╪════════╡
│ 11 │ Eyes in the Heat │ 10 │ 1 │ 1 │ 1 │ 1 │ 1 │
│ 19 │ Deux fillettes, fond jaune et rouge │ 11 │ 2 │ 2 │ 2 │ 2 │ 1 │
│ 17 │ Flowers in a Pitcher │ 12 │ 3 │ 3 │ 6 │ 3 │ 1 │
│ 5 │ Composition with Red, Yellow and Blue │ 12 │ 4 │ 3 │ 6 │ 3 │ 2 │
│ 18 │ La lecon de musique (The Music Lesson) │ 12 │ 5 │ 3 │ 6 │ 3 │ 2 │
│ 9 │ The Adoration of the Magi │ 12 │ 6 │ 3 │ 6 │ 3 │ 2 │
│ 29 │ Self-Portrait │ 14 │ 7 │ 7 │ 10 │ 4 │ 3 │
│ 25 │ Symphony in White, No. 1: The White Girl │ 14 │ 8 │ 7 │ 10 │ 4 │ 3 │
│ 30 │ The Anatomy Lecture of Dr. Nicolaes Tulp │ 14 │ 9 │ 7 │ 10 │ 4 │ 3 │
│ 20 │ Les repasseuses (Women Ironing) │ 14 │ 10 │ 7 │ 10 │ 4 │ 4 │
│ 1 │ The Birth of Venus │ 15 │ 11 │ 11 │ 14 │ 5 │ 4 │
│ 12 │ Femme se promenant dans une foret exotique │ 15 │ 12 │ 11 │ 14 │ 5 │ 4 │
│ 24 │ Portrait of the Painter’s Mother │ 15 │ 13 │ 11 │ 14 │ 5 │ 5 │
│ 28 │ Jeunes filles au piano │ 15 │ 14 │ 11 │ 14 │ 5 │ 5 │
│ 7 │ Portrait de l artiste (Self-portrait) │ 16 │ 15 │ 15 │ 17 │ 6 │ 5 │
│ 3 │ The Last Supper │ 16 │ 16 │ 15 │ 17 │ 6 │ 6 │
│ 13 │ Combat of a Tiger and a Buffalo │ 16 │ 17 │ 15 │ 17 │ 6 │ 6 │
│ 4 │ The Creation of Man │ 17 │ 18 │ 18 │ 19 │ 7 │ 6 │
│ 22 │ Le Chemin de Fer │ 17 │ 19 │ 18 │ 19 │ 7 │ 7 │
│ 6 │ Femmes de Tahiti [Sur la plage] │ 18 │ 20 │ 20 │ 24 │ 8 │ 7 │
│ 21 │ Le Bar aux Folies-Berg │ 18 │ 21 │ 20 │ 24 │ 8 │ 7 │
│ 26 │ Lady at the Piano │ 18 │ 22 │ 20 │ 24 │ 8 │ 8 │
│ 15 │ Remembrance of a Garden │ 18 │ 23 │ 20 │ 24 │ 8 │ 8 │
│ 16 │ 1914 │ 18 │ 24 │ 20 │ 24 │ 8 │ 8 │
│ 14 │ Ancient Sound, Abstract on Black │ 19 │ 25 │ 25 │ 28 │ 9 │ 9 │
│ 8 │ The Large Turf │ 19 │ 26 │ 25 │ 28 │ 9 │ 9 │
│ 23 │ On the Beach │ 19 │ 27 │ 25 │ 28 │ 9 │ 9 │
│ 2 │ Portrait of Mona Lisa │ 19 │ 28 │ 25 │ 28 │ 9 │ 10 │
│ 27 │ On the Terrace │ 20 │ 29 │ 29 │ 30 │ 10 │ 10 │
│ 10 │ The She-Wolf │ 20 │ 30 │ 29 │ 30 │ 10 │ 10 │
└────┴────────────────────────────────────────────┴───────┴────────────┴──────┴───────┴────────────┴────────┘
Note that there are four items with price 12
, but two of them are in the have decile 1, and two of them decile 2. I would like to keep these items together, and I’m not fussed about which decile.
I have included other window functions to make the comparison.
It seems that ntile()
uses the row_number()
only and bases the cuttoffs on that. It would be more fair if it used either the rank()
or count(*)
function as items with the same price would end up in the same bin.
This is the behaviour for both PostgreSQL and SQL Server, and, presumably, for the rest.
The question is, is there a way to achieve this?
Manngo
(3145 rep)
Aug 22, 2019, 05:58 AM
• Last activity: Dec 22, 2022, 06:15 AM
2
votes
1
answers
908
views
Is there a better SQL query to find ranking of a single row in database?
I would like to obtain the ranking of a user's points in their country. For example, 23rd highest points in the US. After some Googling, I found that the `RANK` function would be the best solution. Here is my SQL query: ```sql SELECT points_ranking FROM ( SELECT user_country, user_id, user_points, R...
I would like to obtain the ranking of a user's points in their country. For example, 23rd highest points in the US.
After some Googling, I found that the
RANK
function would be the best solution. Here is my SQL query:
SELECT points_ranking FROM
(
SELECT
user_country,
user_id,
user_points,
RANK () OVER (
ORDER BY user_points DESC
) points_ranking
FROM data.user
WHERE
user_country iLIKE 'usa'
) as subquery
WHERE
user_id = 3
However, I'm unsure if this is highly inefficient or if there's a better way. I ask this because I'm concerned the inner query would be ranking every single row in the table, just to select one.
nreh
(123 rep)
Nov 29, 2022, 02:10 AM
• Last activity: Nov 29, 2022, 05:14 AM
2
votes
1
answers
613
views
Select rank of single row
``` CREATE TABLE Player ( id BIGINT NOT NULL PRIMARY KEY, xp BIGINT NOT NULL ); INSERT INTO Player (id, xp) VALUES (1, 100); INSERT INTO Player (id, xp) VALUES (2, 150); INSERT INTO Player (id, xp) VALUES (3, 250); INSERT INTO Player (id, xp) VALUES (4, 125); INSERT INTO Player (id, xp) VALUES (5, 5...
CREATE TABLE Player (
id BIGINT NOT NULL PRIMARY KEY,
xp BIGINT NOT NULL
);
INSERT INTO Player (id, xp) VALUES (1, 100);
INSERT INTO Player (id, xp) VALUES (2, 150);
INSERT INTO Player (id, xp) VALUES (3, 250);
INSERT INTO Player (id, xp) VALUES (4, 125);
INSERT INTO Player (id, xp) VALUES (5, 500);
This query:
SELECT id, xp, RANK() over (ORDER BY xp DESC) 'rank' FROM Player;
Produces these results:
+----+-----+------+
| id | xp | rank |
+----+-----+------+
| 5 | 500 | 1 |
| 3 | 250 | 2 |
| 2 | 150 | 3 |
| 4 | 125 | 4 |
| 1 | 100 | 5 |
+----+-----+------+
But when I add WHERE id = '.....', the rank comes back as 1
.
I know that this is because it's now ranking over just one row, so it'll always be 1
.
But how would I select the correct rank just for a single row? Is the RANK() function still appropriate for this?
I don't mind about duplicate scores.
Lucien
(123 rep)
Oct 22, 2022, 10:53 AM
• Last activity: Oct 23, 2022, 01:02 PM
0
votes
0
answers
63
views
student scores ranking or positioning on mysqli
I have a MySQL table (named "scores") where I save student scores and I want to get each student Rank using total score column which I successfully did. Below is my code: ```sql SELECT studentRegNo, subject_Id, total, FIND_IN_SET( total, ( SELECT GROUP_CONCAT( total ORDER BY total DESC ) FROM scores...
I have a MySQL table (named "scores") where I save student scores and I want to get each student Rank using total score column which I successfully did.
Below is my code:
SELECT studentRegNo,
subject_Id,
total,
FIND_IN_SET( total,
( SELECT GROUP_CONCAT( total ORDER BY total DESC )
FROM scores ) ) AS rank
FROM scores;
But I want the rank of each student separately based of each subject_Id
Like this:
total| studentRegNo| subject_Id| Rank
-|-|-|-
86 | STM/2022/1 | 3 | 1
84 | STR/2021/3 | 3 | 2
98 | STM/2022/1 | 7 | 1
85 | STR/2021/3 | 7 | 2
85 | STM/2022/4 | 7 | 2
70 | STR/2021/5 | 7 | 4
EZE JOHN EMEKA
(1 rep)
Jul 6, 2022, 07:34 AM
• Last activity: Jul 6, 2022, 11:21 AM
1
votes
2
answers
719
views
Getting rank of row
I am using `mysql Ver 14.14 Distrib 5.7.21, for Linux (x86_64)`. I am having a table referral: ``` CREATE TABLE `referrals` ( `id` bigint(20) unsigned NOT NULL AUTO_INCREMENT, `referred_by` bigint(20) unsigned DEFAULT NULL, `referral_token` varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL, `emai...
I am using
mysql Ver 14.14 Distrib 5.7.21, for Linux (x86_64)
.
I am having a table referral:
CREATE TABLE referrals
(
id
bigint(20) unsigned NOT NULL AUTO_INCREMENT,
referred_by
bigint(20) unsigned DEFAULT NULL,
referral_token
varchar(255) COLLATE utf8mb4_unicode_ci DEFAULT NULL,
email
varchar(255) COLLATE utf8mb4_unicode_ci NOT NULL,
created_at
timestamp NULL DEFAULT NULL,
updated_at
timestamp NULL DEFAULT NULL,
PRIMARY KEY (id
),
UNIQUE KEY referrals_email_unique
(email
)
);
In the table you basically insert the person and who referred it.
I have created the following leaderboard:
select
referred_by,
count(referred_by) as referred_by_count,
@curRank := @curRank + 1 AS rank
from
referrals r,
(
SELECT
@curRank := 0) o
group by
referred_by
order by
referred_by_count DESC;
This gives me:
| referred_by | referred_by_count | rank |
| ----------- | ----------------- | ---- |
| 10 | 3 | 3 |
| 2 | 2 | 1 |
| 3 | 2 | 2 |
| 13 | 2 | 5 |
| 11 | 2 | 6 |
| 15 | 1 | 8 |
| 12 | 1 | 9 |
| 4 | 1 | 4 |
| 9 | 1 | 7 |
However, I would like to get:
| referred_by | referred_by_count | rank |
| ----------- | ----------------- | ---- |
| 10 | 3 | 1 |
| 2 | 2 | 2 |
| 3 | 2 | 3 |
| 13 | 2 | 4 |
| 11 | 2 | 5 |
| 15 | 1 | 6 |
| 12 | 1 | 7 |
| 4 | 1 | 8 |
| 9 | 1 | 9 |
[View on DB Fiddle](https://www.db-fiddle.com/f/o42zo6XiJZzEbrrVDfgQR4/3)
I created the following dbfiddle using the faker php plugin.
I tried using the mysql RANK()
function. However, I am struggeling how to correctly use it.
I appreciate your replies!
Carol.Kar
(135 rep)
Dec 25, 2021, 07:30 PM
• Last activity: Dec 26, 2021, 01:03 AM
1
votes
1
answers
194
views
Atomic updates to ordering in a table for a waiting list application
I am trying to implement a gamified waiting list. This waiting list is for eventual access to an application. A user on the waiting list can perform certain actions which may move them up in the waiting list. The table would look something like this (open to suggestions here as well): `# table waiti...
I am trying to implement a gamified waiting list. This waiting list is for eventual access to an application. A user on the waiting list can perform certain actions which may move them up in the waiting list. The table would look something like this (open to suggestions here as well):
# table waitinglist
| user_id | position |
| ------- | -------- |
| 12 | 0 |
| 3 | 1 |
| 15 | 2 |
| 10 | 3 |
... and so on.
This waiting list could potentially (hopefully) have millions of entries.
When a user performs an action that changes their order in the waiting list, I want to be able to update that order atomically so that if someone else does something to change their order at a similar time that they don't conflict.
We are using Postgres for our database, but I'm open to other (opensource) ideas as well.
In the example table, if user #3 did something that moves them to position 0, it means that user #12 would be moved to position 1. We could do a query like:
BEGIN;
UPDATE waitinglist AS w SET
position = d.position
FROM (VALUES
(3, 0),
(12, 1)
) AS d(user_id, position)
WHERE d.user_id = w.user_id;
COMMIT;
However, if user #15 also moved up a position at nearly the same time, there might be conflict.
Are there some strategies I can use to make sure that the updates don't conflict?
synic
(115 rep)
Dec 1, 2021, 10:13 PM
• Last activity: Dec 1, 2021, 11:35 PM
0
votes
1
answers
1100
views
Substitute DISTINCT ON for query transformation (Postgresql to Redshift)
I need to transform my Postgresql query to Redshift and I working with the following data, here is an excerpt: [![enter image description here][1]][1] [1]: https://i.sstatic.net/9YSPR.png In my final output, I would like only to keep rows with the highest rank (in green). And I was using the followi...
I need to transform my Postgresql query to Redshift and I working with the following data, here is an excerpt:
In my final output, I would like only to keep rows with the highest rank (in green). And I was using the following query, which works to get the output, but I have to transform it into Redshift because the visualization tool is only using Redshift and Redshift doesn't recognize DISTINCT ON.

select distinct on customer_id
id_created,
customer_id,
call_start,
status_prior,
status_after,
rank() over (partition by customer_id order by call_start) as rank
order by customer_id, rank desc;
ColRow
(43 rep)
Jun 30, 2020, 08:50 PM
• Last activity: Sep 14, 2021, 10:04 AM
1
votes
1
answers
42
views
How can I group together rankings by number of users?
I have a table which includes the following: ``` ID evar event_date ranking 1 landing 2019-01-01 1 1 content 2019-01-02 2 1 homepage 2019-01-03 3 2 support 2019-02-01 1 2 content 2019-02-02 2 3 content 2019-01-23 1 3 support 2019-01-24 2 ``` etc. I am trying to map customer journeys, so I've ranked...
I have a table which includes the following:
ID evar event_date ranking
1 landing 2019-01-01 1
1 content 2019-01-02 2
1 homepage 2019-01-03 3
2 support 2019-02-01 1
2 content 2019-02-02 2
3 content 2019-01-23 1
3 support 2019-01-24 2
etc.
I am trying to map customer journeys, so I've ranked the pages that were visited by date.
I need to find the top journeys on the website - so it needs to be at this level, but aggregated by the number of users.
For example: ID 1 made the journey landing
> content
> homepage
;
I would like to see who else made that same journey (without me manually checking)
How can I write such a query?
markjones0121
(11 rep)
Dec 31, 2019, 11:43 AM
• Last activity: Sep 3, 2021, 11:18 AM
39
votes
10
answers
162187
views
Get the rank of a user in a score table
I have a very simple MySQL table where I save highscores. It looks like that: Id Name Score So far so good. The question is: How do I get what's a users rank? For example, I have a users `Name` or `Id` and want to get his rank, where all rows are ordinal ordered descending for the `Score`. An Exampl...
I have a very simple MySQL table where I save highscores. It looks like that:
Id Name Score
So far so good. The question is: How do I get what's a users rank?
For example, I have a users
Name
or Id
and want to get his rank, where all rows are ordinal ordered descending for the Score
.
An Example
Id Name Score
1 Ida 100
2 Boo 58
3 Lala 88
4 Bash 102
5 Assem 99
In this very case, Assem
's rank would be 3, because he got the 3rd highest score.
The query should return one row, which contains (only) the required Rank.
Michael
(595 rep)
Feb 22, 2012, 08:48 PM
• Last activity: Jul 24, 2021, 07:21 AM
1
votes
0
answers
134
views
Ranking and use of cube by in an SQL query?
I am a the MSc student who does not understand this following SQL dilemma and want to be certain from an Academic standpoint. The entire “Foodmart” database can be downloaded from: http://didawiki.cli.di.unipi.it/lib/exe/fetch.php/bdd-infuma/foodmart_mysql.sql.zip In short, is my SQL syntax valid in...
I am a the MSc student who does not understand this following SQL dilemma and want to be certain from an Academic standpoint.
The entire “Foodmart” database can be downloaded from: http://didawiki.cli.di.unipi.it/lib/exe/fetch.php/bdd-infuma/foodmart_mysql.sql.zip
In short, is my SQL syntax valid in interpreting the question correctly? I put the questions in reverse order from 4.3a 4.3b 4.2 and 4.1 so they decrease with difficulty.
Question 4.3 Produce a report with information about the
A) profit and ratioP of each product family, store country, year and gender; the ratioP is between profit and the store sales in the gender category (Male or Female)
B) the profit of each subset of the above combination
My syntax is:
-- 4.3a
select PC.product_family , ST.store_country , TBD.the_year as year ,
C.gender , (sum(store_sales) - sum(store_cost)) as total_profit ,
((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
from sales_fact S
left join product P ON S.product_id = P.product_id
left join product_class PC ON P.product_class_id = PC.product_class_id
left join store ST ON S.store_id = st.store_id
left join time_by_day TBD ON S.time_id = TBD.time_id
left join customer C ON S.customer_id = C.customer_id
group by PC.product_family, ST.store_country, TBD.the_year,
C.gender ;
-- 4.3b
select PC.product_family , ST.store_country , ST.store_city ,
TBD.the_year as year , TBD.the_month as month , TBD.the_day as day ,
C.gender , (sum(store_sales) - sum(store_cost)) as total_profit
from sales_fact S
left join product P ON S.product_id = P.product_id
left join product_class PC ON P.product_class_id = PC.product_class_id
left join store ST ON S.store_id = st.store_id
left join time_by_day TBD ON S.time_id = TBD.time_id
left join customer C ON S.customer_id = C.customer_id
group by PC.product_family, ST.store_country, ST.store_city,
TBD.the_year, TBD.the_month, TBD.the_day, C.gender ;
select PC.product_family , ST.store_city , TBD.the_year as year ,
sum(store_sales) as total_sales , (sum(store_sales) - sum(store_cost)) as total_profit ,
((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
from sales_fact S
left join product P ON S.product_id = P.product_id
left join product_class PC ON P.product_class_id = PC.product_class_id
left join store ST ON S.store_id = ST.store_id
left join time_by_day TBD ON S.time_id = TBD.time_id
group by PC.product_family, ST.store_city, TBD.the_year ;
Could there be any merit, necessity, or efficiency in using cube by or roll-up functions in this query?
A university Professor suggests:
You do not need cube by. The queries you mention have nothing to do with cubes or rollup
They are simply group by and many combinations of the intermediate results.
"They are simply group by and many combinations of the intermediate results."
"With group by having and aggregate functions like max sum etc. can achieve what is asked. You just need to create a strategy (plan) on how to compute it and then do the select project join queries that do that."
“With group by having and aggregate functions like max sum etc. you can achieve what is asked. You just need to create a strategy (plan) on how to compute it and then do the select project join queries that do that.”
“It is not something done it 2 min … It needs some thinking (but that does not make it hard).”
I have other questions below but am grateful for your reply or anyone else you deem suitable!
Here are the other 2 questions on the same database with my syntax:
4.2 For every product family and customer country and gender return the customer with the highest store sales of the product family. If there are two or more customers with the same store sales, pick up any.
-- 4.2
select distinct
C.lname,
C.fname,
PC.product_family,
C.country,
C.gender,
MAX(S.store_sales)
from
sales_fact S
left join product P on S.product_id = P.product_id
c left join product_class PC on P.product_class_id = PC.product_class_id
left join customer C on S.customer_id = C.customer_id
group by
PC.product_family, C.country, C.gender
;
The university Professor suggests only grouping by year?:
you group by (product family, customer country, gender, customer)
in the select clause you put
product family, customer country, gender, customer, sum(sakes)
Now you know the sales of each customer.
Then you use that table, to identify the max sales and select the tuples that have such a value.
Possible error?
Is there also for example the requirement of the use of a ranking or anything else for that matter?
4.1 Return a table where you list product family, store city and year, with the corresponding total sales
and profit, and the percentage of profit over the total sales of the year.
-- 4.1
select
PC.product_family
, ST.store_city
, TBD.the_year as year
, sum(store_sales) as total_sales
, (sum(store_sales) - sum(store_cost)) as total_profit
, ((sum(store_sales) - sum(store_cost)) / sum(store_sales)) as perc_profit
from
sales_fact S
left join product P on S.product_id = P.product_id
left join product_class PC on P.product_class_id = PC.product_class_id
left join store ST on S.store_id = ST.store_id
left join time_by_day TBD on S.time_id = TBD.time_id
group by
PC.product_family, ST.store_city, TBD.the_year
;
you simply group by year and then you do
in the select clause
sum(sales), sum(profit), sum(sales)/sum(profits)
The from clause should have the join of all the tables containing the needed information
That is all.
I do not only group by year as I see also PC.product_family, ST.store_city in my syntax. Possible error?
Could there also, for example, in the Select be the error of the inclusion of attributes not present in the group by?
Many thanks people for this help,
Tim
Timothy
(11 rep)
Mar 26, 2021, 03:13 PM
• Last activity: Mar 26, 2021, 04:51 PM
0
votes
0
answers
25
views
Field with Top Ranking Field Name
Let's imagine a table structured like this: | Bucket | Red | Blue | Green | | ------ | --- |----- | ----- | | First | 1 |3 |4 | | Second | 6 |5 |2 | What I'm trying to achieve is based on the values within each bucket, I'd like to generate another set of fields with the highest ranking, second highe...
Let's imagine a table structured like this:
| Bucket | Red | Blue | Green |
| ------ | --- |----- | ----- |
| First | 1 |3 |4 |
| Second | 6 |5 |2 |
What I'm trying to achieve is based on the values within each bucket, I'd like to generate another set of fields with the highest ranking, second highest ranking, and third highest ranking colors (assume there are more than three colors as well). We are limiting to top 3.
Essentially, what my final output should look like is this:
| Bucket | Red | Blue | Green | Rank 1 | Rank 2 | Rank 3 |
| ------ | --- |----- | ----- | ------ | ------ | ------ |
| First | 1 |3 |4 | Green | Blue | Red |
| Second | 6 |5 |2 | Red | Blue | Green |
Hoping this isn't a redundant question.
Franco Buhay
(1 rep)
Feb 10, 2021, 07:37 PM
0
votes
1
answers
62
views
Normalization using DENSE_RANK and LAST_INSERT_ID
**[db-fiddle][1]** I have created a table called `orders` (orderId, customersId) and a table called `orderDetails` (orderId, productId). My goal is to denormalize a table called `temp` which is like ``` scanId, customersId, productId 1,1,1 1,1,2 1,2,3 2,3,4 2,1,5 2,1,6 ``` I want to insert a new `or...
**db-fiddle **
I have created a table called
orders
(orderId, customersId) and a table called orderDetails
(orderId, productId). My goal is to denormalize a table called temp
which is like
scanId, customersId, productId
1,1,1
1,1,2
1,2,3
2,3,4
2,1,5
2,1,6
I want to insert a new order
for every distinct scanId,customersId, and then for every order insert a productId into orderDetails
. So given temp
you would see 4 orders created, with the 1st and 4th order containing 2 productIds each.
I am used to SQL Server where we can utilize the OUTPUT clause of an INSERT to tie identities back to an initial INSERT. However MySQL doesn't have anything like that. Following advice from https://stackoverflow.com/questions/7333524/how-can-i-insert-many-rows-into-a-mysql-table-and-return-the-new-ids
I came up with the following:
INSERT INTO orders(customersId)
SELECT customersId
FROM temp
GROUP BY scanId, customersId
ORDER BY scanId, customersId;
INSERT INTO orderDetails(orderId, productId)
SELECT
DENSE_RANK() OVER(
ORDER BY scanId, customersId
) + LAST_INSERT_ID() -1, productId
FROM temp;
The trick being DENSE_RANK() OVER( ORDER BY scanId, customersId ) + LAST_INSERT_ID() -1
constructs the orderId.
Will this always work?
Specifically, will MySQL always assign autoincrement ids to inserts based on the ORDER BY and then utilzing a windowed function on the partition and order always tie back to the inserted id?
ParoX
(163 rep)
Jan 9, 2021, 05:15 AM
• Last activity: Jan 9, 2021, 03:34 PM
0
votes
1
answers
139
views
Creating a table with Rank ordering in Oracle
I am trying to create a table in Oracle as follow: create table hops.t_distribution AS (select UID, CONTACT_EMAIL, PRODUCT_ID, rank() over(Partition by UID order by to_number(PRODUCT_ID)) as PRIORITY_ORDER from hops.staging_test); hops.staging_test is basically the same table without the PRIORITY_OR...
I am trying to create a table in Oracle as follow:
create table hops.t_distribution AS
(select
UID,
CONTACT_EMAIL,
PRODUCT_ID,
rank() over(Partition by UID order by to_number(PRODUCT_ID)) as PRIORITY_ORDER
from hops.staging_test);
hops.staging_test is basically the same table without the PRIORITY_ORDER column.
However, I am getting an ORA error because you can't order by in a create table statement. Is there another clever way to do this?
Thank you!
JoD
(1 rep)
Dec 10, 2020, 04:59 PM
• Last activity: Dec 10, 2020, 09:27 PM
1
votes
2
answers
103
views
Find the greatest difference between each unique record with different timestamps
I have a PostgreSQL table with the following schema and data: CREATE TABLE IF NOT EXISTS T( id uuid PRIMARY KEY, username varchar(15), person varchar(10), tweets int, followers int, following int, likes int, created_at date) ; id | username | person | tweets | followers | following | likes | created...
I have a PostgreSQL table with the following schema and data:
CREATE TABLE IF NOT EXISTS T(
id uuid PRIMARY KEY,
username varchar(15),
person varchar(10),
tweets int,
followers int,
following int,
likes int,
created_at date)
;
id | username | person | tweets | followers | following | likes | created_at
:----------------------------------- | :----------- | :--------- | -----: | --------: | --------: | ----: | :---------
3fa34100-d688-4051-a687-ec49d05e7212 | renok | null | 110 | 6 | 0 | 0 | 2020-10-10
bab9ceb9-2770-49ea-8489-77e5d763a223 | Lydia_C | test user2 | 515 | 1301 | 1852 | 1677 | 2020-10-10
4649077a-9188-4821-a1ec-3b38608ea44a | Kingston_Sav | null | 2730 | 1087 | 1082 | 1339 | 2020-10-10
eef80836-e140-4adc-9598-8b612ab1825b | TP_s | null | 1835 | 998 | 956 | 1832 | 2020-10-10
fd3ff8c7-0994-40b6-abe0-915368ab9ae5 | DKSnr4 | null | 580 | 268 | 705 | 703 | 2020-10-10
3fa34100-d688-4051-a687-ec49d05e7312 | renok | null | 119 | 6 | 0 | 0 | 2020-10-12
bab9ceb9-2770-49ea-8489-77e5d763a224 | Lydia_C | test user2 | 516 | 1301 | 1852 | 1687 | 2020-10-12
4649077a-9188-4821-a1ec-3b38608ea44B | Kingston_Sav | null | 2737 | 1090 | 1084 | 1342 | 2020-10-12
eef80836-e140-4adc-9598-8b612ae1835c | TP_s | null | 1833 | 998 | 957 | 1837 | 2020-10-12
fd3ff8c7-0994-40b6-abe0-915368ab7ab5 | DKSnr4 | null | 570 | 268 | 700 | 703 | 2020-10-12
I intend to get the biggest difference between the most recent date and the next most recent date for each unique username and the find the username with the largest margin (difference) for example..In the above table the most recent date is
2020-10-12
and the next most recent date is 2020-10-10
.
So I want to get something like this
id | username | person | tweets | followers | following | likes | created_at | prev_followers | gain
:----------------------------------- | :----------- | :----- | -----: | --------: | --------: | ----: | :--------- | -------------: | ---:
4649077a-9188-4821-a1ec-3b38608ea44a | Kingston_Sav | null | 2737 | 1090 | 1084 | 1342 | 2020-10-12 | 1087 | 3
basebandit
(13 rep)
Oct 29, 2020, 09:02 AM
• Last activity: Nov 2, 2020, 07:31 AM
0
votes
1
answers
184
views
Create a temp table that consolidates start and end dates plus generates a uniqueid for each consolidated record
I have a table called `projecthistory`, it is actually a view so I could take advantage of the `Rank` function to get a unique record number. From this view I need to create a temp table that has a `uniqueid` field which would be calculated bases on these rules: 1. Rule 1 - The next sequential recor...
I have a table called
projecthistory
,
it is actually a view so I could take advantage of the Rank
function to get a unique record number.
From this view I need to create a temp table that has a uniqueid
field which would be calculated bases on these rules:
1. Rule 1 - The next sequential record for the id changes from one
project to another or
2. Rule 2 - If the id and project match for the next sequential record
, but the days between the end date of the current record and start
date of the next record for the is more than 7 days.
It would also consolidate the start and end dates appropriately for each group.
I believe this would be best done in a stored procedure that can call the data when needed, but I'm fairly new to stored procedures.
Here is an example of the table data, as well as the desired output
Projecthistory
╔════════╦══════╦═════════╦═══════════╦═══════════╗
║ Record ║ Id ║ Project ║ Start ║ End ║
╠════════╬══════╬═════════╬═══════════╬═══════════╣
║ 1 ║ 1111 ║ A ║ 1/1/2016 ║ 1/7/2016 ║
║ 2 ║ 1111 ║ A ║ 1/10/2016 ║ 1/20/2016 ║
║ 3 ║ 1111 ║ B ║ 1/21/2016 ║ 1/25/2016 ║
║ 4 ║ 1111 ║ A ║ 1/26/2016 ║ NULL ║
║ 5 ║ 2222 ║ A ║ 1/1/2016 ║ 2/1/2016 ║
║ 6 ║ 2222 ║ B ║ 2/2/2016 ║ NULL ║
║ 7 ║ 3333 ║ D ║ 1/15/2016 ║ 2/1/2016 ║
║ 8 ║ 4444 ║ B ║ 2/10/2016 ║ NULL ║
║ 9 ║ 5555 ║ A ║ 2/1/2016 ║ 2/5/2016 ║
║ 10 ║ 5555 ║ A ║ 2/20/2016 ║ 2/27/2016 ║
║ 11 ║ 5555 ║ A ║ 2/28/2016 ║ 3/10/2016 ║
║ 12 ║ 5555 ║ A ║ 3/11/2016 ║ NULL ║
╚════════╩══════╩═════════╩═══════════╩═══════════╝
Desired output
╔════════╦══════╦═════════╦══════════════╦════════════╗
║ unique ║ Id ║ Project ║ ProjectStart ║ ProjectEnd ║
╠════════╬══════╬═════════╬══════════════╬════════════╣
║ 1111A1 ║ 1111 ║ A ║ 1/1/2016 ║ 1/20/2016 ║
║ 1111B1 ║ 1111 ║ B ║ 1/21/2016 ║ 1/25/2016 ║
║ 1111A2 ║ 1111 ║ A ║ 1/26/2016 ║ NULL ║
║ 2222A1 ║ 2222 ║ A ║ 1/1/2016 ║ 2/1/2016 ║
║ 2222B1 ║ 2222 ║ B ║ 2/2/2016 ║ NULL ║
║ 3333D1 ║ 3333 ║ D ║ 1/15/2016 ║ 2/1/2016 ║
║ 4444B1 ║ 4444 ║ B ║ 2/10/2016 ║ NULL ║
║ 5555A1 ║ 5555 ║ A ║ 2/1/2016 ║ 2/5/2016 ║
║ 5555A2 ║ 5555 ║ A ║ 2/20/2016 ║ Null ║
╚════════╩══════╩═════════╩══════════════╩════════════╝
Here are some notes about the desired output and some of
more difficult scenarios in this data.
Take note that records 1-2 became 1111A1
,
record 3 becomes 1111b1
,
record 4 became the second instance of that project for id 1111 so its uniqueid
became 1111A2
, be careful this doesn’t get lumped
in with records 1&2.
Record 9 become 55555A1
and because of the date gap
between records 9&10, record 10-12 became uniqueid
5555A2
I look forward to any solutions, feedback or questions. I’m
truly grateful in advance.
Josh
(11 rep)
Mar 9, 2016, 08:13 PM
• Last activity: Aug 29, 2020, 10:04 AM
Showing page 1 of 20 total questions