Sample Header Ad - 728x90

Database Administrators

Q&A for database professionals who wish to improve their database skills

Latest Questions

2 votes
1 answers
2256 views
Having a subquery somehow do the same group by as the outer query
i'm pretty decent with SQL but it's not my main experience of expertise. A colleague asked me a stumper of a question the other day. I was wondering if you guys could provide assistance. Basically we're doing an over all count and grouping these counts by days... but we'd also like to kind of subque...
i'm pretty decent with SQL but it's not my main experience of expertise. A colleague asked me a stumper of a question the other day. I was wondering if you guys could provide assistance. Basically we're doing an over all count and grouping these counts by days... but we'd also like to kind of subquery those counts to get a specific type of count. This is what I have so far that's not quite right. select d.start_date, count(a.status), ( select count(checked_ins) as checked_ins from ( select distinct(a.id) as checked_ins from schema1.employee_se_assignment a, schema2.se d where a.se_id= d.id and a.status not in ('unassigned') and d.customer_name in (‘Test Client’) and a.status = 'checked_in' ) src ) from schema1.employee_se_assignment a, schema2.se d where a.se_id= d.id and a.status not in ('unassigned') and d.customer_name in (‘Test Client’) and a.status = 'checked_in' group by d.start_date Order by d.start_date that yields the following results. which is almost what he wants. but it's clearly yielding the total of all of a certain type of status when we want it broken down by day. EG (see below):  g
die Wolken (121 rep)
May 20, 2019, 09:47 AM • Last activity: Aug 4, 2025, 07:05 PM
0 votes
1 answers
1035 views
Detect with Left Join duplicates entries and return the first value, also return sum of duplicates entries
¡Hi!, i have a case with two tables than can have a lot of matches entries and the time of fetching increases a lot. This tables can be Table A: Employees ------------------------- | Name | ID | Account | ------------------------- | Nicole | 01 | 12345 | | Alexis | 02 | 67890 | ----------------...
¡Hi!, i have a case with two tables than can have a lot of matches entries and the time of fetching increases a lot. This tables can be Table A: Employees ------------------------- | Name | ID | Account | ------------------------- | Nicole | 01 | 12345 | | Alexis | 02 | 67890 | ------------------------- And Table B: BankAccounts -------------------------- | Name | ID | Account | -------------------------- | Nicole | 01 | 12345 | | Nicole | 01 | 67890 | //duplicates Accounts | Alexis | 02 | 67890 | //duplicates Accounts -------------------------- And i want to do this with a Left Join in a Table that can have more of 450,000 different entries Result Table C Column_A = ¿Exists the account number in other register? Column_B = if(NumberOfMatches > 1) //this means that the account be found in other user AND i want to get the first value of all posibles number of matches |Account exists in other user|Match in User.. ----------------------------------------------------------------------------- | Name | ID | Account | Column_A | NumberOfMatches | Column_B | BadID | --------------------------------------------------------------------|-------| | Nicole | 01 | 12345 | No | 1 | Nicole (OK) | null | | Alexis | 02 | 67890 | Yes | 2 | Nicole (BAD)| 01 | ----------------------------------------------------------------------------- Thanks and regards! Note: sorry for my english, im learning :p
user186910
Feb 20, 2020, 05:38 AM • Last activity: Aug 1, 2025, 04:06 AM
0 votes
1 answers
2580 views
PostgreSQL error: "subquery uses ungrouped column"
I have a query with subqueries that use only elements from the `GROUP BY` clause: ``` SELECT DATE(cm.created_at) as "date", COUNT(1) as total_message, room_id, (SELECT seller_id FROM chat_room cr WHERE cr.id = cm.room_id LIMIT 1) seller_id, (SELECT count(1) FROM "order" o WHERE o.room_id = cm.room_i...
I have a query with subqueries that use only elements from the GROUP BY clause:
SELECT 
		DATE(cm.created_at) as "date", 
		COUNT(1) as total_message,
		room_id,
		(SELECT seller_id FROM chat_room cr WHERE cr.id = cm.room_id LIMIT 1) seller_id,
		(SELECT count(1) FROM "order" o WHERE o.room_id = cm.room_id) total_order,
		(SELECT sum(total_amount) FROM "order" o WHERE o.room_id = cm.room_id and DATE(o.created_at) = date(cm.created_at) )  total_order_amount
	FROM chat_message cm
	GROUP BY DATE(cm.created_at), cm.room_id
	ORDER BY DATE(cm.created_at)
Error: ERROR: subquery uses ungrouped column "cm.created_at" from outer query This part of the query causes the error: (SELECT sum(total_amount) FROM "order" o WHERE o.room_id = cm.room_id and DATE(o.created_at) = date(cm.created_at) ) total_order_amount How to solve this?
yozawiratama (111 rep)
Apr 15, 2020, 03:43 AM • Last activity: Jul 31, 2025, 03:00 AM
0 votes
2 answers
335 views
Select Sum from two joined tables
There are structures: CREATE TABLE `invoices` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `date` date NOT NULL, PRIMARY KEY (`id`) ) ENGINE=InnoDB; INSERT INTO `invoices` VALUES (1,'2018-09-22'); CREATE TABLE `products` ( `id` int(10) unsigned NOT NULL AUTO_INCREMENT, `invoice_id` int(10) unsig...
There are structures: CREATE TABLE invoices ( id int(10) unsigned NOT NULL AUTO_INCREMENT, date date NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB; INSERT INTO invoices VALUES (1,'2018-09-22'); CREATE TABLE products ( id int(10) unsigned NOT NULL AUTO_INCREMENT, invoice_id int(10) unsigned NOT NULL, amount decimal(10,2) unsigned NOT NULL, quantity smallint(5) unsigned NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB; INSERT INTO products VALUES (1,1,150.00,2),(2,1,60.00,3),(3,1,50.00,1); CREATE TABLE payments ( id int(10) unsigned NOT NULL AUTO_INCREMENT, invoice_id int(10) unsigned NOT NULL, amount decimal(10,2) unsigned NOT NULL, date date NOT NULL, PRIMARY KEY (id) ) ENGINE=InnoDB; INSERT INTO payments VALUES (1,1,400.00,'2018-09-23'),(2,1,80.00,'2018-09-23'); I have this query: select i.id, sum(pr.amount * pr.quantity) as productAmount, sum(pm.amount) as paymentAmount from invoices as i left join products as pr on pr.invoice_id=i.id left join payments as pm on pm.invoice_id=i.id group by i.id and have this result: +----+---------------+---------------+ | id | productAmount | paymentAmount | +----+---------------+---------------+ | 1 | 1060.00 | 1440.00 | +----+---------------+---------------+ 1 row in set (0,00 sec) However, I want to get the following result: +----+---------------+---------------+ | id | productAmount | paymentAmount | +----+---------------+---------------+ | 1 | 530.00 | 480.00 | +----+---------------+---------------+ 1 row in set (0,00 sec) I want sum amount of products and sum amount of payments grouped by invoice.id. What should be the query in this case?
abdulmanov.ilmir (101 rep)
Sep 24, 2018, 06:58 AM • Last activity: Jul 17, 2025, 09:08 AM
0 votes
1 answers
170 views
Select records and related entity count
Given the following select that will count related entities for every record SELECT MAIN.*, SUB.app_count FROM my_table_view MAIN LEFT JOIN (SELECT MAIN2.p_key, count(DISTINCT T3.app_id) AS app_count FROM my_table MAIN2 JOIN T1 ... LEFT JOIN T2 ... LEFT JOIN T3 ... GROUP BY MAIN2.p_key) SUB ON MAIN....
Given the following select that will count related entities for every record SELECT MAIN.*, SUB.app_count FROM my_table_view MAIN LEFT JOIN (SELECT MAIN2.p_key, count(DISTINCT T3.app_id) AS app_count FROM my_table MAIN2 JOIN T1 ... LEFT JOIN T2 ... LEFT JOIN T3 ... GROUP BY MAIN2.p_key) SUB ON MAIN.p_key = SUB.p_key WHERE ... Trying to simplify it, came up with the following : SELECT MAIN.p_key, count(DISTINCT T3.app_id) AS app_count -- select MAIN.* errors out - missing GROUP BY FROM my_table_view MAIN LEFT JOIN T1 ON ... LEFT JOIN T2 ON ... LEFT JOIN T3 ON ... WHERE ... GROUP BY MAIN.p_key How to select all data from the MAIN table without adding a bunch of GROUP BYs ? Would also appreciate performance hints. ### Edit I think this is related https://dba.stackexchange.com/questions/195104/postgres-group-by-id-works-on-table-directly-but-not-on-identical-view
Bax (313 rep)
Jun 30, 2020, 05:13 AM • Last activity: Jul 6, 2025, 05:01 PM
0 votes
1 answers
190 views
Check for existing matches to find the field their grouped-by
So I have a table of values associated with different sessions using a field called sessionid, lets say the table has 3 other fields: itemcode, itemcost, & itemeffect. What I end up doing, is getting a standard set of items (rows) from another table that contains the defaults for each country, and t...
So I have a table of values associated with different sessions using a field called sessionid, lets say the table has 3 other fields: itemcode, itemcost, & itemeffect. What I end up doing, is getting a standard set of items (rows) from another table that contains the defaults for each country, and then store it in this sessiontable with a guid in the sessionid field. The user can then modify the items for their session only. What I want to do instead is to search for an existing session match that has the same items and respective values, to avoid session repetition as the processing later in the application is expensive (reports etc). A match would return the sessionid, otherwise I will stash the new session and items for repeated use. I'm struggling to search for a solution on stackExchange (im working off [this q/a](https://dba.stackexchange.com/questions/72641/checking-whether-two-tables-have-identical-content-in-postgresql) , I've realised EXIST and UNION won't work because I dont know the sessionid and my users session guid wont match (hence the question to begin with). I'm thinking something to do with group by sessionid, and check with the worlds longest sql query (or LINQ madness) with a check for each item and item-field, and addtionally a check that no other (unmatched) items exist in matching sessions, and in theory there will only ever be one match or none! Here's my pseudo-code: useritems = getCountryDefaultItems(); var existingid =( select grpresult from (select item from db.sessionints group by item.sessionid where item in useritems (ignore sessionid) ) g where g.count == useritems.count ) .first().sessionid; I've had a go, and I believe it should be FULL OUTER JOIN so I hear, but sqlfiddle says no so I played with left + right + inner joins :/ sqlfiddle SELECT COUNT(1),a.SessionID FROM sessionitems a RIGHT JOIN useritems b -- should be FULL OUTER JOIN ? USING (SessionID,InterventionCode, Included, IntLvl, Uptake Baseline Smokers, RelativeEffect, Cost, IntType, Baseline, Current, Alternative, UpPopulation, EffPopulation, Name) WHERE (a.InterventionCode IS NULL OR b.InterventionCode IS NULL) AND (a.Included IS NULL OR b.Included IS NULL) AND (a.IntLvl IS NULL OR b.IntLvl IS NULL) AND (a.Uptake Baseline Smokers IS NULL OR b.Uptake Baseline Smokers IS NULL) AND (a.RelativeEffect IS NULL OR b.RelativeEffect IS NULL) AND (a.Cost IS NULL OR b.Cost IS NULL) AND (a.IntType IS NULL OR b.IntType IS NULL) AND (a.Baseline IS NULL OR b.Baseline IS NULL) AND (a.Current IS NULL OR b.Current IS NULL) AND (a.Alternative IS NULL OR b.Alternative IS NULL) AND (a.UpPopulation IS NULL OR b.UpPopulation IS NULL) AND (a.EffPopulation IS NULL OR b.EffPopulation IS NULL) AND (a.Name IS NULL OR b.Name IS NULL) GROUP BY a.SessionID; , but sqlfiddle is having errors doing MSSQL schema today, so I've used mysql for now which doesnt support FULL OUTER JOIN on sqlfiddle: I can get a count of exactly matching rows when there's an exact match, but the id comes back null, and if I remove the COUNT(1) and return * it returns only one row (is this a restriction on results in nested select queries) which might be okay as the session is a match, but I'm worried I have overlooked something and don't appreciate where the query will fall down...
Tyeth (101 rep)
Sep 18, 2016, 12:52 PM • Last activity: Jun 28, 2025, 02:00 AM
2 votes
1 answers
99 views
Ignoring trailing spaces when grouping by varchar type in Redshift
I created a table in Redshift and entered data as below. ``` create table public.temp ( id int, name varchar(20) ); insert into public.temp (id, name) values (1, 'bob'), (2, 'bob '); ``` And when counting based on the name column, it returns ('bob', 2). ``` select name, count(1) from public.temp gro...
I created a table in Redshift and entered data as below.
create table public.temp (
  id int,
  name varchar(20)
);

insert into public.temp (id, name) values (1, 'bob'), (2, 'bob  ');
And when counting based on the name column, it returns ('bob', 2).
select name, count(1) from public.temp group by name;

Result: ('bob', 2);
When grouping by varchar type in redshift it seems to ignore trailing spaces. And we ran another test.
select t, count(1)
from (
  select 'a' as t
  union all 
  select 'a  ' as t
) as a
group by t;
Surprisingly, it returns the result including the trailing spaces.
('a', 1), ('a  ', 1)
What is the difference between the two query execution methods and why do I get these results?
임승현Seunghyun Lim (21 rep)
Jun 23, 2025, 02:08 AM • Last activity: Jun 23, 2025, 08:05 AM
1 votes
1 answers
219 views
Why does Redshift ignore length when grouping by VARCHAR of only spaces?
Imagine I have a column `name` that is `n` rows and the `nth` entry is equal to `REPEAT(' ', n)`. For `SELECT DISTINCT name FROM table` I would expect to receive `n` rows because each name is unique by definition. For dialects like BigQuery and Snowflake, I do receive this. However for Redshift I on...
Imagine I have a column name that is n rows and the nth entry is equal to REPEAT(' ', n). For SELECT DISTINCT name FROM table I would expect to receive n rows because each name is unique by definition. For dialects like BigQuery and Snowflake, I do receive this. However for Redshift I only receive one row. Why is this?
Tanner Clary (11 rep)
Jan 29, 2024, 07:01 PM • Last activity: Jun 22, 2025, 03:03 PM
2 votes
1 answers
199 views
mysql group by on table from two derived non-indexable
Working with mysql 5.7. I've got a query that ends up joining two derived tables and grouping on one of the columns. The query without the grouping runs pretty fast.... .5 seconds and returns 15K rows (In production I would expect that to possibly hit 100K + rows). When I do a group on this query it...
Working with mysql 5.7. I've got a query that ends up joining two derived tables and grouping on one of the columns. The query without the grouping runs pretty fast.... .5 seconds and returns 15K rows (In production I would expect that to possibly hit 100K + rows). When I do a group on this query it really kills the performance resulting in a query that takes 10x longer. I have tried to reorganize the query to push the group by logic to an earlier step; logically that doesn't work. I've also tried creating a temp table to insert the query's results that takes .5 seconds to run but the inserting into the temp table takes ~5 seconds (and I'm assuming that exactly what is happening with the original query given what the explain says). I've also tried modifying the session setting for the join and sort buffers but that doesn't seem to change anything at all. Does anybody have any advice on what else I could try or what I have tried that I haven't tried "correctly". Do I need to do something else besides SET SESSION sort_buffer_size = 1024*1024*4; SET SESSION join_buffer_size = 1024*1024*4; to get those conf settings to work cause they don't seem to do anything at all to my query performance. I've adjust those values all over the place from 1024*4 to 1024*1024*10 Also, I'm a bit miffed as to why it takes ~5 seconds to add 15K rows to a temp table. I've tried memory engine, adding indexes etc... always takes ~5 seconds. Here is a very simple boiled down version of the query that may help get the gist of what I'm doing select group_concat(storeID), fileID from ( select ugfileToStores.storeID, ugfileToStores.fileID from ( select usergroupID, storeID from tableOne join ( select fileID, storeID from tableTwo ) tableTwo on tableOne.storeID = tableTwo.storeID ) ugfileToStores ) fileToStores group by fileID Without the group by at the end and the group_concat the query runs in ~.5 seconds... with it it runs in ~5 seconds. Here is an example of the actual query: SELECT sql_no_cache group_concat(usergroup0.storeid), filebucket0.filebucketid FROM ( SELECT en.id AS usergroupid, st.storeid AS storeid, 1 AS assignmentcount FROM entity en CROSS JOIN ( SELECT storeid FROM masterentity_attachedstores WHERE masterentityid = 156825) st WHERE en.id IN ('156830') ) usergroup0 INNER JOIN ( SELECT maxout.filebucketid, maxout.storeid FROM ( SELECT filebucketid, storeid, entityid FROM ( SELECT stb.id AS filebucketid, AS.storeid AS storeid, 1 AS assignmentcount FROM masterentity_attachedstores AS JOIN entity en ON AS.storeid = en.id JOIN filebucket AS stb WHERE en.isdeleted = 0 AND AS.masterentityid = 156825 AND stb.id IN (55,40) UNION ALL SELECT stb.id AS filebucketid, stb_ret_s.id AS storeid, count(stb_ret_a.id) AS assignmentcount FROM assignment AS stb_ret_a JOIN filebucket AS stb ON stb.id = stb_ret_a.associatedobjectid AND stb.id IN (69,50,68) INNER JOIN entity AS stb_ret_e ON stb_ret_e.id = stb_ret_a.assignmentvalue AND stb_ret_e.classname = 'Retailer' AND stb_ret_a.assignmentsubtype IN ('RetailerID') JOIN store stb_ret_s ON stb_ret_s.retailerid = stb_ret_e.id WHERE stb_ret_a.associatedobjectclass = 'FileBucket' AND stb_ret_a.isdeleted = 0 AND stb_ret_a.assignmentsubtype IN ('RetailerID') AND stb_ret_e.isdeleted = 0 GROUP BY filebucketid, storeid UNION ALL SELECT filebucket.id AS filebucketid, stb.storeid AS storeid, 1 AS assignmentcount FROM filebucket CROSS JOIN ( SELECT maxout.repid, maxout.storeid FROM ( SELECT repid, storeid, entityid FROM ( SELECT mp.id AS repid, mp_cf_csv.entityid AS storeid, count(mp_cf_a.id) AS assignmentcount FROM assignment AS mp_cf_a JOIN rep AS mp ON mp.id = mp_cf_a.associatedobjectid JOIN customfieldvalue AS mp_cf_csv ON mp_cf_csv.value = REPLACE(REPLACE(substring_index(substring_index(mp_cf_a.assignmentvalue, ',"', -1), ':', -1), '"',''), '}','') AND mp_cf_csv.customfieldid = REPLACE(substring_index(substring_index(mp_cf_a.assignmentvalue, ',', 1), ':', -1), '"','') JOIN entity AS mp_cf_e ON mp_cf_e.id = mp_cf_csv.entityid WHERE mp_cf_a.associatedobjectid IN (7400,7825,7780,7700) AND mp_cf_a.associatedobjectclass = 'Rep' AND mp_cf_a.isdeleted = 0 AND mp_cf_a.assignmentsubtype IN ('CustomFieldValue') AND mp_cf_e.isdeleted = 0 GROUP BY repid, storeid UNION ALL SELECT mp.id AS repid, mp_ret_s.id AS storeid, count(mp_ret_a.id) AS assignmentcount FROM assignment AS mp_ret_a JOIN rep AS mp ON mp.id = mp_ret_a.associatedobjectid JOIN store AS mp_ret_s ON mp_ret_s.retailerid = mp_ret_a.assignmentvalue AND mp_ret_a.assignmentsubtype IN ('RetailerID') JOIN entity AS mp_ret_e ON mp_ret_e.id = mp_ret_s.id WHERE mp_ret_a.associatedobjectid IN (7700,7400,7780,7825) AND mp_ret_a.associatedobjectclass = 'Rep' AND mp_ret_a.isdeleted = 0 AND mp_ret_a.assignmentsubtype IN ('RetailerID') AND mp_ret_e.isdeleted = 0 GROUP BY repid, storeid) orouttie JOIN masterentity_attachedstores AS ON orouttie.storeid = AS.storeid AND AS.masterentityid = 156825 JOIN ( SELECT associatedobjectid, ownerobjectid FROM assignment WHERE associatedobjectid IN (7400,7700,7780,7825) AND associatedobjectclass='Rep' GROUP BY associatedobjectid) creatorassignment ON creatorassignment.associatedobjectid = orouttie.repid JOIN entityuser creatorentity ON creatorentity.id = creatorassignment.ownerobjectid ) maxout JOIN asshelper0660552001475295480164_out outcreator ON maxout.storeid = outcreator.storeid AND outcreator.usergroupid = maxout.entityid ) stb WHERE id IN (60,55,50) UNION ALL SELECT stb.id AS filebucketid, AS.storeid AS storeid, 1 AS assignmentcount FROM masterentity_attachedstores AS JOIN entity en ON AS.storeid = en.id JOIN filebucket AS stb WHERE en.isdeleted = 0 AND AS.masterentityid = 156825 AND stb.id IN (40)) orouttie JOIN masterentity_attachedstores AS ON orouttie.storeid = AS.storeid AND AS.masterentityid = 156825 JOIN ( SELECT associatedobjectid, ownerobjectid FROM assignment WHERE associatedobjectid IN (40,50,55,60,65,67,68,69) AND associatedobjectclass='FileBucket' GROUP BY associatedobjectid) creatorassignment ON creatorassignment.associatedobjectid = orouttie.filebucketid JOIN entityuser creatorentity ON creatorentity.id = creatorassignment.ownerobjectid ) maxout JOIN asshelper0777657001475295479813_out outcreator ON maxout.storeid = outcreator.storeid AND outcreator.usergroupid = maxout.entityid ) filebucket0 ON filebucket0.storeid = usergroup0.storeid GROUP BY filebucket0.filebucketid
Josh (159 rep)
Oct 1, 2016, 04:05 PM • Last activity: Jun 22, 2025, 11:08 AM
0 votes
2 answers
212 views
Selecting last value to be entered each month
I am looking to pull a closing balance from the database for each month. I have tried SELECT CloseBal As 'Balance', MONTHNAME(DateTime) as 'Month', DateTime FROM Table1 WHERE MAX(DateTime) Group By Month I am getting an error `invalud use of grouping function` What would be the best way to achieve t...
I am looking to pull a closing balance from the database for each month. I have tried SELECT CloseBal As 'Balance', MONTHNAME(DateTime) as 'Month', DateTime FROM Table1 WHERE MAX(DateTime) Group By Month I am getting an error invalud use of grouping function What would be the best way to achieve this?
Paulmcf1987 (43 rep)
Feb 3, 2023, 04:05 PM • Last activity: Jun 17, 2025, 03:06 PM
3 votes
1 answers
150 views
Merge two queries into one, involving aggregation
I have these two queries in Postgres. One is: ``` SELECT _id, created_at FROM pedidos WHERE _id = '123abc; ``` Works fine, returns: ``` { "_id": "123abc", "created_at": "Sun Jun 08 2025 10:28:23 GMT-0300 (Brasilia Standard Time)", } ``` The other: ``` SELECT i.id, i.objeto FROM pedidos, jsonb_array_...
I have these two queries in Postgres. One is:
SELECT _id, created_at
FROM pedidos
WHERE _id = '123abc;
Works fine, returns:
{
    "_id": "123abc",
    "created_at": "Sun Jun 08 2025 10:28:23 GMT-0300 (Brasilia Standard Time)",
  }
The other:
SELECT i.id, i.objeto
FROM pedidos, jsonb_array_elements(objeto) WITH ORDINALITY i(objeto, id)
WHERE _id = '123abc';
Also works fine, returns:
[
        {
            "id": "1",
            "objeto": {
                "peso": 76,
                "valor": 1290,
                "altura": 1001,
                "largura": 211,
                "comprimento": 298
            }
        },
        {
            "id": "2",
            "objeto": {
                "peso": 77,
                "valor": 1291,
                "altura": 1002,
                "largura": 212,
                "comprimento": 299
            }
        }
  ]
I can run both separately and merge to obtain:
{
    "_id": "123abc",
    "created_at": "Sun Jun 08 2025 10:28:23 GMT-0300 (Brasilia Standard Time)",
    "objetos": [
        {
            "id": "1",
            "objeto": {
                "peso": 76,
                "valor": 1290,
                "altura": 1001,
                "largura": 211,
                "comprimento": 298
            }
        },
        {
            "id": "2",
            "objeto": {
                "peso": 77,
                "valor": 1291,
                "altura": 1002,
                "largura": 212,
                "comprimento": 299
            }
        }
    ]
  }
Is it possible to merge the two queries into one? I tried INNER JOIN but that returned: > i._id not found
flourigh (145 rep)
Jun 8, 2025, 03:07 PM • Last activity: Jun 11, 2025, 09:45 AM
0 votes
2 answers
451 views
postgresql cumulative counts in date range
I'm trying to get the cumulative count of rows (by `group_id`) between two dates that represent a time period where the row was active. I have a table like this: ``` group_id | id | type | start_date | end_date ----------+--------+------+------------+------------ 33 | 119435 | AAA | 2013-05-21 | 201...
I'm trying to get the cumulative count of rows (by group_id) between two dates that represent a time period where the row was active. I have a table like this:
group_id |   id   | type | start_date |  end_date
----------+--------+------+------------+------------
       33 | 119435 | AAA  | 2013-05-21 | 2014-05-19
       33 |  15144 | AAA  | 2013-05-21 | 2015-05-18
       33 |  29393 | AAA  | 2013-05-21 | 2016-05-23
       33 | 119437 | AAA  | 2013-05-21 | 2017-05-15
       33 |  62380 | AAA  | 2013-05-21 | 2099-12-31
       33 | 119436 | AAA  | 2013-05-21 | 2099-12-31
       33 |  27346 | AAA  | 2013-05-21 | 2099-12-31
       33 |  28529 | AAA  | 2014-05-20 | 2099-12-31
       33 | 221576 | AAA  | 2015-05-19 | 2099-12-31
       33 | 253893 | AAA  | 2016-05-24 | 2099-12-31
       33 | 251589 | AAA  | 2017-05-16 | 2099-12-31
       33 | 285245 | AAA  | 2019-01-24 | 2099-12-31
       34 | 253893 | AAA  | 2016-05-24 | 2099-12-31
       34 | 251589 | AAA  | 2017-05-16 | 2099-12-31
       34 | 285245 | AAA  | 2019-01-24 | 2099-12-31
       34 | 285246 | AAA  | 2019-05-31 | 2099-12-31
... and I need to get active counts for each of those date ranges like this:
group_id | start_date |  end_date  | active
----------+------------+------------+--------
       33 | 2013-05-21 | 2014-05-19 |      7
       33 | 2013-05-21 | 2015-05-18 |      8
       33 | 2013-05-21 | 2016-05-23 |      9
       33 | 2013-05-21 | 2017-05-15 |     10
       33 | 2013-05-21 | 2099-12-31 |     12
       33 | 2013-05-21 | 2099-12-31 |     12
       33 | 2013-05-21 | 2099-12-31 |     12
       33 | 2014-05-20 | 2099-12-31 |     11
       33 | 2015-05-19 | 2099-12-31 |     10
       33 | 2016-05-24 | 2099-12-31 |      9
       33 | 2017-05-16 | 2099-12-31 |      8
       33 | 2019-01-24 | 2099-12-31 |      8
       34 | 2016-05-24 | 2099-12-31 |      1
       34 | 2017-05-16 | 2099-12-31 |      2
       34 | 2019-01-24 | 2099-12-31 |      3
       34 | 2019-05-31 | 2099-12-31 |      4
I've tried various combinations of LAG and LEAD, with and without CTEs, but cannot come up with a solution. Is there a way to do this in a single query? If not a single query, perhaps a combination of queries in a UDF? **UPDATE** Per @jjanes comment below, I believe my source table is setup incorrectly. I think I should create the source table like this instead:
group_id |   id   | type | start_date |  end_date
----------+--------+------+------------+------------
  ... (skipped group 33) ...
       34 | 253893 | AAA  | 2016-05-24 | 2017-05-15
       34 | 253893 | AAA  | 2017-05-16 | 2019-01-23
       34 | 253893 | AAA  | 2019-01-24 | 2019-05-30
       34 | 253893 | AAA  | 2019-05-31 | 2099-12-31
       34 | 251589 | AAA  | 2017-05-16 | 2019-01-23
       34 | 251589 | AAA  | 2019-01-24 | 2019-05-30
       34 | 251589 | AAA  | 2019-05-31 | 2099-12-31
       34 | 285245 | AAA  | 2019-01-24 | 2019-05-30
       34 | 285245 | AAA  | 2019-05-31 | 2099-12-31
       34 | 285246 | AAA  | 2019-05-31 | 2099-12-31
With that change in the source data, the outcome of actives (showing only group 34 here) would be like this:
group_id | start_date |  end_date  | active
----------+------------+------------+--------
       34 | 2016-05-24 | 2017-05-15 |      1
       34 | 2017-05-16 | 2019-01-23 |      2
       34 | 2019-01-24 | 2019-05-30 |      3
       34 | 2019-05-31 | 2099-12-31 |      4
jacaetevha (1 rep)
Mar 21, 2020, 08:20 PM • Last activity: Jun 9, 2025, 09:01 AM
1 votes
2 answers
240 views
How to order by COUNT(*) of unique entries in GROUP BY?
I search through a tag_map as SELECT ArticleID,COUNT(*) AS C FROM WordMap WHERE WordID IN( SELECT WordID FROM Words WHERE Word IN ('word1', 'word2') ) GROUP BY ArticleID ORDER BY C DESC I want to get article ids in which all words (here word1 and word2) appear first. The above query returns in order...
I search through a tag_map as SELECT ArticleID,COUNT(*) AS C FROM WordMap WHERE WordID IN( SELECT WordID FROM Words WHERE Word IN ('word1', 'word2') ) GROUP BY ArticleID ORDER BY C DESC I want to get article ids in which all words (here word1 and word2) appear first. The above query returns in order of the number (e.g., if word1 appears 100 times). NOTE: WordMap table is huge, and this is the reason I went for sub-query rather than JOIN.
Googlebot (4551 rep)
Oct 9, 2020, 07:28 AM • Last activity: Jun 8, 2025, 03:09 PM
7 votes
4 answers
45457 views
Use column alias in GROUP BY
I have a query where I chain two columns. Why does the alias **not** work in the `GROUP BY` clause but in `ORDER BY` clause, it does work? How can I write the select correctly? SELECT KOS_VER_ID AS "Vertrag" , WHR_ISO_3_CODE AS "Waehrung" , KOS_KOA_ST_KZN || ' - ' || ST_LANGBEZ_EN as "Kostenart" , S...
I have a query where I chain two columns. Why does the alias **not** work in the GROUP BY clause but in ORDER BY clause, it does work? How can I write the select correctly? SELECT KOS_VER_ID AS "Vertrag" , WHR_ISO_3_CODE AS "Waehrung" , KOS_KOA_ST_KZN || ' - ' || ST_LANGBEZ_EN as "Kostenart" , SUM (KOS_BETRAG) AS "Summe pro KOA" FROM KOSTEN , WAEHRUNG , SCHLUESSELTABELLE WHERE KOSTEN.KOS_VERHI_WHR_ID = WAEHRUNG.WHR_ISO_ID AND KOSTEN.KOS_KOA_ST_KZN = SCHLUESSELTABELLE.ST_ID AND KOS_VER_ID in (2509, 2510, 2511) GROUP BY KOS_VER_ID , WHR_ISO_3_CODE , KOS_KOA_ST_KZN || ' - ' || ST_LANGBEZ_EN ORDER BY "Vertrag" , "Kostenart" ;
Konrad (71 rep)
Jun 19, 2015, 07:45 AM • Last activity: Jun 3, 2025, 11:59 AM
1 votes
1 answers
276 views
Postgresql group by for many columns
I have such query: ``` SELECT "Movie"."title", "Movie"."id", "Movie"."publicationDate", "Movie"."authorID", "Author"."photoURL" as "authorPhoto", concat("Author"."name", ' ', "Author"."surname") as "authorName", "Movie"."seriesID", "Series"."name" as "seriesName", "Series"."logoURL" as "seriesLogo",...
I have such query:
SELECT "Movie"."title", 
       "Movie"."id", 
	   "Movie"."publicationDate", 
	   "Movie"."authorID", 
	   "Author"."photoURL" as "authorPhoto", 
	   concat("Author"."name", ' ', "Author"."surname") as "authorName",
	   "Movie"."seriesID", 
	   "Series"."name" as "seriesName", 
	   "Series"."logoURL" as "seriesLogo", 
	   "Movie"."languageID" as "language", 
	   "Movie"."ratingUp", 
	   "Movie"."ratingDown", 
	   "userWatched"."watchedDate", 
	   CASE WHEN "UserMovieFavourited"."userID" = '6F9BD058-853F-4B50-92D6-9170118A2727' 
	        THEN TRUE 
			ELSE FALSE 
			END as "favourited", 
	   CASE WHEN "UserMovieWatchlist"."userID" = '6F9BD058-853F-4B50-92D6-9170118A2727' 
	        THEN TRUE 
			ELSE FALSE 
			END as "onWatchlist",
       CASE WHEN "userWatched"."userID" = '6F9BD058-853F-4B50-92D6-9170118A2727' AND "userWatched"."watchedDate" IS NOT NULL 
	        THEN TRUE 
			ELSE FALSE 
			END as "watched",
			count(DISTINCT "overallWatched"."watchedDate") as "viewCount" 
FROM "Movie" 
LEFT JOIN "Series" ON "Series"."id" = "Movie"."seriesID" 
LEFT JOIN "Author" ON "Author"."id" = "Movie"."authorID" 
LEFT JOIN "UserMovieFavourited" ON "UserMovieFavourited"."movieID" = "Movie"."id" 
								AND "UserMovieFavourited"."userID" = '6F9BD058-853F-4B50-92D6-9170118A2727' 
LEFT JOIN "UserMovieWatchlist" ON "UserMovieWatchlist"."movieID" = "Movie"."id" 
							   AND "UserMovieWatchlist"."userID" = '6F9BD058-853F-4B50-92D6-9170118A2727' 
LEFT JOIN "UserMovieWatchProgress" as "userWatched" ON "userWatched"."movieID" = "Movie"."id" 
													AND "userWatched"."userID" = '6F9BD058-853F-4B50-92D6-9170118A2727' 
LEFT JOIN "UserMovieWatchProgress" as "overallWatched" ON "overallWatched"."movieID" = "Movie"."id" 
GROUP BY "Movie"."id", 
		 "Author"."photoURL", 
		 "Author"."name", 
		 "Author"."surname", 
		 "Series"."name", 
		 "Series"."logoURL", 
		 "userWatched"."watchedDate", 
		 "UserMovieFavourited"."userID", 
		 "UserMovieWatchlist"."userID", 
		 "userWatched"."userID"
In MySQL i needed only one column for grouping that's Movie.id. For postgresql I needed to add 10 columns to make this query work as shown above. Is this natural for postgresql or this query should be refactored to remove need of so many grouping columns?
Paweł Madej (123 rep)
Feb 11, 2020, 06:08 PM • Last activity: May 23, 2025, 11:04 AM
0 votes
1 answers
268 views
Keep one row per ID while keeping previous order (from ORDER BY)
I'm actually working on a big query and stuck at the very end. I'm working on MySQL 5.7. ID // SubID // Criteria 1 // Criteria 2 // ... // Critera n 1 // 2 // x // y // ... // z 1 // 3 // x // y // ... // z 1 // 1 // x // y // ... // z 2 // 1 // x // y // ... // z 3 // 1 // x // y // ... // z I woul...
I'm actually working on a big query and stuck at the very end. I'm working on MySQL 5.7. ID // SubID // Criteria 1 // Criteria 2 // ... // Critera n 1 // 2 // x // y // ... // z 1 // 3 // x // y // ... // z 1 // 1 // x // y // ... // z 2 // 1 // x // y // ... // z 3 // 1 // x // y // ... // z I would need for each ID the best (ID, SubID) pair, based on the existing order which is based on sorting on criteria 1 to n. I had a look on several solutions but: - GROUP BY picks randomly any (ID, SubID) pair and doesn't take into account the existing order - GROUP BY + ORDER BY is not sufficient as I need to sort on complex criteria - I can't recover the subID info if I use DISTINCT I have the feeling that the solution is fairly simple but I'm really stuck and I can't find similar problems/solutions on the Internet (maybe I just need some sleep ;) ) Just to make it clear: I need to keep 1 row per ID but in the same order as before, and to return at least ID // SubID (but it would be perfect if it could return all the criteria as well). Thank you in advance for your help guys! **EDIT** Ok so to add a little bit of context here, my criteria are either boolean or float values that I computed in a subquery. It looks like: SELECT * FROM (SELECT *, CASE(formula) THEN 1 ELSE 0 as criteria_1, CASE(formula) THEN 1 ELSE 0 as criteria_2, ABS(formula), as criteria_3, SQRT(formula) as criteria_4 FROM subquery WHERE criteria_1 = 0 ORDER BY criteria_2 , criteria_3 DESC, criteria_4, value_z) "Remove (ID,SubID) duplicates while keeping the previous order" I want to replace the last sentence (the one between quotes) by something, but GROUP BY ID, SubID just randomly picks any (ID, SubID) value.
SaintMark (9 rep)
Jun 20, 2016, 01:26 AM • Last activity: May 19, 2025, 11:04 PM
0 votes
1 answers
255 views
Getting sum result for last year divided quarterly
I am trying to get financial data grouped by each quarter of last year and also grouped by the last three weeks. Also, if it is possible, I want to have the type as a separate column with the type as the header. This is the query I am using right now which is giving the result for last month. Select...
I am trying to get financial data grouped by each quarter of last year and also grouped by the last three weeks. Also, if it is possible, I want to have the type as a separate column with the type as the header. This is the query I am using right now which is giving the result for last month. Select to_char(add_months(last_day(TO_DATE(&DTE,'YYYYMMDD')),-1), 'MM') || '-' || to_char(add_months(last_day(TO_DATE(&DTE,'YYYYMMDD')),-1), 'YYYY') AS Month, case when (ap.asset = 2 or AP.ASSET_TYPE not like 'CFTD%' and AP.ASSET_TYPE not like 'CASH%' and AP.ASSET_TYPE != 'FTL') then 'AUM' when (AP.ASSET_TYPE like 'CFTD%') then 'Depostis' when (AP.ASSET_TYPE like 'CASH%') then 'Cash' when (AP.ASSET_TYPE = 'FTL') then 'Loan' end as Type, sum(ABS(AP.Eval_Market_Value)) as Total from (select p.account, p.open_date as Open_Date, ac.description as RM, s.* FROM k$portfolio p, k$client k LEFT JOIN k$client_role cr ON cr.client = k.id AND cr.role = 136 LEFT JOIN k$actors ac ON cr.actor = ac.id, table(gtw$reporting.Statement(p.id, 2, trunc(add_months(last_day(TO_DATE(&DTE,'YYYYMMDD')),-1)), trunc(add_months(last_day(TO_DATE(&DTE,'YYYYMMDD')),-1)), 1, 1002, 1, 'USD')) s wHERE s.line_type = 'P' and k.id = p.client and p.id = s.portfolio and p.portfolio_type = 'C' and p.status = 1 ) ap group by to_char(add_months(last_day(TO_DATE(&DTE,'YYYYMMDD')),-1), 'MM') ||'-'|| to_char(add_months(last_day(TO_DATE(&DTE,'YYYYMMDD')),-1), 'YYYY'), case when (ap.asset = 2 or AP.ASSET_TYPE not like 'CFTD%' and AP.ASSET_TYPE not like 'CASH%' and AP.ASSET_TYPE != 'FTL') then 'AUM' when (AP.ASSET_TYPE like 'CFTD%') then 'Depostis' when (AP.ASSET_TYPE like 'CASH%') then 'Cash' when (AP.ASSET_TYPE = 'FTL') then 'Loan' end ===================== Current output Output Desired result Final Resualt
Yousef (13 rep)
Feb 1, 2015, 07:43 AM • Last activity: May 18, 2025, 11:06 PM
2 votes
1 answers
57 views
How to group by with similar group_name in sql
How can I perform a GROUP BY in *SQL* when the `group_name` values are similar but not exactly the same? In my dataset, the group_name values may differ slightly (e.g., "Apple Inc.", "AAPL", "Apple"), but conceptually they refer to the same entity. The similarity might not be obvious or consistent,...
How can I perform a GROUP BY in *SQL* when the group_name values are similar but not exactly the same? In my dataset, the group_name values may differ slightly (e.g., "Apple Inc.", "AAPL", "Apple"), but conceptually they refer to the same entity. The similarity might not be obvious or consistent, so I might need to define a custom rule or function like is_similar() to cluster them. For simple cases, I can extract a common pattern using regex or string functions (e.g., strip suffixes, lowercase, take prefixes). But how should I handle more complex scenarios, like fuzzy or semantic similarity? Case: group_name | val ---------------|----- 'Apple Inc.' | 100 'AAPL' | 50 'Apple' | 30 'Microsoft' | 80 'MSFT' | 70 What I want to achieve: new_group_name | total_val ----------------|---------- 'Apple' | 180 'Microsoft' | 150 What are the best approaches to achieve this in *SQL*? And how would I write a query like this: SELECT some_characteristic(group_name) AS new_group_name, SUM(val) FROM tb1 GROUP BY new_group_name;
Ahamad (1 rep)
May 14, 2025, 08:59 AM • Last activity: May 15, 2025, 05:31 AM
0 votes
1 answers
284 views
MySQL - Group By - How does it determine which row to return from joined table
I am getting desired, but inexplicable behavior on a particular query, and wanted to get more information before pushing up. I have 3 tables: **foo** **bar** **bar_image** And paring back all distractors, the fundamental query looks like this: SELECT f.id AS foo_id, b.id AS bar_id, COALESCE(bi.image...
I am getting desired, but inexplicable behavior on a particular query, and wanted to get more information before pushing up. I have 3 tables: **foo** **bar** **bar_image** And paring back all distractors, the fundamental query looks like this: SELECT f.id AS foo_id, b.id AS bar_id, COALESCE(bi.image_path, b.legacy_image_path) AS image_path FROM foo AS f INNER JOIN bar AS b ON b.foo_id = f.id LEFT OUTER JOIN bar_image AS bi ON bi.bar_id = b.id AND bi.deleted_at > f.created_at GROUP BY f.id And the sample sets: ; The following query will result in image_path = a.png due to no matching rows in bar_image foo id: 1 created_at: 1000 bar id: 1 legacy_image_path: a.png bar_image {empty} ; New foo row created, after image replacement ; Above query will return ; foo: 1, image_path: a.png ; foo: 2, image_path: b.png foo id: 2 created_at: 2000 bar id: 1 legacy_image_path: b.png bar_image id: 1 image_path: a.png deleted_at: 1500 ; And another iteration for sake of thoroughness ; Another foo row created, after another image replacement ; Above query will return ; foo: 1, image_path: a.png ; foo: 2, image_path: b.png ; foo: 3, image_path: c.png foo id: 3 created_at: 3000 bar id: 1 legacy_image_path: c.png bar_image id: 2 image_path: b.png deleted_at 2500 As you can see, with the group by on the foo table, and the given join to the bar_image table, each foo is correctly displaying the proper image_path, which is desired, but I cannot explain why this is occurring. When running the query after all the new rows were inserted, foo:1 would return two image rows from bar_image table, but the group by is truncating them to a single row, how does MySQL know to select the closest value, vs some other (seemingly random) row?
Mike Purcell (549 rep)
Mar 11, 2015, 10:17 PM • Last activity: May 12, 2025, 06:04 PM
0 votes
1 answers
663 views
Creating a parent table from child relationships
I have a child relationship table as CREATE TABLE ItemRelationships ( Item1 int(11) unsigned NOT NULL, Item2 int(11) unsigned NOT NULL, INDEX(Item2), FOREIGN KEY(Item1) REFERENCES Items(ItemID) ON DELETE CASCADE, FOREIGN KEY(Item2) REFERENCES Items(ItemID) ON DELETE CASCADE, PRIMARY KEY(Item1,Item2)...
I have a child relationship table as CREATE TABLE ItemRelationships ( Item1 int(11) unsigned NOT NULL, Item2 int(11) unsigned NOT NULL, INDEX(Item2), FOREIGN KEY(Item1) REFERENCES Items(ItemID) ON DELETE CASCADE, FOREIGN KEY(Item2) REFERENCES Items(ItemID) ON DELETE CASCADE, PRIMARY KEY(Item1,Item2) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_general_ci for a list of items: CREATE TABLE Items ( ItemID int(11) unsigned NOT NULL AUTO_INCREMENT, Title varchar(255), ParentID int(11) unsigned, INDEX(ParentID), PRIMARY KEY(ItemID) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_general_ci If there is a relationship between Item1 and Item2, they both belong to the same parent. But I do not have the Parent yet. How can I create unique ParentIDs from the relationship table as CREATE TABLE Parents ( ParentID int(11) unsigned NOT NULL AUTO_INCREMENT, PRIMARY KEY(ParentID) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COLLATE utf8_general_ci And then update ParentID in the table Items.
Googlebot (4551 rep)
Jan 7, 2022, 03:50 PM • Last activity: May 5, 2025, 04:02 PM
Showing page 1 of 20 total questions