We have a table with a date field and ids. A simple query like this returns the correct result:
SELECT count(DISTINCT id_field) AS distinct_count FROM schema1.table1 WHERE date_part('month', date_field ) = 11 ;
distinct_count
----------------------
5645202
However, when trying to count distinct ids grouped by month we get incorrect results:
SELECT date_part('month',date_field ), count(DISTINCT id_field) AS distinct_count
FROM schema1.table1 GROUP BY 1 ORDER BY 1 desc;
date_part | distinct_count
-----------+----------------------
12 | 5637167
11 | 5645426
10 | 5705702
9 | 5633101
8 | 5619553
7 | 5636407
6 | 5598244
5 | 5658568
4 | 5591066
3 | 5595882
2 | 5646399
1 | 5584825
(12 rows)
Month 11 has 5,645,426 IDs counted (incorrect) vs 5,645,202 (correct). Could this be an issue with the DB engine itself?
Asked by Alex Polkhovsky
(101 rep)
Oct 28, 2022, 03:13 PM