Database Administrators
Q&A for database professionals who wish to improve their database skills
Latest Questions
0
votes
2
answers
175
views
DSBulk count returns more rows than unloaded in CSV files
I'm doing `dsbulk unload` for one table, with one primary key field, without clustering key. At the end in console I see something like this ``` total | failed | rows/s | p50ms | p99ms | p999ms 174,971,236 | 0 | 1,946,689 | 148.95 | 285.21 | 400.56 ``` but when I counting number lines in all csv fil...
I'm doing
dsbulk unload
for one table, with one primary key field, without clustering key.
At the end in console I see something like this
total | failed | rows/s | p50ms | p99ms | p999ms
174,971,236 | 0 | 1,946,689 | 148.95 | 285.21 | 400.56
but when I counting number lines in all csv files I getting ~170M, in ~5M less. My main confusing difference in cli output and number in csv files.
I also:
dsbulk count
shows me same result as unload
~175M.
My general question why it happens? What is explanation? What real size of my table? and how to debug?
I already did dsbulk count
with --dsbulk.engine.maxConcurrentQueries 1
, --datastax-java-driver.basic.request.consistency LOCAL_QUORUM
and number is same ~175M.
Viktor Tsymbaliuk
(25 rep)
Jun 12, 2024, 11:58 AM
• Last activity: Jul 29, 2025, 01:03 AM
Showing page 1 of 1 total questions