Sample Header Ad - 728x90

Is jq internal sort slower than GNU sort?

1 vote
0 answers
627 views
While filtering through [this json file](https://iptv-org.github.io/iptv/channels.json) I did a [benchmark](https://github.com/sharkdp/hyperfine) and found out utilizing jq's internal sort and unique method is actually **25% slower** than sort --unique! | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:---|---:|---:|---:|---:| | jq "[.[].category] \| sort \| unique" channels.json | 172.0 ± 2.6 | 167.8 | 176.8 | 1.25 ± 0.06 | | jq "[.[].category \| select((. != null) and (. != \"XXX\"))] \| sort \| unique" channels.json | 151.9 ± 4.1 | 146.5 | 163.9 | 1.11 ± 0.06 | | jq ".[].category" channels.json \| sort -u | 137.2 ± 6.6 | 131.8 | 156.6 | 1.00 |
Summary
  'jq ".[].category" channels.json | sort -u' ran
    1.11 ± 0.06 times faster than 'jq "[.[].category | select((. != null) and (. != \"XXX\"))] | sort | unique" channels.json'
    1.25 ± 0.06 times faster than 'jq "[.[].category] | sort | unique" channels.json'
test command:
hyperfine --warmup 3 \
    'jq "[.[].category] | sort | unique" channels.json'  \
    'jq "[.[].category | select((. != null) and (. != \"XXX\"))] | sort | unique" channels.json' \
    'jq ".[].category" channels.json | sort -u'
If we only test sort (without uniqueness), again jq is **9% slower** than sort: | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:---|---:|---:|---:|---:| | jq "[.[].category] \| sort" channels.json | 133.9 ± 1.6 | 131.1 | 138.2 | 1.09 ± 0.02 | | jq ".[].category" channels.json \| sort | 123.0 ± 1.3 | 120.5 | 125.7 | 1.00 |
Summary
  'jq ".[].category" channels.json | sort' ran
    1.09 ± 0.02 times faster than 'jq "[.[].category] | sort" channels.json'
versions:
jq-1.5-1-a5b5cbe
sort (GNU coreutils) 8.28
I expected using jq's internal functions would result in a faster processing than piping into an external app which itself should be spawned. Am I using jq poorly? **update** Just repeated this experiment on host with FLASH storage, Arm CPU and these versions:
jq-1.6
sort (GNU coreutils) 8.32
result:
Benchmark #1: jq "[.[].category] | sort" channels.json
  Time (mean ± σ):     587.8 ms ±   3.9 ms    [User: 539.5 ms, System: 44.2 ms]
  Range (min … max):   582.8 ms … 594.2 ms    10 runs
 
Benchmark #2: jq ".[].category" channels.json | sort
  Time (mean ± σ):     606.0 ms ±   8.6 ms    [User: 569.5 ms, System: 49.0 ms]
  Range (min … max):   589.6 ms … 616.2 ms    10 runs
 
Summary
  'jq "[.[].category] | sort" channels.json' ran
    1.03 ± 0.02 times faster than 'jq ".[].category" channels.json | sort'
Now jq sort runs 3% faster than GNU sort :D
Asked by Zeta.Investigator (1190 rep)
Jun 26, 2021, 08:39 AM
Last activity: Jun 26, 2021, 08:28 PM