Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

22 votes

5 answers

44096 views

Compress a large number of large files fast

I have about 200 GB of log data generated daily, distributed among about 150 different log files. I have a script that moves the files to a temporary location and does a tar-bz2 on the temporary directory. I get good results as 200 GB logs are compressed to about 12-15 GB. The problem is that it tak...

                                  I have about 200 GB of log data generated daily, distributed among about 150 different log files.

I have a script that moves the files to a temporary location and does a tar-bz2 on the temporary directory.

I get good results as 200 GB logs are compressed to about 12-15 GB.

The problem is that it takes forever to compress the files. The cron  job runs at 2:30 AM daily and continues to run till 5:00-6:00 PM.

Is there a way to improve the speed of the compression and complete the job faster? Any ideas?

Don't worry about other processes and all, the location where the compression happens is on a NAS , and I can run mount the NAS on a dedicated VM  and run the compression script from there.

Here is the output of top  for reference:

    top - 15:53:50 up 1093 days,  6:36,  1 user,  load average: 1.00, 1.05, 1.07
    Tasks: 101 total,   3 running,  98 sleeping,   0 stopped,   0 zombie
    Cpu(s): 25.1%us,  0.7%sy,  0.0%ni, 74.1%id,  0.0%wa,  0.0%hi,  0.1%si,  0.1%st
    Mem:   8388608k total,  8334844k used,    53764k free,     9800k buffers
    Swap: 12550136k total,      488k used, 12549648k free,  4936168k cached
     PID  USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
     7086 appmon    18   0 13256 7880  440 R 96.7  0.1 791:16.83 bzip2
    7085  appmon    18   0 19452 1148  856 S  0.0  0.0   1:45.41 tar cjvf /nwk_storelogs/compressed_logs/compressed_logs_2016_30_04.tar.bz2 /nwk_storelogs/temp/ASPEN-GC-32459:nkp-aspn-1014.log /nwk_stor
    30756 appmon    15   0 85952 1944 1000 S  0.0  0.0   0:00.00 sshd: appmon@pts/0
    30757 appmon    15   0 64884 1816 1032 S  0.0  0.0   0:00.01 -tcsh

anu (362 rep)

May 4, 2016, 11:00 PM • Last activity: Jun 17, 2025, 09:12 AM

3 votes

1 answers

1285 views

How to pipe output of command to two separate commands and store outputs

bash io-redirection sort tee optimization

I have a really long command that runs over a huge file and I am forced to run it twice which doubles the time it takes to run. This is what I am doing at the moment: ``` x=$(command | sort -u) y=$(command | sort -n) ``` I was wondering whether there is any way to redirect the output of command to b...

I have a really long command that runs over a huge file and I am forced to run it twice which doubles the time it takes to run. This is what I am doing at the moment:

x=$(command | sort -u)
y=$(command | sort -n)

I was wondering whether there is any way to redirect the output of command to both

-u

and

-n

and store output of each into separate variables or files like I did above with

and

. I tried to use tee to do the following but no luck:

command | tee >(sort -n > x.txt) >(sort -u > y.txt)

I tried to redirect output to text files but it just printed it to standard output instead. Any tips or ideas?

markovv.sim (35 rep)

Oct 27, 2020, 07:44 PM • Last activity: Feb 25, 2025, 06:09 PM

0 votes

1 answers

77 views

Could use docker for app isolation?

docker optimization

I am studying the use of Docker in a big scale project that is actually deployed on production. I never used docker before, but for what I read, it consinst about a new layout called "Container engine" that gives you the opportunity to deploy many aplications that are independent to each other and u...

                                  I am studying the use of Docker in a big scale project that is actually deployed on production. I never used docker before, but for what I read, it consinst about a new layout called "Container engine" that gives you the opportunity to deploy many aplications that are independent to each other and use the resources of the host.

In the case that I am working at, the machines that where our app is deployed can have different OS and architecture like; Windows, Linux, arm, Debian, etc... but they don't have any VM working on, just the OS and the aplications that we deployed.

These machines can have 4-5 aplications running on the same system, having each one of them different dependencies. We had some problems already with that: for example with the file descriptors, where one app was taking the log writing from another app generating erroneous logs and crashing.

These apps have communications with other parts of the machines via TCP/IP sockets and use gRPC, QPID and SFTP to communicate with another elements of the environment (external servers, own libraries, etc...). **I don't know if the use of this protocols would complicate the implementation of the docker in our system.** 

Talking with my work mates, they told me that is not worth as it would not bring any no optimisation or benefit, but I don't think so. 

I've been reading that by using containers, we get OS independence, making the app work on different OS using a docker image, library independence and therefore isolation between apps.

ShadowFurtive (13 rep)

Jul 4, 2024, 09:00 AM • Last activity: Jul 4, 2024, 09:28 AM

30 votes

4 answers

81406 views

How to compile without optimizations -O0 using CMake

debugging gdb cmake optimization

I am using [Scientific Linux][1] (SL). I am trying to compile a project that uses a bunch of C++ (.cpp) files. In the directory `user/project/Build`, I enter `make` to compile and link all the .cpp files. I then have to go to `user/run/` and then type `./run.sh values.txt` To debug with GDB, I have...

                                  I am using Scientific Linux  (SL). I am trying to compile a project that uses a bunch of C++ (.cpp) files. 

In the directory user/project/Build, I enter make to compile and link all the .cpp files. I then have to go to user/run/ and then type ./run.sh values.txt

To debug with GDB, I have to go to user/run and then type gdb ../project/Build/bin/Project and to run, I enter run -Project INPUT/inputfile.txt. However, I am trying to print out the value of variable using p variablename.

However, I get the message s1 = . I have done some research online, and it seems I need to compile without optimizations using -O0 to resolve this. But where do I enter that? In the CMakeLists? If so, which CMakeLists? The one in project/Build or project/src/project?

user4352158 (471 rep)

Feb 28, 2015, 10:08 PM • Last activity: Jun 24, 2024, 07:12 AM

0 votes

0 answers

248 views

PostgreSQL: indexes and partitions

postgresql database partition-table optimization indexing

I have a PostgreSQL database and I noticed a weird behaviour while working with indexes and partitions. The engine version is 10.21. Now, I have a table with this structure: ``` guid varchar(50) PK guid_a varchar(50) data text part_key varchar(2) ``` There are other columns but they are irrelevant....

I have a PostgreSQL database and I noticed a weird behaviour while working with indexes and partitions. The engine version is 10.21. Now, I have a table with this structure:

guid varchar(50) PK
guid_a varchar(50)
data text
part_key varchar(2)

There are other columns but they are irrelevant. The query I have to run on this table looks like this'

select * from mytable where guid_a = 'jxxxxx-xxxxxxx' and data like '%7263628%';

Let me explain a things: The column guid_a contains a code that identifies a person in the format: 'jxxxx-xxxxxxx' where 'x' are numbers. The first two digits goes from 00 to 99, so, for example:

j01xxx-xxxxxx
j02xxx-xxxxxx
...
j99xxx-xxxxxx

I created an index on this column and then I also created an index using trgm module on the data column. Launching the query I get a giant improvement on the performance. Everything's good until now. I also decided to use partitions (the table has **6.4 million records**) and I created 99 partitions (by list) on the column part_key, which contains the first two digits only of the guid_a value. I obtained 99 partitions with each an average of 65 thousand rows. Each partition has the same indexes I talked about before. Improved the performance again. Obviously le query has another condition for the part_key, so that the engine knows which partition should query. Now the weird stuff. I removed the trgm index on the table without the partitions and, surprise surprise: it's faster. Even faster than the partitioned table. Even removing the trgm indexes on the partitioned table. What I noticed on the explain is that the query on the non-partitioned table is forcing the engine go for a index scan only (shouldn't then also make another scan for the second condition on the data table?). On the partitioned table, on the other hand, it goes for the hitman index scan, then it does a heap scan and then an append. This apparently costs more than indexing all the 6.4 million rows. I made different tests with different values but same results. **Performance**: On average: 11 ms on the partitioned table 9 ms on the non-partitioned table with one index only on the guid_a 20 ms on the non-partitioned table with two indexes, the second on the data column using trgm. What's going on here?

Federico Loro (1 rep)

Jan 20, 2023, 07:03 PM

1 votes

1 answers

191 views

Speed up grep usage inside bash script

bash grep optimization mkfifo

I am currently working on creating a bash script that is supposed to process large log files from one of my programs. When I first started the script took around 15 seconds to complete which wasn't bad but I wanted to improve it. I implemented a queue with `mkfifo` and reduced the parse time to 6 se...

#!/usr/bin/env bash
# $1 is server log file
# $2 is client logs file directory

declare -A orders_array

fifo=$HOME/.fifoDate-$$
mkfifo $fifo
# Queue for time conversion
exec 5> >(exec stdbuf -o0 date -f - +%s%3N >$fifo)
exec 6 >(exec stdbuf -o0 grep -oP '[0-9a-f]*-[0-9a-f]*-[0-9a-f]*-[0-9a-f]*-[0-9a-f]*' >$fifo)
exec 8&5 "${line:1:26}"
    read -t 1 -u6 converted_time
    orders_array[$order_id]=$converted_time
done &7 "$line"
    read -t 1 -u8 id
    echo >&5 "${line:1:26}"
    read -t 1 -u6 converted_time
    time_diff="$(($converted_time - orders_array[$id]))"
    echo "$id -> $time_diff ms"
done  GatewayCommon::States::Executed]
[2022-12-07 07:36:18.209567] [MarketOrderTransitionsa4ec2abf-059f-4452-b503-ae58da2ce1ff] [info] [log_action] [(lambda at ../subprojects/market_session/private_include/MarketSession/MarketOrderTransitions.hpp:57:25) for event: MarketMessages::OrderExecuted]
[2022-12-07 07:36:18.209574] [MarketOrderTransitionsa4ec2abf-059f-4452-b503-ae58da2ce1ff] [info] [log_process_event] [boost::sml::v1_1_0::back::on_entry]

the id is in square brackets after MarketOrderTransitions (a4ec2abf-059f-4452-b503-ae58da2ce1ff) Client

[2022-12-07 07:38:47.545433] [twap_algohawk] [info] [] [Event received (OrderExecuted): {"MessageType":"MarketMessages::OrderExecuted","averagePrice":"49.900000","counterPartyIds":{"activeId":"dIh5wYd/S4ChqMQSKMxEgQ**","executionId":"2295","inactiveId":"","orderId":"3dOKjIoURqm8JjWERtInkw**"},"cumulativeQuantity":"1200.000000","executedPrice":"49.900000","executedQuantity":"1200.000000","executionStatus":"Executed","instrument":[["Symbol","5"],["Isin","5"],["SecurityIDSource","4"],["Mic","MARS"]],"lastFillMarket":"MARS","leavesQuantity":"0.000000","marketSendTime":"07:38:31.972000000","orderId":"a4ec2abf-059f-4452-b503-ae58da2ce1ff","orderPrice":"49.900000","orderQuantity":"1200.000000","propagationData":[],"reportId":"Qx2k73f7QqCqcT0LTEJIXQ**","side":"Buy","sideDetails":"Unknown","transactionTime":"00:00:00.000000000"}]

The id in the client log is inside orderId tag (there is 2 of them and I use the second one) The wanted output is:

98ddcfca-d838-4e49-8f10-b9f780a27470 -> 854 ms
5a266ca4-67c6-4482-9068-788a3520b2f3 -> 18 ms
2e8d28de-eac0-4776-85ab-c75d9719b7c6 -> 58950 ms
409034eb-4e55-4e39-901a-eba770d497c0 -> 56172 ms
5b1dc7e8-fae0-43d2-86ea-d3df4dbe810b -> 52505 ms
5249ac24-39d2-40f5-8adf-dcf0410aebb5 -> 17446 ms
bef18cb3-8cef-4d8a-b244-47fed82f21ea -> 1691 ms
7c53c950-23fd-497e-a011-c07363d5fe02 -> 18194 ms

I am in particular concerned only about the "order executed" messages in the log files

Dzamba (11 rep)

Dec 12, 2022, 09:30 AM • Last activity: Dec 13, 2022, 03:32 PM

0 votes

2 answers

186 views

How to resize images on sequential functions?

command-line images conversion optimization

I am trying to: - change the format of my images, - then resize their height and width of 40% - then optimize their quality to 35% max of the source and the total size of the image to 35% of the origin. Here my code: find . -name '*.png' -exec mogrify -format jpg {} + && find . -name '*.{jpeg,jpg}'...

                                  I am trying to:

 - change the format of my images,
 - then resize their height and width of 40%
 - then optimize their quality to 35% max of the source and the total size of the image to 35% of the origin. 

Here my code: 

    find . -name '*.png' -exec mogrify -format jpg {} + &&
    find . -name '*.{jpeg,jpg}' -exec convert -resize 40% _resized.jpg {} + &&
    find ./*.{jpeg,jpg} -exec jpegoptim   -m 35%  --size=35% {} \;

The resizing -line 2- seems to fail. When I am looking at image property I am getting the same image dimensions. 

I expect that the new image:
  - is resized
  - have the original name + the "resized" word at  the end of the name


                                

Diagathe Josué (543 rep)

Apr 22, 2020, 10:11 PM • Last activity: Nov 12, 2022, 02:40 PM

0 votes

1 answers

353 views

Parse huge amounts of files efficiently

bash read optimization

I have a folder that holds hunderds of thousands of files called `hp-temps.txt`. (There are also tons of subfolders) The content of these files looks like this for example: ``` Sensor Location Temp Threshold ------ -------- ---- --------- #1 PROCESSOR_ZONE 15C/59F 62C/143F #2 CPU#1 10C/50F 73C/163F...

I have a folder that holds hunderds of thousands of files called hp-temps.txt. (There are also tons of subfolders) The content of these files looks like this for example:

Sensor   Location              Temp       Threshold
------   --------              ----       ---------
#1        PROCESSOR_ZONE       15C/59F    62C/143F 
#2        CPU#1                10C/50F    73C/163F 
#3        I/O_ZONE             25C/77F    68C/154F 
#4        CPU#2                32C/89F    73C/163F 
#5        POWER_SUPPLY_BAY     9C/48F     55C/131F

I need to parse through all the files and find the highest entry for the Temperature in the #1 line. I have a working script but it takes a very long time, and I was wondering, if there is any way to improve it. Since I'm rather new in Shell Scripting, I imagine this code of mine is really inefficient:

#!/bin/bash
highesetTemp=0
temps=$(find $1 -name hp-temps.txt -exec cat {} + | grep 'PROCESSOR' | cut -c 32-33)
for t in $temps
do
  if [ $t -gt $highestTemp ]; then
    highestTemp=$t
  fi
done

**EDIT:** There has been a very efficient code but I forgot to mention that I not only need the biggest value. I would like to be able to loop through all the files, since I'd like to output the directory of the file and the temperature whenever a higher value is detected. So the output could look like this for example:

New MAX: 22 in /path/to/file/hp-temps.txt
New MAX: 24 in /another/path/hp-temps.txt
New MAX: 29 in /some/more/path/hp-temps.txt

Lumnezia (111 rep)

Sep 18, 2022, 01:32 PM • Last activity: Sep 19, 2022, 05:49 AM

2 votes

2 answers

1354 views

Set usb flash drive as non rotational drive

usb udev devices scheduling optimization

I'm trying to optimize the IO schedulers and to use a proper scheduler for rotational and for non rotational drives (different). When I run: cat /sys/block/sd*/queue/rotational I get: 1 <-- for sda 1 <-- for sdb although sdb is the usb flash drive and it shouldn't be rotational. $ udevadm info -a -n...

                                  I'm trying to optimize the IO schedulers and to use a proper scheduler for rotational and for non rotational drives (different).
When I run:

    cat /sys/block/sd*/queue/rotational

I get:
    
    1    <-- for sda
    1    <-- for sdb

although sdb is the usb flash drive and it shouldn't be rotational.

    $ udevadm info -a -n /dev/sda | grep queue
    ATTRS{queue_depth}=="31"
    ATTRS{queue_ramp_up_period}=="120000"
    ATTRS{queue_type}=="simple"
    
    $ udevadm info -a -n /dev/sdb | grep queue
    ATTRS{queue_depth}=="1"
    ATTRS{queue_type}=="none"

so there is no such attribute as:
    
    ATTR{queue/rotational}=="0" or ...=="1"
                                

user252842

Apr 21, 2018, 12:37 PM • Last activity: Jul 17, 2022, 03:48 PM

1 votes

1 answers

3859 views

Build the Linux kernel without gcc optimization

linux-kernel compiling gcc optimization kgdb

I follow one of many tutorials found on Google results to build and debug the Linux kernel with gcc and kgdb/gdb. And I end up by discovering that is all waste of time. Since I can't compile the kernel without gcc optimization -O0 neither -Og. There's no config option for removing optimization. And...

                                  I follow one of many tutorials found on Google results to build and debug the Linux kernel with gcc and kgdb/gdb. And I end up by discovering that is all waste of time. Since I can't compile the kernel without gcc optimization -O0 neither -Og. There's no config option for removing optimization. And last but not leat, Linus said years ago that is against debugging.
 
Saying that kgdb must exist for some reason. I was wondering if there is a way to get rid of variables/arguments "**optimized out**" and let the debugger step through the code sequentially and not jumping from everywhere to everywhere?

Joe Smith (11 rep)

Aug 27, 2020, 01:05 PM • Last activity: Jun 26, 2022, 02:02 PM

5 votes

3 answers

17700 views

Allocating Swap Space with KVM

virtual-machine performance kvm swap optimization

Consider the following scenario: a host with 2 GiB runs a few guests using KVM.  Each guest does usually not need much memory; they are given 256 MiB each and run services that mostly twiddle their thumbs.  However, occasionally the guests need more memory.  Right now, each guest has...

                                  Consider the following scenario: a host with 2 GiB runs a few guests using KVM.  Each guest does usually not need much memory; they are given 256 MiB each and run services that mostly twiddle their thumbs. 
However, occasionally the guests need more memory. 
Right now, each guest has little RAM but its own swap space. 
I noticed that a small portion of swap is used. 
I never had problems with that configuration, but just out of curiosity:

What is the optimal swap allocation strategy?

 1. Assign each guest its own swap space from their respective disks, and assign the guests only little memory from the host. 
(This is what I am doing now.)
 2. Assign the host a larger amount of swap space and none to the guests, and assign more memory to the guests.

Would memory ballooning help to improve memory performance?  
                                

countermode (7764 rep)

Jun 17, 2014, 01:01 AM • Last activity: May 17, 2022, 02:13 AM

0 votes

1 answers

832 views

CLI tool that compress the given image, whatever file type the image is (png, jpg, gif, webp, svg)?

command-line compression images optimization

I know that there are many tools to optimize an image: - pngquant - optipng - jpegoptim - gifsicle - exiftool - ecc but they are all specific for a certain file type. Is there a single command line that, whatever image type passed, it applies the right compression? Something similar to what https://...

                                  I know that there are many tools to optimize an image:
- pngquant
- optipng
- jpegoptim
- gifsicle
- exiftool
- ecc

but they are all specific for a certain file type. Is there a single command line that, whatever image type passed, it applies the right compression? Something similar to what https://compressor.io  does but cli.

With "optimize" I mean reducing the size of the overall file while keeping it visually nearly identical (thanks @Philippos).
                                

nulll (235 rep)

May 2, 2022, 10:58 AM • Last activity: May 2, 2022, 03:44 PM

1 votes

0 answers

309 views

Pipewire using 2x processes when idling

optimization pipewire

I recently had my BT headphones stutter and cut out and realized that during CPU intensive processing, pipewire is dropping frames. My overall goal, then, is to streamline processing to make that situation happen rarely or never. With that in mind, I noticed today that pipewire has quite a bit of pr...

                                  I recently had my BT headphones stutter and cut out and realized that during CPU intensive processing, pipewire is dropping frames. My overall goal, then, is to streamline processing to make that situation happen rarely or never.

With that in mind, I noticed today that pipewire has quite a bit of processing going on, even while no audio is occurring:

Combined, these processes are taking over 10% of one CPU.

My question is twofold:

1. Is it normal to have 2 of each process running for pipewire (pipewire, pipewire-pulse, and pipewire-gnome-session)? If not, how can I reduce this to 1 each?

2. Why are these processes even taking any CPU when there is no system audio streaming anywhere? Is there a way to reduce CPU usage while idling?

Duane J (133 rep)

Dec 8, 2021, 07:28 PM

0 votes

1 answers

32 views

Timing desktop environment initialization

gnome performance desktop-environment optimization

Is there a way to time desktop environment initialization and perhaps identify delaying candidates?

                                  Is there a way to time desktop environment initialization and perhaps identify delaying candidates?
                                

Sterling Butters (117 rep)

Aug 25, 2020, 08:55 PM • Last activity: Nov 2, 2021, 10:29 AM

1 votes

0 answers

430 views

Can't compile linux kernel with -Og/-O0 option for debugging purpoces

linux-kernel debugging gdb optimization

Having custom hardware running embedded Linux (OpenWrt) like a charm. CPU - is IMX6ULL (ArmV7) so it is supported by Jlink to debug over JTAG interface. Starting GDB server and step by step debugging Linux kernel shows lot of optimized out messages because of kernel compiled with KBUILD_CFLAGS += -O2 -fno-reorder-blocks -fno-tree-ch $(EXTRA_OPTIMIZATION) flag. So I am trying to compile it with -O0 that provides me following option:

$ make -j64 V=s all
  CHK     include/config/kernel.release
  CHK     include/generated/uapi/linux/version.h
  CC      scripts/mod/empty.o
  ....
  AR      built-in.o
  LD      vmlinux.o
  MODPOST vmlinux.o
  WARNING: modpost: Found 4 section mismatch(es).
  To see full details build your kernel with:
  'make CONFIG_DEBUG_SECTION_MISMATCH=y'
  arm-openwrt-linux-muslgnueabi-ld: arch/arm/kernel/setup.o: in function `setup_arch':
/opt/eclipse/imx6ull-openwrt/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/linux-imx6ull_cortexa7/linux-4.14.199/arch/arm/kernel/setup.c:1134: undefined reference to `psci_smp_ops'
  arm-openwrt-linux-muslgnueabi-ld: /opt/eclipse/imx6ull-openwrt/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/linux-imx6ull_cortexa7/linux-4.14.199/arch/arm/kernel/setup.c:1134: undefined reference to `psci_smp_ops'
  arm-openwrt-linux-muslgnueabi-ld: kernel/panic.o: in function `__xchg':
/opt/eclipse/imx6ull-openwrt/build_dir/target-arm_cortex-a7+neon-vfpv4_musl_eabi/linux-imx6ull_cortexa7/linux-4.14.199/./arch/arm/include/asm/cmpxchg.h:110: undefined reference to `__bad_xchg'
  arm-openwrt-linux-muslgnueabi-ld: kernel/exit.o: in function `__xchg':

Checked for WARNING: modpost: Found x section mismatch(es). here . Seems that resulted binary file takes more space then configured by some settings. The vmlinux size built with -O2 option is 39Mb. Using -O1 gives me 37Mb image so I hope there is enough space in my DDR3 RAM (128Mb) to feet for even bigger image compiled with -O0 configuration. So I am wondering about way to provide more space for sections? Could someone please point me to place I can do it? Have limited knowledge about Linux kernel so was unable to find any linker script used for that.

user3583807 (111 rep)

Oct 22, 2021, 12:06 PM

1 votes

2 answers

682 views

How to Build Latest HandBrake on linux with FDO (PGO) + LTO?

gcc optimization handbrake

Passing CFLAGS and CXXFLAGS to a HandBrake build for the latest version (v1.3.3 at the time of this writing) will work until you add `-flto` which will **FAIL** the whole build. How to build HandBrake with LTO option `-flto` and as a stretch goal, with FDO as well (feedback directed optimisation aka...

                                  Passing CFLAGS and CXXFLAGS to a HandBrake build for the latest version (v1.3.3 at the time of this writing) will work until you add -flto which will **FAIL** the whole build.

How to build HandBrake with LTO option -flto and as a stretch goal, with FDO as well (feedback directed optimisation aka FDO aka PGO)?

Most of the codecs within HandBrake are developed with "hand-coded" assembly, so many assert that the compiler optimisation gains would not be that much.  
I would like to test and challenge that assertion!

DanglingPointer (262 rep)

Jun 23, 2021, 12:56 AM • Last activity: Aug 2, 2021, 12:52 AM

0 votes

0 answers

198 views

Why integer division is faster than bitwise shift in shell?

bash performance dash math optimization

I'm comparing performance of `bash` and `dash` (default `sh` in Xubuntu 18.04). - I expect `sh` to be faster than `bash` - I expect bitwise shift to be faster than division operator. However, I'm getting inconsistencies: ``` λ hyperfine --export-markdown a.md -w 3 ./* Benchmark #1: ./calc-div.bash T...

I'm comparing performance of bash and dash (default sh in Xubuntu 18.04). - I expect sh to be faster than bash - I expect bitwise shift to be faster than division operator. However, I'm getting inconsistencies:

λ hyperfine --export-markdown a.md -w 3 ./*
Benchmark #1: ./calc-div.bash
  Time (mean ± σ):      2.550 s ±  0.033 s    [User: 2.482 s, System: 0.068 s]
  Range (min … max):    2.497 s …  2.595 s    10 runs

Benchmark #2: ./calc-div.sh
  Time (mean ± σ):      2.063 s ±  0.016 s    [User: 2.063 s, System: 0.000 s]
  Range (min … max):    2.043 s …  2.100 s    10 runs

Benchmark #3: ./calc-shift.bash
  Time (mean ± σ):      3.312 s ±  0.034 s    [User: 3.255 s, System: 0.057 s]
  Range (min … max):    3.274 s …  3.385 s    10 runs

Benchmark #4: ./calc-shift.sh
  Time (mean ± σ):      2.087 s ±  0.046 s    [User: 2.086 s, System: 0.001 s]
  Range (min … max):    2.058 s …  2.211 s    10 runs

Summary
  './calc-div.sh' ran
    1.01 ± 0.02 times faster than './calc-shift.sh'
    1.24 ± 0.02 times faster than './calc-div.bash'
    1.61 ± 0.02 times faster than './calc-shift.bash'

| Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | ./calc-div.bash | 2.550 ± 0.033 | 2.497 | 2.595 | 1.24 ± 0.02 | | ./calc-div.sh | 2.063 ± 0.016 | 2.043 | 2.100 | 1.00 | | ./calc-shift.bash | 3.312 ± 0.034 | 3.274 | 3.385 | 1.61 ± 0.02 | | ./calc-shift.sh | 2.087 ± 0.046 | 2.058 | 2.211 | 1.01 ± 0.02 | Here are the scripts I tested: calc-div.bash

#!/usr/bin/env bash

for i in {1..1000000}; do
    _=$(( i / 1024 ))
done

calc-div.sh

#!/usr/bin/env sh

i=1
while [ $i -le 1000000 ]; do
    _=$(( i / 1024 ))
    i=$(( i + 1 ))
done

calc-shift.bash

#!/usr/bin/env bash

for i in {1..1000000}; do
    _=$(( i >> 10 ))
done

calc-shift.sh

#!/usr/bin/env sh

i=1
while [ $i -le 1000000 ]; do
    _=$(( i >> 10 ))
    i=$(( i + 1 ))
done

This difference is more visible for 5000000: | Command | Mean [s] | Min [s] | Max [s] | Relative | |:---|---:|---:|---:|---:| | ./calc-div.bash | 13.333 ± 0.202 | 12.870 | 13.584 | 1.23 ± 0.02 | | ./calc-div.sh | 10.830 ± 0.119 | 10.750 | 11.150 | 1.00 | | ./calc-shift.bash | 17.361 ± 0.357 | 16.995 | 18.283 | 1.60 ± 0.04 | | ./calc-shift.sh | 11.226 ± 0.351 | 10.834 | 11.958 | 1.04 ± 0.03 |

Summary
  './calc-div.sh' ran
    1.04 ± 0.03 times faster than './calc-shift.sh'
    1.23 ± 0.02 times faster than './calc-div.bash'
    1.60 ± 0.04 times faster than './calc-shift.bash'

As you can see, for both bash and dash, division operator is faster than equivalent bitwise-shift to the right.

Zeta.Investigator (1190 rep)

Jun 30, 2021, 02:58 PM • Last activity: Jun 30, 2021, 03:13 PM

1 votes

0 answers

627 views

Is jq internal sort slower than GNU sort?

sort jq coreutils benchmark optimization

While filtering through [this json file](https://iptv-org.github.io/iptv/channels.json) I did a [benchmark](https://github.com/sharkdp/hyperfine) and found out utilizing jq's internal sort and unique method is actually **25% slower** than sort --unique! | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:---|---:|---:|---:|---:| | jq "[.[].category] \| sort \| unique" channels.json | 172.0 ± 2.6 | 167.8 | 176.8 | 1.25 ± 0.06 | | jq "[.[].category \| select((. != null) and (. != \"XXX\"))] \| sort \| unique" channels.json | 151.9 ± 4.1 | 146.5 | 163.9 | 1.11 ± 0.06 | | jq ".[].category" channels.json \| sort -u | 137.2 ± 6.6 | 131.8 | 156.6 | 1.00 |

Summary
  'jq ".[].category" channels.json | sort -u' ran
    1.11 ± 0.06 times faster than 'jq "[.[].category | select((. != null) and (. != \"XXX\"))] | sort | unique" channels.json'
    1.25 ± 0.06 times faster than 'jq "[.[].category] | sort | unique" channels.json'

test command:

hyperfine --warmup 3 \
    'jq "[.[].category] | sort | unique" channels.json'  \
    'jq "[.[].category | select((. != null) and (. != \"XXX\"))] | sort | unique" channels.json' \
    'jq ".[].category" channels.json | sort -u'

If we only test sort (without uniqueness), again jq is **9% slower** than sort: | Command | Mean [ms] | Min [ms] | Max [ms] | Relative | |:---|---:|---:|---:|---:| | jq "[.[].category] \| sort" channels.json | 133.9 ± 1.6 | 131.1 | 138.2 | 1.09 ± 0.02 | | jq ".[].category" channels.json \| sort | 123.0 ± 1.3 | 120.5 | 125.7 | 1.00 |

Summary
  'jq ".[].category" channels.json | sort' ran
    1.09 ± 0.02 times faster than 'jq "[.[].category] | sort" channels.json'

versions:

jq-1.5-1-a5b5cbe
sort (GNU coreutils) 8.28

I expected using jq's internal functions would result in a faster processing than piping into an external app which itself should be spawned. Am I using jq poorly? **update** Just repeated this experiment on host with FLASH storage, Arm CPU and these versions:

jq-1.6
sort (GNU coreutils) 8.32

result:

Benchmark #1: jq "[.[].category] | sort" channels.json
  Time (mean ± σ):     587.8 ms ±   3.9 ms    [User: 539.5 ms, System: 44.2 ms]
  Range (min … max):   582.8 ms … 594.2 ms    10 runs
 
Benchmark #2: jq ".[].category" channels.json | sort
  Time (mean ± σ):     606.0 ms ±   8.6 ms    [User: 569.5 ms, System: 49.0 ms]
  Range (min … max):   589.6 ms … 616.2 ms    10 runs
 
Summary
  'jq "[.[].category] | sort" channels.json' ran
    1.03 ± 0.02 times faster than 'jq ".[].category" channels.json | sort'

Now jq sort runs 3% faster than GNU sort :D

Zeta.Investigator (1190 rep)

Jun 26, 2021, 08:39 AM • Last activity: Jun 26, 2021, 08:28 PM

1 votes

0 answers

56 views

How can I frequently update a file without harming my disk?

performance i3 ssd optimization

I've got an i3 install on a laptop with an SSD. Currently I have it configured to save the WM layout on many different events. The tool that I'm using to do this is built on Python, and I'm just running it with an ampersand through the i3 config. However, I'm concerned that this will hurt the longev...

                                  I've got an i3 install on a laptop with an SSD. Currently I have it configured to save the WM layout on many different events. The tool that I'm using to do this is built on Python, and I'm just running it with an ampersand through the i3 config. However, I'm concerned that this will hurt the longevity of my disk. I understand a bit about the "virtual filesystem", but I'm not really sure how that applies here. Should I be concerned about my disk? And if so, how can I change my setup to avoid that, while still being able to frequently update this table (which is stored as multiple json files).

Thanks!

wknr (11 rep)

May 17, 2021, 02:52 PM • Last activity: May 18, 2021, 11:12 AM

-1 votes

2 answers

46 views

Optimize time when creating archive

find tar optimization

Currently I am using the following command to create an archive with files older than 7 days: `find /var/tunningLog/ -type f -mtime +7 -print0 | tar -czf "/var/tunningLog/$(date '+%Y-%m-%d').tar.gz" --null -T - && echo "OK" || echo "NOK"` But it is taking to long (currently `/var/tunningLog/` has 49...

                                  Currently I am using the following command to create an archive with files older than 7 days:

find /var/tunningLog/ -type f -mtime +7 -print0 | tar -czf "/var/tunningLog/$(date '+%Y-%m-%d').tar.gz" --null -T - &&  echo "OK" || echo "NOK"

But it is taking to long (currently /var/tunningLog/ has 49G). Is there any way to speed up the process or to improve the command? Thx

dejanualex (369 rep)

Jan 14, 2021, 02:56 PM • Last activity: Jan 28, 2021, 02:54 PM

Showing page 1 of 20 total questions