Sample Header Ad - 728x90

Compress a large number of large files fast

22 votes
5 answers
44096 views
I have about 200 GB of log data generated daily, distributed among about 150 different log files. I have a script that moves the files to a temporary location and does a tar-bz2 on the temporary directory. I get good results as 200 GB logs are compressed to about 12-15 GB. The problem is that it takes forever to compress the files. The cron job runs at 2:30 AM daily and continues to run till 5:00-6:00 PM. Is there a way to improve the speed of the compression and complete the job faster? Any ideas? Don't worry about other processes and all, the location where the compression happens is on a NAS , and I can run mount the NAS on a dedicated VM and run the compression script from there. Here is the output of top for reference: top - 15:53:50 up 1093 days, 6:36, 1 user, load average: 1.00, 1.05, 1.07 Tasks: 101 total, 3 running, 98 sleeping, 0 stopped, 0 zombie Cpu(s): 25.1%us, 0.7%sy, 0.0%ni, 74.1%id, 0.0%wa, 0.0%hi, 0.1%si, 0.1%st Mem: 8388608k total, 8334844k used, 53764k free, 9800k buffers Swap: 12550136k total, 488k used, 12549648k free, 4936168k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 7086 appmon 18 0 13256 7880 440 R 96.7 0.1 791:16.83 bzip2 7085 appmon 18 0 19452 1148 856 S 0.0 0.0 1:45.41 tar cjvf /nwk_storelogs/compressed_logs/compressed_logs_2016_30_04.tar.bz2 /nwk_storelogs/temp/ASPEN-GC-32459:nkp-aspn-1014.log /nwk_stor 30756 appmon 15 0 85952 1944 1000 S 0.0 0.0 0:00.00 sshd: appmon@pts/0 30757 appmon 15 0 64884 1816 1032 S 0.0 0.0 0:00.01 -tcsh
Asked by anu (362 rep)
May 4, 2016, 11:00 PM
Last activity: Jun 17, 2025, 09:12 AM