Sample Header Ad - 728x90

Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

1 votes
1 answers
183 views
Understanding `pgpgin` in `/proc/vmstat` as I/O Counters: Relationship with I/O Bandwidth Measurements
Hi Kernel I/O Experts, I have a question regarding the `pgpgin` and `pgpgout` counters in `/proc/vmstat`, specifically focusing on `pgpgin`. I’ve been exploring performance monitoring tools like `vmstat` and `iotop`, which are very practical commandline tools for observing I/O performance. Upon exam...
Hi Kernel I/O Experts, I have a question regarding the pgpgin and pgpgout counters in /proc/vmstat, specifically focusing on pgpgin. I’ve been exploring performance monitoring tools like vmstat and iotop, which are very practical commandline tools for observing I/O performance. Upon examining their code, I noticed that these tools report "Current I/O" using the pgpgin and pgpgout counters instead of directly reading current I/O statistics from block devices, as tools like fio do. First Question: I am trying to understand **the exact relationship between the pgpgin and pgpgout counters and actual I/O bandwidth. Why do these tools rely on paging-related counters to represent I/O activity?** How exactly are pgpgin and pgpgout updated, and what system components are responsible for these updates? In short, could you explain when and why these counters reflect disk I/O operations? Second Question (Edge Case): **Are there specific scenarios or edge cases where pgpgin bandwidth does not accurately correspond to actual disk bandwidth?** While benchmarking SSD read performance using fio with io_uring polling, I observed that the I/O bandwidth reported by fio (and SSD stats) is significantly lower than the bandwidth indicated by pgpgin. This discrepancy led me to investigate how pgpgin reflects I/O activity (thus the first question above). I have confirmed that this mismatch is consistent and not due to transient system noise. Any insights into these counters, their update mechanisms, and their relationship with real I/O performance would be greatly appreciated. Thank you!
JGL (161 rep)
Sep 12, 2024, 09:56 AM • Last activity: Sep 12, 2024, 11:45 AM
0 votes
1 answers
124 views
How to identify the cause of a process reading from disk at rates above 400 Mb/s
I am managing some virtual machines in Azure and a few times a week, at apparently random times, some of them start **I/O reading at speeds above 400 Mb/s**. This occurrs for one machine at a time, not simultaneously. These machines use SSDs as hard drives, but those read speeds seem abnormal. [![en...
I am managing some virtual machines in Azure and a few times a week, at apparently random times, some of them start **I/O reading at speeds above 400 Mb/s**. This occurrs for one machine at a time, not simultaneously. These machines use SSDs as hard drives, but those read speeds seem abnormal. enter image description here Additionally, the machines experiencing this activity **become unreachable via SSH after a few minutes**. I am currently using **iotop** trying to output it to a log file, so after rebooting the stuck machine I can go through it and try to identify the process/es causing the trouble. I am also using crontab to run it every minute. Find below the current script I am using: #!/usr/bin/env bash OUT=/var/log/zs/io.log echo $(date) >> $OUT echo $(iotop -o -b -n 1|head -n 2) >> $OUT echo $(iotop -o -b -n 1|head -n 6|tail -n +4) >> $OUT And the log file showing the I/O peak: Fri Jan 12 09:33:01 CET 2024 Total DISK READ : 113.45 M/s | Total DISK WRITE : 7.04 M/s Actual DISK READ: 171.85 M/s | Actual DISK WRITE: 85.79 M/s 3350 be/4 root 41.59 M/s 0.00 B/s ?unavailable? containerd 11744 be/4 root 112.49 M/s 0.00 B/s ?unavailable? dockerd -H fd:// --containerd=/run/containerd/containerd.sock 11925 be/4 root 1142.56 K/s 0.00 B/s ?unavailable? dockerd -H fd:// --containerd=/run/containerd/containerd.sock Fri Jan 12 09:58:35 CET 2024 Apparently it seems to be related to a docker process, but I would like to know: 1. Is it possible to prevent the machine from becoming unreachable? 2. How to track down the exact Docker container causing this trouble? Thank you in advance.
Stacker12345 (3 rep)
Jan 12, 2024, 10:40 AM • Last activity: Jan 12, 2024, 12:02 PM
1 votes
1 answers
742 views
iotop fields on missing SWAPIN and IO columns on AlmaLinux 8
I am using ```iotop``` to monitor disk activity and on AlmaLinux 8 systems, ```iotop``` does not have all fields. Here are the columns on AlmaLinux 8 ``` PID PRIO USER DISK READ DISK WRITE> COMMAND ``` Here are the columns on CentOS 6/7 ``` PID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND ``` H...
I am using
to monitor disk activity and on AlmaLinux 8 systems,
does not have all fields. Here are the columns on AlmaLinux 8
PID  PRIO  USER     DISK READ DISK WRITE>    COMMAND
Here are the columns on CentOS 6/7
PID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND
How do I make
display SWAPIN and IO> by default. It also showing unavailable when I run the following:
[root@alma8-dev test]# iotop -botqk
18:22:09 Total DISK READ :       0.00 K/s | Total DISK WRITE :       0.00 K/s
18:22:09 Actual DISK READ:       0.00 K/s | Actual DISK WRITE:       0.00 K/s
    TIME    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
18:22:11 Total DISK READ :       0.00 K/s | Total DISK WRITE :       0.00 K/s
18:22:11 Actual DISK READ:       0.00 K/s | Actual DISK WRITE:       0.00 K/s
18:22:12 Total DISK READ :       0.00 K/s | Total DISK WRITE :      64.94 K/s
18:22:12 Actual DISK READ:       0.00 K/s | Actual DISK WRITE:      64.94 K/s
b'18:22:12    2294 be/3 root        0.00 K/s   64.94 K/s ?unavailable?  d3 flusher-0 pick0 LINUX'
b'18:22:12  849204 be/4 root        0.00 K/s    0.00 K/s ?unavailable?  [kworker/1:3-events]'
Here's the kernel config:
[root@alma8-dev test]#  egrep 'CONFIG_TASK_DELAY_ACCT|CONFIG_TASK_IO_ACCOUNTING|CONFIG_TASKSTATS|CONFIG_VM_EVENT_COUNTERS' /boot/config-4.18.0-425.10.1.e_7.x86_64 CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_IO_ACCOUNTING=y CONFIG_VM_EVENT_COUNTERS=y
supmethods (561 rep)
Mar 1, 2023, 02:15 AM • Last activity: Apr 20, 2023, 01:01 PM
1 votes
1 answers
286 views
jbd2/sda2-8 utilizing disk I/O and Xorg.0.log log
I noticed that ```jbd2/sda2-8``` is always using disk and ```iotop``` showing IO% between 1%-5%. I checked the logs and noticed Xorg.0.log quite large. I see there are frequent connection and disconnection. Is there a way to reduce this. ``` [root@vmcloudm51 autoit]# tail -f /var/log/Xorg.0.log [177...
I noticed that
/sda2-8
is always using disk and
showing IO% between 1%-5%. I checked the logs and noticed Xorg.0.log quite large. I see there are frequent connection and disconnection. Is there a way to reduce this.
[root@vmcloudm51 autoit]# tail -f /var/log/Xorg.0.log
[177501.285] AUDIT: Tue Feb 28 14:00:23 2023: 3444: client 19 disconnected
[177531.342] AUDIT: Tue Feb 28 14:00:53 2023: 3444: client 19 connected from local host ( uid=986 gid=1009 pid=29884 )
  Auth name: MIT-MAGIC-COOKIE-1 ID: 919
[177531.343] AUDIT: Tue Feb 28 14:00:53 2023: 3444: client 19 disconnected
[177561.398] AUDIT: Tue Feb 28 14:01:23 2023: 3444: client 19 connected from local host ( uid=986 gid=1009 pid=29884 )
  Auth name: MIT-MAGIC-COOKIE-1 ID: 919
[177561.399] AUDIT: Tue Feb 28 14:01:23 2023: 3444: client 19 disconnected
[177591.541] AUDIT: Tue Feb 28 14:01:53 2023: 3444: client 19 connected from local host ( uid=986 gid=1009 pid=29884 )
  Auth name: MIT-MAGIC-COOKIE-1 ID: 919
[177591.557] AUDIT: Tue Feb 28 14:01:53 2023: 3444: client 19 disconnected
[177621.616] AUDIT: Tue Feb 28 14:02:23 2023: 3444: client 19 connected from local host ( uid=986 gid=1009 pid=29884 )
  Auth name: MIT-MAGIC-COOKIE-1 ID: 919
[177621.617] AUDIT: Tue Feb 28 14:02:23 2023: 3444: client 19 disconnected
supmethods (561 rep)
Feb 28, 2023, 03:20 AM • Last activity: Feb 28, 2023, 05:06 PM
4 votes
2 answers
6473 views
IOTOP complains: CONFIG_TASK_DELAY_ACCT not enabled in kernel
Looking about, I see that the standard fix is to add this to the kernel boot parameters. Using systemd-boot, my arch.conf looks like this : title Arch Linux linux /vmlinuz-linux initrd /intel-ucode.img initrd /initramfs-linux.img options root=PARTUUID="98b3b4f7-e7f9-6f49-be81-a2ee709c7a3e" rw How do...
Looking about, I see that the standard fix is to add this to the kernel boot parameters. Using systemd-boot, my arch.conf looks like this : title Arch Linux linux /vmlinuz-linux initrd /intel-ucode.img initrd /initramfs-linux.img options root=PARTUUID="98b3b4f7-e7f9-6f49-be81-a2ee709c7a3e" rw How do I add CONFIG_TASK_DELAY_ACCT to the options entry? Another line? Or by using some delimeter, add it to the existing line? What value should I be setting it to?
Stephen Boston (2526 rep)
May 30, 2022, 11:00 PM • Last activity: Aug 9, 2022, 09:37 PM
1 votes
1 answers
463 views
why is iotop showing "?err"
when I run `iotop`, the second column "PRIO" shows `?err` everywhere. [![enter image description here][1]][1] What does this mean? I did not find any mention of err in the manual page. I am using custom compiled kernel. Could it be some option is missing in my kernel? How do I find which one? **EDIT...
when I run iotop, the second column "PRIO" shows ?err everywhere. enter image description here What does this mean? I did not find any mention of err in the manual page. I am using custom compiled kernel. Could it be some option is missing in my kernel? How do I find which one? **EDIT** I have all the required options in my kernel: CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_IO_ACCOUNTING=y CONFIG_VM_EVENT_COUNTERS=y
Martin Vegter (598 rep)
Apr 17, 2022, 11:44 AM • Last activity: Apr 22, 2022, 11:21 AM
5 votes
3 answers
2866 views
Why are processes blocked by I/O in case of heavy system load?
I have a workstation(2x Intel Xeon family CPUs and 128GiB of RAM) running several virtual machines and while the combined CPU usage is <30%, then the load average is between 20 and 25. For example, if I execute a `tar -xzvf vm_data.tgz --directory vm4/ --strip-components=1` command, then the `gzip`...
I have a workstation(2x Intel Xeon family CPUs and 128GiB of RAM) running several virtual machines and while the combined CPU usage is <30%, then the load average is between 20 and 25. For example, if I execute a tar -xzvf vm_data.tgz --directory vm4/ --strip-components=1 command, then the gzip process is 90% - 99% of its time blocked by I/O and the command takes forever to complete: enter image description here On the other hand, the actual reads and writes to disks are very low compared to SATA 3.0 or SSDs(I'm using single [Kingston SA400S37960G](https://www.kingston.com/en/ssd/a400-solid-state-drive?partnum=SA400S37%2F960G) SSD) hardware limits. What might cause a process(gzip in my example) to wait after the I/O while the actual disk reads and writes appear to be very low? My first thought was that maybe the system interrupts are very high and that's what's blocking the I/O, but according to /proc/interrupts this does not seem to be the case as none of the counters are increasing rapidly.
Martin (8156 rep)
Sep 17, 2020, 10:12 AM • Last activity: Mar 18, 2021, 11:31 PM
2 votes
0 answers
769 views
Process shows as 100% I/O bound while producing minimal disk activity, disk util is at 100%
We are having quite strange problem. There is a program (cryptocurrency node to be precise) which has local database of all the transactions ever made. Database is huge - around 15 TB. The problem is that the program won't synchronize with the network, though it has enough peers and knowledge about...
We are having quite strange problem. There is a program (cryptocurrency node to be precise) which has local database of all the transactions ever made. Database is huge - around 15 TB. The problem is that the program won't synchronize with the network, though it has enough peers and knowledge about new and old blocks is not a problem. Now the strange part - I have started the same program from scratch, without that history of 15TB, and it started syncing immediately, loading disk by about 50% per iostat. CPU and memory utilization are negligible. Absolute figures are: - Read speed: 5MB/s - Write speed: 20MB/s - iotop - 20% on average for this process When I switch to historical DB (15TB), iostat shows 100% disk utilization, iotop shows multiple forked processes with majority of them sitting at 99% of I/O, but actual I/O is not happening judging by the volume reported by iotop or iostat. Both read and write are within 1MB/s. This is running on MS Azure VM, through Azure portal we see that disk utilization is around 1% in "full" mode and writing is around 20% in "fresh" mode, so throttling by cloud operator is no an issue either. Now the question - how do I diagnose what exactly program is doing with the disk? I was thinking about random I/O, tried to strace lseek function, got some for both fresh and full modes, much less ratio in full mode, while I expected the opposite. What does it do in full mode then? Program has quite bearable number of file descriptors (/prod//fd), below 50 together with peer TCP connections. How can it be in general that both iostat and iotop show 100% utilization with no actual consuming of I/O bandwidth? We even had a call with engineer from Microsoft, he said that iostat may be not accurate especially with SSDs. Might be, but when it says util is 100%, iotop confirms it, and program is not doing what it is supposed to do, what is an alternative explanation?
DimaA6_ABC (121 rep)
Jan 23, 2021, 02:14 PM
-1 votes
1 answers
57 views
RAM bandwidth activity
I compile sources in `/tmp` mounted into RAM. I'd like to know fast it is! That's show/monitor I/O activity (MB/s) for this device. *If I want to monitor HD bandwidth I would have used tools like `iotop` but for RAM it doesn't work!* --- **Long story short:** how can I monitor/show RAM bandwidth (I/...
I compile sources in /tmp mounted into RAM. I'd like to know fast it is! That's show/monitor I/O activity (MB/s) for this device. *If I want to monitor HD bandwidth I would have used tools like iotop but for RAM it doesn't work!* --- **Long story short:** how can I monitor/show RAM bandwidth (I/O) activity?
mattia.b89 (3398 rep)
Sep 22, 2019, 12:24 PM • Last activity: Sep 29, 2019, 10:14 AM
16 votes
2 answers
8722 views
Error with command iotop on CentOS
When using `sudo iotop` (latest version `0.6-2.el7`) in a terminal in my newly installed CentOS 7.5, I get the following error message: Traceback (most recent call last): File "/sbin/iotop", line 17, in main() File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 620, in main main_loop() File "/...
When using sudo iotop (latest version 0.6-2.el7) in a terminal in my newly installed CentOS 7.5, I get the following error message: Traceback (most recent call last): File "/sbin/iotop", line 17, in main() File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 620, in main main_loop() File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 610, in main_loop = lambda: run_iotop(options) File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 508, in run_iotop return curses.wrapper(run_iotop_window, options) File "/usr/lib64/python2.7/curses/wrapper.py", line 43, in wrapper return func(stdscr, *args, **kwds) File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 501, in run_iotop_window ui.run() File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 155, in run self.process_list.duration) File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 434, in refresh_display lines = self.get_data() File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 415, in get_data return list(map(format, processes)) File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 388, in format cmdline = p.get_cmdline() File "/usr/lib/python2.7/site-packages/iotop/data.py", line 292, in get_cmdline proc_status = parse_proc_pid_status(self.pid) File "/usr/lib/python2.7/site-packages/iotop/data.py", line 196, in parse_proc_pid_status key, value = line.split(':\t', 1) ValueError: need more than 1 value to unpack **Any idea how to fix this problem?**
F. Priv&#233; (263 rep)
May 29, 2018, 07:24 AM • Last activity: Jan 23, 2019, 04:02 AM
6 votes
3 answers
8572 views
Why does a gunzip to dd pipeline slow down at the end?
my command: gunzip -c serial2udp.image.gz | sudo dd of=/dev/mmcblk0 conv=fsync,notrunc status=progress bs=4M my output: 15930949632 bytes (16 GB, 15 GiB) copied, 1049 s, 15.2 MB/s 0+331128 records in 0+331128 records out 15931539456 bytes (16 GB, 15 GiB) copied, 1995.2 s, 8.0 MB/s the card: SanDisk...
my command: gunzip -c serial2udp.image.gz | sudo dd of=/dev/mmcblk0 conv=fsync,notrunc status=progress bs=4M my output: 15930949632 bytes (16 GB, 15 GiB) copied, 1049 s, 15.2 MB/s 0+331128 records in 0+331128 records out 15931539456 bytes (16 GB, 15 GiB) copied, 1995.2 s, 8.0 MB/s the card: SanDisk Ultra 32GB MicroSDHC Class 10 UHS Memory Card Speed Up To 30MB/s
distribution: 16.0.4 xenial with xfce
kernel version: 4.13.0.37-generic
i understand taking 17 minutes seems reasonable from what I've read. playing with block size doesn't really seem to make much of a difference (bs=100M still exhibits this behaviour with similar timestamps). why do the updates hang and it doesn't produce a finished report for another 16 minutes?? iotop tells me that mmcqd/0 is still running in the background at this point (at 99% IO), so I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen iotop shows no traffic crossing at this time either for dd. ctrl-c is all but useless and i don't want to corrupt my drive after writing to it.
cts (81 rep)
Sep 9, 2018, 12:03 PM • Last activity: Sep 24, 2018, 09:29 PM
2 votes
1 answers
2170 views
How to see an average of I/O of disk with iotop command?
With `iotop -o` command I can see the write and read speed of the disk per second. But it varies a lot, from 0 to high values. I'd like to see an average of it per minute or per 10 seconds. How can I do it?
With iotop -o command I can see the write and read speed of the disk per second. But it varies a lot, from 0 to high values. I'd like to see an average of it per minute or per 10 seconds. How can I do it?
Henrique Barcelos (121 rep)
Aug 16, 2018, 08:55 PM • Last activity: Aug 16, 2018, 10:09 PM
Showing page 1 of 12 total questions