Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

1 votes

1 answers

183 views

Understanding `pgpgin` in `/proc/vmstat` as I/O Counters: Relationship with I/O Bandwidth Measurements

Hi Kernel I/O Experts, I have a question regarding the `pgpgin` and `pgpgout` counters in `/proc/vmstat`, specifically focusing on `pgpgin`. I’ve been exploring performance monitoring tools like `vmstat` and `iotop`, which are very practical commandline tools for observing I/O performance. Upon exam...

                                  Hi Kernel I/O Experts,

I have a question regarding the pgpgin and pgpgout counters in /proc/vmstat, specifically focusing on pgpgin. I’ve been exploring performance monitoring tools like vmstat and iotop, which are very practical commandline tools for observing I/O performance. Upon examining their code, I noticed that these tools report "Current I/O" using the pgpgin and pgpgout counters instead of directly reading current I/O statistics from block devices, as tools like fio do.

First Question: I am trying to understand **the exact relationship between the pgpgin and pgpgout counters and actual I/O bandwidth. Why do these tools rely on paging-related counters to represent I/O activity?** How exactly are pgpgin and pgpgout updated, and what system components are responsible for these updates? In short, could you explain when and why these counters reflect disk I/O operations?

Second Question (Edge Case): **Are there specific scenarios or edge cases where pgpgin bandwidth does not accurately correspond to actual disk bandwidth?** 

While benchmarking SSD read performance using fio with io_uring polling, I observed that the I/O bandwidth reported by fio (and SSD stats) is significantly lower than the bandwidth indicated by pgpgin. This discrepancy led me to investigate how pgpgin reflects I/O activity (thus the first question above). 

I have confirmed that this mismatch is consistent and not due to transient system noise.

Any insights into these counters, their update mechanisms, and their relationship with real I/O performance would be greatly appreciated. Thank you!

JGL (161 rep)

Sep 12, 2024, 09:56 AM • Last activity: Sep 12, 2024, 11:45 AM

0 votes

1 answers

124 views

How to identify the cause of a process reading from disk at rates above 400 Mb/s

docker container azure iotop

I am managing some virtual machines in Azure and a few times a week, at apparently random times, some of them start **I/O reading at speeds above 400 Mb/s**. This occurrs for one machine at a time, not simultaneously. These machines use SSDs as hard drives, but those read speeds seem abnormal. [![en...

                                  I am managing some virtual machines in Azure and a few times a week, at apparently random times, some of them start **I/O reading at speeds above 400 Mb/s**. This occurrs for one machine at a time, not simultaneously.

These machines use SSDs as hard drives, but those read speeds seem abnormal.

Additionally, the machines experiencing this activity **become unreachable via SSH after a few minutes**.

I am currently using **iotop** trying to output it to a log file, so after rebooting the stuck machine I can go through it and try to identify the process/es causing the trouble.

I am also using crontab to run it every minute.

Find below the current script I am using:

    #!/usr/bin/env bash
    OUT=/var/log/zs/io.log
    echo $(date) >> $OUT
    echo $(iotop -o -b -n 1|head -n 2) >> $OUT
    echo $(iotop -o -b -n 1|head -n 6|tail -n +4) >> $OUT

And the log file showing the I/O peak:

    Fri Jan 12 09:33:01 CET 2024
    Total DISK READ : 113.45 M/s | Total DISK WRITE : 7.04 M/s Actual DISK READ: 171.85 M/s | Actual DISK WRITE: 85.79 M/s
    3350 be/4 root 41.59 M/s 0.00 B/s ?unavailable? containerd 11744 be/4 root 112.49 M/s 0.00 B/s ?unavailable? dockerd -H fd:// --containerd=/run/containerd/containerd.sock 11925 be/4 root 1142.56 K/s 0.00 B/s ?unavailable? dockerd -H fd:// --containerd=/run/containerd/containerd.sock
    Fri Jan 12 09:58:35 CET 2024

Apparently it seems to be related to a docker process, but I would like to know:

1. Is it possible to prevent the machine from becoming unreachable?
2. How to track down the exact Docker container causing this trouble?

Thank you in advance.

Stacker12345 (3 rep)

Jan 12, 2024, 10:40 AM • Last activity: Jan 12, 2024, 12:02 PM

1 votes

1 answers

742 views

iotop fields on missing SWAPIN and IO columns on AlmaLinux 8

centos alma-linux iotop

I am using ```iotop``` to monitor disk activity and on AlmaLinux 8 systems, ```iotop``` does not have all fields. Here are the columns on AlmaLinux 8 ``` PID PRIO USER DISK READ DISK WRITE> COMMAND ``` Here are the columns on CentOS 6/7 ``` PID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND ``` H...

I am using

to monitor disk activity and on AlmaLinux 8 systems,

does not have all fields. Here are the columns on AlmaLinux 8

PID  PRIO  USER     DISK READ DISK WRITE>    COMMAND

Here are the columns on CentOS 6/7

PID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN     IO>    COMMAND

How do I make

display SWAPIN and IO> by default. It also showing unavailable when I run the following:

[root@alma8-dev test]# iotop -botqk
18:22:09 Total DISK READ :       0.00 K/s | Total DISK WRITE :       0.00 K/s
18:22:09 Actual DISK READ:       0.00 K/s | Actual DISK WRITE:       0.00 K/s
    TIME    TID  PRIO  USER     DISK READ  DISK WRITE  SWAPIN      IO    COMMAND
18:22:11 Total DISK READ :       0.00 K/s | Total DISK WRITE :       0.00 K/s
18:22:11 Actual DISK READ:       0.00 K/s | Actual DISK WRITE:       0.00 K/s
18:22:12 Total DISK READ :       0.00 K/s | Total DISK WRITE :      64.94 K/s
18:22:12 Actual DISK READ:       0.00 K/s | Actual DISK WRITE:      64.94 K/s
b'18:22:12    2294 be/3 root        0.00 K/s   64.94 K/s ?unavailable?  d3 flusher-0 pick0 LINUX'
b'18:22:12  849204 be/4 root        0.00 K/s    0.00 K/s ?unavailable?  [kworker/1:3-events]'

Here's the kernel config:

[root@alma8-dev test]#  egrep 'CONFIG_TASK_DELAY_ACCT|CONFIG_TASK_IO_ACCOUNTING|CONFIG_TASKSTATS|CONFIG_VM_EVENT_COUNTERS' /boot/config-4.18.0-425.10.1.e_7.x86_64 CONFIG_TASKSTATS=y CONFIG_TASK_DELAY_ACCT=y CONFIG_TASK_IO_ACCOUNTING=y CONFIG_VM_EVENT_COUNTERS=y

supmethods (561 rep)

Mar 1, 2023, 02:15 AM • Last activity: Apr 20, 2023, 01:01 PM

1 votes

1 answers

286 views

jbd2/sda2-8 utilizing disk I/O and Xorg.0.log log

linux centos xorg iotop

I noticed that ```jbd2/sda2-8``` is always using disk and ```iotop``` showing IO% between 1%-5%. I checked the logs and noticed Xorg.0.log quite large. I see there are frequent connection and disconnection. Is there a way to reduce this. ``` [root@vmcloudm51 autoit]# tail -f /var/log/Xorg.0.log [177...

I noticed that

/sda2-8

is always using disk and

showing IO% between 1%-5%. I checked the logs and noticed Xorg.0.log quite large. I see there are frequent connection and disconnection. Is there a way to reduce this.

[root@vmcloudm51 autoit]# tail -f /var/log/Xorg.0.log
[177501.285] AUDIT: Tue Feb 28 14:00:23 2023: 3444: client 19 disconnected
[177531.342] AUDIT: Tue Feb 28 14:00:53 2023: 3444: client 19 connected from local host ( uid=986 gid=1009 pid=29884 )
  Auth name: MIT-MAGIC-COOKIE-1 ID: 919
[177531.343] AUDIT: Tue Feb 28 14:00:53 2023: 3444: client 19 disconnected
[177561.398] AUDIT: Tue Feb 28 14:01:23 2023: 3444: client 19 connected from local host ( uid=986 gid=1009 pid=29884 )
  Auth name: MIT-MAGIC-COOKIE-1 ID: 919
[177561.399] AUDIT: Tue Feb 28 14:01:23 2023: 3444: client 19 disconnected
[177591.541] AUDIT: Tue Feb 28 14:01:53 2023: 3444: client 19 connected from local host ( uid=986 gid=1009 pid=29884 )
  Auth name: MIT-MAGIC-COOKIE-1 ID: 919
[177591.557] AUDIT: Tue Feb 28 14:01:53 2023: 3444: client 19 disconnected
[177621.616] AUDIT: Tue Feb 28 14:02:23 2023: 3444: client 19 connected from local host ( uid=986 gid=1009 pid=29884 )
  Auth name: MIT-MAGIC-COOKIE-1 ID: 919
[177621.617] AUDIT: Tue Feb 28 14:02:23 2023: 3444: client 19 disconnected

supmethods (561 rep)

Feb 28, 2023, 03:20 AM • Last activity: Feb 28, 2023, 05:06 PM

4 votes

2 answers

6473 views

IOTOP complains: CONFIG_TASK_DELAY_ACCT not enabled in kernel

systemd-boot iotop

Looking about, I see that the standard fix is to add this to the kernel boot parameters. Using systemd-boot, my arch.conf looks like this : title Arch Linux linux /vmlinuz-linux initrd /intel-ucode.img initrd /initramfs-linux.img options root=PARTUUID="98b3b4f7-e7f9-6f49-be81-a2ee709c7a3e" rw How do...

                                  Looking about, I see that the standard fix is to add this to the kernel boot parameters. 

Using systemd-boot, my arch.conf looks like this :

     title   Arch Linux
     linux   /vmlinuz-linux
     initrd  /intel-ucode.img
     initrd  /initramfs-linux.img
     options root=PARTUUID="98b3b4f7-e7f9-6f49-be81-a2ee709c7a3e" rw

How do I add CONFIG_TASK_DELAY_ACCT to the options entry? 

Another line?

Or by using some delimeter, add it to the existing line? 

What value should I be setting it to?

Stephen Boston (2526 rep)

May 30, 2022, 11:00 PM • Last activity: Aug 9, 2022, 09:37 PM

1 votes

1 answers

463 views

why is iotop showing "?err"

shell io iotop

when I run `iotop`, the second column "PRIO" shows `?err` everywhere. [![enter image description here][1]][1] What does this mean? I did not find any mention of err in the manual page. I am using custom compiled kernel. Could it be some option is missing in my kernel? How do I find which one? **EDIT...

                                  when I run iotop, the second column "PRIO" shows ?err everywhere.

What does this mean? I did not find any mention of err in the manual page.

I am using custom compiled kernel. Could it be some option is missing in my kernel? How do I find which one?

**EDIT**

I have all the required options in my kernel:

    CONFIG_TASKSTATS=y
    CONFIG_TASK_DELAY_ACCT=y
    CONFIG_TASK_IO_ACCOUNTING=y
    CONFIG_VM_EVENT_COUNTERS=y

Martin Vegter (598 rep)

Apr 17, 2022, 11:44 AM • Last activity: Apr 22, 2022, 11:21 AM

5 votes

3 answers

2866 views

Why are processes blocked by I/O in case of heavy system load?

io iotop

I have a workstation(2x Intel Xeon family CPUs and 128GiB of RAM) running several virtual machines and while the combined CPU usage is <30%, then the load average is between 20 and 25. For example, if I execute a `tar -xzvf vm_data.tgz --directory vm4/ --strip-components=1` command, then the `gzip`...

                                  I have a workstation(2x Intel Xeon family CPUs and 128GiB of RAM) running several virtual machines and while the combined CPU usage is <30%, then the load average is between 20 and 25. For example, if I execute a tar -xzvf vm_data.tgz --directory vm4/ --strip-components=1 command, then the gzip process is 90% - 99% of its time blocked by I/O and the command takes forever to complete:

On the other hand, the actual reads and writes to disks are very low compared to SATA 3.0 or SSDs(I'm using single [Kingston SA400S37960G](https://www.kingston.com/en/ssd/a400-solid-state-drive?partnum=SA400S37%2F960G)  SSD) hardware limits.

What might cause a process(gzip in my example) to wait after the I/O while the actual disk reads and writes appear to be very low? My first thought was that maybe the system interrupts are very high and that's what's blocking the I/O, but according to /proc/interrupts this does not seem to be the case as none of the counters are increasing rapidly.

Martin (8156 rep)

Sep 17, 2020, 10:12 AM • Last activity: Mar 18, 2021, 11:31 PM

2 votes

0 answers

769 views

Process shows as 100% I/O bound while producing minimal disk activity, disk util is at 100%

iostat iotop

We are having quite strange problem. There is a program (cryptocurrency node to be precise) which has local database of all the transactions ever made. Database is huge - around 15 TB. The problem is that the program won't synchronize with the network, though it has enough peers and knowledge about...

                                  We are having quite strange problem. There is a program (cryptocurrency node to be precise) which has local database of all the transactions ever made. Database is huge - around 15 TB. The problem is that the program won't synchronize with the network, though it has enough peers and knowledge about new and old blocks is not a problem.

Now the strange part - I have started the same program from scratch, without that history of 15TB, and it started syncing immediately, loading disk by about 50% per iostat. CPU and memory utilization are negligible. Absolute figures are:

 - Read speed: 5MB/s 
 - Write speed: 20MB/s
 - iotop - 20% on average for this process

When I switch to historical DB (15TB), iostat shows 100% disk utilization, iotop shows multiple forked processes with majority of them sitting at 99% of I/O, but actual I/O is not happening judging by the volume reported by iotop or iostat. Both read and write are within 1MB/s. This is running on MS Azure VM, through Azure portal we see that disk utilization is around 1% in "full" mode and writing is around 20% in "fresh" mode, so throttling by cloud operator is no an issue either.

Now the question - how do I diagnose what exactly program is doing with the disk? I was thinking about random I/O, tried to strace lseek function, got some for both fresh and full modes, much less ratio in full mode, while I expected the opposite. What does it do in full mode then? Program has quite bearable number of file descriptors (/prod//fd), below 50 together with peer TCP connections. How can it be in general that both iostat and iotop show 100% utilization with no actual consuming of I/O bandwidth? We even had a call with engineer from Microsoft, he said that iostat may be not accurate especially with SSDs. Might be, but when it says util is 100%, iotop confirms it, and program is not doing what it is supposed to do, what is an alternative explanation?

DimaA6_ABC (121 rep)

Jan 23, 2021, 02:14 PM

-1 votes

1 answers

57 views

RAM bandwidth activity

compiling monitoring io ram iotop

I compile sources in `/tmp` mounted into RAM. I'd like to know fast it is! That's show/monitor I/O activity (MB/s) for this device. *If I want to monitor HD bandwidth I would have used tools like `iotop` but for RAM it doesn't work!* --- **Long story short:** how can I monitor/show RAM bandwidth (I/...

                                  I compile sources in /tmp mounted into RAM.  
I'd like to know fast it is! That's show/monitor I/O activity (MB/s) for this device.

*If I want to monitor HD bandwidth I would have used tools like iotop but for RAM it doesn't work!*

---

**Long story short:** how can I monitor/show RAM bandwidth (I/O) activity?

mattia.b89 (3398 rep)

Sep 22, 2019, 12:24 PM • Last activity: Sep 29, 2019, 10:14 AM

16 votes

2 answers

8722 views

Error with command iotop on CentOS

centos python iotop

When using `sudo iotop` (latest version `0.6-2.el7`) in a terminal in my newly installed CentOS 7.5, I get the following error message: Traceback (most recent call last): File "/sbin/iotop", line 17, in main() File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 620, in main main_loop() File "/...

                                  When using sudo iotop (latest version 0.6-2.el7) in a terminal in my newly installed CentOS 7.5, I get the following error message:

    Traceback (most recent call last):
      File "/sbin/iotop", line 17, in 
        main()
      File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 620, in main
        main_loop()
      File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 610, in 
        main_loop = lambda: run_iotop(options)
      File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 508, in run_iotop
        return curses.wrapper(run_iotop_window, options)
      File "/usr/lib64/python2.7/curses/wrapper.py", line 43, in wrapper
        return func(stdscr, *args, **kwds)
      File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 501, in run_iotop_window
        ui.run()
      File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 155, in run
        self.process_list.duration)
      File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 434, in refresh_display
        lines = self.get_data()
      File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 415, in get_data
        return list(map(format, processes))
      File "/usr/lib/python2.7/site-packages/iotop/ui.py", line 388, in format
        cmdline = p.get_cmdline()
      File "/usr/lib/python2.7/site-packages/iotop/data.py", line 292, in get_cmdline
        proc_status = parse_proc_pid_status(self.pid)
      File "/usr/lib/python2.7/site-packages/iotop/data.py", line 196, in parse_proc_pid_status
        key, value = line.split(':\t', 1)
    ValueError: need more than 1 value to unpack

**Any idea how to fix this problem?**

                                

F. Privé (263 rep)

May 29, 2018, 07:24 AM • Last activity: Jan 23, 2019, 04:02 AM

6 votes

3 answers

8572 views

Why does a gunzip to dd pipeline slow down at the end?

linux dd gzip sd-card iotop

my command: gunzip -c serial2udp.image.gz | sudo dd of=/dev/mmcblk0 conv=fsync,notrunc status=progress bs=4M my output: 15930949632 bytes (16 GB, 15 GiB) copied, 1049 s, 15.2 MB/s 0+331128 records in 0+331128 records out 15931539456 bytes (16 GB, 15 GiB) copied, 1995.2 s, 8.0 MB/s the card: SanDisk...

                                  my command:

    gunzip -c serial2udp.image.gz |
    sudo dd of=/dev/mmcblk0 conv=fsync,notrunc status=progress bs=4M

my output:

    15930949632 bytes (16 GB, 15 GiB) copied, 1049 s, 15.2 MB/s    

    0+331128 records in
    0+331128 records out
    15931539456 bytes (16 GB, 15 GiB) copied, 1995.2 s, 8.0 MB/s

the card: SanDisk Ultra 32GB MicroSDHC Class 10 UHS Memory Card Speed Up To 30MB/s 

distribution: 16.0.4 xenial with xfce 

kernel version: 4.13.0.37-generic 

i understand taking 17 minutes seems reasonable from what I've read. playing with block size doesn't really seem to make much of a difference (bs=100M still exhibits this behaviour with similar timestamps). why do the updates hang and it doesn't produce a finished report for another 16 minutes?? 

iotop tells me that mmcqd/0 is still running in the background at this point (at 99% IO), so I figure there is a cache somewhere that is holding up the final 5MB but I thought fsync should make sure that doesn't happen
iotop shows no traffic crossing at this time either for dd. ctrl-c is all but useless and i don't want to corrupt my drive after writing to it.

cts (81 rep)

Sep 9, 2018, 12:03 PM • Last activity: Sep 24, 2018, 09:29 PM

2 votes

1 answers

2170 views

How to see an average of I/O of disk with iotop command?

linux iotop

With `iotop -o` command I can see the write and read speed of the disk per second. But it varies a lot, from 0 to high values. I'd like to see an average of it per minute or per 10 seconds. How can I do it?

                                  With iotop -o command I can see the write and read speed of the disk per second. But it varies a lot, from 0 to high values. I'd like to see an average of it per minute or per 10 seconds.

How can I do it?

Henrique Barcelos (121 rep)

Aug 16, 2018, 08:55 PM • Last activity: Aug 16, 2018, 10:09 PM

Showing page 1 of 12 total questions