Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

0 votes

0 answers

179 views

Resident memory reported significantly lesser than Proportional Resident memory - Process Exporter and Grafana

linux memory-management prometheus-exporter

I have a process monitoring stack setup with process exporter and Grafana with the "process profiling with treemap" dashboard, and I see some suspicious behaviour regarding the memory it is reporting. Based on my understanding (Having read [this](https://unix.stackexchange.com/questions/33381/getting-information-about-a-process-memory-usage-from-proc-pid-smaps) article and the links mentioned within that article): RSS = Private memory + shared memory PSS = Private memory + (shared memory / num of processes sharing said memory) This leads me to believe that RSS >= PSS at any given time. Here is what I observe: 1. Process takes 39.2GB of virtual memory [process alloted 39.2GB virtual memory](https://i.sstatic.net/zOFaMpz5.png) 2. Process takes up 38.1GB of "proportional resident memory", this makes sense [Process takes up 38.1GB proportional resident memory](https://i.sstatic.net/BC3QAWzu.png) 3. This is where this get suspicious, process takes up only 18.8GB of resident memory. [Process takes only 19.2GB resident memory](https://i.sstatic.net/ZlsbQymS.png) Is my understanding of how RSS and PSS works correct? If yes, what could be the reasons this process is behaving like this(or being reported as such). I suspected process exporter or grafana might be incorrect but no other process reports something suspicious like this so im assuming they're working as expected. I looked at the process exporter github to confirm if my understanding of the fields reported by it is correct.

resident: Field rss(24) from /proc/[pid]/stat, whose doc says:

This is just the pages which count toward text, data, or stack space. This does not include pages which have not been demand-loaded in, or which are swapped out.


proportionalResident: Sum of "Pss" fields from /proc/[pid]/smaps, whose doc says:

The "proportional set size" (PSS) of a process is the count of pages it has in memory, where each page is divided by the number of processes sharing it.

No pages have been swapped out. Here are the queries used by Grafana to graph these: proportional resident memory: namedprocess_namegroup_memory_bytes{instance=~"$instance", memtype="proportionalResident"} > 0 virtual memory: namedprocess_namegroup_memory_bytes{instance=~"$instance", memtype="virtual"} resident memory:

namedprocess_namegroup_memory_bytes{instance=~"$instance", memtype="resident"} / ignoring(memtype) namedprocess_namegroup_num_procs > 0

All other processes behave expectedly with RSS >= PSS. Why could this process be reporting this behaviour? TIA!

Phantom (1 rep)

Oct 15, 2024, 09:34 AM

1 votes

0 answers

196 views

Getting an interfaces `valid_lft` programmatically

kernel network-interface dhcp prometheus-exporter

iproute2 can give back an interface's `valid_lft` which corresponds to the dhcp lease time left. See a truncated example output hereafter: ``` 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 ... 2: enp0s1: mtu 1500 qdisc fq_codel state UP group default qlen 1000 ... valid_lft 82...

iproute2 can give back an interface's valid_lft which corresponds to the dhcp lease time left. See a truncated example output hereafter:

1: lo:  mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
  ...
2: enp0s1:  mtu 1500 qdisc fq_codel state UP group default qlen 1000
   ...
       valid_lft 82933sec preferred_lft 82933sec
   ...

I'd like to get this from preferably Golang without actually calling ip address. Though I realise ip route has a json output, I still much prefer to interact with, for example: - a library which allows me access to this counter - a path in /sys or /proc which provides access to this counter - something else? The goal is to expose the valid_lft counter as a metric which can be scraped by something like Prometheus. I'm amazed this is not available yet.

hbogert (759 rep)

Aug 17, 2023, 06:52 PM

0 votes

2 answers

1144 views

Cannot run Prometheus or Blackbox Exporter with systemctl

systemd prometheus-exporter

I installed the Prometheus Blackbox Exporter using this instruction set: https://devconnected.com/how-to-install-and-configure-blackbox-exporter-for-prometheus/ When I start the blackbox exporter from the command line it works fine: ``` /usr/local/bin/blackbox_exporter --config.file=/etc/blackbox/bl...

/usr/local/bin/blackbox_exporter --config.file=/etc/blackbox/blackbox.yml --web.listen-address=":9115"
level=info ts=2023-05-09T15:18:12.170335169Z caller=main.go:213 msg="Starting blackbox_exporter" version="(version=0.14.0, branch=HEAD, revision=bba7ef76193948a333a5868a1ab38b864f7d968a)"
level=info ts=2023-05-09T15:18:12.17114947Z caller=main.go:226 msg="Loaded config file"
level=info ts=2023-05-09T15:18:12.171355458Z caller=main.go:330 msg="Listening on address" address=:9115

When I try to run it as a service it does not:

systemctl status blackbox_exporter
blackbox_exporter.service - Blackbox Exporter
   Loaded: loaded (/etc/systemd/system/blackbox_exporter.service; disabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Tue 2023-05-09 15:20:40 GMT; 5s ago
  Process: 22483 ExecStart=/usr/local/bin/blackbox_exporter --config.file /etc/blackbox_exporter/blackbox.yml (code=exited, status=1/FAILURE)
 Main PID: 22483 (code=exited, status=1/FAILURE)

May 09 15:20:40 hostname systemd: Started Blackbox Exporter.
May 09 15:20:40 hostname systemd: blackbox_exporter.service: main process exited, code=exited, status=1/FAILURE
May 09 15:20:40 hostname systemd: Unit blackbox_exporter.service entered failed state.
May 09 15:20:40 hostname systemd: blackbox_exporter.service failed.

I'm running Prometheus on another box, not sure if that will work with the blackbox exporter and Prometheus itself on separate boxes but something I wanted to try. I thought maybe for some reason not having Prometheus locally was causing problems so I tried installing Prometheus locally using this instruction set: https://devconnected.com/how-to-setup-grafana-and-prometheus-on-linux/ Unfortunately I have the same problem. I can fire up Prometheus just fine from the command line but cannot start it using sudo systemctl start prometheus. As I saw this being an issue on some other threads I researched, I'll also mention that SELinux is disabled on this particular machine. Thoughts anyone? TIA!

Chris Utter (1 rep)

May 9, 2023, 07:14 PM • Last activity: May 10, 2023, 05:27 PM

2 votes

0 answers

1121 views

Systemd service creation of Prometheus

systemd services oracle-linux prometheus-exporter

Oracle Linux 8 Trying to start prometheus from systemd and getting the `status=203/EXEC` error. I can copy/paste the commands from `/etc/systemd/system/prometheus.service` and it will start without error. The prometheus user (no login) and group exist. This is my **prometheus.service:** [Unit] Descr...

                                  Oracle Linux 8

Trying to start prometheus from systemd and getting the status=203/EXEC error.  I can copy/paste the commands from /etc/systemd/system/prometheus.service and it will start without error. The prometheus user (no login) and group exist. This is my **prometheus.service:**


    [Unit]
    Description=Prometheus
    Documentation=https://prometheus.io/docs/introduction/overview/ 
    Wants=network-online.target
    After=network-online.target
    
    [Service]
    Type=simple
    Environment="GOMAXPROCS=1"
    User=prometheus
    Group=prometheus
    ExecReload=/bin/kill -HUP $MAINPID
    ExecStart=/usr/local/bin/prometheus \
      --config.file=/etc/prometheus/prometheus.yml \
      --storage.tsdb.path=/var/lib/prometheus \
      --web.console.templates=/etc/prometheus/consoles \
      --web.console.libraries=/etc/prometheus/console_libraries \
      --web.listen-address=0.0.0.0:9090 \
      --web.external-url=
    
    SyslogIdentifier=prometheus
    Restart=always
    
    [Install]
    WantedBy=multi-user.target
                                

jim feldman (31 rep)

Sep 19, 2022, 11:57 PM • Last activity: Sep 21, 2022, 04:39 PM

0 votes

1 answers

2764 views

Merge jq output into a comma separated string like

command-line json jq prometheus-exporter

I have this output and would like to convert it into a Prometheus-like format by [JQ](https://stedolan.github.io/jq/). `cat /tmp/wp-plugin.txt | jq .[]` ```json { "name": "akismet", "status": "active", "update": "none", "version": "5.0" } { "name": "performance-lab", "status": "active", "update": "n...

I have this output and would like to convert it into a Prometheus-like format by [JQ](https://stedolan.github.io/jq/) . cat /tmp/wp-plugin.txt | jq .[]

{
  "name": "akismet",
  "status": "active",
  "update": "none",
  "version": "5.0"
}
{
  "name": "performance-lab",
  "status": "active",
  "update": "none",
  "version": "1.4.0"
}

My goal is to get like this using JQ CLI tools

wp_plugins{name="akismet",status="active",update="none",version="5.0"}0
wp_plugins{name="performance-lab",status="active",update="active",version="1.4.0"}1

Rostyslav Malenko (103 rep)

Aug 24, 2022, 11:35 AM • Last activity: Aug 24, 2022, 12:39 PM

0 votes

1 answers

99 views

Prometheus DiskTooManyReallocatedSectors

linux hard-disk node.js smart prometheus-exporter

I have Prometheus Alert Manager running on several linux machines. *(https://prometheus.io/docs/alerting/latest/alertmanager/)* One of them is reporting *2 reallocated sectors*. I got the setup-alert from here: *https://awesome-prometheus-alerts.grep.to/rules.html* **1. What is my course of action?*...

                                  I have Prometheus Alert Manager running on several linux machines. *(https://prometheus.io/docs/alerting/latest/alertmanager/)* 

One of them is reporting *2 reallocated sectors*. I got the setup-alert from here:
*https://awesome-prometheus-alerts.grep.to/rules.html* 

**1. What is my course of action?** Replace with an SDD?

**2. What is the priority ...weeks, months?**

DavidDunham (117 rep)

Jul 2, 2022, 03:08 AM • Last activity: Jul 3, 2022, 07:51 AM

1 votes

0 answers

786 views

Construction "if then else" in alert rules alertmanager(prometheus)

monitoring notifications prometheus-exporter

Right now I have two rules for "Warning" and "Critical" alerts. Is it possible to combine them somehow, so as not to produce entities? - alert: Proxysql latency check expr: metric1 > 1 for: 30s labels: severity: warning instance: "{{ $labels.node_name }}" label: name-channel annotations: summary: "I...

                                  Right now I have two rules for "Warning" and "Critical" alerts. Is it possible to combine them somehow, so as not to produce entities?

      - alert: Proxysql latency check
        expr: metric1 > 1
        for: 30s
        labels:
          severity: warning
          instance: "{{ $labels.node_name }}"
          label: name-channel
        annotations:
          summary: "Info"
    
      - alert: Proxysql latency check
        expr: metric1 > 5
        for: 30s
        labels:
          severity: critical
          instance: "{{ $labels.node_name }}"
          label: name-channel
        annotations:
          summary: "Info"

An example of how I present it: 

          - alert: Proxysql latency check
        expr: if metric1 > 1 then "Warning" else metric > 5 "Critical" 
        for: 30s
        labels:
          severity: warning | critical
          instance: "{{ $labels.node_name }}"
          label: name-channel
        annotations:
          summary: "Info"
                                

Alexander Kolesnik (11 rep)

May 25, 2022, 10:46 AM • Last activity: May 25, 2022, 10:47 AM

0 votes

1 answers

1636 views

How to calculate used memory at Ubuntu?

ubuntu prometheus-exporter

I just launched a new ubuntu machine. available memory is reasonable but free mem is very small: cat /proc/meminfo |more MemTotal: 2034484 kB MemFree: 703496 kB MemAvailable: 1538076 kB Buffers: 80332 kB Cached: 829408 kB How should I calculate used_memory? According to https://stackoverflow.com/que...

                                  I just launched a new ubuntu machine.  available memory is reasonable but free mem is very small:

    cat /proc/meminfo |more
    MemTotal:        2034484 kB
    MemFree:          703496 kB
    MemAvailable:    1538076 kB
    Buffers:           80332 kB
    Cached:           829408 kB

How should I calculate used_memory? According to https://stackoverflow.com/questions/41224738/how-to-calculate-system-memory-usage-from-proc-meminfo-like-htop  

    Used_mem = Total_mem- Free_mem
which does not make sense me because if I used this equation, the machine already used a lot of although it has not run anything yet. I feel maybe

    Used_mem = Total_mem- Available_mem 

make more sense to me.

So my question is: how to calculate used memory? Or How to calculate memory which is really free to use? the free memory shown at above must not be correct.

Note I am calculate used_memory using node_exporter metrics.
                                

user389955 (103 rep)

Dec 9, 2020, 10:42 PM • Last activity: Dec 10, 2020, 03:54 AM

0 votes

1 answers

2672 views

blackbox_exporter failing to launch with exit code 203/EXEC

centos systemd prometheus-exporter

I am trying to follow this guide to install and setup blackbox_exporter: https://devconnected.com/how-to-install-and-configure-blackbox-exporter-for-prometheus/ I have followed everything and can manually run the command from the systemd service and get it to run. However when I try to run systemctl start blackbox.service and then check the status, it fails with exit code 203/EXEC I check the permissions on /usr/local/bin/blackbox_exporter: -rwxr-xr-x. 1 blackbox blackbox 17050332 Nov 11 10:27 /usr/local/bin/blackbox_exporter I can run the command from terminal as such just fine: /usr/local/bin/blackbox_exporter --config.file=/etc/blackbox/blackbox.yml --web.listen-address=:9115 Here is my systemd service:

[Unit]
Description=Blackbox Exporter Service
Wants=network-online.target
After=network-online.target

[Service]
Type=simple
User=blackbox
Group=blackbox
ExecStart=/usr/local/bin/blackbox_exporter \
  --config.file=/etc/blackbox/blackbox.yml \
  --web.listen-address=":9115"

Restart=always

[Install]
WantedBy=multi-user.target

Logs from journalctl -u blackbox.service:

Apr 30 08:26:55 localhost systemd: Started Blackbox Exporter Service.
Apr 30 08:26:55 localhost systemd: blackbox.service: Main process exited, code=exited, status=203/EXEC
Apr 30 08:26:55 localhost systemd: blackbox.service: Failed with result 'exit-code'.

I am using CentOS 8. Any help would be greatly appreciated.

Tyler Radlick (3 rep)

Apr 30, 2020, 01:45 PM • Last activity: Apr 30, 2020, 03:47 PM

0 votes

2 answers

3285 views

Shut up a kernel error or avoid it by configuring Prometheus Node Exporter

ubuntu prometheus-exporter

Let me start by giving the output which I get to see in `/var/log/syslog` and with `dmesg`: [559151.898586] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20170831/exfield-427) [559151.911578] No Local Variables are initialized for Method [_PMM] [559151....

                                  Let me start by giving the output which I get to see in /var/log/syslog and with dmesg:

    [559151.898586] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20170831/exfield-427)
    [559151.911578] No Local Variables are initialized for Method [_PMM]
    [559151.911580] No Arguments are initialized for method [_PMM]
    [559151.911584] ACPI Error: Method parse/execution failed \_SB.PMI0._PMM, AE_AML_BUFFER_LIMIT (20170831/psparse-550)
    [559151.916648] ACPI Exception: AE_AML_BUFFER_LIMIT, Evaluating _PMM (20170831/power_meter-338)

Clearly the [Prometheus Node Exporter](https://github.com/prometheus/node_exporter)  is *triggering* the error, although it *doesn't seem to be the cause* for the error that gets logged.

Now what I want to achieve is - preferably - to tell the Prometheus Node Exporter to stop querying for whatever information it's attempting to query. Failing that, I'd like to silence these messages so they don't spam my log files.

How would I go about either of these options? ... or perhaps there are other options I haven't considered ...

This is happening on Ubuntu 18.04 with the packaged prometheus-node-exporter (it also happened with the 0.16 and 0.17 versions of prometheus-node-exporter which could be installed via stretch-backports - yes, on Ubuntu).
                                

0xC0000022L (16938 rep)

Nov 8, 2018, 09:13 PM • Last activity: Jan 23, 2019, 07:57 PM

Showing page 1 of 10 total questions