Unix & Linux Stack Exchange
Q&A for users of Linux, FreeBSD and other Unix-like operating systems
Latest Questions
0
votes
0
answers
179
views
Resident memory reported significantly lesser than Proportional Resident memory - Process Exporter and Grafana
I have a process monitoring stack setup with process exporter and Grafana with the "process profiling with treemap" dashboard, and I see some suspicious behaviour regarding the memory it is reporting. Based on my understanding (Having read [this](https://unix.stackexchange.com/questions/33381/gettin...
I have a process monitoring stack setup with process exporter and Grafana with the "process profiling with treemap" dashboard, and I see some suspicious behaviour regarding the memory it is reporting.
Based on my understanding (Having read [this](https://unix.stackexchange.com/questions/33381/getting-information-about-a-process-memory-usage-from-proc-pid-smaps) article and the links mentioned within that article):
RSS = Private memory + shared memory
PSS = Private memory + (shared memory / num of processes sharing said memory)
This leads me to believe that RSS >= PSS at any given time.
Here is what I observe:
1. Process takes 39.2GB of virtual memory
[process alloted 39.2GB virtual memory](https://i.sstatic.net/zOFaMpz5.png)
2. Process takes up 38.1GB of "proportional resident memory", this makes sense
[Process takes up 38.1GB proportional resident memory](https://i.sstatic.net/BC3QAWzu.png)
3. This is where this get suspicious, process takes up only 18.8GB of resident memory.
[Process takes only 19.2GB resident memory](https://i.sstatic.net/ZlsbQymS.png)
Is my understanding of how RSS and PSS works correct? If yes, what could be the reasons this process is behaving like this(or being reported as such). I suspected process exporter or grafana might be incorrect but no other process reports something suspicious like this so im assuming they're working as expected.
I looked at the process exporter github to confirm if my understanding of the fields reported by it is correct.
resident: Field rss(24) from /proc/[pid]/stat, whose doc says:
This is just the pages which count toward text, data, or stack space. This does not include pages which have not been demand-loaded in, or which are swapped out.
proportionalResident: Sum of "Pss" fields from /proc/[pid]/smaps, whose doc says:
The "proportional set size" (PSS) of a process is the count of pages it has in memory, where each page is divided by the number of processes sharing it.
No pages have been swapped out.
Here are the queries used by Grafana to graph these:
proportional resident memory:
namedprocess_namegroup_memory_bytes{instance=~"$instance", memtype="proportionalResident"} > 0
virtual memory:
namedprocess_namegroup_memory_bytes{instance=~"$instance", memtype="virtual"}
resident memory:
namedprocess_namegroup_memory_bytes{instance=~"$instance", memtype="resident"} / ignoring(memtype) namedprocess_namegroup_num_procs > 0
All other processes behave expectedly with RSS >= PSS. Why could this process be reporting this behaviour?
TIA!
Phantom
(1 rep)
Oct 15, 2024, 09:34 AM
1
votes
0
answers
196
views
Getting an interfaces `valid_lft` programmatically
iproute2 can give back an interface's `valid_lft` which corresponds to the dhcp lease time left. See a truncated example output hereafter: ``` 1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000 ... 2: enp0s1: mtu 1500 qdisc fq_codel state UP group default qlen 1000 ... valid_lft 82...
iproute2 can give back an interface's
valid_lft
which corresponds to the dhcp lease time left. See a truncated example output hereafter:
1: lo: mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
...
2: enp0s1: mtu 1500 qdisc fq_codel state UP group default qlen 1000
...
valid_lft 82933sec preferred_lft 82933sec
...
I'd like to get this from preferably Golang without actually calling ip address
. Though I realise ip route has a json output, I still much prefer to interact with, for example:
- a library which allows me access to this counter
- a path in /sys or /proc which provides access to this counter
- something else?
The goal is to expose the valid_lft counter as a metric which can be scraped by something like Prometheus. I'm amazed this is not available yet.
hbogert
(759 rep)
Aug 17, 2023, 06:52 PM
0
votes
2
answers
1144
views
Cannot run Prometheus or Blackbox Exporter with systemctl
I installed the Prometheus Blackbox Exporter using this instruction set: https://devconnected.com/how-to-install-and-configure-blackbox-exporter-for-prometheus/ When I start the blackbox exporter from the command line it works fine: ``` /usr/local/bin/blackbox_exporter --config.file=/etc/blackbox/bl...
I installed the Prometheus Blackbox Exporter using this instruction set: https://devconnected.com/how-to-install-and-configure-blackbox-exporter-for-prometheus/
When I start the blackbox exporter from the command line it works fine:
/usr/local/bin/blackbox_exporter --config.file=/etc/blackbox/blackbox.yml --web.listen-address=":9115"
level=info ts=2023-05-09T15:18:12.170335169Z caller=main.go:213 msg="Starting blackbox_exporter" version="(version=0.14.0, branch=HEAD, revision=bba7ef76193948a333a5868a1ab38b864f7d968a)"
level=info ts=2023-05-09T15:18:12.17114947Z caller=main.go:226 msg="Loaded config file"
level=info ts=2023-05-09T15:18:12.171355458Z caller=main.go:330 msg="Listening on address" address=:9115
When I try to run it as a service it does not:
systemctl status blackbox_exporter
blackbox_exporter.service - Blackbox Exporter
Loaded: loaded (/etc/systemd/system/blackbox_exporter.service; disabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Tue 2023-05-09 15:20:40 GMT; 5s ago
Process: 22483 ExecStart=/usr/local/bin/blackbox_exporter --config.file /etc/blackbox_exporter/blackbox.yml (code=exited, status=1/FAILURE)
Main PID: 22483 (code=exited, status=1/FAILURE)
May 09 15:20:40 hostname systemd: Started Blackbox Exporter.
May 09 15:20:40 hostname systemd: blackbox_exporter.service: main process exited, code=exited, status=1/FAILURE
May 09 15:20:40 hostname systemd: Unit blackbox_exporter.service entered failed state.
May 09 15:20:40 hostname systemd: blackbox_exporter.service failed.
I'm running Prometheus on another box, not sure if that will work with the blackbox exporter and Prometheus itself on separate boxes but something I wanted to try. I thought maybe for some reason not having Prometheus locally was causing problems so I tried installing Prometheus locally using this instruction set: https://devconnected.com/how-to-setup-grafana-and-prometheus-on-linux/
Unfortunately I have the same problem. I can fire up Prometheus just fine from the command line but cannot start it using sudo systemctl start prometheus. As I saw this being an issue on some other threads I researched, I'll also mention that SELinux is disabled on this particular machine. Thoughts anyone? TIA!
Chris Utter
(1 rep)
May 9, 2023, 07:14 PM
• Last activity: May 10, 2023, 05:27 PM
2
votes
0
answers
1121
views
Systemd service creation of Prometheus
Oracle Linux 8 Trying to start prometheus from systemd and getting the `status=203/EXEC` error. I can copy/paste the commands from `/etc/systemd/system/prometheus.service` and it will start without error. The prometheus user (no login) and group exist. This is my **prometheus.service:** [Unit] Descr...
Oracle Linux 8
Trying to start prometheus from systemd and getting the
status=203/EXEC
error. I can copy/paste the commands from /etc/systemd/system/prometheus.service
and it will start without error. The prometheus user (no login) and group exist. This is my **prometheus.service:**
[Unit]
Description=Prometheus
Documentation=https://prometheus.io/docs/introduction/overview/
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
Environment="GOMAXPROCS=1"
User=prometheus
Group=prometheus
ExecReload=/bin/kill -HUP $MAINPID
ExecStart=/usr/local/bin/prometheus \
--config.file=/etc/prometheus/prometheus.yml \
--storage.tsdb.path=/var/lib/prometheus \
--web.console.templates=/etc/prometheus/consoles \
--web.console.libraries=/etc/prometheus/console_libraries \
--web.listen-address=0.0.0.0:9090 \
--web.external-url=
SyslogIdentifier=prometheus
Restart=always
[Install]
WantedBy=multi-user.target
jim feldman
(31 rep)
Sep 19, 2022, 11:57 PM
• Last activity: Sep 21, 2022, 04:39 PM
0
votes
1
answers
2764
views
Merge jq output into a comma separated string like
I have this output and would like to convert it into a Prometheus-like format by [JQ](https://stedolan.github.io/jq/). `cat /tmp/wp-plugin.txt | jq .[]` ```json { "name": "akismet", "status": "active", "update": "none", "version": "5.0" } { "name": "performance-lab", "status": "active", "update": "n...
I have this output and would like to convert it into a Prometheus-like format by [JQ](https://stedolan.github.io/jq/) .
cat /tmp/wp-plugin.txt | jq .[]
{
"name": "akismet",
"status": "active",
"update": "none",
"version": "5.0"
}
{
"name": "performance-lab",
"status": "active",
"update": "none",
"version": "1.4.0"
}
My goal is to get like this using JQ CLI tools
wp_plugins{name="akismet",status="active",update="none",version="5.0"}0
wp_plugins{name="performance-lab",status="active",update="active",version="1.4.0"}1
Rostyslav Malenko
(103 rep)
Aug 24, 2022, 11:35 AM
• Last activity: Aug 24, 2022, 12:39 PM
0
votes
1
answers
99
views
Prometheus DiskTooManyReallocatedSectors
I have Prometheus Alert Manager running on several linux machines. *(https://prometheus.io/docs/alerting/latest/alertmanager/)* One of them is reporting *2 reallocated sectors*. I got the setup-alert from here: *https://awesome-prometheus-alerts.grep.to/rules.html* **1. What is my course of action?*...
I have Prometheus Alert Manager running on several linux machines. *(https://prometheus.io/docs/alerting/latest/alertmanager/)*
One of them is reporting *2 reallocated sectors*. I got the setup-alert from here:
*https://awesome-prometheus-alerts.grep.to/rules.html*
**1. What is my course of action?** Replace with an SDD?
**2. What is the priority ...weeks, months?**

DavidDunham
(117 rep)
Jul 2, 2022, 03:08 AM
• Last activity: Jul 3, 2022, 07:51 AM
1
votes
0
answers
786
views
Construction "if then else" in alert rules alertmanager(prometheus)
Right now I have two rules for "Warning" and "Critical" alerts. Is it possible to combine them somehow, so as not to produce entities? - alert: Proxysql latency check expr: metric1 > 1 for: 30s labels: severity: warning instance: "{{ $labels.node_name }}" label: name-channel annotations: summary: "I...
Right now I have two rules for "Warning" and "Critical" alerts. Is it possible to combine them somehow, so as not to produce entities?
- alert: Proxysql latency check
expr: metric1 > 1
for: 30s
labels:
severity: warning
instance: "{{ $labels.node_name }}"
label: name-channel
annotations:
summary: "Info"
- alert: Proxysql latency check
expr: metric1 > 5
for: 30s
labels:
severity: critical
instance: "{{ $labels.node_name }}"
label: name-channel
annotations:
summary: "Info"
An example of how I present it:
- alert: Proxysql latency check
expr: if metric1 > 1 then "Warning" else metric > 5 "Critical"
for: 30s
labels:
severity: warning | critical
instance: "{{ $labels.node_name }}"
label: name-channel
annotations:
summary: "Info"
Alexander Kolesnik
(11 rep)
May 25, 2022, 10:46 AM
• Last activity: May 25, 2022, 10:47 AM
0
votes
1
answers
1636
views
How to calculate used memory at Ubuntu?
I just launched a new ubuntu machine. available memory is reasonable but free mem is very small: cat /proc/meminfo |more MemTotal: 2034484 kB MemFree: 703496 kB MemAvailable: 1538076 kB Buffers: 80332 kB Cached: 829408 kB How should I calculate used_memory? According to https://stackoverflow.com/que...
I just launched a new ubuntu machine. available memory is reasonable but free mem is very small:
cat /proc/meminfo |more
MemTotal: 2034484 kB
MemFree: 703496 kB
MemAvailable: 1538076 kB
Buffers: 80332 kB
Cached: 829408 kB
How should I calculate used_memory? According to https://stackoverflow.com/questions/41224738/how-to-calculate-system-memory-usage-from-proc-meminfo-like-htop
Used_mem = Total_mem- Free_mem
which does not make sense me because if I used this equation, the machine already used a lot of although it has not run anything yet. I feel maybe
Used_mem = Total_mem- Available_mem
make more sense to me.
So my question is: how to calculate used memory? Or How to calculate memory which is really free to use? the free memory shown at above must not be correct.
Note I am calculate used_memory using node_exporter metrics.
user389955
(103 rep)
Dec 9, 2020, 10:42 PM
• Last activity: Dec 10, 2020, 03:54 AM
0
votes
1
answers
2672
views
blackbox_exporter failing to launch with exit code 203/EXEC
I am trying to follow this guide to install and setup blackbox_exporter: https://devconnected.com/how-to-install-and-configure-blackbox-exporter-for-prometheus/ I have followed everything and can manually run the command from the systemd service and get it to run. However when I try to run systemctl...
I am trying to follow this guide to install and setup blackbox_exporter:
https://devconnected.com/how-to-install-and-configure-blackbox-exporter-for-prometheus/
I have followed everything and can manually run the command from the systemd service and get it to run.
However when I try to run systemctl start blackbox.service and then check the status, it fails with exit code 203/EXEC
I check the permissions on
/usr/local/bin/blackbox_exporter
:
-rwxr-xr-x. 1 blackbox blackbox 17050332 Nov 11 10:27 /usr/local/bin/blackbox_exporter
I can run the command from terminal as such just fine:
/usr/local/bin/blackbox_exporter --config.file=/etc/blackbox/blackbox.yml --web.listen-address=:9115
Here is my systemd service:
[Unit]
Description=Blackbox Exporter Service
Wants=network-online.target
After=network-online.target
[Service]
Type=simple
User=blackbox
Group=blackbox
ExecStart=/usr/local/bin/blackbox_exporter \
--config.file=/etc/blackbox/blackbox.yml \
--web.listen-address=":9115"
Restart=always
[Install]
WantedBy=multi-user.target
Logs from journalctl -u blackbox.service
:
Apr 30 08:26:55 localhost systemd: Started Blackbox Exporter Service.
Apr 30 08:26:55 localhost systemd: blackbox.service: Main process exited, code=exited, status=203/EXEC
Apr 30 08:26:55 localhost systemd: blackbox.service: Failed with result 'exit-code'.
I am using CentOS 8.
Any help would be greatly appreciated.
Tyler Radlick
(3 rep)
Apr 30, 2020, 01:45 PM
• Last activity: Apr 30, 2020, 03:47 PM
0
votes
2
answers
3285
views
Shut up a kernel error or avoid it by configuring Prometheus Node Exporter
Let me start by giving the output which I get to see in `/var/log/syslog` and with `dmesg`: [559151.898586] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20170831/exfield-427) [559151.911578] No Local Variables are initialized for Method [_PMM] [559151....
Let me start by giving the output which I get to see in
/var/log/syslog
and with dmesg
:
[559151.898586] ACPI Error: SMBus/IPMI/GenericSerialBus write requires Buffer of length 66, found length 32 (20170831/exfield-427)
[559151.911578] No Local Variables are initialized for Method [_PMM]
[559151.911580] No Arguments are initialized for method [_PMM]
[559151.911584] ACPI Error: Method parse/execution failed \_SB.PMI0._PMM, AE_AML_BUFFER_LIMIT (20170831/psparse-550)
[559151.916648] ACPI Exception: AE_AML_BUFFER_LIMIT, Evaluating _PMM (20170831/power_meter-338)
Clearly the [Prometheus Node Exporter](https://github.com/prometheus/node_exporter) is *triggering* the error, although it *doesn't seem to be the cause* for the error that gets logged.
Now what I want to achieve is - preferably - to tell the Prometheus Node Exporter to stop querying for whatever information it's attempting to query. Failing that, I'd like to silence these messages so they don't spam my log files.
How would I go about either of these options? ... or perhaps there are other options I haven't considered ...
This is happening on Ubuntu 18.04 with the packaged prometheus-node-exporter
(it also happened with the 0.16 and 0.17 versions of prometheus-node-exporter
which could be installed via stretch-backports
- yes, on Ubuntu).
0xC0000022L
(16938 rep)
Nov 8, 2018, 09:13 PM
• Last activity: Jan 23, 2019, 07:57 PM
Showing page 1 of 10 total questions