Unix & Linux Stack Exchange

Q&A for users of Linux, FreeBSD and other Unix-like operating systems

Latest Questions

1 votes

0 answers

47 views

Nagios agent not running

nrpe

Nagios is collecting stats from my RHEL server. I do not see that the nrpe agent nor snmp is running. How is this working?

                                  Nagios is collecting stats from my RHEL server. I do not see that the nrpe agent nor snmp is running. How is this working?
                                

DonDavis (11 rep)

Apr 6, 2023, 08:23 PM

0 votes

0 answers

41 views

how to monitor servers properly with nagios?

monitoring nagios nrpe

Currently, I'm using nrpe to monitor clients using Nagios but when nrpe goes down then all the stat goes to an unknown state which causes it difficult to monitor critical alerts... What and how should I prove so that I never miss critical alerts?

                                  Currently, I'm using nrpe to monitor clients using Nagios but when nrpe goes down then all the stat goes to an unknown state which causes it difficult to monitor critical alerts...
What and how should I prove so that I never miss critical alerts?
                                

user561715 (1 rep)

Feb 22, 2023, 06:19 AM

1 votes

1 answers

1156 views

How to validate NRPE config file?

ansible nagios nrpe

Nagios itself has the way to check its config file for validity, to ensure it would at least load the config without errors: ``` /usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg ``` Is it possible to do the same thing for the NRPE daemon? The manual page for NRPE suggests it doesn't...

Nagios itself has the way to check its config file for validity, to ensure it would at least load the config without errors:

/usr/local/nagios/bin/nagios -v /usr/local/nagios/etc/nagios.cfg

Is it possible to do the same thing for the NRPE daemon? The manual page for NRPE suggests it doesn't support that. I intend to update the NRPE config with Ansible's lineinfile module, so I want to check for validity to be sure at least I don't break the monitoring completely.

Nikita Kipriyanov (1779 rep)

Sep 26, 2022, 05:56 AM • Last activity: Sep 26, 2022, 04:22 PM

0 votes

0 answers

1016 views

how to toubleshoot nagios check_nrpe issues?

monitoring nagios nrpe

Steps done: **On Nagios Server (CentOS 7)** yum install nagios nagios-plugins-all **On Target (CentOS 7)** yum install nrpe nagios-plugins-all modified `nrpe.cfg` and added master IP to `allowed_hosts` systemctl enable nrpe && systemctl start nrpe Now I'm trying to add a service check with check_nrp...

                                  Steps done:

**On Nagios Server (CentOS 7)**

    yum install nagios nagios-plugins-all

**On Target (CentOS 7)**

    yum install nrpe nagios-plugins-all

modified nrpe.cfg and added master IP to allowed_hosts

    systemctl enable nrpe && systemctl start nrpe

Now I'm trying to add a service check with check_nrpe
so I defined host, contact, contactgroup then check_nrpe command

    nano /etc/nagios/objects/commands.cfg
    
    define command{
    	command_name check_nrpe
    	command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
    }

then added services for the nrpe check

    nano /etc/nagios/objects/services.cfg
    
    define service {
            use                             basic-new-service
            name                            check-load-service
            normal_check_interval           3
            retry_check_interval            1
            notification_interval		30
            notification_options            w,c,r,u
            check_command                   check_nrpe!check_load
            register                        0
            }
    
    define service {
            use                             check-load-service
            service_description             SYS_HostLoad
            contact_groups                  Audit
    	host_name                 	TGT
            }

nagios service started successfully

    systemctl enable nagios && systemctl start nagios

I can verify check_nrpe returns Ok status directly from CLI

    # /usr/lib64/nagios/plugins/check_nrpe -H TGT -c check_load
    OK - load average per CPU: 0.00, 0.00, 0.01|load1=0.000;0.150;0.300;0; load5=0.002;0.100;0.250;0; load15=0.008;0.050;0.200;0;

but in dashboard check_nrpe check is unable to succeed, due to permissions. (which I had already set as nagios:nagios for both /etc/nagios and /usr/lib64/nagios/plugins)
it's showing this error on dashboard against this check

(Return code of 13 for service 'SYS_HostLoad' on host 'TGT' was out of bounds)


                                

Sollosa (1993 rep)

Sep 22, 2021, 10:40 AM • Last activity: Sep 22, 2021, 10:49 AM

0 votes

1 answers

2095 views

Unable to stop nagios nrpe server

ubuntu nagios nrpe

> Note: I am referring to an old Ubuntu 14.04 ! :( Don't blame me, it's not my fault. I have this process [![enter image description here][1]][1] When I kill, something restart it automatically. I need to stop it! I tried what follows without succcess # systemctl disable nrpe.service # systemctl sto...

                                  > Note: I am referring to an old Ubuntu 14.04 ! :( Don't blame me, it's not my fault.

I have this process 



When I kill, something restart it automatically. 

I need to stop it!

I tried what follows without succcess

    # systemctl disable nrpe.service
    # systemctl stop nrpe.service
    # systemctl status nrpe.service
    nrpe.service - Nagios Remote Plugin Executor
       Loaded: loaded (/etc/systemd/system/nrpe.service; disabled)
       Active: failed (Result: exit-code) since Tue 2021-08-17 09:44:23 CEST; 14min ago
         Docs: http://www.nagios.org/documentation 
     Main PID: 21974 (code=exited, status=1/FAILURE)
       CGroup: name=dsystemd:/system/nrpe.service
    
    Aug 17 09:44:23 localhost nrpe: Starting up daemon
    Aug 17 09:44:23 localhost nrpe: Bind to port 5666 on 0.0.0.0 failed: Address already in use.

I tried also 

    # /etc/init.d/nagios-nrpe-server stop
    # /etc/init.d/nagios-nrpe-server status
    #

But in htop the process is still running


I queried status of service using old way

    # service --status-all
    [ - ]  nagios-nrpe-server

    # service nagios-nrpe-server stop

But the process is still here (with a different PID, so it's auto restarted)

Also

    # systemctl stop nagios-nrpe-server
    Failed to issue method call: Unit nagios-nrpe-server.service not loaded.
    # systemctl disable nagios-nrpe-server
    Failed to issue method call: No such file or directory


                                

realtebo (1035 rep)

Aug 17, 2021, 08:00 AM • Last activity: Aug 17, 2021, 08:15 AM

1 votes

2 answers

5578 views

Don't have directory '/usr/lib/x86_64-linux-gnu'. But instructions to install a software says './configure --with-ssl-lib=/usr/lib/x86_64-linux-gnu'

centos libraries nrpe

I am trying to install `NRPE (nagios remote plugin executor)` in a CentOS system. In the configuration step, the document I have been given to refer to says, ./configure --enable-command-args --with-nagios-user=nagios --with-nagios-group=nagios --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/x86...

                                  I am trying to install NRPE (nagios remote plugin executor) in a CentOS system. In the configuration step, the document I have been given to refer to says,

    ./configure --enable-command-args --with-nagios-user=nagios --with-nagios-group=nagios --with-ssl=/usr/bin/openssl --with-ssl-lib=/usr/lib/x86_64-linux-gnu

But the directory /usr/lib/x86_64-linux-gnu is not present in my system. These are the contents of the /usr/lib directory:

    [root@pr2 ~]# ls /usr/lib
    lib/     lib64/   libexec/ 

    [root@pr2 ~]# ls /usr/lib/
    binfmt.d/          grub/              NetworkManager/    sysctl.d/
    cpp                kbd/               polkit-1/          systemd/
    debug/             kdump/             python2.7/         tmpfiles.d/
    dracut/            kernel/            rpm/               tuned/
    firewalld/         locale/            sendmail           udev/
    firmware/          modprobe.d/        sendmail.postfix   yum-plugins/
    games/             modules/           sendmail.sendmail  
    gcc/               modules-load.d/    sse2/ 

According to the answer here , it may be that more modern system have this directory (although the question and answer in the link is about Ubuntu, I think it is safe to assume that the same is true of CentOS systems, since my installation doc for CentOS mentions this directory?). So, **what do I replace the location in the aforementioned command with if I am missing that directory**?

In case it helps, this is the version of CentOS in my machine:

    [root@pr2 ~]# rpm --query centos-release
    centos-release-7-4.1708.el7.centos.x86_64
                                

Kristada673 (133 rep)

Jan 22, 2018, 02:40 AM • Last activity: Feb 6, 2021, 03:27 PM

0 votes

1 answers

3917 views

NRPE Could not complete SSL handshake - Peer did not return a ceritificate

monitoring openssl ssl nagios nrpe

I am getting SSL Handshake errors with NRPE after enabling SSL. It worked perfectly fine without SSL doing check_nrpe. The allowed host is correct and when run without SSL enabled it shows the proper version. Both are running 4.3 on CentOS Linux release 7.9.2009 (Core) I did not compile NRPE or nagi...

                                  I am getting SSL Handshake errors with NRPE after enabling SSL. It worked perfectly fine without SSL doing check_nrpe. The allowed host is correct and when run without SSL enabled it shows the proper version. Both are running 4.3 on CentOS Linux release 7.9.2009 (Core) I did not compile NRPE or nagios from source I installed via Yum.

Here are the configs I feel are important to this issue. 

here is the error I'm getting logged... It says wrong version but both are running same version of NRPE. 

I am using a real purchased wildcard cert... Same cert on both sides. Cert matches the domain name of the server.

    nrpe --version
    NRPE - Nagios Remote Plugin Executor
    Version: 4.0.3

Same version on both for openssl

    openssl version
    OpenSSL 1.0.2k-fips  26 Jan 2017

When I run ./check_nrpe -H hostname.domain.com I get 

    CHECK_NRPE: (ssl_err != 5) Error - Could not complete SSL handshake with 10.1.1.125: 1

On the other server it logs:

    Jan  5 12:48:54 nagiostest2 nrpe: Error: (ERR_get_error_line_data = 336130315), Could not complete SSL handshake with 10.1.1.64: wrong version number
    Jan  5 12:51:11 nagiostest2 nrpe: CONN_CHECK_PEER: checking if host is allowed: 10.1.1.64 port 16075
    Jan  5 12:51:11 nagiostest2 nrpe: is_an_allowed_host (AF_INET): is host >10.1.1.6410.1.1.6410.1.1.6410.1.1.64<
    Jan  5 12:51:11 nagiostest2 nrpe: is_an_allowed_host (AF_INET): host is in allowed host list!
    Jan  5 12:51:11 nagiostest2 nrpe: Error: (ERR_get_error_line_data = 336105671), Could not complete SSL handshake with 10.1.1.64: peer did not return a certificate

Here is the important portions of my nrpe.cfg

    debug=1
    
    ssl_cipher_list=ALL:!aNULL:!eNULL:!SSLv2:!LOW:!EXP:!RC4:!MD5:@STRENGTH
    
    ssl_version=TLSv1.1+
    
    #ssl_cipher_list=ALL:!MD5:@STRENGTH
    #ssl_cipher_list=ALL:!MD5:@STRENGTH:@SECLEVEL=0
    ssl_cipher_list=ALL:!aNULL:!eNULL:!SSLv2:!LOW:!EXP:!RC4:!MD5:@STRENGTH
    
    # SSL Certificate and Private Key Files
    
    ssl_cacert_file=/etc/nagios/ssl/ca.crt
    ssl_cert_file=/etc/nagios/ssl/star.mydomain.com.crt
    ssl_privatekey_file=/etc/nagios/ssl/star.mydomain.com.key
    
    # SSL USE CLIENT CERTS
    # This options determines client certificate usage.
    # Values: 0 = Don't ask for or require client certificates (default)
    #         1 = Ask for client certificates
    #         2 = Require client certificates
    ssl_client_certs=2
    
    # Enables all SSL Logging
    ssl_logging=0xff

Thank you for any help ahead of time!

Keith Shannon (83 rep)

Jan 5, 2021, 09:09 PM • Last activity: Jan 6, 2021, 01:59 PM

1 votes

1 answers

1131 views

check_nrpe command doesn't work from nagios server since debian upgrade

debian upgrade nagios nrpe

Yesterday I upgraded a server from Debian 9 to Debian 10. This server is supervised with nagios. Since the upgrade, I get an alert, status Unknown saying : > "Volumegroup array03-0 wasn't valid or wasn't specified with "-v > Volumegroup", bye. false The service is VG baie03-0 usage, its command is c...

                                  Yesterday I upgraded a server from Debian 9 to Debian 10. This server is supervised with nagios. Since the upgrade, I get an alert, status Unknown saying :

>  "Volumegroup array03-0 wasn't valid or wasn't specified with "-v
> Volumegroup", bye. false

The service is VG baie03-0 usage, its command is check_nrpe!check_vgs_array03-0. The goal of this service is to generate an alert if storage on the array is almost full.

check_nrpe command is standard :

    # 'check_NRPE' command definition
    define command{
            command_name check_nrpe
            command_line $USER1$/check_nrpe -H $HOSTADDRESS$ -c $ARG1$
            }

If I'm not mistaken, it means that I have a check_vgs_array03-0 command in my /etc/nagios/nrpe.cfg on the supervised server. Let's look at it, here it is :

> command[check_vgs_array03-0]=/usr/lib/nagios/plugins/check_vg_size -w
> 20 -c 10 -v array03-0

If I just type this command on the supervised server, I have no errors, it works.

> VG array03-0 OK Available space is 805 GB;|
> array03-0=805GB;20;10;0;19155

I got the error if, for example, I type a volumegroup name that doesn't exist.

check_vg_size plugin script goes like this :

    #!/bin/bash
    #check_vg_size
    #set -x
    # Plugin for Nagios
    # Written by M. Koettenstorfer (mko@lihas.de)
    # Some additions by J. Schoepfer (jsc@lihas.de)
    # Major changes into functions and input/output values J. Veverka (veverka.kuba@gmail.com)
    # Last Modified: 2012-11-06
    #
    # Description:
    #
    # This plugin will check howmany space in volume groups is free
    
    # Nagios return codes
    STATE_OK=0
    STATE_WARNING=1
    STATE_CRITICAL=2
    STATE_UNKNOWN=3
    STATE_DEPENDENT=4
    
    SERVICEOUTPUT=""
    SERVICEPERFDATA=""
    
    PROGNAME=$(basename $0)
    
    vgs_bin=/usr/bin/whereis -b -B /sbin /bin /usr/bin /usr/sbin -f vgs | awk '{ print $2 }'
    _vgs="$vgs_bin --units=g"
    
    bc_bin=/usr/bin/whereis -b -B /sbin /bin /usr/bin /usr/sbin -f bc | awk '{ print $2 }'
    
    exitstatus=$STATE_OK #default
    declare -a volumeGroups;
    novg=0; #number of volume groups
    allVG=false; #Will we use all volume groups we can find on system?
    inPercent=false; #Use percentage for comparison?
    
    unitsGB="GB"
    unitsPercent="%"
    units=$unitsGB
    
    ########################################################################
    ### DEFINE FUNCTIONS
    ########################################################################
    
    print_usage() {
            echo "Usage: $PROGNAME  -w  -c  -v  [-a] [-p]"
            echo "If '-a' and '-v' are specified: all volumegroups defined by -v will be ommited and the remaining groups which are found on system are checked"
            echo "If '-p' is specified: the warning and critical levels are represented as the percent space left on device"
        echo ""
    }
    
    print_help() {
            print_usage
            echo ""
            echo "This plugin will check how much space is free in volume groups"
            echo "usage: "
            exit $STATE_UNKNOWN
    }
    
    
    checkArgValidity () {
    # Check arguments for validity
            if [[ -z $critlevel || -z $warnlevel ]] # Did we get warn and crit values?
            then
                    echo "You must specify a warning and critical level"
                    print_usage
                    exitstatus=$STATE_UNKNOWN
                    exit $exitstatus
            elif [ $warnlevel -le $critlevel ] # Do the warn/crit values make sense?
            then
            if [ $inPercent != 'true' ]
            then
                echo "CRITICAL value of $critlevel GB is less than WARNING level of $warnlevel GB"
                print_usage
                exitstatus=$STATE_UNKNOWN
                exit $exitstatus
            else
                echo "CRITICAL value of $critlevel % is higher than WARNING level of $warnlevel %"
                print_usage
                exitstatus=$STATE_UNKNOWN
                exit $exitstatus
            fi
            fi
    }
    
    #Does volume group actually exist?
    volumeGroupExists () {
            local volGroup="$@"
            VGValid=$($_vgs 2>/dev/null | grep "$volGroup" | wc -l )
    
            if [[  -z "$volGroup" ||  $VGValid = 0 ]]
            then
                    echo "Volumegroup $volGroup wasn't valid or wasn't specified"
                    echo "with \"-v Volumegroup\", bye."
                    echo false
                    return 1
            else
                    #The volume group exists
                    echo true
                    return 0
            fi
    }
    
    getNumberOfVGOnSystem () {
            local novg=$($_vgs 2>/dev/null | wc -l)
            let novg--
            echo $novg
    }
    
    getAllVGOnSystem () {
            novg=$(getNumberOfVGOnSystem)
            local found=false;
            for (( i=0; i /dev/null | head -n1 | awk -v name=$columnName '
                    BEGIN{}
                            { for(i=1;i/dev/null | awk -v n=$cnFree '/[0-9]/{print $n}' | sed -e 's/[\.,\,].*//'`;
            fullspace=$_vgs $volumeName 2>/dev/null | awk -v n=$cnSize '/[0-9]/{print $n}' | sed -e 's/[\.,\,].*//';
    
            if ( $inPercent ); then
            #Convert to Percents
                    freespace="$(convertToPercent $freespace $fullspace)"
            fi
    }
    
    setExitStatus () {
            local status=$1
            local volGroup="$2"
            local formerStatus=$exitstatus
    
            if [ $status -gt $formerStatus ]
            then
                    formerStatus=$status
            fi
    
            if [ $status = $STATE_UNKNOWN ] ; then
                    SERVICEOUTPUT="${volGroup}"
                    exitstatus=$STATE_UNKNOWN
                    return
            fi
    
            if [ "$freespace" -le "$critlevel" ]
            then
                    SERVICEOUTPUT=$SERVICEOUTPUT" VG $volGroup CRITICAL Available space is $freespace $units;"
                    exitstatus=$STATE_CRITICAL
            elif [ "$freespace" -le "$warnlevel" ]
            then
                    SERVICEOUTPUT=$SERVICEOUTPUT"VG $volGroup WARNING Available space is $freespace $units;"
                    exitstatus=$STATE_WARNING
            else
                    SERVICEOUTPUT=$SERVICEOUTPUT"VG $volGroup OK Available space is $freespace $units;"
                    exitstatus=$STATE_OK
            fi
    
            SERVICEPERFDATA="$SERVICEPERFDATA $volGroup=$freespace$units;$warnlevel;$critlevel"
            if [ $inPercent != 'true' ] ; then
    
                    SERVICEPERFDATA="${SERVICEPERFDATA};0;$fullspace"
            fi
    
            if [ $formerStatus -gt $exitstatus ]
            then
                    exitstatus=$formerStatus
            fi
    }
    
    
    checkVolumeGroups () {
    checkArgValidity
            for (( i=0; i &2
                            ;;
            esac
    done
    
    checkVolumeGroups
    
    
    echo $SERVICEOUTPUT"|"$SERVICEPERFDATA
    exit $exitstatus


I I use another arg (another script) to the check_nrpe command, it works.

for example :

> root@nagiosserver:/usr/local/nagios# /usr/local/nagios/libexec/check_nrpe
> -H srv-supervised04 -c check_load OK - load average: 3.79, 2.99, 1.83|load1=3.790;25.000;30.000;0; load5=2.990;20.000;25.000;0; load15=1.830;15.000;20.000;0;

VG array03-0 does exist :

> root@srv-supervised04:/usr/lib/nagios/plugins# vgdisplay   --- Volume group ---   VG Name               array03-0   System ID   Format       
> lvm2   Metadata Areas        1   Metadata Sequence No  34   VG Access 
> read/write   VG Status             resizable   MAX LV                0
> Cur LV                5   Open LV               4   Max PV            
> 0   Cur PV                1   Act PV                1   VG Size       
>  4903887   Alloc PE / Size       4697600 /  206287 /  OgzAMF-DGbW-3t3L-Wk7k-gY1g-s6fH-zYEKad


So. VG does exist. The check_vg_size plugin works when used locally, the check_nrpe command works from the nagios server when used with another plugin but check_vg_size doesn't work from nagios server. Error message is apparently that array03-0 doesn't exist while it does. I haven't changed anything from all the files. It appeared with Debian update from 9 to 10 (during the installation, I decided to keep my nrpe.cfg modified file).

Anyone knows where it can come from ? Debian version ? New bash version maybe ? An incompatibility between the nagios server (still Debian 9) and the supervised one (Debian 10) ?


                                

JadenBZH (23 rep)

Sep 1, 2020, 02:37 PM • Last activity: Sep 7, 2020, 12:08 PM

1 votes

1 answers

1066 views

How can I get hostapd_cli to work under sudo on debian stretch?

debian command-line sudo hostapd nrpe

I have a bash script that runs `hostapd_cli all_sta`, and the script executes successfully from the command line under both jessie and stretch. The script also works when run under sudo on jessie but not on stretch. On stretch the command times out with the error `'STA-FIRST' command timed out`. Whe...

I have a bash script that runs hostapd_cli all_sta, and the script executes successfully from the command line under both jessie and stretch. The script also works when run under sudo on jessie but not on stretch. On stretch the command times out with the error 'STA-FIRST' command timed out. When I invoke hostapd_cli under strace I see that it opens a socket file under /tmp: bind(3, {sa_family=AF_UNIX, sun_path="/tmp/wpa_ctrl_13552-1"}, 110) = 0 connect(3, {sa_family=AF_UNIX, sun_path="/var/run/hostapd/wlan1"}, 110) = 0 As a test I temporarily modified the script and added a line:

echo "this is a test" >/tmp/test 2>/root/error

When the modified script runs under sudo, the file in /tmp is not created and no error is written to /tmp/error. On my system, /tmp is not a tmpfs, just a plain old directory under / on an ext3 filesystem. So root is unable to create a file under /tmp and there is ample space. # df -h /tmp Filesystem Size Used Avail Use% Mounted on /dev/sdb2 6.7G 5.1G 1.4G 80% / And an ls -ld /tmp gives: # ls -ld /tmp drwxrwxrwt 9 root root 4096 Jul 27 23:50 /tmp/ If I can figure out why /tmp can't be written to, I believe the hostapd_cli command will work. What could be happening here?

Bob (111 rep)

Jul 28, 2017, 05:07 AM • Last activity: Aug 5, 2020, 06:56 PM

0 votes

2 answers

12726 views

Find command to check only for the last 10 minutes and not the whole folder

linux find nagios nrpe

I have an nvr system that recording video surveillance into a file server (Debian). I made a Nagios plugin to check if the NVR system is recording correctly and send me a notification when it stops doing that. The issue is I'm using a find command: `find /srv/unifi-video/videos/ -name '*.ts' -mmin -...

                                  I have an nvr system that recording video surveillance into a file server (Debian). I made a Nagios plugin to check if the NVR system is recording correctly and send me a notification when it stops doing that. The issue is I'm using a find command:

find /srv/unifi-video/videos/ -name '*.ts' -mmin -10 | wc -l

and since it's checking through 400 Gb+ of files Nagios keeps timing out and sending "NRPE Socket timed out" messages.

Is it possible to make the find command to search only for the files created in the passed 10 minutes and not only the whole folder ?

Youssef Karami (1 rep)

Sep 22, 2015, 04:16 PM • Last activity: Dec 23, 2019, 06:00 PM

0 votes

0 answers

810 views

Open port 12489 for nrpe

debian nagios nrpe

I'm learning a little bit of nagios and unix(buster), and im getting an error related with monitoring CPU LOAD and MEMORY USAGE The feedback I get from the monitor is connect to address 192.168.1.94 and port 12489: Connection refused could not fetch information from server So I went to the client ma...

                                  I'm learning a little bit of nagios and unix(buster), and im getting an error related with monitoring CPU LOAD and MEMORY USAGE

The feedback I get from the monitor is
   
    connect to address 192.168.1.94 and port 12489: Connection refused

could not fetch information from server

So I went to the client machine and ran netstat -an, and the port doesnt seem to be listening.

I tried to add some rules on iptables, did some rules for port 12489 on TCP

the port was there on "ACCEPT tcp -- anywhere anywhere tcp dpt:12489"
      
    sudo iptables-save

what am I missing? the machines are in the same local network

am im missing some configuration or plugin?

Thank you

Navy Seal (103 rep)

Aug 20, 2019, 10:56 PM • Last activity: Aug 20, 2019, 11:14 PM

2 votes

0 answers

988 views

shebang with /usr/bin/env and sudo

sudo freebsd nagios nrpe

I have lots of scripts, usually they start with a shebang and /usr/bin/env and the required interpreter (for example "#!/usr/bin/env perl"). This works fine since many years, but for some reason I don't understand I have one script on one FreeBSD machine where this breaks sudo: ``` % cat test.pl #!/...

% cat test.pl
#!/usr/bin/env perl
system( "id" );
exit 0;

Executing this as user nagios works fine. Executing this with sudo as user nagios also works fine. But, executing this through nrpe daemon from monitoring server, where the nrpe is running as user nagios and using sudo as command_prefix it exits with error code 3. Since nrpe doesn't show any reasons, but just the exit code, I have no clue why this doesn't work. (Yes, sudo seems to be configured correctly to allow /usr/bin/env and test.pl, as it works on commandline)

nrpe: Running command: /usr/local/bin/sudo /usr/local/etc/nagios/test.pl
nrpe: Command completed with return code 3 and output: 
nrpe: Return Code: 3, Output: NRPE: Unable to read output

Yes, I could change the shebang string to /usr/local/bin/perl, which then works with sudo through nrpe, but the script is intended to be generic for different OS types. So, any idea what I'm missing here?

frank42 (121 rep)

May 15, 2019, 11:23 AM • Last activity: Jun 3, 2019, 08:07 PM

2 votes

1 answers

4314 views

Why does this Bash NRPE plugin not return a variable to Nagios?

shell-script sudo nagios nrpe

This script I have here works locally just fine: #! /bin/bash volts=`sudo vcgencmd measure_volts core|sed 's/volt=$[0-9\.]*$V/\1/'` echo -n "BCM2835 SoC Voltage is ${volts}V " echo "| volts=$volts;1.5;1.5;0;1.5" However if Nagios tries to get the information it only gets "BCM2835 SoC Voltage is V"...

                                  This script I have here works locally just fine:

    #! /bin/bash
    volts=sudo vcgencmd measure_volts core|sed 's/volt=\([0-9\.]*\)V/\1/'
    echo -n "BCM2835 SoC Voltage is ${volts}V "
    echo "| volts=$volts;1.5;1.5;0;1.5"

However if Nagios tries to get the information it only gets "BCM2835 SoC Voltage is V" as if the variable was not defined.

There are also other plugins which pull the information from files and it works. So I managend to write the information in a temp file and write it back into the variable.

    #! /bin/bash
    sudo vcgencmd measure_volts core|sed 's/volt=\([0-9\.]*\)V/\1/'>/tmp/volts
    volts=$(
                              
                            

syss (701 rep)

Aug 12, 2013, 08:56 AM • Last activity: Mar 20, 2019, 01:14 PM

1 votes

1 answers

4592 views

ERROR: CHECK_NRPE: Socket timeout after 10 seconds

iptables amazon-ec2 nagios aws nrpe

getting following error : # /usr/local/nagios/libexec/check_nrpe -H nagios-server-ip CHECK_NRPE: Socket timeout after 10 seconds. But it's working for localhost # /usr/local/nagios/libexec/check_nrpe -H localhost NRPE v2.15 ps i have checked security groups as well as iptables Also on Nagios server...

                                  getting following error :

    # /usr/local/nagios/libexec/check_nrpe -H nagios-server-ip
    CHECK_NRPE: Socket timeout after 10 seconds.

But it's working for localhost 

    # /usr/local/nagios/libexec/check_nrpe -H localhost
    NRPE v2.15

ps i have checked security groups as well as iptables 

Also on Nagios server :

    # /usr/local/nagios/libexec/check_nrpe -H localhost
    NRPE v2.13
    [root@ADM-PROD-NAGIOS ec2-user]# /usr/local/nagios/libexec/check_nrpe -H monitoring-host-ip
    NRPE v2.15

Ashish Karpe (302 rep)

Nov 25, 2015, 10:46 AM • Last activity: Apr 13, 2018, 01:07 PM

0 votes

1 answers

1504 views

Nagios: are certificates required for SSL/TLS?

ssl certificates nagios nrpe

Configuring check_nrpe and NRPE daemon with --enable-ssl generates a DH key pair. Are these DH keys enough to establish SSL connection? Or do we need certs and keys signed by a CA? Docs say certificates CAN be used for security.

                                  Configuring check_nrpe and NRPE daemon with --enable-ssl generates a DH key pair. Are these DH keys enough to establish SSL connection? Or do we need certs and keys signed by a CA? Docs say certificates CAN be used for security.
                                

pdns (275 rep)

Oct 28, 2017, 12:20 AM • Last activity: Oct 28, 2017, 06:17 AM

1 votes

0 answers

630 views

Nagios plugin fails to run a command over NRPE

nagios nrpe

I have the following plugin whose status is OK on core. #!/usr/local/bin/bash if [ "$1" = "-w" ] && [ "$2" -lt "101" ] && [ "$3" = "-c" ] && [ "$4" -lt "101" ] ; then warn=$2 crit=$4 AVAILMEMPERC=$(free -m | grep mem_avail | awk '{print $7}'| tr -d %]) if [ ${AVAILMEMPERC} -gt $warn ] && [ ${AVAILME...

                                  I have the following plugin whose status is OK on core.

    #!/usr/local/bin/bash
    
    if [ "$1" = "-w" ] && [ "$2" -lt "101" ] && [ "$3" = "-c" ] && [ "$4" -lt "101" ] ; then
      warn=$2
      crit=$4
    
      AVAILMEMPERC=$(free -m | grep mem_avail | awk '{print $7}'| tr -d %])
    
      if [ ${AVAILMEMPERC} -gt $warn ] && [ ${AVAILMEMPERC} -gt $crit ];then
        echo "OK - Available Memory = $AVAILMEMPERC% | Available memory=$AVAILMEMPERC%;$warn;$crit;0;100"
        exit 0
      elif [ ${AVAILMEMPERC} -lt $warn ] && [ ${AVAILMEMPERC} -gt $crit ]; then
        echo "WARNING - Available Memory = $AVAILMEMPERC% | Available memory=$AVAILMEMPERC%;$warn;$crit;0;100"
        exit 1
      else
        echo "CRITICAL - Available Memory = $AVAILMEMPERC% | Available memory=$AVAILMEMPERC%;$warn;$crit;0;100"
        exit 2
      fi
    else
      echo "$0 - Nagios Plugin for checking the available memory in a Linux system"
      echo ""
      echo "Usage:    $0 -w  -c "
      echo "  = warnlevel and critlevel is warning and critical value for alerts."
      echo ""
      echo "EXAMPLE:  $0 -w 10 -c 5 "
      echo "  = This will send warning alert when available memory is less than 10%, and send critical when it is less than 5%"
      echo ""
      exit 3
    fi

When I run it locally on the remote machine, it runs fine. I get the right output. But on the web GUI, I see that Nagios cannot extract the variable AVAILMEMPERC value 

For example, if I simplify the plugin to below

    #!/usr/local/bin/bash
    
    warn=$2
    crit=$4
    
    AVAIL_MEM_PERCENTAGE="$(free -m)"
    
    echo "OK - ${AVAIL_MEM_PERCENTAGE}"

The only output I see on GUI is 

OK -

When I run it on command line, I do get the entire free -m output

Tried the following and it doesn't write anything. I gave 777 permissions to /tmp and the files.

free -m > /tmp/check_avail_memory.out

Seems like a permissions issue? It runs on Nagios Core though. If I replace free with top nagios is able to write to the file. 

I have downloaded free from here http://people.freebsd.org/~rse/dist/freebsd-memory . As I said it runs fine on the remote machine. I have made sure the paths are correct on FreeBSD and it is executable.

Couldn't find any relevant logs to this except for the plugin output.


                                

pdns (275 rep)

Oct 26, 2017, 01:14 AM • Last activity: Oct 26, 2017, 12:04 PM

0 votes

1 answers

3339 views

libssl.so.6 not found for check_nrpe

linux centos nagios nrpe

I couldn't get NRPE to work, so I ran it locally and this error popped up: ./check_nrpe: error while loading shared libraries: libssl.so.6: cannot open shared object file: No such file or directory After I check my "check_nrpe" with ldd command, I got this : [root@supervision lib64]# ldd /srv/eyesof...

                                  I couldn't get NRPE to work, so I ran it locally and this error popped up:

    ./check_nrpe: error while loading shared libraries: libssl.so.6: cannot open shared object file: No such file or directory

After I check my "check_nrpe" with ldd command, I got this :

    [root@supervision lib64]# ldd /srv/eyesofnetwork/nagios-3.5.1/plugins/check_nrpe
            linux-gate.so.1 =>  (0xf7744000)
            libssl.so.6 => not found
            libcrypto.so.6 => not found
            libnsl.so.1 => /lib/libnsl.so.1 (0xf7720000)
            libc.so.6 => /lib/libc.so.6 (0xf7562000)
            /lib/ld-linux.so.2 (0xf7745000)

But I'm providing libssl and libcrypto for "6".

        [root@supervision lib64]# yum provides /usr/lib/libcrypto.so.6
        Modules complémentaires chargés : fastestmirror
        Loading mirror speeds from cached hostfile
         * base: mirror.in2p3.fr
         * epel: mirrors.ircam.fr
         * extras: mirror.in2p3.fr
         * updates: mirror.in2p3.fr
        openssl098e-0.9.8e-29.el7.centos.3.i686 : A compatibility version of a general cryptography and TLS library
        Dépôt               : base
        Correspondance depuis :
        Nom de fichier : /usr/lib/libcrypto.so.6
        
    [root@supervision lib64]# yum provides /usr/lib/libssl.so.6
        Modules complémentaires chargés : fastestmirror
        Loading mirror speeds from cached hostfile
         * base: mirror.in2p3.fr
         * epel: mirrors.ircam.fr
         * extras: mirror.in2p3.fr
         * updates: mirror.in2p3.fr
        openssl098e-0.9.8e-29.el7.centos.3.i686 : A compatibility version of a general cryptography and TLS library
        Dépôt               : base
        Correspondance depuis :
        Nom de fichier : /usr/lib/libssl.so.6

I created a symbolic link :

    [root@supervision lib64]# sudo ln -s /lib64/libssl.10 /lib/libssl.so.6
    [root@supervision lib64]# sudo ln -s /lib64/libcrypto.so.10 /lib/libcrypto.so.6

But still doesn't work when I launch ldd command; can you help me?
                                

Hujino (1 rep)

Feb 1, 2017, 11:14 AM • Last activity: Feb 26, 2017, 08:53 PM

3 votes

0 answers

859 views

Nagios - Help with getting NRPE to work with check_fail2ban.sh

centos monitoring nagios fail2ban nrpe

I am trying to monitor fail2ban with Nagios so, I found the following check via a Google search: http://nagios.fm4dd.com/plugins/manual/check_fail2ban.htm I am trying to get the check to work on a remote host, but I am unable to get it to return accurate results. I am using Fail2ban v0.9.3 on CentOS...

                                  I am trying to monitor fail2ban with Nagios so, I found the following check via a Google search:
http://nagios.fm4dd.com/plugins/manual/check_fail2ban.htm 

I am trying to get the check to work on a remote host, but I am unable to get it to return accurate results.  I am using Fail2ban v0.9.3 on CentOS 7, so I had to make one change to the script per the following link:
https://exchange.nagios.org/directory/Plugins/Security/Firewall-Software/check_fail2ban/details#rev-3948 

***NOTE:** *All output below is from the "Remote Server" and not my "Nagios Server".*



**The change I made (Line 108) is below:**

    jail_list=$($fail2ban_client status|grep "list" |cut -d : -f 2 |tr -d ,)


**I already gave the Nagios user & NRPE permissions per the wiki:**

    setfacl -m u:nagios:rwx /var/run/fail2ban/fail2ban.sock


**I am able to run the fail2ban-client  & the script as both the Nagios & NRPE users:**

    [root@localhost plugins]# sudo -u nrpe fail2ban-client status
    Status
    |- Number of jail:      2
    `- Jail list:   openvpn, sshd
    
    [root@localhost plugins]# sudo -u nagios fail2ban-client status
    Status
    |- Number of jail:      2
    `- Jail list:   openvpn, sshd
    
    [root@localhost etc]# sudo -u nagios /usr/lib64/nagios/plugins/check_fail2ban.sh -w 10 -c 20
    OK: 1 banned IP(s) in 2 active jails|banned_IP=1;10;20;;
    jail openvpn blocks 1 IP(s): 76.123.218.206
    jail sshd blocks 0 IP(s):
    | openvpn=1;;;; sshd=0;;;;
    
    [root@localhost etc]# sudo -u nrpe /usr/lib64/nagios/plugins/check_fail2ban.sh -w 10 -c 20
    OK: 1 banned IP(s) in 2 active jails|banned_IP=1;10;20;;
    jail openvpn blocks 1 IP(s): 76.123.218.206
    jail sshd blocks 0 IP(s):
    | openvpn=1;;;; sshd=0;;;;



**Here is what I get when I run it locally:**

    [root@localhost plugins]# ./check_fail2ban.sh -w 10 -c 20
    OK: 1 banned IP(s) in 2 active jails|banned_IP=1;10;20;;
    jail openvpn blocks 1 IP(s): 46.133.118.236
    jail sshd blocks 0 IP(s):
    | openvpn=1;;;; sshd=0;;;;



**Here is what I get when I run it locally with NRPE:**

    [root@localhost plugins]# /usr/lib64/nagios/plugins/check_nrpe -t 60 -H 127.0.0.1 -p 5666 -c check_fail2ban -a 10 20
    OK: 0 banned IP(s) in active jails|banned_IP=0;10;20;;
    |

- *I Get the same result when I run it on my Nagios Server*



**My command is defined in my nrpe.cfg:**

    command[check_fail2ban]=/usr/lib64/nagios/plugins/check_fail2ban.sh -w $ARG1$ -c $ARG2$



**I tried some "debugging" by adding the following to my nrpe.cfg file:**

    command[check_fail2ban]=whoami
    command[check_fail2ban]=env


**"Debug" output:**

    [root@localhost plugins]# /usr/lib64/nagios/plugins/check_nrpe -t 60 -H 127.0.0.1 -p 5666 -c check_fail2ban -a 10 20
    SHELL=/sbin/nologin
    NRPE_PROGRAMVERSION=2.15
    USER=nrpe
    PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin
    PWD=/
    LANG=en_US.UTF-8
    SHLVL=1
    HOME=/var/run/nrpe
    LOGNAME=nrpe
    NRPE_SSL_OPT=
    NRPE_MULTILINESUPPORT=1
    _=/usr/bin/env


**I tried additional debugging by setting NRPE to debug =1.  Here is the output when I run the command from my Nagios Server.**

    Sep 27 12:36:46 localhost nrpe: Connection from 192.168.1.200 port 61853
    Sep 27 12:36:46 localhost nrpe: Host address is in allowed_hosts
    Sep 27 12:36:46 localhost nrpe: Handling the connection...
    Sep 27 12:36:46 localhost nrpe: Host is asking for command 'check_fail2ban' to be run...
    Sep 27 12:36:46 localhost nrpe: Running command: usr/lib64/nagios/plugins/check_fail2ban.sh -w 10 -c 20
    Sep 27 12:36:46 localhost nrpe: Command completed with return code 0 and output: OK: 0 banned IP(s) in active jails|banned_IP=0;10;20;;#012|
    Sep 27 12:36:46 localhost nrpe: Return Code: 0, Output: OK: 0 banned IP(s) in active jails|banned_IP=0;10;20;;#012|
    Sep 27 12:36:46 localhost nrpe: Connection from `bYj closed.

- *I get the same thing when I run it locally from the server with check_nrpe.*


It looks like NRPE may not be capturing all of the output from the script?  Please forgive me if this is something stupid that I've missed, as I am a Windows user, that does very little on Linux.  Any help is greatly appreciated!


----------


*** ***EDIT TO ANSWERS***  ***

User4556274, I think it is enabled.  Here is the output from that command: 

    [root@localhost etc]# ls -Z /usr/lib64/nagios/plugins
    -rwxr-xr-x. root root unconfined_u:object_r:usr_t:s0   check_apc
    -rwxr-xr-x. root root unconfined_u:object_r:usr_t:s0   check_asterisk_pri.php
    -rwxr-xr-x. root root system_u:object_r:nagios_checkdisk_plugin_exec_t:s0 check_disk
    -rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   check_fail2ban.old
    -rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   check_fail2ban.sh
    -rwxr-xr-x. root root system_u:object_r:nagios_system_plugin_exec_t:s0 check_load
    -rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   check_mem.pl
    -rwxr-xr-x. root root system_u:object_r:nagios_services_plugin_exec_t:s0 check_nrpe
    -rwxr-xr-x. root root unconfined_u:object_r:usr_t:s0   check_openmanage
    -rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   check_openvpn.php
    -rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   check_openvpn_user_list
    -rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   check_openvpn_user_status
    -rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   check_openvpn_user_traffic
    -rwxr-xr-x. root root unconfined_u:object_r:lib_t:s0   check_ping
    -rwxr-xr-x. root root system_u:object_r:nagios_system_plugin_exec_t:s0 check_procs
    -rwxr-xr-x. root root system_u:object_r:nagios_system_plugin_exec_t:s0 check_swap
    -rwxr-xr-x. root root unconfined_u:object_r:usr_t:s0   check_swraid.py
    -rwxr-xr-x. root root unconfined_u:object_r:usr_t:s0   check_swraid.sh
    -rwxr-xr-x. root root system_u:object_r:nagios_system_plugin_exec_t:s0 check_users
    -rwxr-xr-x. root root system_u:object_r:bin_t:s0       negate
    -rwxr-xr-x. root root system_u:object_r:bin_t:s0       urlize
    -rwxr-xr-x. root root system_u:object_r:bin_t:s0       utils.sh


                                

TB. (31 rep)

Sep 27, 2016, 05:13 PM • Last activity: Sep 27, 2016, 05:42 PM

0 votes

1 answers

1282 views

Nagios plugins are executed from server plugins or client plugins?

nagios nrpe

Just installed Nagios on SERVER (10.20.8.106) and attached a CLIENT (10.20.10.11). So I defined my host and and a service for check_nrpe. It is working. So I have check_nrpe plugin in the plugins(`/usr/lib64/nagios/plugins/`) directory of SERVER and CLIENT. I didn't know which check_nrpe was execute...

                                  Just installed Nagios on SERVER (10.20.8.106) and attached a CLIENT (10.20.10.11). So I defined my host and and a service for check_nrpe. It is working. 

So I have check_nrpe plugin in the plugins(/usr/lib64/nagios/plugins/) directory of SERVER and CLIENT. I didn't know which check_nrpe was executed.

On the SERVER:

    $/usr/lib64/nagios/plugins/check_nrpe -H 10.20.10.11
    NRPE v2.15

On the CLIENT:

    $usr/lib64/nagios/plugins/check_nrpe -H 10.20.8.106
    connect to address 10.41.8.106 port 5666: No route to host
    connect to host 10.41.8.106 port 5666: No route to host

The above confirmed to me that the check_nrpe plugin in SERVER's plugin directory was executed. So why do we have the plugins directory in the CLIENT? At first I thought, SERVER executes them from the plugin directory of CLIENT. And the plugins at SERVER side were used for doing checks on the same machine.I am confused at this moment.

Can anybody clarify.

0aslam0 (335 rep)

Feb 5, 2016, 01:13 PM • Last activity: Feb 5, 2016, 01:45 PM

0 votes

1 answers

682 views

Nagios Plugin for reading RMON files

nagios nrpe

May I know if Nagios is able read and display information from `.rmon` and `.pmon` files ? Samples: **.rmon** Time Stamp,RX Octs,TX Octs,RX Pkts,TX Pkts,RX Drop Events,Status,RX Undersize Pkts,Status,RX Fragments,Status,RX 64Octs,TX 64Octs,RX 65 to 127Octs,TX 65 to 127Octs,RX 128 to 255Octs,TX 128 t...

                                  May I know if Nagios is able read and display information from .rmon and .pmon files  ?

Samples:

**.rmon**

    Time Stamp,RX Octs,TX Octs,RX Pkts,TX Pkts,RX Drop Events,Status,RX Undersize Pkts,Status,RX Fragments,Status,RX 64Octs,TX 64Octs,RX 65 to 127Octs,TX 65 to 127Octs,RX 128 to 255Octs,TX 128 to 255Octs,RX 256 to 511Octs,TX 256 to 511Octs,RX 512 to 1023Octs,TX 512 to 1023Octs,RX 1024 to 1518Octs,TX 1024 to 1518Octs,RX CRC Alignment Errors,Status,RX Oversize Pkts,Status,TX Oversize Pkts,RX Jabbers,RX Multicast Pkts,TX Multicast Pkts,RX Broadcast Pkts,TX Broadcast Pkts,TX Collisions,Status,RX Unknown TPID,RX Unknown VID,RX MAC Limit,RX Filter Discard,RX QoS Discard,TX Queue0 Discard,TX Queue1 Discard,TX Queue2 Discard,TX Queue3 Discard,TX Queue4 Discard,TX Queue5 Discard,TX Queue6 Discard,TX Queue7 Discard,Record Status
    00:15,69586578,421339463,525456,1172251,0,NORMAL,0,NORMAL,0,NORMAL,54916,153676,306346,354260,142676,60877,15165,87807,6265,515626,88,5,0,NORMAL,0,NORMAL,0,0,0,1798,0,14,0,NORMAL,0,0,0,0,0,0,0,0,0,0,0,0,0,MAINT
    00:30,54931226,290982247,425662,873843,0,NORMAL,0,NORMAL,0,NORMAL,49302,135761,249862,279097,113025,44823,6999,71647,6340,342510,125,5,0,NORMAL,0,NORMAL,0,0,0,1782,0,16,0,NORMAL,0,0,0,0,0,0,0,0,0,0,0,0,0,MAINT

**.pmon**

    Time Stamp,RF BBE,Status,RF ES,Status,RF SES,Status,RF SEP,Status,RF UAS,Status,RF OFS,Status,RX Level (MAX) [dBm],Status,RX Level (MIN) [dBm],Status,Record Status
    00:15,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,-30.2,NORMAL,-31.4,NORMAL,VALID
    00:30,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,-30.2,NORMAL,-30.7,NORMAL,VALID
    00:45,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,0,NORMAL,-30.3,NORMAL,-31.2,NORMAL,VALID

**UPDATE**

I am thinking of using NRPE to run a bash script which extract or parse information from the files. Then display the information on the monitoring server. Is there a way display the information in graph format ?
                                

abiieez (55 rep)

Dec 5, 2015, 09:26 AM • Last activity: Dec 9, 2015, 09:40 AM

Showing page 1 of 20 total questions