NRPE service checks report socket timeouts across all NRPE service checks when using GroundWork Monitor and/or Nagios


You may get this error (Service Check Timed Out) in the Service Status Details part of Nagios for all your NRPE service checks

When you manually check NRPE from a command line or testing from within GroundWork Monitor for example:

/usr/local/groundwork/nagios/libexec/check_nrpe -t 60 -H "NRPEHOST" -c get_cpu -a "SERVER" "_Total" "80,90"

You receive the response:

CHECK_NRPE: Socket timeout after 60 seconds

Increasing the socket timeout threshold makes no difference. Neither does stopping and restarting the NRPE service on the Windows server running the NRPE protocol.

For my situation it was a little unique. Due to some swapping around of GroundWork Monitor servers the allowed_hosts entry in the nrpe.cfg file on my NRPE host had changed to an old IP address of a previous GW server that was no longer online.

I changed it to the current GroundWork Monitor server address and NRPE started to work again

# ALLOWED HOST ADDRESSES

# This is a comma-delimited list of IP address of hosts that are allowed

# to talk to the NRPE daemon.

#

# NOTE: The daemon only does rudimentary checking of the client’s IP

# address.

allowed_hosts=xxx.xxx.xxx.xxx

Leave a comment