I'm trying to set up a Heartbeat client on a linux machine (CentOS Linux release 7.3.1611) which is sending ICMP echo requests to roughly 1300 hosts. However, in future this number will be higher. The messages generated by heartbeat are being shipped to a logstash instance on another server.
These requests go out every 3 minutes, all at once. After the first round of requests, a large number of these messages contain an error:
"write ip4 0.0.0.0->x.x.x.x: sendto: no buffer space available"
I've tried multiple ways of mitigating this issue, including staggering the requests so there are less being sent out in 1 second. This caused issues with data synchronisation at the end of this pipeline (as well as not completely resolving the issue) so it's not really a feasible option for me.
Researching online led me to increasing the memory allocated to TCP/IP connections by the system, however these are ICMP requests which as far as I'm aware should be separate.
I'm also unable to bounce the network interface as the machine is remote and I don't have a way of getting back in to bring it back up.
I also tested out increasing (and setting to 0) the icmp_ratelimit variable in /proc/sys/net/ipv4 but this hasn't helped either.
My question is pretty general, what could be causing this issue? Is there some sort of variable I need to tweak on the system that will allow these requests to all be sent out at once? I can't really deduce what buffer the error message is referencing.
Any help would be greatly appreciated...
P.S. if further clarification is needed I'd be happy to provide it.
**EDIT**
----------
Still unsure if there is a fault elsewhere however increasing the size of the socket buffer determined by the wmem_max, wmem_default, rem_max, rem_default variables in /proc/sys/net/core
has fixed the issue. It's likely that the total size of the data for all the ICMP echo requests was too large to fit under the previous maximum of 208kb and so a large number of them would be dropped. Doesn't really explain why the number of dropped requests would vary each time, maybe there is an underlying issue...
Now my only problem is that each time I reboot the system, these variables reset to 208kb each. How do I make these changes persistent?
Asked by P.Ackland
(41 rep)
Aug 15, 2017, 04:17 AM
Last activity: Aug 15, 2017, 12:31 PM
Last activity: Aug 15, 2017, 12:31 PM