UDP packets are dropped during garbage collection

kishorenc · November 20, 2018, 5:57am

Whenever the Graylog server’s garbage collection happens, we notice UDP RcvbufErrors (we can see this from netstat -suna). See this graph which shows how the RcvBuf errors spikes during a GC cycle:

27%20AM

I tried bumping up net.core.rmem_max and net.core.netdev_max_backlog to see if it would help, but it did not help. Is there anything we can do to ensure that the GC pauses do not lead to the UDP packets being dropped?

jan · November 20, 2018, 12:28pm

I guess that it would help to reduce the time of the garbage collection - Did you have a reason for this hudge HEAP for Graylog?

kishorenc · November 20, 2018, 12:35pm

@jan I don’t follow what you mean by “hudge HEAP”. These are the JVM options used:

-Djava.net.preferIPv4Stack=true -Xms7519m -Xmx7519m -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow

How can I reduce the GC time?

jan · November 20, 2018, 1:50pm

The RAM you have assigned to Graylog JVM HEAP (7GB) - did you have that for a reason?

kishorenc · November 20, 2018, 2:51pm

Generally, for JVM applications, we tend to assign half the total RAM available. For e.g. this is a 16 GB box, so assigned roughly half of it. We’re ingesting 1,000 to 1,500 messages per second on the Graylog boxes.

Would a smaller heap result in more frequent GC, hence leading to smaller pauses? What is the recommended size?

jan · November 20, 2018, 3:19pm

if you have no special needs in a big HEAP for Graylog - I would not use more than 2GB …

kishorenc · November 21, 2018, 5:43am

@jan I tried decreasing the heap size to 2 GB. That only decreased the number of RcvbufErrors, but did not eliminate them. I still see the errors occurring when garabage collection happens.

I’m aware of the limitations of UDP, but is this an expected behavior that GC pauses can cause packet loss?

jan · November 21, 2018, 8:36am

I’m aware of the limitations of UDP, but is this an expected behavior that GC pauses can cause packet loss?

As this is UDP, yes that can happen.

macko003 · November 30, 2018, 12:49pm

@kishorenc
Do you see the same UPD drops at night with lower log traffic?
What is your average traffic rate at the day and night?

system · December 14, 2018, 12:49pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Nodes with too Long GC pauses with maxed ES ram Graylog Central (peer support) sidecar , filebeat-windows , alert , capacity_planning	7	1756	March 7, 2023
Nodes with too long GC pauses Graylog Central (peer support)	5	4724	December 21, 2022
Drools failing to drop all logs Graylog Central (peer support)	3	459	March 22, 2017
Garbage collection runs Graylog Central (peer support)	4	2482	March 22, 2022
Graylog stops processing all incoming trafic all the sudden Graylog Central (peer support)	2	499	May 6, 2020

UDP packets are dropped during garbage collection

Related topics