Hi,
We’ve had some occations the last two weeks where one of the Graylog servers had over 9 million unprocessed messages in the journal. It has usually been solved by restarting Graylog.
At the time it happened yesterday there were lots of messages like this :
2020-09-23T21:11:41.623+02:00 WARN [GelfChunkAggregator] Error while expiring GELF chunk entries
java.lang.NullPointerException: null
at java.util.concurrent.ConcurrentSkipListMap.remove(ConcurrentSkipListMap.java:1991) ~[?:1.8.0_162]
at java.util.concurrent.ConcurrentSkipListSet.remove(ConcurrentSkipListSet.java:259) ~[?:1.8.0_162]
at org.graylog2.inputs.codecs.GelfChunkAggregator.getAndCleanupEntry(GelfChunkAggregator.java:206) ~[graylog.jar:?]
at org.graylog2.inputs.codecs.GelfChunkAggregator.expireEntry(GelfChunkAggregator.java:195) ~[graylog.jar:?]
at org.graylog2.inputs.codecs.GelfChunkAggregator.access$200(GelfChunkAggregator.java:49) ~[graylog.jar:?]
at org.graylog2.inputs.codecs.GelfChunkAggregator$ChunkEvictionTask.run(GelfChunkAggregator.java:296) [graylog.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [?:1.8.0_162]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [?:1.8.0_162]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [?:1.8.0_162]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [?:1.8.0_162]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_162]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_162]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_162]
Can it be related to the full journal ?
Graylog version is 3.2.5 on Debian Stretch