Sudden Spike of Unprocessed Messages in Graylog Cluster

support.wastu · April 3, 2026, 3:58pm

Hi,

I am currently experiencing an issue related to a high number of unprocessed messages in my Graylog cluster.

Within a very short period of time (approximately 1 minute), the number of unprocessed messages can spike to hundreds of thousands, even though all nodes are in a RUNNING state and message processing is enabled.

Environment:

Graylog cluster (multi-node setup)
All nodes status: Running, Load Balancer Alive
JVM heap usage appears normal (approximately 1–4 GB per node)
Journal messages continue to accumulate

Issue:

Sudden spike in unprocessed messages
Processing throughput is unable to keep up with incoming log volume
Potential delay in log visibility and alerting

Questions:

What are the most common root causes for this behavior?
What is the recommended approach to identify the bottleneck (e.g., CPU, disk I/O, journal, or input rate)?
What tuning steps are recommended to reduce the number of unprocessed messages?
Would it be more appropriate to scale horizontally (add more nodes) or optimize the existing configuration first?

Wine_Merchant · April 7, 2026, 8:54am

Check the process and output buffer utilization at the point the build up occurs, if it’s just the process buffer hitting 100% then the issue could be with pipeline rules and if both output and process buffer are filled then the issue will be with writing messages to the Opensearch cluster.

ihe · April 7, 2026, 3:55pm

one input is sending more logs than usual. you can check on the input page, with input has so many logs
I recommend the steps by @Wine_Merchant to find the bottleneck: only the processing buffer is full → processing on the Graylog is slow. If the out butbuffer is also full the OpenSearch is the bottleneck
if you can identify the messages which cause the spike, you can decide if they are relvant for you, or not. I know cases, where a drop_message() for certain types of messages brings a big boost in performance, as this message is not even written into OpenSearch
first optimize, what you already have. To do so:

check your parsing. Greedy Grok patterns can realy burn a lot of CPU. I wrote a blog post on that (in German though, but any AI Translation will help): Grok Pattern und Graylog: Effizientes Log-Parsing ohne Bottlenecks - NetUSE AG
Check your lookups. If you do e. g. reverse DNS and the timeout is 10s, reduce the timeout. Check also the size of your caches. If the thoughput of the cache is high but the hit percentage low, increase your cache
Check your Alerts: running an alert every 5 sec to search a heavy data stream with the data of the last week will degrade the performance of your OpenSearch

Topic		Replies	Views
Graylog has Millions of Unprocessed Messages Graylog Central (peer support)	2	1019	February 10, 2021
Graylog keeps crashing with millions of unprocessed messages Graylog Central (peer support)	2	338	May 29, 2024
Unprocessed messages are currently in the journal Graylog Central (peer support) sidecar , filebeat-linux , filebeat-windows , nosendlogfblx	18	12651	January 26, 2018
Unprocessed messages Graylog Central (peer support)	9	2206	January 8, 2018
Graylog process buffer utilization high , too many unprocessed messages Graylog Central (peer support)	5	910	July 24, 2020

Sudden Spike of Unprocessed Messages in Graylog Cluster

Related topics