Graylog keeps crashing with millions of unprocessed messages

Wvery few weeks my server keeps crashing I keep deleting all the logs I find in my /var/log/graylog and elasticksearch restarting and waiting for the server for days to process the millions of messages. Is there a way to have this process automated so that when I go to my dashboard and i don’t have
graylog While retrieving data for this widget, the following error(s) occurred: Connection refused.In Nodes
The journal contains 3,091,373 unprocessed messages in 32 segments. 71 messages appended, 370 messages read in the last second.

And in Overview

Journal utilization is too high (triggered 2 months ago)

Uncommited messages deleted from journal (triggered 3 months ago)

Nodes with too long GC pauses (triggered 6 months ago)

I normally have to delete all log files and restart the server

Hey @jrecho

Not sure why the deletion of the log and the fact that your journal is full are connected.

This sounds like an issue with processing power for the amount of logs you are currently ingesting. This can be addressed by adding more CPUS’s/RAM/HEAP to the Graylog and Opensearch nodes. You could also increase the amount of Graylog and Opensearch nodes within the cluster and introduce a load balancer in front of your Graylog nodes.

I would clear the journal and restart the system, observe under System/nodes which buffer fills first. This will give some idea as to where the resource issue lies.

Currently how much heap is assigned to the Graylog and Opensearch nodes?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.