1. Describe your incident:
Few weeks ago I upgraded from version 5.2 to 6.x. All other things stayed unchanged…graylog confguration, mongodb. opensearch, the amount of incoming messages is the same. Immediately after upgrade I noticed that 3 graylog servers are processing with a bit lowr efficiency, they are unable to fully process in time the full traffic that has around 50 000 msgs/sec and the journal queues grow very large, over 100 milions. Dont see error about in graylog logfiles .
During night ammount of messages is less and at that time journal queues are empty - processing is done quick enough.
2. Describe your environment:
- OS Information:
Oracle Linux 8.7 - Package Version:
Before upgrade graylog 5.2.3, after upgrade 6.0.2 - Service logs, configurations, and environment variables:
1 Load Balncer infornt of 3 powerful graylog servers , each over 50 CPUs,
6 opensearch nodes
3. What steps have you already taken to try and solve the problem?
Played a bit with setting of buffers, as I saw this was changesd in Documnetation for 6.0, But as i understand if user sets his own values for processbuffer_processors and outputbuffer_processors they should still be in charge and work as before ?
Automatically choose default number of process-buffer and output-buffer processors based on available CPU cores. graylog2-server#17450graylog2-server#17737
Our setting before were large but were set so based on best results with previous versions
processbuffer_processors = 26
outputbuffer_processors = 16
I played a bit with these values, tried to lower or increase them, but it seem s it didnt matter much. I monitored metrics for Processor buffer in GUI like [org.graylog2.shared.buffers.processors.ProcessBufferProcessor.incomingMessages] to see if the changedf buffer setting orioduces better througput.
The workaround for now is following : So once journal queue on one graylog node gets too larg, we instruct LB to move messages to other 2 and give that node time to recover.
4. How can the community help?
I would really prefer to stay on 6.x version. Am I looking at right reason regardinig buffers. to check inside metrics.
Helpful Posting Tips: Tips for Posting Questions that Get Answers [Hold down CTRL and link on link to open tips documents in a separate tab]