Journal filling with buffers remaining empty

inventor96 · October 3, 2021, 2:53am

Description of your problem

TL;DR
The journal is filling, but the process and output buffers are empty, and the output message count has not risen above 0 since I’ve noticed the issue.

A little background context…

I was troubleshooting an issue with our logging setup earlier today. I ended up finding that the issue was with the source, and the fix caused a significant amount of messages to flood our one (standalone) Graylog instance. It overfilled the journal while the 6 process buffers and 1 output buffer were maxed out. After addressing that, the buffers remained full, but the journal usage started going down, which I took as a good sign. I was watching that for about 10 minutes; it got through about 1 million messages and just over 50% of the maxed-out journal. Seemingly at a random point, the process and output buffers emptied and the journal usage and counts started slowly going up as messages came in.

Description of steps you’ve taken to attempt to solve the issue

I’ve tried restarting all three services on the host (mongo, elasticsearch, graylog), I’ve tried restarting individual services, I’ve tried changing the process buffers to 4 and output to 3, and I’ve tried restarting the whole machine, but I’m getting the same results. Nothing helpful seems to be in the log file (I’ve only looked back as far as the last service restart, though)

Environmental information

Operating system information

Rocky Linux 8.4 (was CentOS 8 when originally built)

Package versions

Graylog: 4.1.5
MongoDB: 4.2.17
Elasticsearch: 7.10.2

I haven’t included logs or configs because I have no idea what would be applicable/useful and I thought including everything would be excessive; I didn’t change anything in Graylog, ES, or MongoDB before the problem came up; and the only logs in the Graylog file seem to be related to starting up. If there’s something specific that would be helpful, please feel free to ask.

inventor96 · October 3, 2021, 3:55am

Per a coworker’s recommendation and searching the forum for the idea; I stopped the three services, deleted all journal files (rm -rf /var/lib/graylog-server/journal/*), and then started the services again. It seems to be working now. Kind of a bummer to loose those ~4 million messages that were in the journal, but I’m assuming there wasn’t much I could do with a presumably corrupt journal. If anyone has ideas on why there were no errors about the journal, that might be helpful for the future.

gsmith · October 5, 2021, 1:55am

Unfortunately I had that happen before. The only thing I have done different was increase the journal size & Volume to at least give me 24 hour head start on fixing problems before that happens again.

system · October 19, 2021, 1:55am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Journal utilization is too high - process buffer 100% Graylog Central (peer support) alert , elastic	20	6004	April 11, 2022
Process buffer repeatedly filling up until restart Graylog Central (peer support) pipeline-rules	9	2781	December 24, 2019
Graylog nodes stop outputting/fill up buffers Graylog Central (peer support)	15	6167	May 6, 2020
Disk Journal is full and Process buffer is full Graylog Central (peer support)	2	354	March 31, 2023
Elasticsearch optimization Graylog Central (peer support)	3	1104	January 23, 2023