Graylog container pegging 2 cpu cores for hours

CJRoss · June 8, 2024, 5:07pm

Before you post: Your responses to these questions will help the community help you. Please complete this template if you’re asking a support question.
Don’t forget to select tags to help index your topic!

1. Describe your incident:
The graylog container has pegged two cpu cores of the server since yesterday. The mongo and opensearch containers are showing normal cpu usage.

2. Describe your environment:

OS Information: Debian 12
Package Version: 6.0.2 Docker
Service logs, configurations, and environment variables:
Single server instance with everything running in docker

3. What steps have you already taken to try and solve the problem?

Checked logs.

4. How can the community help?

How can I tell why graylog has suddenly started using so much cpu?

Helpful Posting Tips: Tips for Posting Questions that Get Answers [Hold down CTRL and link on link to open tips documents in a separate tab]

CJRoss · June 9, 2024, 1:55pm

Because Graylog had the two cores pinned, the VM backup failed and caused it to be shut down. I’ve restarted the VM and things appear to be back to normal.

I’m not sure if there was just a runaway task or something else. I wasn’t seeing a ton of activity on OpenSearch so I don’t think it was organizing the indices.

gsmith · June 9, 2024, 6:14pm

Hey @CJRoss

Did you run HTOP/TOP see what was actually using CPU? What is the amount of logs are you ingesting?

In your Graylog Configuration file there are setting for :

processbuffer_processors = 5
outputbuffer_processors = 3
inputbuffer_processors = 2

each one of those settings creates a thread on the CPU. Depending on amount of load you have coming in. you may want to Increase you CPU to 4 cores. or adjust those setting to match the amount of CPU core you have.

CJRoss · June 11, 2024, 11:54am

Top just showed java. My log volume is low and the cpu usage is normally just a few percent. That’s why the spike was confusing.

gsmith · June 12, 2024, 3:48am

Hey,

So when it spikes because OpenSearch/Graylog uses the CPU. Those setting I showed above create thread to be used for Processing. The one that is the heavy hitter is

processbuffer_processors = 5

If you add up a those settings above together, it will create 10 threads, normally you want 1 CPU per , so this would be 10 CPU’s you should have. in a normal operation you can get away with those setting but if you have a load it will start using those resource, hence CPU spike.

CJRoss · June 12, 2024, 12:46pm

What do you define as a spike? There were no indications of any sort of spike, hence my confusion. The OS container wasn’t using much cpu and the indexes volume was showing usual levels of activity. The logs didn’t show any sort of index operation being performed nor were there an increase in the amount of logs or change in processing pipelines. The node buffers weren’t filling up, etc.

Except for the two pinned cores, there was no indications of anything out of the ordinary. That’s why I think it was some sort of runaway process but I don’t know how to determine that for sure. It hasn’t happened before or since.

gsmith · June 13, 2024, 3:28am

Hey

Oh, I misunderstood you, so your 2 CPU were pinned, and nothing else was going on. By chance are you using any GROK extractor or in Pipeline?

CJRoss · June 13, 2024, 11:07am

I am using grok patterns in my pipelines, but as I mentioned, none of those had changed recently. I’ve since added more patterns and pipelines and my cpu usage continues to be low.

gsmith · June 18, 2024, 3:59am

Hey, Sorry for delay.

Ok so nothing has change with your GROK stuff and suddenly your 2 CPU’s get pinned. The only variable I see that could have changed would be what was sent to your Graylog server, meaning log wise. Maybe some random log was sent to Graylog and go stuck in Pipeline, which pinned your CPU’s . The reason I stated this was we had issues with GROK , so we reconfigured for regex instead and never had a issue since. Just an idea.

CJRoss · June 18, 2024, 1:29pm

Understood. Since it hasn’t happened again, I’m not too concerned about what caused it. My main desire right now is how do I diagnose it if it happens again. Figuring out what is causing load on Graylog is much less clear than I’d like.

What do you recommend for standard troubleshooting steps?

gsmith · June 18, 2024, 8:52pm

Hey,

Check OpenSearch and/or Graylog logs is the only thing I can think of now. Keep an eye on HTOP/TOP find what service/ app is consuming your CPU. a Metric server like " Zabbix" does help or Grafana. It hard to find this type of issue. I would also look at your log shipper logs to see if there was any clue.

CJRoss · June 19, 2024, 11:20am

That’s what I did the first time but unfortunately there wasn’t anything that stood out. I know it was java consuming all the cpu in the graylog container but that was it.

gsmith · June 20, 2024, 3:22am

Hey ,
I just found this when i was troubleshooting some issues. It’s a good read if you have time.

How to Troubleshoot Java High CPU Usage Issues - Site24x7.

Arie · June 24, 2024, 9:47am

Could this be a garbage cleaning (Java GC) cycle from java?

CJRoss · June 25, 2024, 5:46pm

No. Java GC wouldn’t have pinned the two cores for hours like that.

system · July 9, 2024, 5:47pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
High CPU usage after 3.0 -> 3.1 upgrade Graylog Central (peer support)	5	576	September 4, 2019
Graylog-server container start consuming high cpu and auto recovers in evening Graylog Central (peer support) docker , elastic	7	2142	May 19, 2022
Extreme hardware usage Graylog Central (peer support) docker	3	69	November 25, 2024
Allow graylog in docker to use multiple cores Graylog Central (peer support)	13	2300	June 21, 2018
Graylog keeps consuming high RAM usage periodically Graylog Central (peer support)	9	2425	June 26, 2023

Graylog container pegging 2 cpu cores for hours

Related topics