Process and output buffer is 100% utilized

sizerus · July 9, 2018, 11:07am

Hi.
On productive stand have 2 VM (4cpu, 12gb ram in each).
Graylog+elasticsearch+mongodb on each node and all in the docker.
All settings are set to default except:
ES_JAVA_OPTS: -Xms4g -Xmx4g (for elasticsearch)
GRAYLOG_OUTPUT_BATCH_SIZE: 1000 and GRAYLOG_OUTPUTBUFFER_PROCESSORS: 4 (for graylog).
NTP is correctly working on all nodes.
Elasticsearch cluster health is green.
Pipeline and extractors is not set.
In graylog have problem with too many uncommited messages:

In peak I have about 5000 messages per second.
What else can i do besides increasing resources?

ps sorry for my english - using gtranslate

jan · July 9, 2018, 11:13am

increase the ressources for elasticsearch and seperate elasticsearch from Graylog

tjb6 · July 10, 2018, 4:21am

We have a similar problem, but on a physical Dell server, 96Gb memory, 24 processors.
I’ve been alll over the tuning guidelines to do the best I can with it, but have found that it works normallly for a few days, then the message queue starts to back up, and eventually we lose messages.

Elasticsearch memory is not swappable, and is set to 24g, the system is not paging, but a major sign is both the elasticsearch and graylog CPU usage spikes up.

What one of my database team suggested for me, and it seems to work, is restarting the processes to clean up the state of the JVM - I think the garbage collect is starting to thrash.

Our system is processing on average 80000 messages / minute, and storing around 50Gb data / day.

For forward planning, I’m assuming a second graylog node would be more helpful than separating graylog from ES, is that correct?

jochen · July 10, 2018, 6:55am

No, running Graylog and Elasticsearch on separate machines is more important. Otherwise they’ll compete for the same resources (CPU, memory, I/O bandwidth and disk cache) which leads to cache thrashing.

sizerus · July 12, 2018, 7:15am

After increasing resources on each VM (4 cpu -> 6 cpu, 14gb ram -> 16gb ram) and adding some memory to es (ES_JAVA_OPTS: -Xms4g -Xmx4g --> -Xms6g -Xmx6g) everything became fine.
For example to other posters: cluster with 2 VM (esrch + graylog + mongo on each) processes about 100-120gb logs in day and 3000-5000 messages in a second during peak hours.

Add: and set refresh_interval for es from Graylog and index.refresh_interval

system · July 26, 2018, 7:15am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Processors buffer configuration, process buffer 100% Graylog Central (peer support)	7	12718	June 22, 2018
Process and output buffers are full Graylog Central (peer support)	19	9480	November 30, 2020
Process Buffer Flooding 100% process Graylog Central (peer support)	8	4655	May 7, 2020
Buffer utilization is 100% for all nodes having backlog Graylog Central (peer support)	18	5299	October 18, 2018
Process and output buffer 100% utilized New to Graylog Community? READ-ME FIRST Guides	7	967	July 27, 2023

Process and output buffer is 100% utilized

Related topics