Journal utilization 99% has gone over 95%

I am having this problem regularly. This affects the processing of messages that stop being processed.

I have a strategy to rotate indexes by size (I also tried each time) to keep 2 indexes, but I realize that when the graylog needs to rotate the third one and delete the oldest one, it gets stuck and, therefore, the messages stop processing.

How can I improve the performance of this?

I have 32 GB of RAM, 16 GB for Elasticsearch and 8 for Graylog’s JVM.

Last messages in “server.log” Graylog

2020-04-27T16:01:51.508-03:00 WARN [KafkaJournal] Journal utilization (99.0%) has gone over 95%.
> 2020-04-27T16:02:51.459-03:00 WARN [KafkaJournal] Journal utilization (99.0%) has gone over 95%.
> 2020-04-27T16:03:21.944-03:00 INFO [AbstractIndexCountBasedRetentionStrategy] Number of indices (3) higher than limit (2). Running retention for 1 indices.
> 2020-04-27T16:03:21.947-03:00 INFO [AbstractIndexCountBasedRetentionStrategy] Running retention strategy [org.graylog2.indexer.retention.strategies.DeletionRetentionStrategy] for index <microsoft-ad_3309>
> 2020-04-27T16:03:23.730-03:00 INFO [DeletionRetentionStrategy] Finished index retention strategy [delete] for index <microsoft-ad_3309> in 1782ms.
> 2020-04-27T16:03:31.940-03:00 INFO [AbstractRotationStrategy] Deflector index <Forward: Exchange> (index set <exchange_1336>) should be rotated, Pointing deflector to new index now!
> 2020-04-27T16:03:31.941-03:00 INFO [MongoIndexSet] Cycling from <exchange_1336> to <exchange_1337>.
> 2020-04-27T16:03:31.941-03:00 INFO [MongoIndexSet] Creating target index <exchange_1337>.

@sadman: You need to check journal message directory configuration(i.e. message_journal_dir = " ") on Graylog configuration “server.conf”. This directory will be used to store the message journal and it must exclusively be used by Graylog and must not contain any other files than the ones created by Graylog itself.

If you need more information please share message directory path and its utilization.

I hope this helps you!!!

Journal has a especifc folder in the system, check “server.conf”
It´s not recomend to change the folder location after the first install, you can change the limits for the journal.
Check the doc:
First here:http://docs.graylog.org/en/2.4/pages/faq.html#what-does-journal-utilization-is-too-high-mean
Then for the configuration file:
http://docs.graylog.org/en/2.4/pages/configuration/server.conf.html#output-batch-size

Thank you very much for the contributions, I will check and forward updates here.

We have Graylog with the following configurations:

ring_size= 65536 ( # of messages in each buffer)
inputbuffer_ring_size=65536

processbuffer_processors = 5
outputbuffer_processors = 3

output_batch_size = 1000
journal_age = 15 min
journal_size = 15 gb

GRAYLOG RAM = 8gb
ELASTICSEARCH RAM =16gb

Only 1 node, Only 1 machine

We are getting like 25,000 messages per second and outputs only 5,000 messages per second
The journal is filling up quickly and getting an error “Journal Utilization is too high” so, can anyone please help me in calculating the increase of output_processors and output_batch_size

for that amount of messages it looks like your processing power in terms of cpu cycles is not enough.

@jan Sorry I didn’t send the complete information.
The system runs a single node in a VM that has 24vCPU
In total resources, I have: 24CPU and 32 GB of RAM
I divided 50% for elasticsearch (16GB RAM) and 25% for Graylog (8 GB RAM), the remaining 25% for SO.

he @sadman

did you limit the cores that are usable by elasticsearch in the elasticsearch configuration? If not Graylog and elasticsearch are fighting for the available cores. In addition with that ingest rate special during index rotation you have not enough computing ressources and the graylog journal is used to buffer. You might want to raise the size of the journal and/or add additional compute power.

How do I limit the cores usable by Elasticsearch and Graylog?
I limited the JVM’s memory to 16 EL and 8 GR.
As for the size of the journal, what do you recommend? The default value is configured.

I believe that this is exactly the behavior, using the journal above what can and start receiving unprocessed messages, and this increases when need to rotate the indexes

he @sadman

read the docs: https://www.elastic.co/guide/en/elasticsearch/reference/6.8/modules-threadpool.html#processors

the journal should have the size that fits to your needs. If you get paged when your elasticsearch is dead and you can fix that in 4 hours, the journal should have the size to cover this period of time. If you do not get paged and elasticsearch can die on friday noon and you notice it earliest monday morning you might need a journal that can cover 3-4 days of logs. But you need to have this disk space exclusive for graylog. Cause the journal will get damaged if the configured size is not available and you loose all messages in the journal.

I did not find in the configuration file “server.conf” or in “elasticsearch.yml” the option to configure the processors.
It would be the option of “inputbuffer_processors” in “server.conf”???

it is the graylog server.conf …

https://docs.graylog.org/en/3.3/pages/configuration/server.conf.html

check for buffer_pro to find all options for processing, input and ouput.

I have 24 CPU. do you recommend any value where i can move?

configured this way for now and kept message processing stable

he @sadman

I would go with 2 for input, 3 for output and the processing to 16.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.