I am having this problem regularly. This affects the processing of messages that stop being processed.
I have a strategy to rotate indexes by size (I also tried each time) to keep 2 indexes, but I realize that when the graylog needs to rotate the third one and delete the oldest one, it gets stuck and, therefore, the messages stop processing.
How can I improve the performance of this?
I have 32 GB of RAM, 16 GB for Elasticsearch and 8 for Graylog’s JVM.
Last messages in “server.log” Graylog
2020-04-27T16:01:51.508-03:00 WARN [KafkaJournal] Journal utilization (99.0%) has gone over 95%. > 2020-04-27T16:02:51.459-03:00 WARN [KafkaJournal] Journal utilization (99.0%) has gone over 95%. > 2020-04-27T16:03:21.944-03:00 INFO [AbstractIndexCountBasedRetentionStrategy] Number of indices (3) higher than limit (2). Running retention for 1 indices. > 2020-04-27T16:03:21.947-03:00 INFO [AbstractIndexCountBasedRetentionStrategy] Running retention strategy [org.graylog2.indexer.retention.strategies.DeletionRetentionStrategy] for index <microsoft-ad_3309> > 2020-04-27T16:03:23.730-03:00 INFO [DeletionRetentionStrategy] Finished index retention strategy [delete] for index <microsoft-ad_3309> in 1782ms. > 2020-04-27T16:03:31.940-03:00 INFO [AbstractRotationStrategy] Deflector index <Forward: Exchange> (index set <exchange_1336>) should be rotated, Pointing deflector to new index now! > 2020-04-27T16:03:31.941-03:00 INFO [MongoIndexSet] Cycling from <exchange_1336> to <exchange_1337>. > 2020-04-27T16:03:31.941-03:00 INFO [MongoIndexSet] Creating target index <exchange_1337>.
@sadman: You need to check journal message directory configuration(i.e. message_journal_dir = " ") on Graylog configuration “server.conf”. This directory will be used to store the message journal and it must exclusively be used by Graylog and must not contain any other files than the ones created by Graylog itself.
If you need more information please share message directory path and its utilization.
We are getting like 25,000 messages per second and outputs only 5,000 messages per second
The journal is filling up quickly and getting an error “Journal Utilization is too high” so, can anyone please help me in calculating the increase of output_processors and output_batch_size
@jan Sorry I didn’t send the complete information.
The system runs a single node in a VM that has 24vCPU
In total resources, I have: 24CPU and 32 GB of RAM
I divided 50% for elasticsearch (16GB RAM) and 25% for Graylog (8 GB RAM), the remaining 25% for SO.
did you limit the cores that are usable by elasticsearch in the elasticsearch configuration? If not Graylog and elasticsearch are fighting for the available cores. In addition with that ingest rate special during index rotation you have not enough computing ressources and the graylog journal is used to buffer. You might want to raise the size of the journal and/or add additional compute power.
How do I limit the cores usable by Elasticsearch and Graylog?
I limited the JVM’s memory to 16 EL and 8 GR.
As for the size of the journal, what do you recommend? The default value is configured.
I believe that this is exactly the behavior, using the journal above what can and start receiving unprocessed messages, and this increases when need to rotate the indexes
the journal should have the size that fits to your needs. If you get paged when your elasticsearch is dead and you can fix that in 4 hours, the journal should have the size to cover this period of time. If you do not get paged and elasticsearch can die on friday noon and you notice it earliest monday morning you might need a journal that can cover 3-4 days of logs. But you need to have this disk space exclusive for graylog. Cause the journal will get damaged if the configured size is not available and you loose all messages in the journal.
I did not find in the configuration file “server.conf” or in “elasticsearch.yml” the option to configure the processors.
It would be the option of “inputbuffer_processors” in “server.conf”???