Journal Utilization is too high

Good morning guys, please i found this statement when i checked my notification this morning:

_“Journal utilization is too high and may go over the limit soon. Please verify that your Elasticsearch cluster is healthy and fast enough. You may also want to review your Graylog journal settings and set a higher limit.”

Please what does it mean and how do i get it fixed. Please i would appreciate a quick response cos i need to fix this asap.

Are messages still outgoing? If yes it may be due to elasticsearch cluster not being able to keep up. In which case you may need more cpu,disk,memory. This could be in the form of giving the current node/s more resources/IO OR adding 1 or more nodes. If there are no messages outgoing it may be that you elasticsearch storage is full and can’t ingest anymore data. In which case you would need to expand storage for elasticsearch or add another node and let it “rebalance” There can be other reasons as well but these are the most common I have run into.

Thanks Mantil, I appreciate. But by Outgoing, what do u mean? Because I can receive messages from nodes whenever i click on “Show received messages” But each time i show the messages, the notification comes back.

Does your graylog interface show that messages are going out? Usually there is a stat at the top showing a graylog clusters incoming and outgoing messages and also stats per node under “system>Nodes”

Please check for ursf

Dunno what your log ingestion rate is but your heap size seems pretty low. That is somewhat unrelated but you may want to look into that. Are you using the appliance?

When healthy what is your normal log ingestion rate?

How do i increase the heap size? And what exactly does the heap size do? And what size should it be increased tto max?

Is that last screenshot of your node currently? or when it was healthy? Also again. Are you using the appliance?

Yea, yea…the screenshot was taken seconds before i uploaded it here.

gotcha. The appliance at least in our exp. seems to be great for a proof of concept but not up to the task when put into a production environment where you not only need to ingest messages but dashboard and retain over the longterm. 2k+ messages seemed to fill up the default appliance config pretty quickly. I’d recommend moving up to at some point to a more robust setup. Multiple graylog nodes with accompanying elasticsearch cluster behind it. That being said. Your current problem. have you checked to make sure elasticsearch hasn’t run out of storage?

How do i check that please? And what does a more robust settup require?

quickest way I know of is ssh into your appliance. df -h command should let you know.

image

Looks okay. Another question. Is storage struggling to keep up? What kind of storage do you have behind this? Also, what does cpu/memory load look like? Maybe someone else in the community can chime in. My exp isn’t generally in using the appliance. But my guess is you are hitting some kind of hard or soft limit here whether it be cpu/heap size for graylog or elasticsearch. Another place to look. Are you using any expensive extractors? Regex/Grok patterns? those can also quickly eat into system resources if you don’t keep an eye on them. I could be way off base here but all things to look into. Splitting these workloads out really helps in that regard

Talking about extractors Matt, am using just one to extract only the source ips from the raw logs. Please is there another way other than that, which would enable me to extract all fields at once from the raw logs (source and dest addresses, ports, category outcome, different nodes) so that i can just search for any globally.

Thanks

Depends. What types of logs are we talking? Weblogs? network equip. logs, etc?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.