Disk Journal Utilization monitoring

Hi
I decide to monitor disk journal utilization in my graylog server. what is the best solution? i write a script to get unprocessed message. is it a good criteria to monitor disk journal utilization? when i conncet to Graylog API Browser i didn’t see anything that is relate to “disk journal utilization”. if there is a formula to calculate disk journal utilization what is it?
BR

i didn’t see anything that is relate to “disk journal utilization”

You can see the details of a local disk journal under System > Nodes and then clicking the node’s hostname. There you will find details about the disk journal.

when i conncet to Graylog API Browser

Ah, sorry, my bad. You’re not going through the website, but the API.

is it a good criteria to monitor disk journal utilization?

Personally I would definitely keep an eye on it, simply because I have not fully understood the nature of the log. Is it circular, insofar that it starts overwriting itself once a limit is reached? Will it chuck the full log when the limit is reached? Or will it start a second, new log journal? I’ll need to study a bit more on this subject…

EDIT
Hrrmmm :expressionless: Googling didn’t get me very far…

Hey @Jan, I’m gonna ask a stupid question here. What happens if the local disk journal of a Graylog server reaches either (or both) the maximum size (5GB) and/or the maximum age (12h)? Does it start a second (third, fourth) journal? Is it circular? Or does it just nuke&pave?

Hey @Jan, I’m gonna ask a stupid question here. What happens if the local disk journal of a Graylog server reaches either (or both) the maximum size (5GB) and/or the maximum age (12h)? Does it start a second (third, fourth) journal? Is it circular? Or does it just nuke&pave ?

That Journal is a FIFO journal - that just drops the oldest records if one of the configured limits is reached.

It can be monitored via metrics:

grafik

Those can be requested via API without a Problem at the metric endpoint - see how you restrict a user just to this endpoint here: http://docs.graylog.org/en/2.5/pages/faq.html#how-can-i-create-a-restricted-user-to-check-internal-graylog-metrics-in-my-monitoring-system

1 Like

Super! Dank’schön!

I will have to go over our configuration to verify these settings. The default 12h might be too short for us in some cases, because we’re not guaranteed 24x7 support. It would suck if we lost lots of logging simply because the host could not talk to Elastic for a prolonged period of time over the weekend.

One more thing, monitor what you can, and you can monitor what you want. So monitor everything.
At first time check the Graylog’s system overview and node page, and try to monitor all the information what you see. After some problem you will see some other things too, what you have to monitor.
At the moment I monitor 20-25 metrics/status about the graylog service, most of about API. (the elastic and mongo is another thing…)
Last time when I added a new performance monitor was today morning.
Yes, I don’t need all of the things, but maybe if I will have a problem, I will be able to check the history of these values.
Just some metrics what I think useful, ES Output process times; input, output, process message total and 5 min avg, usage; lb status; throughput; uncommitted message; service mem and CPU usage; TCP and UDP connections; journal disk usage

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.