I have Graylog 4.0 and would like to know how large my logs are (probably on a day-to-day basis). I was trying to create a bar chart and aggregate by number of bytes (since that’s the only “log size” metric I can see), but I don’t see an option for “bytes”. There is also no field for receivedBytes.
Basically I would just like to see the daily average size of the logs, so if a bar chart of that is not possible, is there any other way to see this?
Good question, I haven’t trend data on the size of logs per say. Its normally done by volume /time. But what I did find was metrics this is under System/Node → metrics.
Not sure if that will help ya.
What I have done was enabled Prometheus in Graylog config file , Install Grafana and created a dashboard for these metrics .
See more here.
This was all done on my graylog node. I’m not that great with it yet, so I’m labbing it out.
Ok thank you, I will try looking at this.
I looked at some example messages again and saw that while there was a section in the original message like “bytesreceived=100” it did not get extracted because of some strange formatting. I added a small extractor using regex to parse out that number, so now I can visualize that one input using a widget with a daily bar chart of count(bytesreceived).
Now I just need to figure out how to do it for both my inputs.
UPDATE: I found that “gl_accounted_message_size” tells you the message size, and I used
avg( gl_accounted_message_size) to create a bar chart (with data from the last 30 days). I was a little confused but found out that that actually calculates the average size of one log message for that day. I was able to use
sum(gl_accounted_message_size) to find more reasonable numbers (which turn out to be the same numbers under System/Overview → Outgoing Traffic.
Under System/Overview → Outgoing Traffic, it says “Last 30 days: 99.4 GiB”. Does that tell me how many bytes/GB the logs take up each day or is that something different? But
curl -XGET 'localhost:9200/_cat/allocation?v&pretty' returns 34.7GB under
disk.used, so I am a little confused. Additionally, when I go to System/Nodes → Metrics, the total bytes read for both my inputs add up to <1GB. Which one can actually tell me how much space is used up by all of my logs each day?
I think there are maybe two topics here: measuring the size of the logs and measuring sizes reported in the logs by the application.
Measuring the size of the logs:
Each message has a few fields which are always set, but not visible in the first place. The field “gl_accounted_message_size” is one of those. It counts the bytes of “real log data” thrown into elastic by this message. There are other fields as gl2_source_input (the id of the input) or gl2_source_node (the ID of the node if you run a cluster) which to not counted as “real log data”.
The number of bytes (or better Gigabytes) is relevant for the commercial versions of Graylog. Generally this offers a good rule of thumb how much data you ingest.
The metric “sum” counts those bytes for all matched messages for the given time. If you have a bar chart with time, it will give you an idea how much data is pushed into elastic over time.
Measuring size reported in the logs:
you will need a field of type integer to count the bytes of traffic/whatever in your application. As I understand your post, bytesreceived is the field in question. Here you will find an idea how it looks in my Graylog:
My field is named size_of_request, just put in your “bytesreceived” and you should have good chances to get it working.
Hello, thanks for your response.
I actually don’t think I need the size reported in the logs, just how much space they actually take up daily on my VM. That way I can estimate if I need to increase resources for my VM. If I use the command
curl -XGET 'localhost:9200/_cat/allocation?v&pretty', that tells me how much space I’m using on my VM, right? If I manually calculate day-to-day how much disk space was used from that command in the CLI, can I find out how much resources I need?
well, yes and no.
the development over time will give you the answer. If you look every day, and note the usage, you can see the trend. This is some kind of manual monitoring and might be suitable for proof of concepts etc.
The usage in the long run is mostly stable. On what stream are your logs? Which index set is used for this stream? Is this index set configured to rotate the logs by time (P1D for once a day. e. g.) or by number of messages? How often does this rotation happen?
If your rotation is “full” Graylog will delete the oldest logs during the rotation and your disk-usage will stay stable.
I use the default index set for all my logs, which is set to rotate by month. The retention setting is set to 3 months.
It seems like manual monitoring is the only way, since the other pieces of information on the Graylog UI itself don’t really tell me the actual disk space used each day.
(Also, silly question: does disk space here refer to the storage/memory on my VM?)
There is no silly question
The output of
curl -XGET 'localhost:9200/_cat/allocation?v&pretty' should give you the same sizing as your OS does.
gl2_source_input will give you a different number. It does not take the extra fields into account, and also does not know if your elastic has any replica shards, which adds disk-space for each replica and so on. From my experience, both are some kind of parallel, with some factor based on configuration.
So to recap, the best way is to use
curl -XGET 'localhost:9200/_cat/allocation?v&pretty' to estimate how much resources my VM uses up each day?
yes, it will do. You might play a bit with the parameters:
curl -XGET 'https://localhost:9200/_cat/nodes?v=true&h=id,name,ip,port,version,master,diskTotal,diskUsed,diskUsedPercent&pretty'
will also work with multiple nodes. Choose the one you like more
When I throw
curl -XGET 'localhost:9200/_cat/allocation?v&pretty' in the CLI, I see that the
disk.used has decreased by 0.1GB. If this is the final destination of the logs, why would disk space used decrease?
I don’t know to be honest. If you count every 0,1GB Logging in scale might be the wrong topic. It could be Logs from the Elastic being rotated and deleted, but it is unlikely as you are rotating only once a month. It could be system-logs being rotated by you linux. It could be some cache for updates which was freed. To many possibilities
Ok, thanks for your help!