Index strange rotation and size differences

I’ve setup graylog about half a year ago. I have an index to collect logs from our FortiGates, I want to store the logs for about half a year. This is why I’ve set the Rotation Period to P1D (1 Day) and Max number of indices to 180. When I look at it now, a few months later, I can see that there are a total of 40 indexes but they have very different sizes. While the current one is being written to for about 4 months, the one before got written to for only a few minutes - the current one is almost 1.8 TB big while the one before (or to be precise - many of those before) are only a few GB big.
Here is an image where you can see the settings and the sizes:

  • OS Information: Ubuntu Server 20.04.4 LTS

  • Package Version:

  • elasticsearch-oss V7.10.2
  • graylog-server 4.2.11-1

So my question is, whether I did anything wrong in the configuration, or is this a normal behavior?
From what I have configured, I expect it to make a new index every day and keep 180 of them.
Any informational input would be appreciated.

Thanks

This is not normal behavior, looks like graylog missed a rotation and stopped rotating after that. Even so your number of shards is large, is your elastic backand one or more servers.

To force a rotation you could restart graylog or make a change to the strategies or index to get it going.

I am using a single server for graylog and elasticsearch.
I’ll do a manual rotation and report back. (restarting graylog doesn’t trigger a rotation by the way, the uptime of the server is 42 days and the index is being written to 4 months)
What’s the number of shards you would recommend?

With a singe server an looking at the normal size of your data a day +/- 7,5G one shard is fine and you environment will nee less resources.

OK, thanks. I’ll keep that in mind, however my total data from all indexes is around 50 GB per day - not sure if this makes a difference.

Anyhow, I think it is worth mentioning that every index which is set to P1D is not getting rotated for 4 month already. Furthermore, when I click on Maintenance → Roatate active write index on any of the indexes, nothing happens. Does this take some time? I waited now about 20 minutes and are now doing apt-get update - don’t know if this helps.

Hello,

Did you check your log files (i.e., Elasticsearch, Graylog)?

Hi gsmith

Thanks for pointing that out - it took me some time to get to the log message (I’m still new to Ubuntu server) but there I saw a message saying:

Validation Failed: 1: this action would add [1] total shards, but this cluster currently has [1000]/[1000] maximum shards open

I then deleted some old indexes to get the number of shards down a bit, then I changed all the indexes to 1 shard instead of 4 and now it let me do manual rotations.
Will see if tonight the automatic rotation will work, if not I’ll report back.

Thanks

1 Like

Hello,

Watch out for replica’s, this will double your shard count. Through the WebUI I normally set my Index set to 4 shard with NO replicas. Since Graylog is on a Virtual machine I have backup’s done through Veeam so I don’t really need to create a replica set. This reduces the amount of resources needed for Graylog.

For example here is my default index set, I have been using these settings since 2016 :smiley:

Next, ensure that 1 shard doesn’t go over 50GB you may run into issues.

Hope that helps

EDIT: Couple commands you can execute against Elasticsearch to ensure there are no issues.

Check Health status /w for unassigned shards.

curl -XGET http://localhost:9200/_cluster/health?pretty | grep unassigned_shards

Check for dangling Indices.

curl -X GET "http://localhost:9200/_dangling?pretty"

Hey

Thanks for the advise.

curl -XGET http://localhost:9200/_cluster/health?pretty | grep unassigned_shards
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   464  100   464    0     0   2252      0 --:--:-- --:--:-- --:--:--  2263
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,

curl -X GET "http://localhost:9200/_dangling?pretty"
{
  "_nodes" : {
    "total" : 1,
    "successful" : 1,
    "failed" : 0
  },
  "cluster_name" : "graylog",
  "dangling_indices" : [ ]
}

This is how it looks like - for an amateur like me it seems ok.

Thanks

1 Like

It is, just checking , & thx for showing the results :smiley:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.