Retention time strategy

Hi!
We have an Graylog index that will receive about 300 million messages / 24h.
Total amount of data is approx. 54GB /24h.We would like to have this searchable for 3 years.
he setup we intend to use is:
1 index with 5 shards and 1 replica.
Rotation period:P1D
Max number of indices:1096
Index retention strategy: Delete

This will result in the index for each day(24h) will be aprox 300 million messages/54GB

Is this setup feasible or should we do it some other way?

Thankful for any suggestions!

If it works for you, everything is fine. :wink:

More interesting than the number of primary and replica shards would be the sizing of your Elasticsearch cluster.

See also:

1 Like

Well, we will se in about 1096 days or less :slight_smile:
We have 3 elastic master and 5 data nodes, we can always add more nodes.
It´s just that if we can avoid doing something obviously wrong in this setup.
I had a look at the blog post you linked to.
EDITED:We will create 5 shards per 24h and with an average 54GB will result in each shard 54/5=10,8 GB per shard(not counting replicas).

TIP: Small shards result in small segments, which increases overhead. Aim to keep the average shard size between a few GB and a few tens of GB.

This seems like an OK setup?

Why 8 primary shards when you only have 5 Elasticsearch data nodes?

1 Like

Sorry, 5 primary, will edit post.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.