Data Retention since Graylog 6.0

Bit of an essay for you @LCE…The system is based around obtaining the ideal size for a single shard within an index set, this setting defaults to 20gb per shard and can be altered with the below options in server.conf. 20GB shard size is considered ideal for search performance but you may wish to alter this based on how much data you ingest and the resources available within the cluster. How much memory you have assigned to heap should be a consideration when calculating how many shards an individual Opensearch can hold, let’s say 16GB is assigned as heap the equation would look like this 16 (total heap assigned in GB) x 20 (1GB of heap = 20 shards at 20GB shard size) = 320. This means 320 shards per OS node.

time_size_optimizing_rotation_min_shard_size =20gb
time_size_optimizing_rotation_max_shard_size = 20gb

When applying this logic to your cluster, to make suggestions we would also need to know how many Opensearch nodes are available and their current heap allocation along with the daily ingest in GB.

To give the simplest answer you your question you would need to set the min to 365 days and the max to 375, this would give a 10 day leeway for the system to better optimise shard size but you will always retain at least 365 days of data.

3 Likes