Data Retention since Graylog 6.0

1. Describe your incident:

I just upgraded from Graylog 5.2 to 6.0.6 and noticed that the Indice retention is being deprecated. And now I just have Data tiering with max days and min days but I still want to rely on daily rotation.

2. Describe your environment:

  • OS Information: docker compose

  • Package Version: graylog 6.0.6, mongo 7.0.14, Opensearch 2.15.0, traefik 3.1.4 as reverse proxy

  • Service logs, configurations, and environment variables:

Env variable for docker compose (graylog service):

GRAYLOG_PASSWORD_SECRET: somepasswordpepper
GRAYLOG_ROOT_PASSWORD_SHA2: some_hash
GRAYLOG_ELASTICSEARCH_HOSTS: "http://opensearch:9200"
GRAYLOG_HTTP_EXTERNAL_URI: "https://graylog.lab.lan/"
GRAYLOG_MONGODB_URI: "mongodb://mongo:27017/graylog"

3. What steps have you already taken to try and solve the problem?

Reading the doc about Data Tiering without finding something about retention.

4. How can the community help?

I understand Graylog wants to have the same approach as elasticsearch by adding data tiering but why depreciated something like data retention ?

I was relying on daily index to easily rotate and so on.

Is there any other ways to configure indice rotation/retention ?

Thank you !

You can still use the old style configurations by selecting “Legacy” in the “Rotation/Retention” section of the index configuration page.
It is deprecated because retention settings don’t map well to what users are actually trying to achieve, when there is a variable daily message rate; and resource usage could also be sub-optimal, with too many and/or poorly sized shards.

Alright, but until when will it be available before deletion ?

On my graylog I can only choose between max days in storage and min days in storage, but it’s a bit confusing because there is no rotation anymore.

Or maybe should I set to min days in storage to: 1 to have a daily rotated index and max days in storage to: 180 to keep it 180 days ?

The way data tiering performs rotation is not as straightforward, but if you are after 1d rotation, you can have a 1 day gap between the min/max. For example, if you want to retain data for 30 days, you can set min to 30 and max to 31. The reason is that graylog uses the “leeway” in the event the index has not grown enough in size to rotate.

This is a basic diagram of the decision tree that happens. You can see the section where “Index Create Date > (Max Age - Min Age)”

I was wondering the same as s0p4L1N.

We were thinking about just using the legacy option for indice retention.
But we were also wondering till when this legacy option will be available before its deleted.

Does anyone know for how much longer this option will be available?

Out of interest, is there something about the new rotation system that is not meeting your needs?

Hi Wine_Merchant,

It might meet our needs, but the new rotation system is a bit confusing for us right now. That’s why we’re considering using the legacy method.

We tried setting up a monthly rotation with an index retention period of one year, but unfortunately, we can’t seem to get it to work. I could be mistaken, but it feels like we had a lot more control and options with the legacy system.

@LCE

I think the idea was to simplify and create a system that reduced the complexity of achieving an optimised shard size vs shard count vs total retention.

You can still alter default shard size and count but these are now options within the server.conf and this should be done when ingesting larger amounts of data but for smaller clusters the defaults should be good.

1 Like

@LCE Yes, you had more control in the old system. But it only works well if you have a fairly constant ingest. When that varies - which it invariably does - selection of a constant size or time is sub-optimal. Allowing GL to dynamically determine when to rotate leads to much better resource usage and performance.

1 Like

@patrickmann Thanks for the insights. The data I ingest is roughly the same size each month, so I was aiming for monthly rotations with a year of retention for each index. I now understand the system works differently, thanks to both @patrickmann and @Wine_Merchant for the explanations. I’m considering the data tiering solution—could you suggest what values for the minimum and maximum days in storage would best approximate monthly rotation and a year-long retention?

1 Like

Bit of an essay for you @LCE…The system is based around obtaining the ideal size for a single shard within an index set, this setting defaults to 20gb per shard and can be altered with the below options in server.conf. 20GB shard size is considered ideal for search performance but you may wish to alter this based on how much data you ingest and the resources available within the cluster. How much memory you have assigned to heap should be a consideration when calculating how many shards an individual Opensearch can hold, let’s say 16GB is assigned as heap the equation would look like this 16 (total heap assigned in GB) x 20 (1GB of heap = 20 shards at 20GB shard size) = 320. This means 320 shards per OS node.

time_size_optimizing_rotation_min_shard_size =20gb
time_size_optimizing_rotation_max_shard_size = 20gb

When applying this logic to your cluster, to make suggestions we would also need to know how many Opensearch nodes are available and their current heap allocation along with the daily ingest in GB.

To give the simplest answer you your question you would need to set the min to 365 days and the max to 375, this would give a 10 day leeway for the system to better optimise shard size but you will always retain at least 365 days of data.

2 Likes

@Wine_Merchant,

Thanks so much for the detailed explanation! I had already read up on shards before this thread, and your input really helped me understand the new data tiering. I’ll definitely give this method a try once I’ve run some numbers for my environment.

1 Like