Delete some data from graylog

Hello
Currently im using Graylog v3.0.2 on Centos 7
Can i delete some data? i already explore on google and this forum but still don’t understand
and i see many explanation to system > indices
and i thought my config was default like attachment

can anyone explain this section?

i want give some latency for data like after 9 month the log was being deleted
and for installation i was using this instruction

sorry for bad english and i was new on graylog, thanks before

Hi,

to delete data after time, you need to use “Index Time” as rotation strategy. Have a look here for the offical documentation: Index model — Graylog 3.0.2 documentation

What interval you choose for rotation should also depends on how fast your indices grow: Size your shards | Elasticsearch Guide [7.13] | Elastic

But that’s more relevant for productive systems. Not test installations.

After you have figured out your rotation interval, you need to figure out how many indices
you need to keep to retain data for the time period you want.

Lets assume you rotate every 7 days, and want to retain 4 weeks of data (28 days):

28/7 = 4 indices

Imagine you have 4 indices, and a new week starts. Now you have 5 indices, so the last one gets
deleted. Your oldest data is now only 3 weeks old. So if you always want to have atleast data
for the last 4 weeks you have to set your “Max number of indices” to 5.

If you want to store data not longer than 28 days, you would choose 4 as the max number.

Hope that helps.

P.S.: Graylog is currently on version 4.0 (4.1 comming soon), so you might want to upgrade now, to enjoy its improvements.

1 Like

Hello thank you for your reply
so if want to keep the data like you said (28 Days) just change the indicies like this?

Before you get too far here, I want to caution you about setting large index rotation periods. When you set a large index rotation period, the net result is that you end up increasing the shard size in your index. The larger the shard size, the longer shard operations will take. You want your shard operations to be quick–the longer an operation takes, the more heap is consumed and the less resources you have available for things like queries, etc. If this is a particular active index, then having 7 days worth of data could result in some pretty large shard sizes, which is going to impact the performance of your searches.

In general, you want to stay within Elasticsearch’s guidance for shard sizes (20-40gb/shard) and shard count (20 shards per GB of heap available within your Elasticsearch cluster). Just looking at these settings, it seems like you may end up in a bad position down the road.

5 Likes

^^ This is great advice and taking this approach a few months ago dramatically changed the responsiveness of our Graylog setup. I recommend setting up a spreadsheet where you can play around with retention, index sizes, and shard counts. Some of our indexes get very little activity and can go weeks before they accumulate 5-10 GB. It makes no sense to have 90 indices with 1 day rotation when I could just have 3 indices with 1 month rotation. Other index sets accumulate 20-40 GB a day and so require a different strategy.

1 Like

Hello,
Thank you for reply staff,
and thank you for explanation
as you know my current configuration rotation was default

and i just want to confirm if i use that configuration the data after 20 rotation will be deleted? like graylog_0 will be deleted and start on graylog_21 ?

and we can delete it manually too for reduce the size?

if im right i don’t mind use this config, the impact was my query search will be slowed right?
sorry for bad english

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.