I have been tasked with ensuring that graylog keeps logs for 12 months before deleting older data. What is the best method to ensure this? Just looking for ways to make sure we have the correct amount of space allocated to achieve this.
If you have to be absolutely sure, then you need to make a way to backup. But if a reasonable margin of error is allowed, you can just set retention to be time based, retention period to 1 day, number of indices to store to about 380 (better to have some spares, in case you need to manually rotate index), and retention strategy to delete. Then just make sure you have enough disk space.
Thank you so much jtkarvo for the response. Last question and I know this seems rather obvious, but how do you determine if you have enough disk space for what you want to do? It seems like if you under estimate you could run completely out of hard drive space on the system, while GrayLog itself thinks it has plenty of space to use. I don’t want to tank my current setup.
@Sparky for most ppl. it is hard to calculate. That is why do the following:
- Plan your setup that Elasticsearch is able to grow
- ingest one month and calculate on that month what you would need for a year.
- Monitor your disk usage
- place Elasticsearch on LVM Volumes
keep that posting in mind: https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster
I have set up a cron task to log free disk space in the ES nodes, then made a dashboard in Graylog where I can see how the disk space is used (a graph showing the free disk space as a function of time); then we have added more disk to the ES nodes whenever needed.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.