I have a graylog cluster which has one master node in the in-house server room and one node in the remote datacenter.
The Master node also hosts elastic search and mongo db.
All was running fine until, few days back over the weekend the disk space of the Master Node got full and everything stopped.
After going through all the articles and help available, the elastic search cluster state was still RED and graylog also kept on giving error:
“Deflector is pointing to [graylog_7], not the newest one:[graylog_8],Re-pointing.”
All said and done. The graylog was restored to one week back full backup. And the logs for one week were lost.
I want the community to help me in getting best practises from the experience you all had in:
-
How to make sure that I don’t loose any logs in future : Some Backup of Graylog Index etc
-
How to recover the logs from a failed elastic search cluster due to space full issue.
-
Log rotation and archiving or some sort of log preservation and retrieval for old logs in future.
Please help me understand what can be done for Graylog to be able to rotate the logs, keep old logs, archive them, recover failed elastic search cluster etc.
The setup was running in a production environment and now my job is stake due to the event that took place.
Any help and suggestion would be highly appreciated.
Thanks and Regards
Ankit Sharma