Backup graylog and recover from failures

ankit · March 29, 2017, 6:42am

I have a graylog cluster which has one master node in the in-house server room and one node in the remote datacenter.

The Master node also hosts elastic search and mongo db.

All was running fine until, few days back over the weekend the disk space of the Master Node got full and everything stopped.

After going through all the articles and help available, the elastic search cluster state was still RED and graylog also kept on giving error:

“Deflector is pointing to [graylog_7], not the newest one:[graylog_8],Re-pointing.”

All said and done. The graylog was restored to one week back full backup. And the logs for one week were lost.

I want the community to help me in getting best practises from the experience you all had in:

How to make sure that I don’t loose any logs in future : Some Backup of Graylog Index etc
How to recover the logs from a failed elastic search cluster due to space full issue.
Log rotation and archiving or some sort of log preservation and retrieval for old logs in future.

Please help me understand what can be done for Graylog to be able to rotate the logs, keep old logs, archive them, recover failed elastic search cluster etc.

The setup was running in a production environment and now my job is stake due to the event that took place.

Any help and suggestion would be highly appreciated.

Thanks and Regards
Ankit Sharma

jochen · March 29, 2017, 7:31am

Well, nobody can take the burden to properly monitor your production-critical systems off of you.

This being said, Graylog Enterprise has a comprehensive story for archiving logs from the active cluster.

ankit · March 29, 2017, 7:50am

I just want to know if graylog along with elastic search and all the data can be backed up and restored.

If yes, then what could be the best possible way without facing the data loss.

Thanks and Regards
Ankit Sharma

jochen · March 29, 2017, 8:11am

Sure, just backup MongoDB and the Elasticsearch indices with the appropriate backup (or rather snapshot) tools:

Topic		Replies	Views
Backup (restoration test) Graylog Central (peer support)	5	580	October 30, 2019
Backup/Restauration Graylog Central (peer support)	3	558	March 19, 2020
Graylog backups Graylog Central (peer support)	5	6450	December 7, 2018
Graylog restart always ends up with corrupted elasticsearch Graylog Central (peer support)	6	629	January 10, 2018
Graylog cluster indexer failures after reboot graylog master Graylog Central (peer support)	12	1186	February 21, 2019

Backup graylog and recover from failures

Related topics