do anyone of you knows if it’s safe to snapshot a running Graylog VM under VMware?
I know I can backup the individual components (files, MongoDB, Elasticsearch) but this implies quite an effort in case I have to restore the server (create a new one, restore each piece while keeping my fingers crossed, etc…). Restarting from a snapshot is much easier, as long as the snapshot is consistent of course.
Suggestions? Experience? Thoughts?
I have to agree 100%. For testing in our lab environemnt I do that consistently. Some times I have multiply ones on a VM when I’m testing new software. NOTE: make sure you remove them when your done. I found that later on if something goes sideways those snapshots/checkpoints can create problems.VM snapshots take up significant space, so you should limit yourself to 2 or 3 of them. Becarefull using them on production VMs. A VM with snapshot/s usually has bad performance because you are doubling IOPS and there is CPU overhead in calculating the block-level difference.
From my experience on MS Hyper-V and VMware is pretty much the same. Click create Checkpoint/Snapshot. Should be done in a minute or two. We also have software that backs up VM’s once a day and then full backup on the weekend. If the snapshot/checkpoint is corrupt or doesn not work I’m able re-install the full Virtual machine in 5 + minutes.
Hope that helps
my concern is about database consistency. I know that proper backup solutions are using VMware tools functionalities or custom agents to stop/pause database activity before performing the snapshot/backup. I fear that taking a snapshot while the server is receiving thousands of messages per minute may lead to a broken database.
Is someone doing something special to avoid that?
Which supporting software are you worried about, MongoDB or Elasticsearch? If MongoDB, it only serves as a config store for Graylog. It’s not (or at least shouldn’t be) receiving thousands of messages a second. Elasticsearch on the other hand, that does receive quite a few messages. The only point where I think you’d be REALLY concerned about data consistency/corruption with either Mongo or Elastic is if you’ve configured them in such a way that you’re going to end up with a split brain scenario.
I personally have no problems with snapshots/checkpoints. For instance, I’m receieve 30 GB logs a day and I want to upgrade Graylog to a newer version. I create a snapshot/checkpoint. When thats all done I proceed to execute my upgrade. If everything works out I remove these snapshots/checkpoint. So far I have no problems. The Snapshot preserves the state and data of a virtual machine at a specific point in time. This includes the virtual machine’s power state.
I didnt think about it but I also agree with @aaronsachs statement concerning a split brain scenario.
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.