If messages will be lost by changing standalone to cluster

There is only one standalone node in our env now. I am upgrading graylog, mongodb and elasticsearch as they are very old. Here is my plan:

  1. Clone the production server graylog01 to graylog02
  2. Upgrade graylog to 3.3, mongodb to 4.4, elastic search to 6.0 on graylog02
  3. Change rsyslog settings on servers to send their log to graylog02
  4. Upgrade graylog01
  5. Make graylog01 and graylog02 as a cluster

My questions is, if I do it this way, will the data being messy/lost because the two nodes are not synchronized with each other? Or if they can be synchronized after the cluster is setup?

Thank you in advance!

Why are you clustering? Need redundancy or capacity on the front end or the back end?

MongoDB and Elasticsearch recommend no cluster be built with less than 3 nodes. So if you are going to go from a standalone, to a “cluster” perhaps consider going to 3 nodes.

Based on your needs, there may be other options as well.

I am clustering for redundancy. As when I was upgrading mongodb and elasticsearch, I found some data lost during the upgrade. If it is a cluster, I can do rolling upgrade.

Cool… so for input redundancy, you technically only need a Graylog cluster. ES can still be a single node. I do not recommend this, but it would work.

This would work mainly because if you need to upgrade ES (which is a single node) the messages will queue up on Graylog until the upgrade is complete. So in essence, you don’t lose a message, but you can not search for anything either. If this is an ok scenario, then just build the Graylog cluster with a MongoDB replica set, and point it at the ES node on the back end. Optionally put a load balancer in front of the Graylog cluster.

Obviously, any single node is not redundant if something were to happen to the hardware or system that it is running on, so if full true redundancy is something you want, you’ll need to make both redundant by clustering them.

So better solution (and more forward thinking) in your case (based on my understanding) is to setup a 3 node Graylog cluster, leveraging a MongoDB replica set, a 3 node Elasticsearch cluster, and then perhaps put a load balancer in front of the Graylog cluster.

Also, you can have the same 3 hosts simultaneously be part of both the Graylog/MongoDB cluster and the ES Cluster, but unless you are tight on resources… I’m not a fan of this design.