Multi Data Center Architecture with AWS


Hi All. It’s come time for us to setup logging in a new build we’re doing that’s spread out across 2 Data Centers. After a lot of googling, I haven’t found a way of running a Graylog setup across 2 DCs, at least not easily anyway, due to Elastic Search not really supporting this kind of setup (Cross DC clustering). Elastic Search Cross cluster searching looked promising, but couldn’t find much info on it in relation to Graylog and unfortunately running 1 standalone Graylog setup in each Data Center isn’t an option.

In our case, we have dual 1Gbps Direct Connects into AWS at each Data Center and a single 1Gbps dedicated link between the 2 DCs. So I was looking into a hybrid type setup, where we run our Elastic Search cluster in our VPC, whilst keeping Graylog on prem. Kind of like below:

Initially a single Graylog instance in each DC would be enough for us, but it will be behind a load balancer from the start, so adding more instances should be straight forward. Graylog and MongoDB will be clustered across DCs, with a third MongoDB instance in AWS to keep an odd number.

The end goal is the have resources in each DC send their logs to the local Graylog instance, which in turn sends the data to the shared Elastic Search. A user can then log into the web interface in either data center and be able to search logs from both DCs, and we don’t have to flood the link between the Data Centers sending logs from DC2 to DC1 for example. Hopefully this setup gives us better availability and scalability by easily being able to add nodes as required.

Before I get to far down the garden path, is something like this feasible, or are there better ways to be doing it? Ideally, I’d like to keep it all on prem, but it doesn’t seem (easily) possible with Elastic Search, so putting just Elastic Search and MongoDB in AWS whilst keeping Graylog on prem seems to be the most economical way I can think to do it… if it’s possible.

Would love some feedback :slight_smile: Thanks!

(Jan Doberstein) #2

you should measure the traffic you will have in your environment between the servers - and check if the provided link is shared for all applications that you run in your environment.

Cross Cluster search or some kind of federation is currently not possible with Graylog - but it this might be a future feature, but do not expect that within a few weeks.


Thanks Jan :grinning:

Just for an update, I got this all working. ElasticSearch is running in AWS, I used docker for this and built a base AMI with packer so it can be run in an autoscaling group behind an ALB which seems to work well. New nodes join the cluster automatically.

I ended up moving MongoDB to AWS totally rather than running on prem. I used docker and packer also for MongoDB to create a 3 node replica set. Unfortunately for this bit, the creation of the replica set is still a manual step after provisioning the instances, will work on trying to automate this at a later date.

For GrayLog, I’m running 4 nodes in total, 2 in each data center behind a load balancer. I didn’t end up using docker for the GrayLog component, a standard install worked just fine.

So now I’m left with each DC being able to send logs to the local GrayLog load balancer VIP for ingestion and indexing, and I can log on to any of the 4 web interfaces and search all of the logs from both DC’s, which is exactly what I wanted.

For now I’m just using 3x t2.medium for Elasticsearch and 3x t2.nano for MongoDB. Now I will be starting to setup logging on all our servers and devices to load test it, almost certainly these instance sizes will increase.

Happy to share my docker/packer/ansible/terraform files if anyone is interested in a similar setup.

(Brandon Cruz) #4

Hi sirfraz, I am looking for a similar setup. Would you be willing to share your setup files?



Sure, no worries. I’ll get them posted somewhere in the next couple of days when I’m back at the office.

(Brandon Cruz) #6

Great! Let me know when available.


Hey there, I haven’t forgot, I promise :slight_smile:

I’ll try and get these up on github this week, I just need to sanitise the configs.

Sorry for the delay.

(Brandon Cruz) #8

No worries. Thanks again for sharing!


Hey, finally got them uploaded:

You’ll need to modify them to suit your needs. The only thing I couldn’t automate (but I’m sure there is a way), was initialising the MongoDB replica set. I had to logon and do that manually after building. Seems to be running fine for the moment, but I’m sure it will need tweaking. It’s still only being tested.

(Brandon Cruz) #10

Awesome! Thanks so much! Great work!


No worries :slight_smile: If you see anything that can be done better, let me know. This is the first time I’ve played with any of this stuff, it was hacked together from a bunch of different articles I read and trial and error.


(system) closed #12

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.