I am running 3 Elasticsearch Nodes in a Cluster with 2 Graylog Servers.
We have different Webservers and individual Logs which we want to put on the first Graylog Node.
With the second Graylog Node we want to handle our Domain Controller Logs, which is about 2000msg/s.
We have to store this big amount of data for half a year.
If I activate the Input of DC Logs on my second node, the elasticsearch Cluster perfomance is very heavy. (4CPU, 16GB RAM, 500GB SSD Storage each)
However, I am looking for a solution to kinda archivate the DC Logs. We do not need to select and filter them every day and they should also not influence the other “Hot” Logs on the first node.
Is there any Chance to build such an Environment? Do I have to use the Graylog Enterprise Version? If yes, just on the Second node? Can I directly store the logs on a cheap disk space? Will they then also consume our Elasticsearch Cluster Performance?
The architecture and tuning depends on many small parts and should be made very carefully.
Did you have those two Graylog Nodes in a Cluster, using the same MongoDB and the same Elasticsearch?
As Elasticsearch is used for the storage of the logfiles you need enough storage on them (1,5TB total does not look like enough for the amount of messages you want to store for 6 months) and you might want to look into the ability of ‘hot / warm’ setuo where you place specific indices on specific nodes.
Think of the scale you want to save. You want to store ~ 5.184.000.006 messages from Windows where the log messages can become 32766 characters so you end up having 169.858.944.196.596 characters which will become ~ 154 TB only for the Windows Event log. If you store all messages.