Thanks for your support in advance. I am using Graylog 2.x very well in my company. It is being used as a company-wide log monitoring system on AWS. I figure Graylog is a well-made and really nice program. And also I am a newbie in this community
Background
Elasticsearch has a kind of soft limit on its cluster size.
Elasticsearch guys said that about 150~200 data node is maximum because of their gossip overhead; I am using 150 data node now.
So I think the best way to scale out the elasticsearch cluster is, making multiple elasticsearch clusters not adding more nodes.
But it seems that Graylog now supports only one elasticsearch cluster.
Question
Do you guys have any plan about multi elasticsearch cluster support or cross-cluster search feature of elasticsearch?
But ‘Cross Cluster Search’ feature of elastcisearch is in beta and is subject to change, so I think this feature is unstable to rely on. So, in my opinion, the best way to support muti elasticsearch cluster is to make Graylog’s own way leveraging its Mongodb config store. It would be really great if I can use different elasticsearch cluster per stream.
But only kind of, may I ask how much data you have on each host and if you really need the data for search ‘online’ available.
I ask because if you need to hold the data not for search, but to suite local rules it might be cheaper to buy the archiving enterprise plugin and store the indices on cheap storage.
I read the github page you linked, and it seems related to my issue. Boths aim for federated search, right. But you suggested multi-graylog architecture, not multi-elasticsearch in 1 graylog cluster.
In my cases, it was always elasticsearch, not graylog, that has scaling problems when there are tons of logs. So IMHO, it would be better considering multi-elasticsearch support in 1 graylog cluster than multi-graylog architecture. This is just my suggestion, so please don’t feel bad. Just scaling out was enough for graylog, and it is really nice.
So I think multi-elasticsearch support is worth giving it a shot.
I will give some details about my situation for your understanding.
~ 100,000 Log per second on average
~ 1KiB per Log on average
10 Graylog nodes
150 ES data nodes + 3 ES master nodes
4TiB SSD per ES data node (So, 4TiB * 150 = 600TiB in total)
I am holding all this data for realtime search, not for local rules. Graylog system is now being used as a centrailzed logging system on AWS in my company. I already have another system that archives all this log to S3. Thanks for your enterprise suggestion, though.