I am a beginner on Graylog and I would like to know if I am using Graylog with Elastic, should I need 2x more disk space ?
For example, If I handle 10gb of logs data, i I am using graylog and Elastic, will it take 20gb of disk space or only 10 ? (10gb use on graylog and 10gb use on elastic)
The short story is Graylog stores it information and settings with it’s use of MongoDB but not the messages you are sending in. Once Graylog is finished processing a message it is sent out to Elasticsearch for storage and future retrieval.
You can search the community for recommendations for sizing to find information that is relevant to your installation (future installation?) Here is an example. There are many factors to take into consideration from clustering to shards in Elasticsearch/Opensearch etc. etc. OF NOTE: If you are creating a new Graylog instance it is likely better to start with OpenSearch as that seems to be the current Graylog direction.
I only have a small installation so it is hard for me to help at that level - perhaps @gsmith has some detail to add… he has some pretty cool set ups!
Graylog can handle a lot, I have see 33,000+ messages per second. This would depend oh how Graylog cluster is setup and resources given. Kafka, Nginx or any other load balancers could be set in front of a cluster. Again this would depend on the environment.
Of cource SSD will work a lot better, the question is “is it necessary?”. If you have the money I would rollout a couple 4TB SSD and raid them. 1 TB a week is a lot and depending how long you want to keep the data , for 30 day /w 1 TB week that 4 TBs a months plus you need room for your OS , I would suggest keeping Elasticsearch Indices on its own volume.