ElasticSearch Index Size Recommendation

My setup :

  • Elasticsearch VM : 8 vcpu ; 32Gb RAM ; 2Tb storage
  • Graylog + MongoDB VM : 8vcpu ; 16GB RAM ; 200Go storage

I will be managing the index rotation strategy based on Index size. I was wondering if anyone has recommandations for the optimal index size ? Is the default value (1Gb) adequate for a total of 2Tb storage ?

Assuming you chose size for a reason, then I would ask, how much of your 2TB do you want to dedicate to your index. Will you only have 1 Index and throw all your inputs into it? (not recommended) Or will you have multiple indices for the various inputs and need to leave space for them.

Typically this question can be answered by asking. “What am I logging and what retention level and searchability, do I want?”

Some systems generate small or infrequent logs, so a 20GB index could be a week or a month’s worth of data. Others, 20GB could be a day or an hour.

At 20GB per index, with a default retention policy, you’ll have about 400GB of logs for that one Index/source.

I would just create an index with a size that makes sense to you and adjust it if needed. If you create an index and check it in a day and it still is not full, make a decision. You can modify most of the settings for an index after creation.

But… to answer your question, I like the 15-20GB size range… think it has something to do with something I read somewhere at some point.

Just some tips.
The number of indices is important, it use memory…
The number of shards is important too… (maybe more important)
I suggest use google to find some elasticsearch documents, whitepapers, or case study. It can help a lot.
Of if you not prefer google, use the community search. We shared a lot of docs about that topic.

We can’t answer it, because it depends on your goals, and resources, so you have to test it in your envirolment.

He @H2Cyber

I would read: https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster

This shall give you some guidance on the optimal size per index/shard.

Thanks @macko003 and @jan for the tips.

After reading on, I think I will use the following setup on the dedicated EL node:

  • 15Gb of the total RAM will be dedicated to JVM heap
  • 200 Gb of disk space will be used for the EL OS
  • 1 TB of disk space will be dedicated as storage space, in a seperate /data partition
  • Index size = 12 Gb
  • Number of shards per index = 1 (as this is a single node setup)
  • Max number of indices = 50 (leaving about 30% free disk space at max retention)
  • Replica = 0 (as it is a single node)

Hope this makes sense. Would welcome you guys’ infinite wisdom on this setup :slight_smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.