I have a question on Elasticsearch Shards using up disk space. I’m using an All in one Graylog Server, with an index call “Default index set”.
Configured as: Shards 4, Replicas 0, Index rotation strategy: Index Time, Rotation period: P1D (1d, a day), Index retention strategy: Delete, Max number of indices:90. I did some research on shards as shown below;
From my research on Elasticsearch Shards;
“The disk of a single node may be too slow to serve search requests from a single node alone.
To solve this problem, Elasticsearch provides the ability to subdivide your index into multiple pieces called shards. Each shard is a fully-functional and independent “index” that can be hosted on any node in the cluster.”
• It allows you to horizontally split/scale your content volume
• It allows you to distribute and parallelize operations across shards (potentially on multiple nodes) thus increasing performance/throughput
• You may change the number of replicas dynamically anytime, but you cannot change the number shards after-the-fact.
“Each document is stored in a single primary shard. When you index a document, it is indexed first on the primary shard, then on all replicas of the primary shard. By default, an index has 5 primary shards. You can specify fewer or more primary shards to scale the number of documents that your index can handle. You cannot change the number of primary shards in an index, once the index is created.”
So, if I have this correct, I have an Index called “Default index set” with 4 shards. That means my index “Default index set” is split into 4 sections for easier search indexing?
What confuses me is in the Elasticsearch Documentation, stating that each document is stored on the Primary Shard, but if you have 4 primary shards on one server, does this mean that the one document is stored in 4 shards (i.e. S1, S2, S3, S4) on “Default index set”?
If so, does that mean its duplicated across 4 shards and decreasing free disk space?
I don’t know all about Elasticsearch shards, but have a rough Idea. Could someone enlighten me further please.
If I configure a Index with less Shards, Say 2 instead of 4 will this prevent the amount of disk space usage?