Elasticsearch Shards

Hi all,
I’ve got this issue since a while now.
My Elasticsearch cluster is always yellow or red. I always got unassigned shards.

When I identify them with the statement:
curl -XGET 10.26.2.243:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED

I can of course delete them manually and it gets back to green:
curl -XDELETE ‘10.26.2.243:9200/graylog_XXX/’

But if than a new Index will be creatad the Elasticsearch cluster gets back to yellow and has 4 shards unassigned. Always a new Index gets rotated 4 new unassigned shards will appear…

I got the follwoing rotation in place:
Index prefix:graylog
Shards:4
Replicas:1
Index rotation strategy:Index Size
Max index size:1073741824 bytes (1.0GB)
Index retention strategy:DeleteMax
number of indices:200

FYI:

 200 indices with a total of 820,506,688 messages under management, current write-active index is graylog_404.

 Elasticsearch cluster is yellow. Shards: 800 active, 0 initializing, 0 relocating, 44 unassigned

Does anyone has an idea how to fix this issue?

Thanks in advance
BR
Steffen

@zoscail how many elasticsearch server do you have? is anyone low on disk space? (check the log files)

Hi Jan,
thanks for your answer. I just got one, I am using the OVA.
Diskspace is good. I’ve mounted 300 GB for that. The settings I’ve posted says that after 200 Indices (1 GB each) the oldest gets deleted…
The Elasticsearch Logs look like this. The last entry from the current log says just

Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[graylog_194][0]] ...]).

2017-08-24_06:52:26.57284 [INFO ][o.e.n.Node               ] [42UqLG7] starting ...
2017-08-24_06:52:26.60840 [INFO ][i.n.u.i.PlatformDependent] Your platform does not provide complete low-level API for accessing direct buffers reliably. Unless explicitly requested, heap buffer will always be preferred to avoid potential system instability.
2017-08-24_06:52:26.74540 [INFO ][o.e.t.TransportService   ] [42UqLG7] publish_address {10.26.2.243:9300}, bound_addresses {10.26.2.243:9300}
2017-08-24_06:52:26.76055 [INFO ][o.e.b.BootstrapChecks    ] [42UqLG7] bound or publishing to a non-loopback or non-link-local address, enforcing bootstrap checks
2017-08-24_06:52:36.81940 [INFO ][o.e.c.s.ClusterService   ] [42UqLG7] new_master {42UqLG7}{42UqLG7WRPSJkV47C-wdvg}{dSpir7Y_QmCnec4H2JQa8g}{10.26.2.243}{10.26.2.243:9300}, reason: zen-disco-elected-as-master ([0] nodes joined)
2017-08-24_06:52:36.84214 [INFO ][o.e.h.n.Netty4HttpServerTransport] [42UqLG7] publish_address {10.26.2.243:9200}, bound_addresses {10.26.2.243:9200}
2017-08-24_06:52:36.84354 [INFO ][o.e.n.Node               ] [42UqLG7] started
2017-08-24_06:52:39.72534 [INFO ][o.e.g.GatewayService     ] [42UqLG7] recovered [200] indices into cluster_state
2017-08-24_06:53:01.43556 [INFO ][o.e.c.r.a.AllocationService] [42UqLG7] Cluster health status changed from [RED] to [YELLOW] (reason: [shards started [[graylog_194][0]] ...]).

Hej @zoscail

if you just have one elasticsearch server your shard and replication setting does not make sense … You end up having 4 shards and one replica for every shard on the same server - in the end 8 shards and doubled data on the same server.

To save some storage you should set shards 2 and replica 0.

I am also using the OVA and it came with the same defaults as the other person. I then got the warning that 4 shards are active and 4 are unassigned.

I found it mentioned somewhere on the net that the number of shards should mirror the number of nodes. So I changed my index to use 1 shard and now I get a warning that 1 shard is active and 1 shard is unassigned.

Someone please clarify why the OVA behaves this way out of the box and how to solve this.

You can reduce the number of replicas in the configuration of your index sets: http://docs.graylog.org/en/2.3/pages/configuration/index_model.html#index-set-configuration

Thanks. I had read that page but I am not familiar enough with the the terms used for all that info to make sense to me.

After reading it a few more times and the description of the fields when editing an index, I have now reduced my index configuration to use 1 shard and no replication since the OVA comes with 1 of each and I don’t plan to expand this test setup so more than 1 shard as well as any replication doesn’t seem to make sense.

Please correct me if I misunderstood.

If I am right, maybe the OVA should not come “misconfigured” by default or maybe this should be mentioned somewhere?

The OVA (and AMI) is intentionally comes configured the way it is (with 4 shards and 1 replica).

The idea is that a YELLOW cluster health state isn’t critical and that this makes adding additional nodes much easier. If a second node is added, Elasticsearch will automatically start distributing the shards (primary and replica shards) between the available Elasticsearch nodes.

If the OVA (and AMI) was distributed with a configuration without replica shards, users would have to remember changing the index set configurations and old indices would still have no replicas.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.