Unassigned Shards and further confusion

SoerenJunkers · March 29, 2022, 1:14pm

Hello,

after inheriting a Graylog Server that got somehow broken by updates and was reverted with a snapshot. Right now I have 2 unassinged shards that I can’t make sense of:

resulting in Erros in the Webui

OS is Ubuntu 18.04.4 LTS
more info on related components:

ii  elasticsearch-oss                      6.8.23                                          all          Elasticsearch is a distributed RESTful search engine built for the cloud. Reference documentation can be found at https://www.elastic.co/guide/en/elasticsearch/reference/current/index.html and the 'Elasticsearch: The Definitive Guide' book can be found at https://www.elastic.co/guide/en/elasticsearch/guide/current/index.html
ii  graylog-3.1-repository                 1-1                                             all          Package to install Graylog 3.1 GPG key and repository
ii  mongo-tools                            3.6.3-0ubuntu1                                  amd64        collection of tools for administering MongoDB servers
ii  mongodb                                1:3.6.3-0ubuntu1.4                              amd64        object/document-oriented database (metapackage)
ii  mongodb-clients                        1:3.6.3-0ubuntu1.4                              amd64        object/document-oriented database (client apps)
ii  mongodb-server                         1:3.6.3-0ubuntu1.4                              all          object/document-oriented database (managed server package)
ii  mongodb-server-core                    1:3.6.3-0ubuntu1.4

I just want to make a bit of sense on what is going on here, if somebody can point me towards the right direction. Of course I can provide more info if needed.
The goal would be to understand what is going on and fix.
Further I would have to update this environment to a current release. Since I fear more complications doing it incrementally is it possible to migrate the data to a new instance? I already set up one to play with and do testing.

Thanks in advance for any helpful input

gsmith · March 29, 2022, 10:19pm

Hello && Welcome

Looks like your casa replicas are unsigned and a couple primary shards. R= replica shard P= Primary shard.
If you do not see these shard names in the Web UI under System / Indices

Depend if you really need them or not and if its a just a couple. I would just deleted them.

You have a couple choices.

1.Delete the Unassigned shards

curl -XDELETE http://localhost:9200/casa

2.Try to fix the unassigned shards

The following steps below are to check and correct Elasticsearch performance Issues from shard/s that are unassigned and/or having errors from the following upgrades, reboots, and configuration of Graylog Server.

Check Shards

curl -XGET http://localhost:9200/_cat/shards

If errors were found the following command will find out why those error/s occurred.

curl -XGET http://localhost:9200/_cluster/allocation/explain?pretty

{
  "index" : "testing",
  "shard" : 0,
  "primary" : false,
  "current_state" : "unassigned",
  "unassigned_info" : {
    "reason" : "INDEX_CREATED",
    "at" : "2022-04-09T21:48:23.293Z",
    "last_allocation_status" : "no_attempt"
  },
  "can_allocate" : "no",
  "allocate_explanation" : "cannot allocate because allocation is not permitted to any of the nodes",
  "node_allocation_decisions" : [
    {
      "node_id" : "t_DVRrfNS12IMhWvlvcfCQ",
      "node_name" : "t_DVRrf",
      "transport_address" : "127.0.0.1:9300",
      "node_decision" : "no",
      "weight_ranking" : 1,
      "deciders" : [
        {
          "decider" : "same_shard",
          "decision" : "NO",
          "explanation" : "the shard cannot be allocated to the same node on which a copy of the shard already exists"
        }
      ]
    }
  ]
}

In this case,
the API clearly explains why the replica shard remains unassigned: “the shard cannot be allocated to the same node on which a copy of the shard already exists”

Also…

Elasticsearch’ s cat shards API will tell you which shards are unassigned, and why:

curl -XGET localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason| grep UNASSIGNED

As nodes join and leave the cluster, the primary node reassigns shards automatically,
ensuring that multiple copies of a shard aren’t assigned to the same node.
n other words, the primary node will not assign a primary shard to the same node as its replica,
nor will it assign two replicas of the same shard to the same node.
A shard may linger in an unassigned state if there are not enough nodes to distribute the shards accordingly.

You can enable shard allocation, update the Cluster Update Settings API:


curl -X PUT "localhost:9200/_cluster/settings?pretty" -H 'Content-Type: application/json' -d'
{
    "transient" : {
        "cluster.routing.allocation.enable" : "all"
    }
}
'

Hope that helps.

SoerenJunkers · March 30, 2022, 9:41am

Hello Gsmith, that was indeed quite helpful. I have to be 100% honest and cannot explain the origin of that “casa” Shard. I have 1 shard that got corrupted during going back to an ealier snapshots because the Elasticsearch nodes are on another storage as the database. But this “casa” shard does not make any sense to me…

Anyhow thank you for pointing me towards the right direction. I will confirm if the corrupted shard is critical and if it is not will just delete it.

Next thing would be to update to the current release, but I am a bit worried since it hast to be incremental and there is potential for further issues with each update. Is it possible to migrate the data to a new graylog instance which is running on the current release?

gsmith · March 30, 2022, 9:21pm

Hello,

You may run into issue if the version do not match. I have done this in the past with a MongoDb dump and Elasticsearch restore from a snapshot. I was able to fix those issues but it took a while. It might have been just luck.

Depending on what versions you have Elasticsearch & MongoDb you might just be able to update just Graylog and that depends on what version of GL you have. For example if you running Graylog 3.0 you will need to upgrade to the latest version for 3.0 → 3.3 then upgrade to 4.0. Once everything is working correct them just yum/apt update the rest. I have found this path worked well.

gsmith · March 31, 2022, 4:30am

I was doing some research and found this old post, Its really a good read if you have time

system · April 14, 2022, 4:30am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Graylog cluster, elasticsearch unassigned shards Graylog Central (peer support)	4	3001	May 4, 2021
900 unassigned shards Graylog Central (peer support)	2	275	December 23, 2023
Elasticsearch shards unassigned Graylog Central (peer support)	2	3886	July 26, 2018
Graylog Opensearch cluster is yellow Graylog Central (peer support) alert	6	1517	August 15, 2023
[Graylog_2.3.2] Shards and deflectors errors Graylog Central (peer support)	14	8161	January 1, 2018

Unassigned Shards and further confusion

Related topics