Datanode fails to allocate shard for index

1. Describe your incident:

Graylog Index randomly breaks, because of unassigned shards. I get to message in the Graylog frontend. The first is when I look at the broken index:

OpenSearch cluster datanode-cluster is red. Shards: 49 active, 0 initializing, 0 relocating, 1 unassigned

The second is when I look at the stream that is using the index:

OpenSearch exception [type=search_phase_execution_exception, reason=all shards failed]

I figured out how to fix the broken problem by using curl to directly use the OpenSearch API, first figuring out which index is the problem:

# curl "https://localhost:9200/_cat/shards?h=index,shard,prirep,state,unassigned.reason" -k --cert cert.crt --key cert.key

.opendistro-ism-managed-index-history-2025.08.18-000128 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.19-000129 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.16-000126 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.17-000127 0 p STARTED    
.opendistro_security                                    0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.15-000125 0 p STARTED    
matrix_log_index_18                                     0 p STARTED    
matrix_log_index_17                                     0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.13-000123 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.14-000124 0 p STARTED    
matrix_log_index_19                                     0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.11-000121 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.12-000122 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.10-000120 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.28-000138 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.26-000136 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.24-000134 0 p STARTED    
.opendistro-ism-managed-index-history-2025.09.07-000148 0 p STARTED    
.opendistro-ism-config                                  0 p STARTED    
matrix_log_index_20                                     0 p UNASSIGNED ALLOCATION_FAILED
.opendistro-ism-managed-index-history-2025.08.22-000132 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.09-000119 0 p STARTED    
.opendistro-ism-managed-index-history-2025.09.03-000144 0 p STARTED    
.opendistro-ism-managed-index-history-2025.09.05-000146 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.20-000130 0 p STARTED    
.opendistro-ism-managed-index-history-2025.09.01-000142 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.31-000141 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.30-000140 0 p STARTED    
graylog_9                                               0 p STARTED    
.ds-gl-datanode-metrics-000003                          0 p STARTED    
.ds-gl-datanode-metrics-000004                          0 p STARTED    
graylog_8                                               0 p STARTED    
.ds-gl-datanode-metrics-000001                          0 p STARTED    
.ds-gl-datanode-metrics-000002                          0 p STARTED    
gl-system-events_6                                      0 p STARTED    
graylog_10                                              0 p STARTED    
gl-system-events_7                                      0 p STARTED    
.ds-gl-datanode-metrics-000005                          0 p STARTED    
graylog_11                                              0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.29-000139 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.27-000137 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.25-000135 0 p STARTED    
.opendistro-ism-managed-index-history-2025.09.06-000147 0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.23-000133 0 p STARTED    
gl-events_0                                             0 p STARTED    
.plugins-ml-config                                      0 p STARTED    
.opendistro-ism-managed-index-history-2025.08.21-000131 0 p STARTED    
.opendistro-ism-managed-index-history-2025.09.04-000145 0 p STARTED    
.opendistro-job-scheduler-lock                          0 p STARTED    
.opendistro-ism-managed-index-history-2025.09.02-000143 0 p STARTED

Then deleting the unassigned shard:

curl -X DELETE "https://localhost:9200/matrix_log_index_20" -k --cert cert.crt --key cert.key

2. Describe your environment:

I run everything in docker compose using this docker-compose.yaml on a single machine:

services:
  mongodb:
    image: "mongo:6.0"  
    restart: "always"
    networks:
      - graylog
    volumes:
      - "./access_guard/mongodb_data:/data/db"
      - "./access_guard/mongodb_config:/data/configdb"  

  datanode:
    image: "graylog/graylog-datanode:6.3.2"
    hostname: "datanode"
    environment:
      GRAYLOG_DATANODE_NODE_ID_FILE: "/var/lib/graylog-datanode/node-id"
      # GRAYLOG_DATANODE_PASSWORD_SECRET and GRAYLOG_PASSWORD_SECRET MUST be the same value
      GRAYLOG_DATANODE_PASSWORD_SECRET: "{{ docker_graylog_password }}"
      GRAYLOG_DATANODE_MONGODB_URI: "mongodb://mongodb:27017/graylog"
    ulimits:
      memlock:
        hard: -1
        soft: -1
      nofile:
        soft: 65536
        hard: 65536
    networks:
      - graylog  
    volumes:
      - "./access_guard/graylog-datanode:/var/lib/graylog-datanode"
    restart: "always"

  # Graylog: https://hub.docker.com/r/graylog/graylog-enterprise
  graylog:
    image: "graylog/graylog:6.3.2"
    depends_on:
      mongodb:
        condition: "service_started"
      datanode:
        condition: "service_started"
    entrypoint: "/usr/bin/tini --  /docker-entrypoint.sh"
    environment:
      GRAYLOG_NODE_ID_FILE: "/usr/share/graylog/data/data/node-id"
      GRAYLOG_PASSWORD_SECRET: "{{ docker_graylog_password }}"
      GRAYLOG_ROOT_PASSWORD_SHA2: "{{ docker_graylog_root_pw | hash('sha256') }}"
      GRAYLOG_HTTP_BIND_ADDRESS: "0.0.0.0:9000"
      GRAYLOG_HTTP_EXTERNAL_URI: "https://url.example.com/"
      GRAYLOG_MONGODB_URI: "mongodb://mongodb:27017/graylog"
    labels:
     - "traefik.enable=true"
     - "traefik.http.routers.graylog-router.rule=Host(`url.example.com`)"
     - "traefik.http.routers.graylog-router.entrypoints=https"
     - "traefik.http.routers.graylog-router.tls.certresolver=letsencrypt"
     - "traefik.http.services.graylog-service.loadbalancer.server.port=9000"
    ports:
      - "[::]:12201-12202:12201-12202/udp" # GELF UDP - matrix
    networks:
      graylog:
        ipv6_address: {{ pub_ip6_graylog }}
      proxy:
    volumes:
      - "./access_guard/graylog_data:/usr/share/graylog/data/data"
    restart: "always"

networks:
  graylog:
    driver: "bridge"
    driver_opts:
      com.docker.network.bridge.gateway_mode_ipv6: routed
    enable_ipv6: true
    ipam:    
      config:
        - subnet: {{ pub_ip6_subnet }}
          gateway: {{ pub_ip6_gateway }}
  proxy:
    external: true
    name: {{ docker_reverse_proxy_network }}

3. What steps have you already taken to try and solve the problem?

I have tried looking for reasons of why the graylog datanode would randomly break, but not found enything with google. This has happend before with version 6.1 to now 6.3.2 but the my setup broke again.

4. How can the community help?

What can cause the OpenSearch instance in the datanode container to fail allocating a shard?

Hi @usbpc ,

Next time it happens I’d recommend checking the

_cluster/allocation/explain?include_yes_decisions=true&include_disk_info=true

to see what’s the actual problem with allocation. This could give you some hints.

{
    "index": "matrix_log_index_27",
    "shard": 0,
    "primary": true,
    "current_state": "unassigned",
    "unassigned_info":
    {
        "reason": "ALLOCATION_FAILED",
        "at": "2025-09-22T12:24:14.492Z",
        "failed_allocation_attempts": 1,
        "details": "failed shard on node [wsD44htiSU-22UKf6X2xBA]: shard failure, reason [merge failed], failure MergeException[java.lang.IllegalStateException: this writer hit an unrecoverable error; cannot merge]; nested: IllegalStateException[this writer hit an unrecoverable error; cannot merge]; nested: CorruptIndexException[checksum failed (hardware problem?) : expected=80e19434 actual=9c1c8edc (resource=BufferedChecksumIndexInput(MemorySegmentIndexInput(path=\"/var/lib/graylog-datanode/opensearch/data/nodes/0/indices/KACBauZTQviiMgTK_NhtcA/0/index/_4ex.cfs\") [slice=_4ex_Lucene99_0.tim]))]; ",
        "last_allocation_status": "no_valid_shard_copy"
    },
    "cluster_info":
    {
        "nodes":
        {
            "wsD44htiSU-22UKf6X2xBA":
            {
                "node_name": "datanode",
                "least_available":
                {
                    "path": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
                    "total_bytes": 97825230848,
                    "used_bytes": 71469965312,
                    "free_bytes": 26355265536,
                    "free_disk_percent": 26.9,
                    "used_disk_percent": 73.1
                },
                "most_available":
                {
                    "path": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
                    "total_bytes": 97825230848,
                    "used_bytes": 71469965312,
                    "free_bytes": 26355265536,
                    "free_disk_percent": 26.9,
                    "used_disk_percent": 73.1
                }
            }
        },
        "shard_sizes":
        {
            "[.opendistro-ism-managed-index-history-2025.09.18-000158][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.03-000144][0][p]_bytes": 208,
            "[.plugins-ml-config][0][p]_bytes": 4021,
            "[.opendistro-ism-managed-index-history-2025.09.24-000164][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.20-000160][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.08-000149][0][p]_bytes": 208,
            "[gl-events_0][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.14-000154][0][p]_bytes": 208,
            "[.opendistro_security][0][p]_bytes": 71251,
            "[graylog_11][0][p]_bytes": 349499203,
            "[.opendistro-ism-managed-index-history-2025.09.10-000151][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.08.27-000137][0][p]_bytes": 51432,
            "[.opendistro-ism-managed-index-history-2025.09.02-000143][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.19-000159][0][p]_bytes": 45541,
            "[.opendistro-ism-managed-index-history-2025.09.23-000163][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.08.31-000141][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.15-000155][0][p]_bytes": 208,
            "[gl-system-events_7][0][p]_bytes": 19872,
            "[.opendistro-ism-managed-index-history-2025.09.05-000146][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.11-000152][0][p]_bytes": 30266,
            "[.opendistro-ism-managed-index-history-2025.08.28-000138][0][p]_bytes": 53668,
            "[matrix_log_index_26][0][p]_bytes": 1538552755,
            "[matrix_log_index_25][0][p]_bytes": 2384072591,
            "[.ds-gl-datanode-metrics-000002][0][p]_bytes": 14182659,
            "[.ds-gl-datanode-metrics-000001][0][p]_bytes": 1122747,
            "[.ds-gl-datanode-metrics-000004][0][p]_bytes": 1142379,
            "[.ds-gl-datanode-metrics-000008][0][p]_bytes": 3556409,
            "[.ds-gl-datanode-metrics-000003][0][p]_bytes": 53520661,
            "[.ds-gl-datanode-metrics-000006][0][p]_bytes": 2997581,
            "[.ds-gl-datanode-metrics-000005][0][p]_bytes": 6052869,
            "[.ds-gl-datanode-metrics-000007][0][p]_bytes": 4396964,
            "[.opendistro-ism-managed-index-history-2025.09.16-000156][0][p]_bytes": 208,
            "[matrix_log_index_23][0][p]_bytes": 1802959777,
            "[.opendistro-job-scheduler-lock][0][p]_bytes": 8792,
            "[.opendistro-ism-managed-index-history-2025.09.01-000142][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.08.30-000140][0][p]_bytes": 208,
            "[matrix_log_index_24][0][p]_bytes": 1857840239,
            "[.opendistro-ism-config][0][p]_bytes": 82764,
            "[.opendistro-ism-managed-index-history-2025.09.06-000147][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.22-000162][0][p]_bytes": 208,
            "[graylog_12][0][p]_bytes": 75985881,
            "[.opendistro-ism-managed-index-history-2025.09.09-000150][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.08.29-000139][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.04-000145][0][p]_bytes": 208,
            "[graylog_10][0][p]_bytes": 434406668,
            "[gl-system-events_8][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.25-000165][0][p]_bytes": 208,
            "[gl-system-events_6][0][p]_bytes": 11085,
            "[.opendistro-ism-managed-index-history-2025.09.17-000157][0][p]_bytes": 208,
            "[graylog_9][0][p]_bytes": 75834,
            "[.opendistro-ism-managed-index-history-2025.09.13-000153][0][p]_bytes": 208,
            "[.opendistro-ism-managed-index-history-2025.09.07-000148][0][p]_bytes": 49955,
            "[.opendistro-ism-managed-index-history-2025.09.21-000161][0][p]_bytes": 208
        },
        "shard_paths":
        {
            "[graylog_10][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=T42l18_WQpWQOOK8WQX_XQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.21-000161][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=YQkIes3zTx6hVtd6Es1RUQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.05-000146][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=lbBYLmkUTXKQAT_BlhtCOQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.14-000154][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=3wucyh7FTm6j6LlMj0Hzmg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[matrix_log_index_26][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=tFsTCOzISNmRl9H7G1fjBg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.18-000158][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=d4q6Q7P9RVOI0YzqktqXAQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.19-000159][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=aeHIZ9QZSGKCztLil0IqRw]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.03-000144][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=BmPvSa6JTimUhDeKbqq9dw]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.04-000145][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=xKeCVoYkQwm40TtTDqf5cQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.02-000143][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=gyercprySyKhtm7dwoHBlw]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.22-000162][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=YfrtJXAoTGyx1StdZl96bg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.ds-gl-datanode-metrics-000003][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=5Zn1eCUbTaqdmAbCOsd2RQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.plugins-ml-config][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=x0iPL-sZTcuw2WS9HAd0hg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.08.30-000140][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=P7WMLe9RSriyLDe9oQFnqg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.01-000142][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=wSLNndW_TTCf14xRcg9FCQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.20-000160][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=iX67VzjBSSWeZPpvwsRJ6Q]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.ds-gl-datanode-metrics-000006][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=IKWd1Q1aTOutAgVjbx_J9Q]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.25-000165][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=JT_FSlEWR0-CVMaqvcrpLQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.ds-gl-datanode-metrics-000005][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=8rYni8zOSfa8Z8qFhgeh7w]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.17-000157][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=3CVPEOJeRgeeNJzLtv29HA]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.23-000163][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=u5WVvK3kTgikouRVXz_j8w]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.10-000151][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=MRSWyvOsRf2_9AMTmJowDw]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[graylog_12][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=ck7Vr8NDSRGZ9O31uU6snQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.15-000155][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=CbKNbT1PQ5iIiPirBHDA1A]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.08.27-000137][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=mKm-dQmGRueBP_OUkVWXrg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[matrix_log_index_25][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=RhU871nnRf6dFdMWqPR6Mg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro_security][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=5UhhmBTFRkClAeM_kPaWmg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.08.28-000138][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=Sn29uJu2SN6snK6l80cJSA]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[graylog_9][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=R6WmPHxbQtCciysXL8trrA]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[gl-system-events_7][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=VHFQIJJDSNmrvHxUci57Fg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[gl-events_0][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=Hxjv2wq8Qh2B4GNCHOxlDQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.08.31-000141][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=qoj7oQwnQYC_rMQ7Vvcclg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[gl-system-events_6][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=ycj2dSHMRemoqEUv9XvcCA]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[matrix_log_index_23][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=X3GmXMLxRTGBuTyExtgAdw]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.ds-gl-datanode-metrics-000007][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=urWuL2_2S8Srce9D5UcP3g]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.06-000147][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=cZueMU61QSuy6QNcbBTXTQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[graylog_11][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=pUlEFB5RQkaU51XNUp_SzA]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-job-scheduler-lock][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=TAKT7ZQYQSKJRfwAA64Y4g]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.11-000152][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=9wOB0mz3RsSGpAeawpCd4A]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[matrix_log_index_24][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=CDbk9Yl-QeqyZ4Kez7sS2Q]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.ds-gl-datanode-metrics-000004][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=aRtY9vSrRkeLZMsh1u6Qrg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.08.29-000139][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=j_2pelyiQ92_SjstWELbIg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.07-000148][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=2gNKEXCMShW2SU3nsobruQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.08-000149][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=CjOPKPXyQcKVWvMOHUgc6g]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.24-000164][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=Ia6BVKFcSQqVt-K6lauN3A]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.ds-gl-datanode-metrics-000008][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=JM-Ka11PSBGwdqlEQh4Q3Q]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.ds-gl-datanode-metrics-000001][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=QdzH1Jg7TeGtsxCaxEHVrA]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.ds-gl-datanode-metrics-000002][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=M_JKDl-zQCeP3uHoZ8XlbQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[gl-system-events_8][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=IHgMd8ttSBqvM18kVg6MAw]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.09-000150][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=qqqvPTHzQ9-vSFDvaDx98Q]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.13-000153][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=XWOOEFCnShi1VUei3p9_4Q]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-config][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=DQTpXjP9Rxa28SMz3XgEYQ]": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
            "[.opendistro-ism-managed-index-history-2025.09.16-000156][0], node[wsD44htiSU-22UKf6X2xBA], [P], s[STARTED], a[id=3gYfPknBTiGp8TB5Y8osVg]": "/var/lib/graylog-datanode/opensearch/data/nodes/0"
        },
        "reserved_sizes":
        [
            {
                "node_id": "wsD44htiSU-22UKf6X2xBA",
                "path": "/var/lib/graylog-datanode/opensearch/data/nodes/0",
                "total": 0,
                "shards":
                [
                    "[.plugins-ml-config][0]",
                    "[graylog_10][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.23-000163][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.19-000159][0]",
                    "[.opendistro-ism-managed-index-history-2025.08.28-000138][0]",
                    "[matrix_log_index_24][0]",
                    "[.ds-gl-datanode-metrics-000003][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.09-000150][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.16-000156][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.14-000154][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.04-000145][0]",
                    "[graylog_9][0]",
                    "[.ds-gl-datanode-metrics-000004][0]",
                    "[.ds-gl-datanode-metrics-000005][0]",
                    "[graylog_11][0]",
                    "[.opendistro-job-scheduler-lock][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.11-000152][0]",
                    "[.opendistro-ism-managed-index-history-2025.08.29-000139][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.22-000162][0]",
                    "[.opendistro-ism-managed-index-history-2025.08.30-000140][0]",
                    "[.opendistro-ism-managed-index-history-2025.08.31-000141][0]",
                    "[.opendistro-ism-managed-index-history-2025.08.27-000137][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.06-000147][0]",
                    "[gl-events_0][0]",
                    "[matrix_log_index_26][0]",
                    "[.ds-gl-datanode-metrics-000002][0]",
                    "[.ds-gl-datanode-metrics-000007][0]",
                    "[gl-system-events_6][0]",
                    "[gl-system-events_8][0]",
                    "[.ds-gl-datanode-metrics-000001][0]",
                    "[.ds-gl-datanode-metrics-000006][0]",
                    "[matrix_log_index_23][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.15-000155][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.08-000149][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.01-000142][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.25-000165][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.05-000146][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.24-000164][0]",
                    "[matrix_log_index_25][0]",
                    "[graylog_12][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.02-000143][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.07-000148][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.17-000157][0]",
                    "[.opendistro-ism-config][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.10-000151][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.20-000160][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.18-000158][0]",
                    "[gl-system-events_7][0]",
                    "[.ds-gl-datanode-metrics-000008][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.13-000153][0]",
                    "[.opendistro_security][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.21-000161][0]",
                    "[.opendistro-ism-managed-index-history-2025.09.03-000144][0]"
                ]
            }
        ]
    },
    "can_allocate": "no_valid_shard_copy",
    "allocate_explanation": "cannot allocate because all found copies of the shard are either stale or corrupt",
    "node_allocation_decisions":
    [
        {
            "node_id": "wsD44htiSU-22UKf6X2xBA",
            "node_name": "datanode",
            "transport_address": "172.19.0.3:9300",
            "node_attributes":
            {
                "shard_indexing_pressure_enabled": "true"
            },
            "node_decision": "no",
            "store":
            {
                "in_sync": true,
                "allocation_id": "7NWgIG9SR6yR1iU3dEBQyA",
                "store_exception":
                {
                    "type": "corrupt_index_exception",
                    "reason": "failed engine (reason: [merge failed]) (resource=preexisting_corruption)",
                    "caused_by":
                    {
                        "type": "i_o_exception",
                        "reason": "failed engine (reason: [merge failed])",
                        "caused_by":
                        {
                            "type": "corrupt_index_exception",
                            "reason": "checksum failed (hardware problem?) : expected=80e19434 actual=9c1c8edc (resource=BufferedChecksumIndexInput(MemorySegmentIndexInput(path=\"/var/lib/graylog-datanode/opensearch/data/nodes/0/indices/KACBauZTQviiMgTK_NhtcA/0/index/_4ex.cfs\") [slice=_4ex_Lucene99_0.tim]))"
                        }
                    }
                }
            }
        }
    ]
}

I got this error from the endpoint you suggested, I think it’s telling me that it suspects a hardware problem, because the checksum of something failed?

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.