Could not retrieve Elasticsearch cluster health. Fetching Elasticsearch cluster health failed: There was an error fetching a resource: Internal Server Error. Additional information: Couldn't read Elasticsearch cluster health

1. Describe your incident:
At indices page I get “Could not retrieve Elasticsearch cluster health. Fetching Elasticsearch cluster health failed: There was an error fetching a resource: Internal Server Error. Additional information: Couldn’t read Elasticsearch cluster health” message
The whole story:
I migrate the graylog from a standalone server to docker with elasticsearch. There was no error.
I did upgrades in little steps from GL 2.X, Elastic 5.X, mongo 3.x.
After I tried to upgrade elasticsearch to opensearch, there was a problem, the opensearch container con’t open all elastic data, so the end “solution” was delete all data from elasticsearch, and start with a new empty opensearch. (It is a test system, settings are important, data not.)(I tied with compatibe and notmal mode)
The mongodb contains the old/migrated data.

Now the system does the following errors:
The mentioned error message,

Overwiev logs full with “There is no index target to point to. Creating one now.”

It writes out the logs, but it doesn’t delete the old indices.
kép

The search doesn’t work, because it tied to access an old index. (The “recalculate index ranges” doesn’t work)

Can’t open an index set.
kép

2. Describe your environment:
I run it in docker, here is the compose file.

docker-compose.yml
version: "3.8"

services:
  graylog-mongodb:
    image: "mongo:5.0.14"
    container_name: graylog-mongodb
    networks:
      graylog_net:
        ipv4_address: 172.20.0.2
    volumes:
      - "/srv/graylog/mongodb:/data/db"
    environment:
      TZ: "Europe/Budapest"
    restart: "always"

  graylog-opensearch:
#    image: "docker.elastic.co/elasticsearch/elasticsearch:7.17.6"
    image: "opensearchproject/opensearch:2.3.0"
#    image: "opensearchproject/opensearch:1.3.2"
    container_name: graylog-opensearch
    networks:
      graylog_net:
        ipv4_address: 172.20.0.3
    environment:
      - "node.name=graylog-opensearch"
#      - "compatibility.override_main_response_version=true"
      - "cluster.name=graylog"
      - "OPENSEARCH_JAVA_OPTS=-Xms1g -Xmx1g"
      - "bootstrap.memory_lock=true"
      - "discovery.type=single-node"
      - "action.auto_create_index=false"
      - "TZ=Europe/Budapest"
      - "plugins.security.ssl.http.enabled=false"
      - "plugins.security.disabled=true"
    ulimits:
      memlock:
        hard: -1
        soft: -1
    volumes:
      - "/srv/graylog/opensearch:/usr/share/opensearch/data"
#      - "/srv/graylog/opensearch:/usr/share/elasticsearch/data"
    restart: "always"

  graylog-server:
    hostname: "server"
    image: "graylog/graylog:5.0.2"
    container_name: graylog-server
    depends_on:
      graylog-opensearch:
        condition: "service_started"
      graylog-mongodb:
        condition: "service_started"
    networks:
      graylog_net:
        ipv4_address: 172.20.0.4
    entrypoint: "/usr/bin/tini -- wait-for-it graylog-opensearch:9200 --  /docker-entrypoint.sh"
    environment:
      GRAYLOG_NODE_ID_FILE: "/usr/share/graylog/data/config/node-id"
      GRAYLOG_PASSWORD_SECRET: "XX"
      GRAYLOG_ROOT_PASSWORD_SHA2: "XX"
      GRAYLOG_HTTP_BIND_ADDRESS: "0.0.0.0:9000"
      GRAYLOG_HTTP_EXTERNAL_URI: "http://localhost:9000/"
      GRAYLOG_ELASTICSEARCH_HOSTS: "http://graylog-opensearch:9200"
      GRAYLOG_MONGODB_URI: "mongodb://graylog-mongodb:27017/graylog"
      TZ: "Europe/Budapest"
      GRAYLOG_TIMEZONE: "Europe/Budapest"
      GRAYLOG_ROOT_TIMEZONE: "Europe/Budapest"
    ports:
    - "1514:1514/tcp"   # Syslog
    - "9000:9000/tcp"   # Server API
    volumes:
      - "/srv/graylog/graylog/node-id:/usr/share/graylog/data/config/node-id:ro"
      - "/srv/graylog/graylog/data:/usr/share/graylog/data/data"
      - "/srv/graylog/graylog/journal:/usr/share/graylog/data/journal"
    restart: "always"
networks:
 graylog_net:
  driver: bridge
  driver_opts:
    com.docker.network.bridge.host_binding_ipv4: "192.168.254.20"
  ipam:
   config:
    - subnet: 172.20.0.0/24

I tried to check the elastic cluster status:

root@bds-docker:/srv/graylog/docker-graylog# docker exec -it  graylog-server bash
graylog@server:~$ curl -XGET http://elasticsearch:9200/_cluster/health?pretty
{
  "cluster_name" : "graylog",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "discovered_master" : true,
  "discovered_cluster_manager" : true,
  "active_primary_shards" : 40,
  "active_shards" : 40,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}
graylog@server:~$ curl -XGET http://graylog-opensearch:9200/_cluster/health?pretty
{
  "cluster_name" : "graylog",
  "status" : "green",
  "timed_out" : false,
  "number_of_nodes" : 1,
  "number_of_data_nodes" : 1,
  "discovered_master" : true,
  "discovered_cluster_manager" : true,
  "active_primary_shards" : 40,
  "active_shards" : 40,
  "relocating_shards" : 0,
  "initializing_shards" : 0,
  "unassigned_shards" : 0,
  "delayed_unassigned_shards" : 0,
  "number_of_pending_tasks" : 0,
  "number_of_in_flight_fetch" : 0,
  "task_max_waiting_in_queue_millis" : 0,
  "active_shards_percent_as_number" : 100.0
}

Any idea?

Hey, @macko003 ,

Thanks for coming to the community to get free support from peers. As you already know, we have several expert users here to can help you find a solution for your question. I’ll start the responses which I’m sure you’ll get from others, by looking at your issue from the perspective of our documentation. Please let me know if this helps.

I wonder if the issue is related to the fact that you deleted all data from Elasticsearch and started with a new empty index. This means that Graylog is not able to find the index it is looking for and is unable to retrieve the Elasticsearch cluster health.

If that’s the case, it’s likely that the Graylog server is still configured to use the old Elasticsearch index, which no longer exists. To test this, you will need to update the Graylog server configuration to point to the new index.

You could create a new index set in the Graylog web interface, and then configure the Graylog server to use the new index set. You can do this by going to System > Indices > Manage indices and creating a new index set. Then, you can go to System > Indices > Configuration and choose the new index set as the active index set.

Alternatively, you might try updating the Graylog server configuration file (graylog.conf) to point to the new Elasticsearch index by updating the elasticsearch_index_prefix and elasticsearch_index_name_template settings.

Check the environment variables, configurations, and the elasticsearch version to ensure they match with the graylog version, and the graylog-elasticsearch plugin is installed.

If you’re running graylog in a container, make sure that the containers are running and communicating properly, and also check the logs of all the containers to see if there are any issues.

Finally, I’d suggest checking the environment variables in the compose file. You may have to update the values for the elasticsearch version, time zone and other variables. Hope this helps!

Please let me know if you have any questions or need further assistance.

1 Like

Hello @macko003

There is a issue that looks the same as yours, if you read further down the post.

I believe this has to do with the connection between Graylog and ES/OS. Since Opensearch has plugins.security I noticed members are trying to use certs and secure connection between the two. By default Opensearch YAML file has this enabled already.

So, using this configuration in Graylog config file does not work.

elasticsearch_hosts = https://192.168.1.100:9200

But this does

elasticsearch_hosts = http://192.168.1.100:9200

So enableing anything but user/password will not work.

elasticsearch_hosts = http://node1:9200,http://user:password@node2:19200

Normally when I see “loading” on my GUI, its either certificates ( HTTPS) or Graylog is unable to connect to ES/OS this is due to configurations.

Last, What does the log file show? Not only from Graylog but MongoDb and Opensearch?

I do have Graylog docker working, here is my docker-compose file, hope it might help.

version: '3'
services:
   # MongoDB: https://hub.docker.com/_/mongo/
  mongodb:
   # Container time Zone     
    image: mongo:4.4.18    
    network_mode: bridge
   # DB in share for persistence
    volumes:
      - mongo_data:/data/db
   
  opensearch-node2:
    image: image: opensearchproject/opensearch:1.3.2    
    network_mode: bridge
    #data folder in share for persistence
    volumes:
      - es_data:/usr/share/opensearch/data
    environment:
      - http.host=0.0.0.0
      - transport.host=localhost
      - network.host=0.0.0.0     
      - "ES_JAVA_OPTS=-Xms512m -Xmx512m"
    ulimits:
      memlock:
        soft: -1
        hard: -1
    mem_limit: 1g
  graylog:
    #image: graylog/graylog-enterprise:4.3.3-jre11
    image: graylog/graylog-enterprise:4.3.9-jre11   
    network_mode: bridge
    dns:
      - 8.8.8.8
      - 8.8.4.4
   # journal and config directories in local NFS share for persistence
    volumes:
       - graylog_journal:/usr/share/graylog/data/journal
       # - graylog_bin:/usr/share/graylog/bin
       - graylog_bin:/usr/share/graylog-server/bin/
       - graylog_data:/usr/share/graylog/data/config
       - graylog_log:/usr/share/graylog/data/log
       - graylog_plugin:/usr/share/graylog/data/plugin
       - graylog_content:/usr/share/graylog/data/contentpacks
      # Mount local configuration directory into Docker container
       - graylog_scripts:/usr/share/graylog/scripts
       #- ./graylog/data/journal:/usr/share/graylog/data/journal
       #- ./graylog/config:/usr/share/graylog/data/config

    environment:
      # Container time Zone
      - TZ=America/Chicago
      # CHANGE ME (must be at least 16 characters)!
      - GRAYLOG_PASSWORD_SECRET=pJod1TRZAckHmqM2oQPqX1qnLVJS99jHm2DuCux2Bpiuu2XLT
      # Password: admin
      -GRAYLOG_ROOT_PASSWORD_SHA2=ef92b778bafe771e89245b89ecbc911881f383d4473e94f
      - GRAYLOG_HTTP_BIND_ADDRESS=0.0.0.0:9000
      - GRAYLOG_HTTP_EXTERNAL_URI=http://192.168.1.100:9000/
      - GRAYLOG_ROOT_TIMEZONE=America/Chicago
      - GRAYLOG_ROOT_EMAIL=greg.smith@domain.com
      - GRAYLOG_HTTP_PUBLISH_URI=http://192.168.1.100:9000/
      - GRAYLOG_TRANSPORT_EMAIL_PROTOCOL=smtp
      - GRAYLOG_HTTP_ENABLE_CORS=true
      - GRAYLOG_TRANSPORT_EMAIL_WEB_INTERFACE_URL=http://192.168.1.100:9000/
      - GRAYLOG_TRANSPORT_EMAIL_HOSTNAME=192.168.1.100
      - GRAYLOG_TRANSPORT_EMAIL_ENABLED=true
      - GRAYLOG_TRANSPORT_EMAIL_PORT=25
      - GRAYLOG_TRANSPORT_EMAIL_USE_AUTH=false
      - GRAYLOG_TRANSPORT_EMAIL_USE_TLS=false
      - GRAYLOG_TRANSPORT_EMAIL_USE_SSL=false
      - GRAYLOG_TRANSPORT_FROM_EMAIL=root@localhost
      - GRAYLOG_TRANSPORT_SUBJECT_PREFIX=[graylog]
      - GRAYLOG_REPORT_DISABLE_SANDBOX=true
      #- depends_on: GRAYLOG_REPORT_RENDER_URI=http://192.168.1.100:9000
      - GRAYLOG_REPORT_USER=graylog-report
      - GRAYLOG_REPORT_RENDER_ENGINE_PORT=9515
    logging:
      driver: syslog
      options:
        syslog-address: "udp://192.168.2.120:51420"
        syslog-facility: "local7"
        syslog-format: "rfc3164"
        tag: "asible"
    links:
      - mongodb:mongo
      - elasticsearch
    #restart:always
    depends_on:
      - mongodb
      - elasticsearch
    ports:
      # Graylog web interface and REST API
      - 9000:9000
      # Syslog TCP
      - 8514:8514
      # Elasticsearch
      - 9200:9200
      - 9300:9300
      # Syslog UDP
      - 8514:8514/udp
      # GELF TCP
      #- 12201:12201
      # GELF UDP
      - 12201:12201/udp
      # Reports
      - 9515:9515
      - 9515:9515/udp
      # beats
      - 5044:5044
      # email
      - 25:25
      - 25:25/udp
      # web
      - 80:80
      - 443:443
      - 21:21
      # Forwarder
      - 13302:13302
      - 13301:13301
      # keycloak
      - 8443:8443
      # packetbeat
      - 5055:5055
      # Syslogs
      - 51420:51420
      # CEF Messages
      - 5555:5555/udp
#Volumes for persisting data, see https://docs.docker.com/engine/admin/volumes/volumes/
volumes:
  mongo_data:
    driver: local
  es_data:
    driver: local
  graylog_journal:
    driver: local
  graylog_bin:
    driver: local
  graylog_data:
    driver: local
  graylog_log:
    driver: local
  graylog_plugin:
    driver: local
  graylog_content:
    driver: local
  graylog_scripts:
    driver: local
root@ansible:/usr/local/bin#
2 Likes

Thanks the help.

Ok, I have cloned the system, to do a playground.
Empty graylog,opensearch folders. The original mongo data.
The error is the same.

I recreated again with full empty folders/data. Everything is fine in the system, it shows a green elastic status, and I can recreate active index.

So the problem is somewhere in the mongo’s data.

So the docker logs:

mongo:
No error just some like that

{"t":{"$date":"2023-01-23T17:44:31.092+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"Checkpointer","msg":"WiredTiger message","attr":{"message":"[1674492271:92567][1:0x7fde96783700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 1031585, snapshot max: 1031585 snapshot count: 0, oldest timestamp: (0, 0) , meta checkpoint timestamp: (0, 0) base write gen: 8741050"}}
{"t":{"$date":"2023-01-23T17:45:31.156+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"Checkpointer","msg":"WiredTiger message","attr":{"message":"[1674492331:156534][1:0x7fde96783700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 1031751, snapshot max: 1031751 snapshot count: 0, oldest timestamp: (0, 0) , meta checkpoint timestamp: (0, 0) base write gen: 8741050"}}
{"t":{"$date":"2023-01-23T17:46:31.212+01:00"},"s":"I",  "c":"STORAGE",  "id":22430,   "ctx":"Checkpointer","msg":"WiredTiger message","attr":{"message":"[1674492391:211988][1:0x7fde96783700], WT_SESSION.checkpoint: [WT_VERB_CHECKPOINT_PROGRESS] saving checkpoint snapshot min: 1031923, snapshot max: 1031923 snapshot count: 0, oldest timestamp: (0, 0) , meta checkpoint timestamp: (0, 0) base write gen: 8741050"}}

opensearch
Just info, no errors

[2023-01-23T17:39:55,531][INFO ][o.o.j.s.JobSweeper       ] [graylog-opensearch] Running full sweep
[2023-01-23T17:44:55,532][INFO ][o.o.j.s.JobSweeper       ] [graylog-opensearch] Running full sweep
[2023-01-23T17:46:29,302][INFO ][o.o.c.m.MetadataMappingService] [graylog-opensearch] [graylog_70/l-3UTHBSS8SHOGDMSsCD0A] update_mapping [_doc]
[2023-01-23T17:46:29,353][INFO ][o.o.a.u.d.DestinationMigrationCoordinator] [graylog-opensearch] Detected cluster change event for destination migration

Graylog:
Full with errors… :frowning:
(I treid to pick one-one errors, not the full log)

2023-01-19 12:02:37,611 WARN : org.graylog2.indexer.indices.Indices - Couldn't create index gl-failures_0. Error: No index template provider found for type 'failures'
java.lang.IllegalStateException: No index template provider found for type 'failures'
2023-01-19 12:02:37,615 ERROR: org.graylog2.periodical.IndexRotationThread - Couldn't point deflector to a new index
java.lang.RuntimeException: Could not create new target index <gl-failures_0>.

2023-01-19 12:02:39,843 ERROR: org.graylog.events.processor.EventProcessorEngine - Caught an unhandled exception while executing event processor <aggregation-v1/Threat IP-vel való kommunikáció/60aa10648ba6ce6b579ef227> - Make sure to modify the event processor to throw only EventProcessorExecutionException so we get more context!
org.graylog2.indexer.IndexNotFoundException: Unable to perform scroll search[graylog_884]

Index not found for query: graylog_884. Try recalculating your index ranges.

2023-01-19 12:02:39,896 WARN : org.graylog.plugins.map.geoip.MaxMindIpResolver - Error creating DatabaseReader for 'MaxMindIpAsnResolver' with config file ''
2023-01-19 12:02:39,900 WARN : org.graylog.plugins.map.geoip.MaxMindIpResolver - Error creating DatabaseReader for 'MaxMindIpAsnResolver' with config file ''
2023-01-19 12:02:41,217 ERROR: org.graylog2.indexer.messages.Messages - Caught exception during bulk indexing: ElasticsearchException{message=ElasticsearchException[An error occurred: ]; nested: IOException[Unable to parse response body for Response{requestLine=POST /_bulk?timeout=1m HTTP/1.1, host=http://graylog-opensearch:9200, response=HTTP/1.1 200 OK}]; nested: NullPointerException;, errorDetails=[]}, retrying (attempt #1).
2023-01-19 12:02:41,244 ERROR: org.graylog2.indexer.messages.Messages - Caught exception during bulk indexing: ElasticsearchException{message=ElasticsearchException[An error occurred: ]; nested: IOException[Unable to parse response body for Response{requestLine=POST /_bulk?timeout=1m HTTP/1.1, host=http://graylog-opensearch:9200, response=HTTP/1.1 200 OK}]; nested: NullPointerException;, errorDetails=[]}, retrying (attempt #2).
2023-01-19 12:02:41,557 ERROR: org.graylog.plugins.threatintel.whois.ip.WhoisIpLookup - Could not lookup WHOIS information for [192.168.1.2] at [ARIN].
2023-01-19 12:02:41,561 ERROR: org.graylog.plugins.threatintel.whois.ip.WhoisIpLookup - Could not lookup WHOIS information for [192.168.0.15] at [ARIN].

What I tried:

  • delete the deflectors, It created again, and It write data to elasticsearch. I also can “rotate active write index”
  • Disable GeoIP resolver, and Threat Int plugins.
  • Recalculate index range, becaule it doesn’t find the graylog_884 (I deleted it, so it is normal), but the same error…
  • create the 884 index. - It solved the search error, I see the data now under the search.AND it solved the cluster state error also :slight_smile:
  • create the gl-failures_0
  • recalculate all index sets’ range, it was successfull

When I tried to rotate the index set, I get:

2023-01-23 18:26:15,293 INFO : org.graylog2.rest.resources.system.DeflectorResource - Cycling deflector for index set <61fc1f731bf7eb4ce04d9a7d>. Reason: REST request.
2023-01-23 18:26:15,298 INFO : org.graylog2.indexer.MongoIndexSet - Cycling from <gl-failures_0> to <gl-failures_1>.
2023-01-23 18:26:15,299 INFO : org.graylog2.indexer.MongoIndexSet - Creating target index <gl-failures_1>.
2023-01-23 18:26:15,306 WARN : org.graylog2.indexer.indices.Indices - Couldn't create index gl-failures_1. Error: No index template provider found for type 'failures'
java.lang.IllegalStateException: No index template provider found for type 'failures'

This is the only error left.
I did a new research, and I found this

https://community.graylog.org/t/migration-from-elasticsearch-to-opensearch-gone-wrong/27237

@gsmith mentioned a security plugin. I haven’t installed it. BUT the old graylog was an enterprise one (demo, we don’t need the feature). It could be the problem. I can’t delete the Index Set, because an stream connectred to it.
Can I delete the stream and the Index set? How?

kép
kép
(I tried to check how can I remove the enterprise feature, but I find only “remove the plugin and restat”)

Hey,

That kind of odd what you went through to solve some of your issues, and thanks for the feed back on what you had to do.

By chance did you use cURL to create the index set graylog_884? for testing did you rotate that index set so you on graylog_885 and does it work correctly?

Its connected to a stream called Processing and indexing failures and you cant delete that stream.

Just out of curiousity and since this is a “play ground”, how about deleting database for graylog which i believe should be called graylog? then do a restart. If this works without errors then 100% sure something is going onwith MongDb, if not we can look else where. to make it easier to revert back you can always do a mongodump and restore it back.

Hey,

I just found out that is indeed caused by old data in the mongoDB. I dropped the graylog database, restarted graylog and all my errors went away.

I’m currently recreating all my config.

Yes, I did with curl, and I try the rotate, because the 884 was created with one default replica, so my cluster goes to yellow, so after the rotate I got the 885, and deleted the 884 :slight_smile:

No, the logs only for playground. Mondog is full with pipelines, alerts, etc what I would like to save for the future.

Hold my beer…

mongo is accessable in the system

> db.streams.find( {} )
{ "_id" : ObjectId("000000000000000000000001"), "creator_user_id" : "local:admin", "is_default_stream" : true, "index_set_id" : "5dc03820af2dc41d09b7f4a8", "matching_type" : "AND", "remove_matches_from_default_stream" : false, "description" : "Stream containing all messages", "created_at" : ISODate("2019-11-04T14:39:28.316Z"), "disabled" : false, "title" : "All messages" }
{ "_id" : ObjectId("000000000000000000000002"), "creator_user_id" : "admin", "is_default_stream" : false, "index_set_id" : "60a540b18ba6ce6b5799b9ba", "matching_type" : "AND", "remove_matches_from_default_stream" : true, "description" : "Stream containing all events created by Graylog", "created_at" : ISODate("2021-05-19T16:45:37.736Z"), "disabled" : false, "title" : "All events" }
{ "_id" : ObjectId("000000000000000000000003"), "creator_user_id" : "admin", "is_default_stream" : false, "index_set_id" : "60a540b18ba6ce6b5799b9bd", "matching_type" : "AND", "remove_matches_from_default_stream" : true, "description" : "Stream containing all system events created by Graylog", "created_at" : ISODate("2021-05-19T16:45:37.804Z"), "disabled" : false, "title" : "All system events" }
{ "_id" : ObjectId("000000000000000000000004"), "creator_user_id" : "admin", "is_default_stream" : false, "index_set_id" : "61fc1f731bf7eb4ce04d9a7d", "matching_type" : "AND", "remove_matches_from_default_stream" : true, "description" : "Stream containing messages that failed to be processed or indexed", "created_at" : ISODate("2022-02-03T18:31:15.594Z"), "disabled" : false, "title" : "Processing and Indexing Failures" }
....
> db.streams.deleteMany( {  "_id" : ObjectId("000000000000000000000004") } )
{ "acknowledged" : true, "deletedCount" : 1 }
> db.streams.find( {} )
{ "_id" : ObjectId("000000000000000000000001"), "creator_user_id" : "local:admin", "is_default_stream" : true, "index_set_id" : "5dc03820af2dc41d09b7f4a8", "matching_type" : "AND", "remove_matches_from_default_stream" : false, "description" : "Stream containing all messages", "created_at" : ISODate("2019-11-04T14:39:28.316Z"), "disabled" : false, "title" : "All messages" }
{ "_id" : ObjectId("000000000000000000000003"), "creator_user_id" : "admin", "is_default_stream" : false, "index_set_id" : "60a540b18ba6ce6b5799b9bd", "matching_type" : "AND", "remove_matches_from_default_stream" : true, "description" : "Stream containing all system events created by Graylog", "created_at" : ISODate("2021-05-19T16:45:37.804Z"), "disabled" : false, "title" : "All system events" }
{ "_id" : ObjectId("000000000000000000000002"), "creator_user_id" : "admin", "is_default_stream" : false, "index_set_id" : "60a540b18ba6ce6b5799b9ba", "matching_type" : "AND", "remove_matches_from_default_stream" : true, "description" : "Stream containing all events created by Graylog", "created_at" : ISODate("2021-05-19T16:45:37.736Z"), "disabled" : false, "title" : "All events" }
...

And after I can delete the ‘Graylog Message Failures’ index set.

I got a graylog error, but it is “normal” in this state.

2023-01-24 12:49:54,422 ERROR: org.graylog2.indexer.indices.jobs.IndexSetCleanupJob - Unable to delete index template <gl-failures-template>
org.graylog.shaded.opensearch2.org.opensearch.OpenSearchException: Unable to delete index template gl-failures-template
...
{"error":{"root_cause":[{"type":"index_template_missing_exception","reason":"index_template [gl-failures-template] missing"}],"type":"index_template_missing_exception","reason":"index_template [gl-failures-template] missing"},"status":404}

2023-01-24 12:49:54,423 INFO : org.graylog2.indexer.indices.jobs.IndexSetCleanupJob - Removing index range information for index: gl-failures_0
2023-01-24 12:49:54,424 INFO : org.graylog2.indexer.indices.jobs.IndexSetCleanupJob - Deleting index <gl-failures_0> in index set <61fc1f731bf7eb4ce04d9a7d> (Graylog Message Failures)

I tried a graylog restart, I got the error

2023-01-24 12:49:54,422 ERROR: org.graylog2.indexer.indices.jobs.IndexSetCleanupJob - Unable to delete index template <gl-failures-template>
org.graylog.shaded.opensearch2.org.opensearch.OpenSearchException: Unable to delete index template gl-failures-template

So I create it…

curl -XPOST "localhost:9200/_template/gl-failures-template?pretty" -H 'Content-Type: application/json' -d '{"index_patterns": ["asdfghjk"]}'

Restart the graylog, no errors
Remove the template

curl -XDELETE "localhost:9200/_template/gl-failures-template"

A restart again, and no more errors…

So for the future… (solution in shorter version)
I recognised the following errors, problems:

  • after the opensearch data delete, the graylog starts write the indices from 0, but try to find the old ones, it caused the “Could not retrieve Elasticsearch cluster health. Fetching Elasticsearch cluster health failed: There was an error fetching a resource: Internal Server Error. Additional information: Couldn’t read Elasticsearch cluster health” error message. And all other elastic/opensearch connection/query problem. I created the index manually, solved it.
  • the mongodb contans some data from the privious enterprise version (Stream " Processing and indexing failures" and Index set “Graylog Message Failures”) , and the opensearch doesn’t contains the necesearry template (gl-failures-template). I have to delete the stream from mongo, and the unnecesearry index set from the browser.
  • After the “cleaning” the graylog still search for the gl-failures-template template, so I created it, restart the graylog, and remove it.

Guys, thanks for your time, I just need a lot of time to debugging everything (what I caused for me). I think it was 4-6 hours :frowning:

3 Likes

Hey,

Awesome & thx for the feed back :+1: , glad you resolved it without losing data.

Whole cow @macko003 you had to go through a lot to get ride of that error. Im wondering why this happened?