Docker stack setup

1. Describe your incident:
Default index set + active input (GELF UDP) is removed after running graylog for some time. It’s not possible to access any streams, indices or doing a search. Furthermore, the cluster ID is just 000-000-000…

We’re not sure how these issues are related.

2. Describe your environment:

  • OS Information: Docker - Linux 4.18.0-193.75.1.el8_2.x86_64 red hat 15BG RAM 150GB storage

  • Package Version: Docker compose file:

mongodb:
    image: "mongo:5.0"
    volumes:
      - "mongodb_data:/data/db"
    networks:
      wardnet:
        aliases:
          - ward-mongo.dk

  elasticsearch:
    environment:
      ES_JAVA_OPTS: "-Xms1g -Xmx1g -Dlog4j2.formatMsgNoLookups=true"
      bootstrap.memory_lock: "true"
      discovery.type: "single-node"
      http.host: "0.0.0.0"
      action.auto_create_index: "false"
    image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2"
    ulimits:
      memlock:
        hard: -1
        soft: -1
    volumes:
      - "es_data:/usr/share/elasticsearch/data"
    networks:
        wardnet:
            aliases:
                - ward-elasticsearch.dk

  graylog:
    image: "graylog/graylog:4.3"
    depends_on:
      - elasticsearch
      - mongodb
    entrypoint: "/usr/bin/tini -- wait-for-it ward-elasticsearch.dk:9200 --  /docker-entrypoint.sh"
    environment:
      GRAYLOG_NODE_ID_FILE: "/usr/share/graylog/data/config/node-id"
      GRAYLOG_PASSWORD_SECRET: $GRAYLOG_PASSWORD_SECRET
      GRAYLOG_ROOT_PASSWORD_SHA2: $GRAYLOG_ROOT_PASSWORD_SHA2
      GRAYLOG_HTTP_EXTERNAL_URI: http://$HOST_DNS:9000/
      GRAYLOG_HTTP_BIND_ADDRESS: "0.0.0.0:9000"
      GRAYLOG_HTTP_EXTERNAL_URI: "http://localhost:9000/"
      GRAYLOG_ELASTICSEARCH_HOSTS: "http://ward-elasticsearch.dk:9200"
      GRAYLOG_MONGODB_URI: "mongodb://ward-mongo.dk:27017/graylog"
    ports:
    - "5044:5044/tcp"   # Beats
    - "5140:5140/udp"   # Syslog
    - "5140:5140/tcp"   # Syslog
    - "5555:5555/tcp"   # RAW TCP
    - "5555:5555/udp"   # RAW TCP
    - "9000:9000/tcp"   # Server API
    - "12201:12201/tcp" # GELF TCP
    - "12201:12201/udp" # GELF UDP
    #- "10000:10000/tcp" # Custom TCP port
    #- "10000:10000/udp" # Custom UDP port
    - "13301:13301/tcp" # Forwarder data
    - "13302:13302/tcp" # Forwarder config
    volumes:
      - "graylog_data:/usr/share/graylog/data/data"
      - "graylog_journal:/usr/share/graylog/data/journal"
      - "./graylog/node-id:/usr/share/graylog/data/config/node-id"
    networks:
        wardnet:
            aliases:
                - ward-graylog.dk
volumes:
  mongodb_data:
    driver: local
  es_data:
    driver: local
  graylog_data:
    driver: local
  graylog_journal:
    driver: local
  • Service logs, configurations, and environment variables:
    in comments due to link limit

3. What steps have you already taken to try and solve the problem?
We’ve tried persisting inputs via content packs, but these are also removed/not reloaded when our issue occurs.

We checked elastic search to see if there were storage/watermark/read-only issues, and tried tweaking memory limits of elastic search, as we think that’s the root cause.

We also tried various combinations and upgrades and downgrades of graylog-mongo-elastic versions.

Recently reconfigured everything from scratch based on this docker-compose/docker-compose.yml at main · Graylog2/docker-compose · GitHub (only thing changed is the node-id)

Checked elastic cluster: “Elasticsearch cluster docker-cluster is green. Shards: 4 active, 0 initializing, 0 relocating, 0 unassigned”

Indexer failures: “Hurray! There are not any indexer failures.”

4. How can the community help?
Can you please review the logs / configuration and check for any rookie mistakes?

Thank you so much! :smiley:

PS. We do have graylog exposed on a public URL, would it be okay/helpful to provide that + credentials here(just a test-server where the issue persists), or is that no-go?

  • Service logs, configurations, and environment variables:

Graylog: Graylog08042022 - Pastebin.com

Elastic search: Elasticsearch04082022 - Pastebin.com

mongo(latest due to max size): MongoDB04082022 - Pastebin.com

Hello,

What with the dot before the forward slash?

Question?

Could you explain more about this? How did you remove the default index set or did you assign a new default?

EDIT: I went over all the logs( a lot) here were some of my findings

cause java.lang.IllegalStateException: GELF message is too short. Not even the type header would fit.)

WARN : org.mongodb.driver.connection - Got socket exception on connection [connectionId{localValue:33, serverValue:56}] to ward-mongo.dk:27017. All connections to ward-mongo.dk:27017 will be closed

Maybe this post might help.

ERROR: org.graylog2.periodical.IndexRotationThread - Couldn’t perform index block check for index set :

Maybe this post might help.

WARN : org.glassfish.jersey.internal.Errors - The following warnings have been detected: WARNING: Unknown HK2 failure detected:

  • Is there enough disk space for MongoDB to run?
  • Is there enough main memory for MongoDB to run?
  • Are the data files uncorrupted?

node.name": “684a07264547”, “message”: “gateway.auto_import_dangling_indices is disabled,
dangling indices will not be automatically detected or imported and must be managed manually” }

Check Elasticsearch for dangling indices:

curl -X GET localhost:9200/_dangling?pretty

Check Elasticsearch Indices:

curl -X GET localhost:9200/_cat/shards

max virtual memory areas vm.max_map_count [65530] is too low, increase to at least [262144]" }

sysctl -w vm.max_map_count=262144

To be honest, It seams that there are multiple issues with this environment. If you removed default index set, that would probably the reason why you have _dangling indices, just a thought.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.