1. Describe your incident:
Default index set + active input (GELF UDP) is removed after running graylog for some time. It’s not possible to access any streams, indices or doing a search. Furthermore, the cluster ID is just 000-000-000…
We’re not sure how these issues are related.
2. Describe your environment:
-
OS Information: Docker - Linux 4.18.0-193.75.1.el8_2.x86_64 red hat 15BG RAM 150GB storage
-
Package Version: Docker compose file:
mongodb:
image: "mongo:5.0"
volumes:
- "mongodb_data:/data/db"
networks:
wardnet:
aliases:
- ward-mongo.dk
elasticsearch:
environment:
ES_JAVA_OPTS: "-Xms1g -Xmx1g -Dlog4j2.formatMsgNoLookups=true"
bootstrap.memory_lock: "true"
discovery.type: "single-node"
http.host: "0.0.0.0"
action.auto_create_index: "false"
image: "docker.elastic.co/elasticsearch/elasticsearch-oss:7.10.2"
ulimits:
memlock:
hard: -1
soft: -1
volumes:
- "es_data:/usr/share/elasticsearch/data"
networks:
wardnet:
aliases:
- ward-elasticsearch.dk
graylog:
image: "graylog/graylog:4.3"
depends_on:
- elasticsearch
- mongodb
entrypoint: "/usr/bin/tini -- wait-for-it ward-elasticsearch.dk:9200 -- /docker-entrypoint.sh"
environment:
GRAYLOG_NODE_ID_FILE: "/usr/share/graylog/data/config/node-id"
GRAYLOG_PASSWORD_SECRET: $GRAYLOG_PASSWORD_SECRET
GRAYLOG_ROOT_PASSWORD_SHA2: $GRAYLOG_ROOT_PASSWORD_SHA2
GRAYLOG_HTTP_EXTERNAL_URI: http://$HOST_DNS:9000/
GRAYLOG_HTTP_BIND_ADDRESS: "0.0.0.0:9000"
GRAYLOG_HTTP_EXTERNAL_URI: "http://localhost:9000/"
GRAYLOG_ELASTICSEARCH_HOSTS: "http://ward-elasticsearch.dk:9200"
GRAYLOG_MONGODB_URI: "mongodb://ward-mongo.dk:27017/graylog"
ports:
- "5044:5044/tcp" # Beats
- "5140:5140/udp" # Syslog
- "5140:5140/tcp" # Syslog
- "5555:5555/tcp" # RAW TCP
- "5555:5555/udp" # RAW TCP
- "9000:9000/tcp" # Server API
- "12201:12201/tcp" # GELF TCP
- "12201:12201/udp" # GELF UDP
#- "10000:10000/tcp" # Custom TCP port
#- "10000:10000/udp" # Custom UDP port
- "13301:13301/tcp" # Forwarder data
- "13302:13302/tcp" # Forwarder config
volumes:
- "graylog_data:/usr/share/graylog/data/data"
- "graylog_journal:/usr/share/graylog/data/journal"
- "./graylog/node-id:/usr/share/graylog/data/config/node-id"
networks:
wardnet:
aliases:
- ward-graylog.dk
volumes:
mongodb_data:
driver: local
es_data:
driver: local
graylog_data:
driver: local
graylog_journal:
driver: local
- Service logs, configurations, and environment variables:
in comments due to link limit
3. What steps have you already taken to try and solve the problem?
We’ve tried persisting inputs via content packs, but these are also removed/not reloaded when our issue occurs.
We checked elastic search to see if there were storage/watermark/read-only issues, and tried tweaking memory limits of elastic search, as we think that’s the root cause.
We also tried various combinations and upgrades and downgrades of graylog-mongo-elastic versions.
Recently reconfigured everything from scratch based on this docker-compose/docker-compose.yml at main · Graylog2/docker-compose · GitHub (only thing changed is the node-id)
Checked elastic cluster: “Elasticsearch cluster docker-cluster is green. Shards: 4 active, 0 initializing, 0 relocating, 0 unassigned”
Indexer failures: “Hurray! There are not any indexer failures.”
4. How can the community help?
Can you please review the logs / configuration and check for any rookie mistakes?
Thank you so much!
PS. We do have graylog exposed on a public URL, would it be okay/helpful to provide that + credentials here(just a test-server where the issue persists), or is that no-go?