Hi All,
We have recently started to ramp up the log messages going to are Graylog instance - certainly not a massive amount but I’ve found that the system cannot cope which results in system crash/ docker containers requiring a restart. I want to understand how I can find out which element is causing the behaviour and how I can identify the bottleneck.
Assuming it may still be disk load, that’s why I switch to the SSD’s on GCE, but that did not reduce the current issue.
Disclaimer, I’m not a system admin, I’ve just thrown this together so we could have a solution
Google Cloud Compute Instance
1 vCPU, 4.75 GB memory
SSD: 200GB
docker-compose file:
version: '3.2'
services:
# MongoDB: https://hub.docker.com/_/mongo/
mongo:
image: mongo:3
restart: unless-stopped
volumes:
- mongo_data:/data/db
networks:
- graylog
# Elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/6.x/docker.html
elasticsearch:
image: docker.elastic.co/elasticsearch/elasticsearch-oss:6.8.5
restart: unless-stopped
volumes:
- es_data:/usr/share/elasticsearch/data
environment:
- http.host=0.0.0.0
- transport.host=localhost
- network.host=0.0.0.0
- "ES_JAVA_OPTS=-Xms1536m -Xmx1536m"
ulimits:
memlock:
soft: -1
hard: -1
#mem_limit: 1g
networks:
- graylog
# Graylog: https://hub.docker.com/r/graylog/graylog/
graylog:
image: graylog/graylog:3.3
volumes:
- graylog_journal:/usr/share/graylog/data/journal
restart: unless-stopped
networks:
- graylog
depends_on:
- mongo
- elasticsearch
environment:
GRAYLOG_SERVER_JAVA_OPTS: "-Djavax.net.ssl.trustStore=/usr/share/graylog/data/config/ssl/cacerts.jks"
ports:
# Graylog https and Rest API
- 443:443
#- 127.0.0.1:9000:9000
# Syslog TCP
- 514:514
# Syslog UDP
- 514:514/udp
# Syslog UDP Tag Systems
- 515:515/udp
# Syslog UDP for Linux Hosts
- 1514:1514/udp
# GELF TCP
- 12201:12201
# GELF UDP
- 12201:12201/udp
logging:
driver: "json-file"
volumes:
# Mount local configuration directory into Docker container
- ./graylog/config:/usr/share/graylog/data/config
# Mount GEO Database DIR
- ./graylog/geoip:/usr/share/graylog/data/geoip
# Mount local plugin files into Docker container
#- ./graylog/plugin/graylog-plugin-auth-sso-3.3.0.jar:/usr/share/graylog/plugin/graylog-plugin-auth-sso-3.3.0.jar
- ./graylog/plugin/graylog-plugin-enterprise-integrations-3.3.7.jar:/usr/share/graylog/plugin/graylog-plugin-enterprise-integrations-3.3.7.jar
- ./graylog/plugin/graylog-plugin-integrations-3.3.7.jar:/usr/share/graylog/plugin/graylog-plugin-integrations-3.3.7.jar
- ./graylog/plugin/graylog-plugin-enterprise-3.3.7.jar:/usr/share/graylog/plugin/graylog-plugin-enterprise-3.3.7.jar
# Mount local graylog enterpirses binaires files into Docker container
- ./graylog/bin/chromedriver:/usr/share/graylog/bin/chromedriver
- ./graylog/bin/chromedriver_start.sh:/usr/share/graylog/bin/chromedriver_start.sh
- ./graylog/bin/headless_shell:/usr/share/graylog/bin/headless_shell
volumes:
mongo_data:
driver: local
es_data:
driver: local
graylog_journal:
driver: local
networks:
graylog:
Docker container stats:
CONTAINER ID NAME CPU % MEM USAGE / LIMIT MEM % NET I/O BLOCK I/O PIDS
5ea64500913e graylog_elasticsearch_1 3.84% 1.946GiB / 4.594GiB 42.35% 3.58GB / 1.02GB 2.09GB / 9.73GB 43
7aa916c45374 graylog_graylog_1 1.50% 1.475GiB / 4.594GiB 32.11% 6.16GB / 6.81GB 244MB / 1.45GB 227
Stats from host system:
Config of Graylog and Stats
Inputs:
GELF UDP
linux-syslog
syslog
tag-syslog
Let me know if you need any further information to point me in the right direction