Good Morning ,
Graylog cluster 3 nodes + elasticsearch cluster + mongodb replica on one mashine
1 node run on: 8 CPU(Intel® Core™ i7-7700 CPU @ 3.60GHz), 64GB ram + 480 ssd in local network(1GBs) + Debian Linux
Speed ~1k msg/sec
One Iindex have 3 shards and 2 replicas
Index rotation strategy: Index Size
Max index size: (2.0GB)
Max number of indices: 245
In the last week I began to notice more and more often in the log file errors
2018-05-11T22:34:08.968+03:00 WARN [StreamRouterEngine] Unexpected error during stream matching
java.util.concurrent.TimeoutException: null
at java.util.concurrent.FutureTask.get(FutureTask.java:205) ~[?:1.8.0_171]
at com.google.common.util.concurrent.SimpleTimeLimiter.callWithTimeout(SimpleTimeLimiter.java:147) ~[graylog.jar:?]
at org.graylog2.streams.StreamRouterEngine$Rule.matchWithTimeOut(StreamRouterEngine.java:303) [graylog.jar:?]
at org.graylog2.streams.StreamRouterEngine$Rule.access$000(StreamRouterEngine.java:264) [graylog.jar:?]
at org.graylog2.streams.StreamRouterEngine.match(StreamRouterEngine.java:191) [graylog.jar:?]
at org.graylog2.streams.StreamRouter.route(StreamRouter.java:99) [graylog.jar:?]
at org.graylog2.filters.StreamMatcherFilter.filter(StreamMatcherFilter.java:42) [graylog.jar:?]
at org.graylog2.messageprocessors.MessageFilterChainProcessor.process(MessageFilterChainProcessor.java:100) [graylog.jar:?]
at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.handleMessage(ProcessBufferProcessor.java:114) [graylog.jar:?]
at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.dispatchMessage(ProcessBufferProcessor.java:100) [graylog.jar:?]
at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:77) [graylog.jar:?]
at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:42) [graylog.jar:?]
at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [graylog.jar:?]
at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [graylog.jar:?]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
[2018-05-11T22:53:38,451][WARN ][o.e.m.j.JvmGcMonitorService] [graylog] [gc][young][14432][3555] duration [1.8s], collections [1]/[2.3s], total [1.8s]/[5.4m], memory [3gb]->[2.5gb]/[30.9gb], all_pools {[young] [517.6mb]->[4.8mb]/[532.5mb]}{[survivor] [16.6mb]->[14.4mb]/[66.5mb]}{[old] [2.4gb]->[2.4gb]/[30.3gb]}
[2018-05-11T22:53:38,451][WARN ][o.e.m.j.JvmGcMonitorService] [graylog] [gc][14432] overhead, spent [1.8s] collecting in the last [2.3s]
[2018-05-11T23:03:36,968][INFO ][o.e.c.m.MetaDataCreateIndexService] [graylog] [graylog_3157] creating index, cause [api], templates [graylog-custom-mapping, graylog-internal], shards [3]/[2], mappings [message]
[2018-05-11T23:03:38,379][INFO ][o.e.c.r.a.AllocationService] [graylog] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[graylog_3157][1]] ...]).
my config graylog
is_master = false # 2x node = false; 1 node=true master
node_id_file = /etc/graylog/server/node-id
root_timezone = Europe/Berlin
plugin_dir = /usr/share/graylog-server/plugin
rest_listen_uri = htTp://10.10.1.4:9000/api/
rest_transport_uri = htTp://10.10.1.4:9000/api/
rest_enable_cors = true
trusted_proxies = 127.0.0.1/32, 10.10.1.1/32
web_listen_uri = htTp://10.10.1.4:9000/
elasticsearch_hosts = htTp://10.10.1.2:9200,htTp://10.10.1.3:9200,htTp://10.10.1.4:9200
elasticsearch_connect_timeout = 10s
elasticsearch_max_total_connections = 300
elasticsearch_max_total_connections_per_route = 300
elasticsearch_max_retries = 10
elasticsearch_discovery_enabled = true
elasticsearch_discovery_frequency = 45s
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = close
elasticsearch_shards = 3
elasticsearch_replicas = 2
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = true
allow_highlighting = true
elasticsearch_analyzer = standard
elasticsearch_index_optimization_jobs = 30
output_batch_size = 2000
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 20
outputbuffer_processors = 10
processor_wait_strategy = blocking
ring_size = 262144
inputbuffer_ring_size = 262144
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
message_journal_max_size = 5gb
lb_recognition_period_seconds = 0
stream_processing_timeout = 2000
stream_processing_max_faults = 3
alert_check_interval = 60
stale_master_timeout = 5000
mongodb_uri = mongodb://10.10.1.2,10.10.1.3,10.10.1.4/graylog?replicaSet=rs01
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
dashboard_widget_default_cache_time = 10s
proxied_requests_thread_pool_size = 32
elasticsearch
cluster.name: graylog
node.master: false # 2x node = true; 1x node = false
node.data: true # 3x node data = true
node.name: ${HOSTNAME}
path.data: /opt/elasticsearch/
path.logs: /var/log/graylog/elasticsearch/
network.host: 10.10.1.4
discovery.zen.ping.unicast.hosts: ["10.10.1.2", "10.10.1.3","10.10.1.4"]
discovery.zen.minimum_master_nodes: 2
and jvm:
-Xms31g
-Xmx31g
JVM settings
/etc/default/graylog-server
# Path to the java executable.
JAVA=/usr/bin/java
# Default Java options for heap and garbage collection.
GRAYLOG_SERVER_JAVA_OPTS="-Xms5g -Xmx5g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX:-OmitStackTraceInFastThrow"
# Pass some extra args to graylog-server. (i.e. "-d" to enable debug mode)
GRAYLOG_SERVER_ARGS=""
# Program that will be used to wrap the graylog-server command. Useful to
# support programs like authbind.
GRAYLOG_COMMAND_WRAPPER=""
/etc/default/elasticsearch
################################
# Elasticsearch
################################
# Elasticsearch home directory
#ES_HOME=/usr/share/elasticsearch
# Elasticsearch Java path
#JAVA_HOME=
# Elasticsearch configuration directory
#CONF_DIR=/etc/elasticsearch
# Elasticsearch data directory
#DATA_DIR=/var/lib/elasticsearch
# Elasticsearch logs directory
#LOG_DIR=/var/log/elasticsearch
# Elasticsearch PID directory
#PID_DIR=/var/run/elasticsearch
# Additional Java OPTS
#ES_JAVA_OPTS=
# Configure restart on package upgrade (true, every other setting will lead to not restarting)
#RESTART_ON_UPGRADE=true
################################
# Elasticsearch service
################################
# SysV init.d
#
# When executing the init script, this user will be used to run the elasticsearch service.
# The default value is 'elasticsearch' and is declared in the init.d file.
# Note that this setting is only used by the init script. If changed, make sure that
# the configured user can read and write into the data, work, plugins and log directories.
# For systemd service, the user is usually configured in file /usr/lib/systemd/system/elasticsearch.service
#ES_USER=elasticsearch
#ES_GROUP=elasticsearch
# The number of seconds to wait before checking if Elasticsearch started successfully as a daemon process
ES_STARTUP_SLEEP_TIME=5
################################
# System properties
################################
# Specifies the maximum file descriptor number that can be opened by this process
# When using Systemd, this setting is ignored and the LimitNOFILE defined in
# /usr/lib/systemd/system/elasticsearch.service takes precedence
#MAX_OPEN_FILES=65536
# The maximum number of bytes of memory that may be locked into RAM
# Set to "unlimited" if you use the 'bootstrap.memory_lock: true' option
# in elasticsearch.yml.
# When using systemd, LimitMEMLOCK must be set in a unit file such as
# /etc/systemd/system/elasticsearch.service.d/override.conf.
#MAX_LOCKED_MEMORY=unlimited
# Maximum number of VMA (Virtual Memory Areas) a process can own
# When using Systemd, this setting is ignored and the 'vm.max_map_count'
# property is set at boot time in /usr/lib/sysctl.d/elasticsearch.conf
#MAX_MAP_COUNT=262144
JVM Options elasticsearch jvm.options
## JVM configuration
################################################################
## IMPORTANT: JVM heap size
################################################################
##
## You should always set the min and max JVM heap
## size to the same value. For example, to set
## the heap to 4 GB, set:
##
## -Xms4g
## -Xmx4g
##
## See https://www.elastic.co/guide/en/elasticsearch/reference/current/heap-size.html
## for more information
##
################################################################
# Xms represents the initial size of total heap space
# Xmx represents the maximum size of total heap space
-Xms31g
-Xmx31g
################################################################
## Expert settings
################################################################
##
## All settings below this section are considered
## expert settings. Don't tamper with them unless
## you understand what you are doing
##
################################################################
## GC configuration
-XX:+UseConcMarkSweepGC
-XX:CMSInitiatingOccupancyFraction=75
-XX:+UseCMSInitiatingOccupancyOnly
## optimizations
# pre-touch memory pages used by the JVM during initialization
-XX:+AlwaysPreTouch
## basic
# force the server VM (remove on 32-bit client JVMs)
-server
# explicitly set the stack size (reduce to 320k on 32-bit client JVMs)
-Xss1m
# set to headless, just in case
-Djava.awt.headless=true
# ensure UTF-8 encoding by default (e.g. filenames)
-Dfile.encoding=UTF-8
# use our provided JNA always versus the system one
-Djna.nosys=true
# use old-style file permissions on JDK9
-Djdk.io.permissionsUseCanonicalPath=true
# flags to configure Netty
-Dio.netty.noUnsafe=true
-Dio.netty.noKeySetOptimization=true
-Dio.netty.recycler.maxCapacityPerThread=0
# log4j 2
-Dlog4j.shutdownHookEnabled=false
-Dlog4j2.disable.jmx=true
-Dlog4j.skipJansi=true
## heap dumps
# generate a heap dump when an allocation from the Java heap fails
# heap dumps are created in the working directory of the JVM
-XX:+HeapDumpOnOutOfMemoryError
# specify an alternative path for heap dumps
# ensure the directory exists and has sufficient space
#-XX:HeapDumpPath=${heap.dump.path}
## GC logging
#-XX:+PrintGCDetails
#-XX:+PrintGCTimeStamps
#-XX:+PrintGCDateStamps
#-XX:+PrintClassHistogram
#-XX:+PrintTenuringDistribution
#-XX:+PrintGCApplicationStoppedTime
# log GC status to a file with time stamps
# ensure the directory exists
#-Xloggc:${loggc}
# By default, the GC log file will not rotate.
# By uncommenting the lines below, the GC log file
# will be rotated every 128MB at most 32 times.
#-XX:+UseGCLogFileRotation
#-XX:NumberOfGCLogFiles=32
#-XX:GCLogFileSize=128M
# Elasticsearch 5.0.0 will throw an exception on unquoted field names in JSON.
# If documents were already indexed with unquoted fields in a previous version
# of Elasticsearch, some operations may throw errors.
#
# WARNING: This option will be removed in Elasticsearch 6.0.0 and is provided
# only for migration purposes.
#-Delasticsearch.json.allow_unquoted_field_names=true
Errors more and more often. In what side to dig? Thanks