Newbie here. We’ve been having issues with Graylog going down in some deployments since it was getting too many logs, we solved it (hopefully) by increasing the JVM heap size for both Graylog and Elasticsearch (4G each).
We where at 963,141 unprocessed messages when it went down.
Now we want to set up some kind of alert that can tells us when there are too many unprocessed messages (we have all the infrastructure on AWS) so we can get alerts when Graylog is stuck on those processes since we are not able to know unless we manually check the server.
This is our server.conf
is_master = true
node_id_file = /etc/graylog/server/node-id
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = 0.0.0.0:9000
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
proxied_requests_thread_pool_size = 32
Thanks!