Newbie here. We’ve been having issues with Graylog going down in some deployments since it was getting too many logs, we solved it (hopefully) by increasing the JVM heap size for both Graylog and Elasticsearch (4G each).
We where at 963,141 unprocessed messages when it went down.
Now we want to set up some kind of alert that can tells us when there are too many unprocessed messages (we have all the infrastructure on AWS) so we can get alerts when Graylog is stuck on those processes since we are not able to know unless we manually check the server.
This is our server.conf
is_master = true node_id_file = /etc/graylog/server/node-id bin_dir = /usr/share/graylog-server/bin data_dir = /var/lib/graylog-server plugin_dir = /usr/share/graylog-server/plugin http_bind_address = 0.0.0.0:9000 rotation_strategy = count elasticsearch_max_docs_per_index = 20000000 elasticsearch_max_number_of_indices = 20 retention_strategy = delete elasticsearch_shards = 4 elasticsearch_replicas = 0 elasticsearch_index_prefix = graylog allow_leading_wildcard_searches = false allow_highlighting = false elasticsearch_analyzer = standard output_batch_size = 500 output_flush_interval = 1 output_fault_count_threshold = 5 output_fault_penalty_seconds = 30 processbuffer_processors = 5 outputbuffer_processors = 3 processor_wait_strategy = blocking ring_size = 65536 inputbuffer_ring_size = 65536 inputbuffer_processors = 2 inputbuffer_wait_strategy = blocking message_journal_enabled = true message_journal_dir = /var/lib/graylog-server/journal lb_recognition_period_seconds = 3 mongodb_uri = mongodb://localhost/graylog mongodb_max_connections = 1000 mongodb_threads_allowed_to_block_multiplier = 5 proxied_requests_thread_pool_size = 32