Get alerts when unprocessed messages reaches limit

Newbie here. We’ve been having issues with Graylog going down in some deployments since it was getting too many logs, we solved it (hopefully) by increasing the JVM heap size for both Graylog and Elasticsearch (4G each).

We where at 963,141 unprocessed messages when it went down.

Now we want to set up some kind of alert that can tells us when there are too many unprocessed messages (we have all the infrastructure on AWS) so we can get alerts when Graylog is stuck on those processes since we are not able to know unless we manually check the server.

This is our server.conf

is_master = true
node_id_file = /etc/graylog/server/node-id
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = 0.0.0.0:9000
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
proxied_requests_thread_pool_size = 32

Thanks!

1 Like

check this Monitoring :

if you are looking high-volume process, look out for Graylog sizing.

1 Like

Hello && welcome @richiexlinares

Adding on to @ramindia suggestion.

Using the Metrics on Graylog node works…

Found here.

what I did was enable Prometheus in Graylog config file the Install Grafana. This gave me the ablility to send an alert if something went wrong.

Example:

This also can be done through Zabbix.

Unprocessed message, I belive that would be the Journal.

Example:

2 Likes