Unprocessed messages every morning for 2 hours

dee · August 2, 2017, 1:09pm

Every morning I come in at 8am and Graylog appears to be processing a backlog of messages (pic below). Usually around 1.5 million. Seems to take a couple hours before it catches up and the dashboards start updating again. Assuming I’ve botched a config or have taxing regex filters or something. Can someone tell me where to look to find out what the problem is?

I’ve looked in /var/log/graylog-server/graylog.log and the last entry is Kafka barfing due to permssions, however , IIRC, that was two days ago upon reboot. I accidentally had two graylog services trying to start.

2017-07-31T08:07:39.265-05:00 ERROR [KafkaJournal] Unable to start logmanager.
kafka.common.KafkaException: Failed to acquire lock on file .lock in /var/lib/graylog-server/journal. A Kafka instance in another process or thread is using this directory.

jochen · August 2, 2017, 1:18pm

You’ll have to find out what’s causing Graylog to restart and also check the logs of your Elasticsearch node(s) as well as cron jobs which might cause additional load on your systems.

dee · August 2, 2017, 2:40pm

I have this in the elasticsearch log from last night. Nothing in there from 8/02 yet. Something caused a “RED” status change. I have attached a CPU graph which seems to suggest some major processing going on from 4:00 to 9:30 (obviously the backlog processing) and then again 18:00 to 22:30. I don’t see any cron jobs starting at 4:00 or 18:00. Is there some housecleaning that graylog does by default at those times? Could be a remote backup process, but not sure. Have to check with the storage ops guys…

# cat /var/log/elasticsearch/graylog.log
[2017-08-01 10:35:41,330][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_41] update_mapping [message]
[2017-08-01 18:49:28,353][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] creating index, cause [api], templates [graylog-internal], shards [4]/[0], mappings [message]
[2017-08-01 18:49:28,406][INFO ][cluster.routing.allocation] [Dragon of the Moon] Cluster health status changed from [RED] to [GREEN] (reason: [shards started [[graylog_42][0], [graylog_42][0]] ...]).
[2017-08-01 18:49:28,692][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 18:50:05,293][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 18:54:15,331][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 18:54:38,343][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 18:55:00,517][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 18:56:03,338][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 18:56:23,927][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 18:56:23,946][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 19:30:31,270][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 21:36:41,343][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 23:17:53,325][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]
[2017-08-01 23:22:24,487][INFO ][cluster.metadata         ] [Dragon of the Moon] [graylog_42] update_mapping [message]

dee · August 3, 2017, 12:30pm

Seems to be related to a process running on another server at those times dumping massive amounts of logs to graylog during an import process. I’ve disabled logging from that host and the problem isn’t there this morning.

system · August 17, 2017, 12:30pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Graylog stops processing messages seemingly at random times Graylog Central (peer support)	7	1879	June 19, 2020
Graylog failing to index data Graylog Central (peer support)	13	1377	May 25, 2021
Graylog suddenly stops processing messages Graylog Central (peer support)	3	1395	November 8, 2017
Graylog has Millions of Unprocessed Messages Graylog Central (peer support)	2	966	February 10, 2021
Problem while loading Graylog webpage after a restart Graylog Central (peer support)	8	2185	April 3, 2019

Unprocessed messages every morning for 2 hours

Related topics