Post update event scheduler is not working

DaveC · December 13, 2022, 8:26am

Before you post: Your responses to these questions will help the community help you. Please complete this template if you’re asking a support question.
Don’t forget to select tags to help index your topic!

1. Describe your incident:
Event scheduler not working post graylog update.

Events now show
Status:
runnable
Next execution:
2022-12-12 15:38:11.064 (A few mins in the past)

Im guessing with a date in the past its never going to trigger.
Notifications work, everything that i can see works. Switched logging to debug and disabled and enabled events but to no avail.

ubuntu@ip-10-60-40-12:~$ cat /etc/graylog/server/server.conf | egrep -v “^\s*(#|$)”
is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret =

root_password_sha2 =

root_timezone = Etc/GMT-1
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = 0.0.0.0:9000
http_external_uri = https://xxxxxx.com/
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://grayuser:i-xxxx@127.0.0.1:27017/graylog
mongodb_uri = mongodb://grayuser:i-xxxx@127.0.0.1:27017/graylog
mongodb_uri = mongodb://grayuser:xxxx@127.0.0.1:27017/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
proxied_requests_thread_pool_size = 32
prometheus_exporter_enabled = true
prometheus_exporter_bind_address = xxxx:9833

2. Describe your environment:

OS Information:
Package Version:
Service logs, configurations, and environment variables:
Stand alone ubuntu server running
Version:
4.2.13+9c90b93, codename Noir
JVM:
PID 1101, Ubuntu 11.0.17 on Linux 5.4.0-1092-aws
Time:
2022-12-13 08:18:56 +00:00

3. What steps have you already taken to try and solve the problem?
we did do a snapshot but as this wasnt noticed we dont want to revert we want to fix forward.

we may update again to next version if a bug

4. How can the community help?

Seen something similar here but no indication of fix - Alert/Event not firing

Helpful Posting Tips: Tips for Posting Questions that Get Answers [Hold down CTRL and link on link to open tips documents in a separate tab]

DaveC · December 13, 2022, 3:55pm

filter seems to work but still no alerts.

We thought it might be a time issue but doesnt seem to be as we fixed the ntp issue on the server.

DaveC · December 13, 2022, 3:56pm

I get a notificatio that ive updated so i guess this has been written to db

tmacgbay · December 13, 2022, 8:36pm

When you say Event Scheduler, you mean Alerts…

I don’t think you want the quotes in the search query - I believe that changes it from checking a field to looking for that string…

gsmith · December 13, 2022, 11:06pm

Hello,

By chance does this setting work? 3 Mongo nodes with the same IP?

if it a replica set maybe something like this

mongodb_uri = mongodb://grayloguser:secret@mongo_node01:27017,mongo_node02:27018,mongo_node03:27019/graylog?replicaSet=rs01

DaveC · December 15, 2022, 8:34am

Hi
what would that do, set up the same user on different ports?

I dont think this is an issue as theres only one db

DaveC · December 15, 2022, 8:47am

Yes this does work in quotes, as we are looking for that specific string to alert on.

So the way i think it works -
When a new Event is made or modified, details are written to the mongoDB and a schedule is automatically made to trigger the check on the DB, when a match happens this creates the alert. Our alert seem to be written to the db but the internal schedule is not triggered.
Hence the last exicution message and no next execution message

another older unmodified alert-

The sting in quotes is working as we see a result given back onscreen in the filter preview. I think if that wasnt working then it maybe a case of no matches and no alerts.

Let me be clear no alerts are working. older events and newly created ones since the update

DaveC · December 15, 2022, 9:46am

Also just thought id mention that we get a lot of old notifications on reboot.

I found a very similar issue here - Alerting not working if cluster contains nodes with no active inputs · Issue #6415 · Graylog2/graylog2-server · GitHub

id like to see the output from here:

but i get this output -

DaveC · December 15, 2022, 9:59am

I think i may have gotten something working

2022-12-15T09:57:27.037Z INFO [DiagnosticEventLogger] Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=1, largestPoolSize=2, maximumPoolSize=2147483647)
2022-12-15T09:57:33.050Z INFO [Scheduler] Current stream shard assignments: shardId-000000000000
2022-12-15T09:57:33.050Z INFO [Scheduler] Sleeping …
2022-12-15T09:57:40.422Z INFO [DiagnosticEventLogger] Current thread pool executor state: ExecutorStateEvent(executorName=SchedulerThreadPoolExecutor, currentQueueSize=0, activeThreads=0, coreThreads=0, leasesOwned=1, largestPoolSize=2, maximumPoolSize=2147483647)
2022-12-15T09:57:42.042Z INFO [Scheduler] Current stream shard assignments: shardId-000000000000
2022-12-15T09:57:42.042Z INFO [Scheduler] Sleeping …
2022-12-15T09:57:44.049Z INFO [Scheduler] Current stream shard assignments: shardId-000000000000
2022-12-15T09:57:44.049Z INFO [Scheduler] Sleeping …

tmacgbay · December 16, 2022, 2:02pm

IF no alerts are working Are you sure the Notification that is attached to the Alert Event is working? What kind of Notification are you using?

For @gsmith’s point, the three instances of defining mongodb_uri likely would only take the last one defined, the previous value is usually overwritten when you define something more than once…

For the quoted search where you are looking for a snippet in the full message… yes that works… it’s just not efficient. In the example you have given, you are asking Graylog to search through all full messages that have come in for the past 28 hours for “Response Code: 96” … depending on the number of messages over that time, this could be a very expensive search. Graylog is designed so that when the message comes in, you can use extractors and/or the pipeline to break the full message to it’s constituent parts and it would allow for a way more efficient search… <find all response_code fields that have a value of 96 in the past 28 hours>. My initial through it that it failed the search or took to long since you were searching every minute through so much.

DaveC · December 16, 2022, 2:18pm

Hi Thanks for that.
So we continued to troubleshoot and restored a snapshot to another instance. There must have been a crash before updates as on the pre update snapshot was also broken. We did the same with a 7day earlier snap and all is working. Notifications events the lot.

The 28hr time frame was purely to trigger the event as thats when it had last aoccured in logs.

I think that OOM killer killed the Graylog process and something was damaged. we will try a restore with older snap to a bigger intance - more mem resources.

tmacgbay · December 16, 2022, 2:24pm

That sucks to have to go back far! Grrr… Good luck !

DaveC · December 21, 2022, 3:08pm

OK so we figured out the issue to some extent after our OG snapshot graylog ran for a about 12hours and also got the same issue.

We had an ongoing issue that triggered 20k logs and an alert that was triggered and tried to also to give us 20k notifications. The event scheduler broke well before that.

we have disabled the events that match the issue until our devs can address the issue and after disabling and rebooting the server events began to work again.

tmacgbay · December 21, 2022, 3:38pm

yow! Glad you found it!

system · January 4, 2023, 3:38pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Edit Event Definition page not show Graylog Central (peer support) alert , elastic	13	1066	May 9, 2022
Alerts & Events empty "No Events found for the current search criteria." Graylog Central (peer support) alert	5	911	October 7, 2022
Error: "Event processor failed to execute" after upgrade graylog to 4.2.1 Graylog Central (peer support)	4	1488	December 15, 2021
Alerts stuck in Status: running Graylog Central (peer support) basic-configuration , alert	6	388	February 21, 2022
Post-disaster restore working almost properly Graylog Central (peer support)	2	105	May 8, 2024

Post update event scheduler is not working

Related topics