Alert condition triggered but not all error messages were sent in mail(s)


(Marick) #1

Hello everyone!

My config

Graylog v2.4.5

collector-sidecar-0.1.6-1.x86_64

elasticsearch-5.6.10

OS - Oracle Linux 7.4

All messages are coming in the stream, alert is fired but not all error message are present in the mail or some mails don’t come at all.



Mail message

What could be the reason that not all ORA- messages came in the email or second email (which didn’t arrive)?

Please help!
Thank you!


(Marick) #2

Where can I see log for alerts? So that I can check if the alert was triggered or not.

thanks!


(Jan Doberstein) #3

I guess you have used the RPM for installation? Then you can find the information in the documentation:

http://docs.graylog.org/en/2.4/pages/configuration/file_location.html#rpm-package


(Marick) #4

Hi jan,
thank you for reply! I read the documentation several time.
I can’t understand where is the problem… I have same configuration in filebeat, streams, alerts. Some of them are working good (message directed in stream - trigger fired - alert mail received), some not (message directed in stream (I see it) - trigger not fired - alert mail not received - not seen in unresolved alerts).

Documentation:
“Sometimes sending alert notifications may fail for some reason. Graylog includes details of the configured notifications at the time an alert was triggered and the result of executing those notifications, helping you to debug and fix any problems that may arise.”

How can I debug this?
Thank you!


(Marick) #5

I think I have found the problem… it seems that unsynchronized time between servers was causing triggers to not fire.

Monitoring…

Thanks!


(Marick) #6

It seems that problem persists.
Alerts are not triggered everytime when messages matching (message: “ORA-”) are received.
Configured: Grace period: 0 minutes. Including last 10 messages in alert notification. Configured to repeat notifications.
Messages received:

2018-10-31 22:29:37.147 mac.lan.ccc.com
ORA-16957: SQL Analyze time limit interrupt
2018-10-31 22:29:22.146 mac.lan.ccc.com
ORA-16957: SQL Analyze time limit interrupt

After this I received just one mail with just one ORA- error message.


Filebeat is scanning every 10 seconds.
So if a message comes in the stream it should rise an alert (in this case 2 alerts).
Am I understanding correctly?
Thank you!


(Jan Doberstein) #7

But you will not receive one alert for every single found event. You will receive one alert once the reason is found in the search time once or multiple times. No matter if the reason for the alert is found 100 times or 1 time it will generate one alert.

That system will improve in future version, but that is the current way it works.


(system) #8

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.