1. Describe your incident:
Since the last update of Graylog-Server and OpenSearch (done on 08.08.24) my Event Definitions based on the aggregation count() are not working correctly anymore. Example Event:
Type
Aggregation
Search Query
some search query
Search Filters
No filters configured
Streams
Stream XYZ
Search within
5 minutes
Execute search every
5 minutes
Enable scheduling
yes
Group by Field(s)
No Group by configured
Create Events if
count()>=15
Now, since the mentioned update, the event is matching every time as configured per scheduling (in this case every 5 minutes) with count() values that aren´t correct. For example the event is created because of “count()=44.0”, when I click “Replay search” it shows exactly 1 message, therefore no event should have been created for this. The wrong count value is changing every time the event is matching…
Because both Graylog and OpenSearch were updated, I don´t know which component is causing this issue…
Does anyone have an idea how to solve or further debug this?
Second screenshot from hitting replay which shows that only 1 message appeared during the searched time will follow in the next post (I can only upload 1 in this post somehow…)
To your second question: If I search by myself inside the stream and timeframe, the count is the same as shown on the second screenshot (count 1), so the count of 24 from the first screenshot is incorrect, yes.
This is an odd one, what of Graylog/Mongo/OS were you on what did you move to?
Assuming this event was carried across from before the update, If you create a new Event Definition with the same configuration does it yield the same results?
Sounds like this has to do with an issue we recently discovered with OpenSearch 2.16 (see Opensearch 2.16.0 breaks alerts · Issue #20119 · Graylog2/graylog2-server · GitHub). We currently do not recommend using this version because of this issue, but OpenSearch is already addressing it and we expect a fix soon. Sorry that you are experiencing the issue!
@Wine_Merchant
Mongo and OS weren´t updated since the problem started.
Problem started after:
graylog-server-6.0.4-1.x86_64 → graylog-server-6.0.5-1.x86_64
opensearch-2.15.0-1.x86_64 → opensearch-2.16.0-1.x86_64
Unfortunately yes, I created a new event with the same configuration, but the problem still exists.
@mako42
Thank you very much for mentioning that Issue, that sounds exactly like the problem that I am facing at the moment. I will keep an eye on that and hopefully it will be solved with the next update.