I have a stream with logs from Nginx, all the logs from this stream are in the index set nginx_0.
The filtering rule looks like this "source must match regular expression nginx-ext* "
Then I create a new stream api. In it, I want to put the logs that come from nginx and look like this
“source must match regular expression nginx-ext*” AND
“request must match regular expression /api/myservice”.
This stream also uses its index set api_0.
but if I open the stream with api and look for logs in it, then SOMETIMES (not always), I see duplicate records. One record is in the index with nginx_25, and the other is in the index api_14
How can I see logs from the stream of the index, which is not tied to the stream. And how can it get rid of this.
It looks like the bug is floating, out of 100 logs, about 10 will be duplicated and will be in a different index
My guess as the whats happening is that if a log message matches both stream rules it will get routed to both streams. Graylog stores a message for each unique index set specified for the applicable streams.
Are you able to share screnshots of your stream rules on these 2 streams?
Due to how Graylog works, if a message is routed to 2 streams, and those 2 streams each use a different index set, the message is stored twice.
You do have a couple of options though:
Configure the 2 streams to use the same index set
When searching message filter for only one of the 2 streams to remove duplicates
You can also add a filter to the search query using _index:<index-name>* to filter only for a specific index (not the trailing * wildcard character to match all indices that start with a specific name or pattern):
_index:graylog_* only return messages from indices that start with graylog_
-_index:graylog_* (notice the prepended - which functions as a NOT operator) only return messages from indices that are DO NOT start with graylog_
Thank you for your answer! I understand that because of this rule, messages end up in two different indexes and are duplicated. But I don’t understand the following. Streams are tied to different index sets. Why, if I look at one stream, do I get messages from another index set?
You can see that I’m searching in the API Stream stream, at the bottom I see two messages, one is stored in the api_256 index, and the other in the nginx_56 index. Where does nginx come from here, if it is not being discussed at all?
Plus, I’ll add that this behavior is not constant. Out of 30 messages, at least one from a different index appears in the stream. I think this is a bug.