Different search results in pipeline and normal search

1. Describe your incident:

I rerouted specific matching messages from one stream to a new stream by searching for certain keywords within the message fields. When I search for the same keywords within the new stream combined by OR statements I should expect to find ALL messages logically, right? But it appears that there are more messages that are found through the pipeline than through the search.
I end up with the same search results if I use that same search on the original stream (without the re-routed messages).

This is my search:
message: “gtm_debug” OR message: “_ga” OR message: “utm_source” OR message: “utm_id” OR “utm_medium” OR message: “utm_campaign” OR message: “utm_term” OR message: “utm_source_platform” OR message: “utm_creative_format” OR message: “utm_marketing_tactic” OR message: “gclid” OR message: “fbclid” OR message: “twclid”

And this is (the relevant part of) my pipeline rule for rerouting:
contains(to_string($message.message), “twclid”)
|| contains(to_string($message.message), “fbclid”)
|| contains(to_string($message.message), “gclid”)
|| contains(to_string($message.message), “utm_marketing_tactic”)
|| contains(to_string($message.message), “utm_creative_format”)
|| contains(to_string($message.message), “utm_source_platform”)
|| contains(to_string($message.message), “utm_term”)
|| contains(to_string($message.message), “utm_campaign”)
|| contains(to_string($message.message), “utm_medium”)
|| contains(to_string($message.message), “utm_source”)
|| contains(to_string($message.message), “utm_id”)
|| contains(to_string($message.message), “_ga”)
|| contains(to_string($message.message), “gtm_debug”

I expected the search according to some documentation I found to basically work like a “contains” but some messages are not found that are rerouted through the pipeline rules though.

The messages that are found through the pipeline and not through the normal search also include at least one of those terms and I should expect the search to find those as well.

I also tried wildcards e.g. “message: *gtm_debug*” OR… and got similar results

2. Describe your environment:

  • OS Information:

Graylog 5.0 with Opensearch 2.6 on SUSE Linux.

3. What steps have you already taken to try and solve the problem?

consulted documentation and tried to find info on forums.
I didnt yet fully go into elasticsearch/opensearch documentation it seems time consuming to truly understand and I’d like to give asking here a shot before going down the rabbit hole in hopes that my misunderstanding is on a more basic level.

4. How can the community help?
I really need someone to help me understand why those two “searches” are not equivalent and how to make sure I find every single message that has any of those keywords anywhere in the message. I need to be able to filter data reliably as I will have to provide an analytics team with accurate info.

Helpful Posting Tips: Tips for Posting Questions that Get Answers [Hold down CTRL and link on link to open tips documents in a separate tab]

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.