Perplexed by the regex search

Hello,

I’m trying to search via regex and am trying to understand some unexpected results. I have a set of entries where the message field looks like “61 minute gap after 2022-06-23 15:59:00 -0500 CDT”

If I search for /minute/, I get lots of results. If I search for /minute gap/ or /.*minute gap.*/ or even /.*minute.*gap.*/ I get no matches. I would have expected lots of results for these searches, so I feel like there’s something very basic I’m missing. Can someone help point me in the right direction?

This is a plain vanilla install on a single machine for experimentation purposes, all default configurations. Graylog 4.2.4+b643d2b on graylog (Private Build 1.8.0_312 on Linux 5.4.0-121-generic)

Thanks in advance!

Hello && welcome @JonKPowers

I might be able to help,
First, did you check out the documentation for global search syntax ?

From that maybe something like this…

"minute gap"

Example from the documentation.

If you have the following characters they must be escaped with a backslash:

& | : \ / + - ! ( ) { } [ ] ^ " ~ * ?

Example:

source:lab\-sql1.domain\-labs.com

Not sure if this was check but make sure Time Zone, Date/Time is correct. not only on the GL server but per user.


Hi @gsmith, thanks for the response and the warm welcome.

I did look at the search syntax page and am specifically trying to apply a regex search, which it says to surround by / in the search bar. I need to search on things like [^6]\d minute gap and 16:\d{2}:\d{2}, but just running the simple regexes in my original post is not working as I would have expected, which leads me to wonder if there’s something more fundamental that I’m missing about the regex engine used here. I would expect the regex .*minute gap.* to match 61 minute gap after 2022-06-23 15:59:00 -0500 CDT but it’s not.

Any insight into what’s going on here would be much appreciated.

In the link that @gsmith posted there is an obscure passage that sort of explains what is going on.

message , full_message , and source are the only fields that are being analyzed by default.While wildcard searches (using * and ? ) work on all indexed fields, analyzed fields will behave a little bit different.See wildcard and regexp queries for details.

Essentially what this means is that message , full_message , and source are analyzed but not searchable in the way you would expect. Breaking out the information in the message will make the details searchable the way you want. It may sound and feel a little odd but I believe it’s an efficiency decision.

1 Like

Ok @JonKPowers

It’s demo time :cowboy_hat_face:

I’m labbing this issue and hope this is what your looking for.

Ingested the following line with a ton of other messages.

Message are in the input as shown below.

Using the command from my previous post below.

I get this…

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.