How to search for messages that starts with [

I’m mildly confused with Graylog search syntax. I was under impression that I can use basic regular expressions but perhaps it’s a bit more sophisticated than that.

I’m trying to find all messages that do not start with “[” (this is one but not the only condition). And I cannot figure out how to write a search query for that purpose.

I’ve started with basic

message:"[DEBUG.*"

It returned messages like

[DEBUG] 20230531-12:50:09.161780 [140690873374464] blah blah blah

But

message:"[.*"

Return nothing. I’ve tried to use escape “[”, still the same.
Then I’ve tried to search for

message:"[INFO.*"

and it’s returned

2023/05/31 12:55:03 887260 INFO [140690873374464] blah blah blah

That make no sense to me… I also could not find detailed Search Query documentation. I feel like I’m just messed up syntax but unsure how.

I read a bunch of similar posts and played with search engine and I think Graylog/Elastic or Lucena can’t really work with any complex regex. I could not even make this regex to work

message:/[0-9]{4}\/[0-9]{2}\/*/

to search logs that start with a “YYYY/MM”.

Basically use if special characters (such as / or [) breaks the query. It work if I replace use os special character with ?

message:/[0-9]{4}?[0-9]{2}?[0-9]{2}*/

Obviously this query would capture much more than just “YYYY/MM/DD” so it’s not really a solution.

I wonder if this is known limitation of Graylog due to use of Elastic / Lucene? Or is there some sort of explanation for this?

I use v.3.1

[Hey @roman.potato

quote=“roman.potato, post:1, topic:29011”]
[DEBUG] 20230531-12:50:09.161780 [140690873374464] blah blah blah
[/quote]

Here is an example of regex getting the string “DEBUG” between brackets. BTW I used your example log.

Thanks, but this is an extractor. Extracts works fine with regex, it’s the search that I have problems with. Try to make a search query that contains / or [

@roman.potato

Extractor is for testing here, regex works

\[ : [ is a meta char and needs to be escaped if you want to match it literally.
(.*?) : match everything in a non-greedy way and capture it.
\] : ] is a meta char and needs to be escaped if you want to match it literally.

For example:

Interesting.

message:\[WARN\]

Works, but

message:\[

Returns nothing. I believe it’s because first search form a complete token ( [WARN] as a word ) but second is not. It require converting it to regex. But If I do

message:/\[WARN\]*/

It no longer works. Any idea how to convert message:\[WARN\] so a query that would return everything that stars with [ ?

Hey @roman.potato

I would have to play around with it a little bit here. I tried a couple quick configs and didnt work.

@gsmith thanks for checking it. At least it’s not me nor a version of Graylog/Elastic. I guess it might be just a limitation of search engine or a very obscure syntax.

Hey @roman.potato

Sorry I cant be more help, Im missing something but not sure. Have a lot on my plate this weekend to bounce this in my Home lab. If i come across anything I hit you up here or perhaps someone could join in this convo.

message:/\[WARN\]*/ doesn’t work because with regex * means repeat the previous caracter (0 or more).
So it would match something like [WARN]]]]]]].
If you want to use regex you need to add the “.” caracter before the wildcard: message:/\[WARN\].*/

message:\[
Returns nothing. I believe it’s because first search form a complete token ([WARN] as a word) but second is not. It require converting it to regex.

Exactly, it’s because it’s a word. But I think it’s not mandatory to use regex.
Can you try

message:\[*

(In normal mode (not regex mode) the wildcard can replace all caracters)

Make sense, but it still does not work with Graylog.

Does’t return anything.

For what its worth, elasticsearch and opensearch have some interesting, lets call them, quirks on how partial string matches and regex matching works.

Given these “quirks”, I recommend parsing out any text that you need to search or match on into its own field using pipeline processing (see also Streams & Pipelines Webinar) or extractors.

To expand on the quirkiness, elasticsearch and opensearch have different behavior for analyzed fields using the standard analyzer. This is important because it dictates how the text is stored in elasticsearch and opensearch, which affects how you can search for that text.

As a quick example, if you have the message “apple banana orange”, and this is stored in an analyzed field using the standard analyzer, each word will be its own token and you won’t be able to apply search filters to the message as a whole.

In graylog, the message and full_message field are analyzed so they have behave a differently then all other fields.

Hope this helps.

1 Like