I’m mildly confused with Graylog search syntax. I was under impression that I can use basic regular expressions but perhaps it’s a bit more sophisticated than that.
I’m trying to find all messages that do not start with “[” (this is one but not the only condition). And I cannot figure out how to write a search query for that purpose.
I read a bunch of similar posts and played with search engine and I think Graylog/Elastic or Lucena can’t really work with any complex regex. I could not even make this regex to work
message:/[0-9]{4}\/[0-9]{2}\/*/
to search logs that start with a “YYYY/MM”.
Basically use if special characters (such as / or [) breaks the query. It work if I replace use os special character with ?
message:/[0-9]{4}?[0-9]{2}?[0-9]{2}*/
Obviously this query would capture much more than just “YYYY/MM/DD” so it’s not really a solution.
I wonder if this is known limitation of Graylog due to use of Elastic / Lucene? Or is there some sort of explanation for this?
Thanks, but this is an extractor. Extracts works fine with regex, it’s the search that I have problems with. Try to make a search query that contains / or [
\[ : [ is a meta char and needs to be escaped if you want to match it literally.
(.*?) : match everything in a non-greedy way and capture it.
\] : ] is a meta char and needs to be escaped if you want to match it literally.
Returns nothing. I believe it’s because first search form a complete token ( [WARN] as a word ) but second is not. It require converting it to regex. But If I do
message:/\[WARN\]*/
It no longer works. Any idea how to convert message:\[WARN\] so a query that would return everything that stars with [ ?
@gsmith thanks for checking it. At least it’s not me nor a version of Graylog/Elastic. I guess it might be just a limitation of search engine or a very obscure syntax.
Sorry I cant be more help, Im missing something but not sure. Have a lot on my plate this weekend to bounce this in my Home lab. If i come across anything I hit you up here or perhaps someone could join in this convo.
message:/\[WARN\]*/ doesn’t work because with regex * means repeat the previous caracter (0 or more).
So it would match something like [WARN]]]]]]].
If you want to use regex you need to add the “.” caracter before the wildcard: message:/\[WARN\].*/
message:\[
Returns nothing. I believe it’s because first search form a complete token ([WARN] as a word) but second is not. It require converting it to regex.
Exactly, it’s because it’s a word. But I think it’s not mandatory to use regex.
Can you try
message:\[*
(In normal mode (not regex mode) the wildcard can replace all caracters)
For what its worth, elasticsearch and opensearch have some interesting, lets call them, quirks on how partial string matches and regex matching works.
Given these “quirks”, I recommend parsing out any text that you need to search or match on into its own field using pipeline processing (see also Streams & Pipelines Webinar) or extractors.
To expand on the quirkiness, elasticsearch and opensearch have different behavior for analyzed fields using the standard analyzer. This is important because it dictates how the text is stored in elasticsearch and opensearch, which affects how you can search for that text.
As a quick example, if you have the message “apple banana orange”, and this is stored in an analyzed field using the standard analyzer, each word will be its own token and you won’t be able to apply search filters to the message as a whole.
In graylog, the message and full_message field are analyzed so they have behave a differently then all other fields.