My problem is with the user agent. In a normal case, the user agent is just a string retrievable with %{DATA:useragent}. But in some cases, the user agent contains a semicolon, so it is enclosed in quotes in the log line.
How can I get this field in all cases?
I tried to use the ${QUOTEDSTRING} grok pattern which does not work when there is no quotes. So I tried to create a new pattern (%{QUOTEDSTRING}|.*), but it also catches the semicolons after when their is not quotes (Mozilla/5.0;13;14;22).
I also tried with (%{QUOTEDSTRING}|.*?), without success.
Here are two examples of lines (anonymized of course)
With quotes because of the semicollon in useragent : 21:50:32,522;434FBF4-9B38-DEADBEEF-01BB-6418E98D-13E52BB8-1057E;1679353197584;;8.8.8.8;;;802;;;/myuripath/search?date=1292523600000;;;"Mozilla/5.0 (Linux; Android 12; SM-G970U1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/111.0.0.0 Mobile Safari/537.36";0;0;0;0;;0;0;;;
Looking at “Without quotes” It looks like you pulled some data from message field and that is whats left. Are you trying to separate the message into different fields or modify the message itself?
By chance, what type odf input are you using? and what types of inputs have you used?
I usually work with rules and not with extractors. I do not know how log Graylog will support extractors.
I can start with a GROK debugger like this one:
Sometimes, Open AI can help to build the basic GROK pattern. If it is something well known, the AI will often tell me what fields are optional.
The next step would be looking into the Graylog Schema. You can use your field names instead of the schema fields, but if you have the choice you might can import dashboards and they might work out of the box.
Not all GROK patterns will run out of the box because the escaping is different. For let us say a doublequote the grokdebugger needs a single quote but in a GROK pattern in Graylog it might need four escapes.