Use of ${QUOTEDSTRING} causes Graylog to seize

1. Describe your incident:
We are creating a grok pattern to parse a complex log message that contains quoted strings with commas inside the quotes. Whenever we use the default ${QUOTEDSTRING} or ${QS} grok patterns against these logs, Graylog ceases the output of all logs to OpenSearch until the Graylog server is rebooted. The problem does not occur when the use of ${QUOTEDSTRING} or ${QS} is removed from the grok pattern.

2. Describe your environment:
We’re running Graylog 6.1 and OpenSearch 2.15 in a two-server configuration on a security hardened Oracle Linux 8 OS.

3. What steps have you already taken to try and solve the problem?
We found nothing in the Graylog server logs or OpenSearch cluster logs related to the issue.

4. How can the community help?

  • How can we grok quoted strings with commas without using ${QUOTEDSTRING} or ${QS}?

  • How can we use ${QUOTEDSTRING} or ${QS} without Graylog seizing?

Searching for QUOTEDSTRING in this forum, I see a bunch of people are using it without encountering your issue. We are going to need more informaiton …

Are you using a grok input extractor or one of the grok functions in a pipeline rule?
Are there other inputs and do those all cease indexing?
Do you see messages in the Processing and Indexing Failures stream?
Please share a sanitized version of the pattern and the log data.

Are you using a grok input extractor or one of the grok functions in a pipeline rule?
We’re using the grok function “Extract grok to fields” in a pipeline rule.

Are there other inputs and do those all cease indexing?
There is another separate input/stream/pipeline/index. Confirmed that all cease indexing.

Do you see messages in the Processing and Indexing Failures stream?
“No failed indexing attempts in the last 24 hours.”

Please share a sanitized version of the pattern and the log data.

Pattern:

%{WORD:logsrc}.div.company.com %{NUMBER:num},%{DATA:receive_time},%{NUMBER:serial},%{WORD:type},%{WORD:subtype},%{NUMBER:port},%{DATA:time_generated},%{DATA:src},%{DATA:dst},%{DATA:natsrc},%{DATA:natdst},%{DATA:rule},%{DATA:srcuser},%{DATA:dstuser},%{DATA:app},%{DATA:vsys},%{DATA:from},%{DATA:to},%{DATA:inbound_if},%{DATA:outboundif},%{DATA:logset},%{DATA:unknown_time},%{NUMBER:sessionid},%{NUMBER:repeatcnt},%{NUMBER:sport},%{NUMBER:dport},%{NUMBER:natsport},%{NUMBER:natdport},%{DATA:flags},%{DATA:proto},%{DATA:action},%{NUMBER:bytes},%{NUMBER:bytes_sent},%{NUMBER:bytes_received},%{NUMBER:packets},%{DATA:start},%{NUMBER:elapsed},%{DATA:category},%{DATA:unknown_field1},%{DATA:seqno},%{DATA:actionflags},%{DATA:srcloc},%{DATA:dstloc},%{DATA:unknown_field2},%{DATA:pkts_sent},%{DATA:pkts_received},%{DATA:session_end_reason},%{DATA:dg_hier_level_1},%{DATA:dg_hier_level_2},%{DATA:dg_hier_level_3},%{DATA:dg_hier_level_4},%{DATA:vsys_name},%{DATA:device_name},%{DATA:action_source},%{DATA:src_uuid},%{DATA:dst_uuid},%{DATA:tunnelid},%{DATA:monitortag},%{DATA:parent_session_id},%{DATA:parent_start_time},%{DATA:tunnel},%{NUMBER:assoc_id},%{NUMBER:chunks},%{NUMBER:chunks_sent},%{NUMBER:chunks_received},%{DATA:rule_uuid},%{DATA:http2_connection},%{DATA:link_change_count},%{DATA:policy_id},%{DATA:link_switches},%{DATA:sdwan_cluster},%{DATA:sdwan_device_type},%{DATA:sdwan_cluster_type},%{DATA:sdwan_site},%{DATA:dynusergroup_name},%{DATA:xff_ip},%{DATA:src_category},%{DATA:src_profile},%{DATA:src_model},%{DATA:src_vendor},%{DATA:src_osfamily},%{DATA:src_osversion},%{DATA:src_host},%{DATA:src_mac},%{DATA:dst_category},%{DATA:dst_profile},%{DATA:dst_model},%{DATA:dst_vendor},%{DATA:dst_osfamily},%{DATA:dst_osversion},%{DATA:dst_host},%{DATA:dst_mac},%{DATA:container_id},%{DATA:pod_namespace},%{DATA:pod_name},%{DATA:src_edl},%{DATA:dst_edl},%{DATA:hostid},%{DATA:serialnumber},%{DATA:src_dag},%{DATA:dst_dag},%{DATA:session_owner},%{DATA:high_res_timestamp},%{DATA:nssai_sst},%{DATA:nssai_sd},%{DATA:subcategory_of_app},%{DATA:category_of_app},%{DATA:technology_of_app},%{DATA:risk_of_app},%{QUOTEDSTRING:characteristic_of_app},

Sample Data:

panorama01.div.company.com 1,2024/12/02 15:18:02,024101003988,TRAFFIC,end,2817,2024/12/02 15:18:02,10.101.19.7,10.101.159.130,0.0.0.0,0.0.0.0,inside-in_37,msrpc-base,vdiv1,PROD.Internal,PROD.DIVNET,ethernet1/2,ethernet1/1,default,2024/12/02 15:18:02,655765,1,61636,49681,0,0,0x401a,tcp,allow,4896,3798,1098,16,2024/12/02 15:17:19,29,any,7432514093394733498,0x8000000000000000,10.0.0.0-10.255.255.255,10.0.0.0-10.255.255.255,9,7,tcp-rst-from-client,20,11,0,0,PA2.NJ17,from-policy,0,0,N/A,0,0,0,0,1d654101-d3e3-774e-df69-82f43a334666,0,0,2024-12-02T15:18:02.940-05:00,infrastructure,networking,network-protocol,2,“has-known-vulnerability,tunnel-other-application,pervasive-use”,msrpc,untunneled,no,no,0,NonProxyTraffic,

It’s a PANOS TRAFFIC log - The “characteristic_of_app” field is quoted but contains commas. The “Test with Sample Data” button works, but if we press “Update Pattern” with the %{QUOTEDSTRING} included in the pattern, the output metric will go straight to zero and no further logs are indexed. If we just remove the %{QUOTEDSTRING} reference from the pattern (without rebooting the Graylog server) the output stays at zero. Commands to stop or restart the graylog-server service will just hang until we hit CTRL-C. If the server is rebooted with %{QUOTEDSTRING} removed from the pattern it will then resume as normal.

The pattern doesn’t match. I used https://grokdebugger.com/ to test and found that the last matching pattern is natsport. The rest of the message string is not matched. I was only using subsets of the pattern, so this didn’t involve QUOTEDSTRING at all.

As far as GL not indexing: I’m guessing that simply no message is being processed successfully.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.