Pipeline Rule not able to parse double from string field

js275 · July 25, 2022, 2:18pm

1. Describe your incident:
I’m trying to convert a string value from an nginx JSON log to a double value. The raw message is as follows:
{"remote_addr": "127.0.0.1", "certificate_subject": "", "remote_user": "reverse-proxy", "time_local": "25/Jul/2022:14:01:19 +0000", "request": "POST /api/v1/access/auth HTTP/1.1", "status": "200", "body_bytes_sent": "1656", "http_referrer": "", "http_user_agent": "lua-resty-http/0.08 (Lua) ngx_lua/10011", "request_id": "fdcf14b7-e812-4f3c-aa62-4d374a85b91e", "request_length": "482", "request_time": "0.011", "upstream_addr": "192.168.3.33:14871", "upstream_response_time": 0.012, "pipe": ".", "ssl_protocol": "", "ssl_cipher": ""}

The value in question is “upstream_acces_time”. I have defined a rule as follows:

rule “Convert to Double”
when
true
then
let converted_double = to_double($message.upstream_response_time, 987.6);
set_field(“upstream_response_time_converted”, converted_double);
end

(The default value is there for debug purposes)

But looking as the stream message, I only get the following result:
graylog_converted

Notice that the “converted” field contains the default value of 987.6 as defined in the rule and not the converted value of the original field. This is the case for all messages.

2. Describe your environment:

Graylog version 4.3.2
JDK: Oracle Corporation 1.8.0_162 on Linux 4.15.0-189-generic
The nginx messages are converted using the extractor:

      "title": "Nginx JSON access log",
      "extractor_type": "json",
      "converters": [],
      "order": 0,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "flatten": false,
        "list_separator": ",",
        "kv_separator": "=",
        "key_prefix": "",
        "key_separator": "_",
        "replace_key_whitespace": false,
        "key_whitespace_replacement": "_"
      },

3. What steps have you already taken to try and solve the problem?
I have run the same message throw the simulator to see if I get any helpful messages. But it shows the same result:

4. How can the community help?
Is my rule correct, or what am I doing wrong?
How/where can I get more helpful message that point out what is going wrong?

tmacgbay · July 25, 2022, 7:07pm

I wonder if upstream_response_time_converted is already a long and to_double() is expecting a string value?

`to_double(value, [default]) : Double` Converts a value to a double value using its string representation

patrickmann · July 26, 2022, 8:34am

to_double is actually really simple - it will take either a string or number. It only returns the default if the value is null.

I think you have uncovered a bizarre bug. Doing some testing I found that when the new field name is an existing numeric field name with _converted appended, it does not get processed correctly.

Try renaming the new field to something else.
I have filed a bug on this. I also requested better documentation.

On a side note: The function parser is a bit strange with regard to field names. It requires additional escaping when the name has a dash (but it didn’t help in your example):

github.com/Graylog2/graylog2-server

Pipeline Rules experienced trouble with dash in fieldname

opened 12:07PM - 05 Sep 19 UTC

closed 11:22AM - 27 Nov 19 UTC

xtruthx

needs-input bug triaged

There are fields auto-generated by a kv-value based rule therefore the field nam…es are generated by this rule. There several field names including a dash or multiple dashes. If i want to work with them as reference for field in pipeline rules graylog is not accepting them cause it interprets the dash separated values as variables. ## Expected Behavior ``` rule "ise grok cisco_client_mac from field cise_Acct-Session-Id" when has_field("cise_Acct-Session-Id") // is working fine then set_fields(grok( pattern: "%{SOMEPATTERN}", value: to_string($message.cise_Acct-Session-Id), only_named_captures: true ) ); end ``` ## Current Behavior The rule interpreter not accept the field reference and throws following errors: Undeclared Variable Session in line 7 pos 50 Undeclared Variable Id in line 7 pos 58 ## Possible Solution/Workarround Use trim function vor keys in kv_value function to remove dashes. ## Steps to Reproduce (for bugs) 1. Create a field named cise_Acct-Session-Id or foo_bar-basel 2. try to use this fild in a pipeline function 3. 4. ## Context There are fields auto-generated by a kv-value based rule therefore the field names are generated by this rule. There several field names including a dash or multiple dashes. If i want to work with them as reference for field in pipeline rules graylog is not accepting them cause it interprets the dash separated values as variables. ## Your Environment * Graylog Version: 3.1.0+aa5175e, codename Quantum Dog * JVM: PID 25565, Oracle Corporation 1.8.0_222 on Linux 4.9.0-8-amd64 * Elasticsearch Version: 6.8.1 * MongoDB Version: 4.0 * Operating System: PRETTY_NAME="Debian GNU/Linux 9 (stretch)" NAME="Debian GNU/Linux" VERSION_ID="9" VERSION="9 (stretch)" ID=debian HOME_URL="https://www.debian.org/" SUPPORT_URL="https://www.debian.org/support" BUG_REPORT_URL="https://bugs.debian.org/" * Browser version: Firefox Quantum 60.8.0esr (64-Bit)

js275 · July 28, 2022, 2:51pm

Thanks @patrickmann

This indeed seems to be a bug related to _ in field names.
I changed the field name from “upstream_response_time” to a camel case “responseTime”. The field was immediately recognized correctly as a float value. That can also be seen when hovering over field names (see screenshot, notice = float vs = string with underscores).
So, no pipeline at all needed, as I now can do numerical statistics on the original field.

graylog_string
graylog_float

patrickmann · August 5, 2022, 12:30pm

@js275 I need to set the record straight - my earlier statement that there might be a bug in the handling of variable names was completely off the mark. Turns out there is no bug at all and this is expected behavior!

What’s going on is due to Elastic’s dynamic type mapping. The first time Elastic encounters a new field, it deduces the type and you cannot subsequently change it. When you later on try to assign a value of some other type, it result in a processing error. Particularly when developing rules, it’s easy to unintentionally lock a name to the wrong type.

Here is another explanation: Indexer failures

tmacgbay · August 5, 2022, 3:41pm

Argh - usually I spot these! Good second catch @patrickmann!

system · August 19, 2022, 3:41pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Changing a field type from string to numeric and mid-setup and past string values Graylog Central (peer support) pipeline-rules , debuggingpl	10	17318	December 13, 2017
Parsing json pipeline not working Graylog Central (peer support)	0	2	April 8, 2025
Help Needed with Graylog Pipeline - Conversion Issue Graylog Central (peer support) pipeline-rules	8	130	July 11, 2024
Pipeline rule to convert timestamp field and parse message json Graylog Central (peer support) pipeline-rules	10	528	April 17, 2024
Escape string values from json input in pipeline Graylog Central (peer support) pipeline-rules , regex-special-charac	7	525	September 22, 2023

Pipeline Rule not able to parse double from string field

Related topics