Pipeline Rule not able to parse double from string field

1. Describe your incident:
I’m trying to convert a string value from an nginx JSON log to a double value. The raw message is as follows:
{"remote_addr": "127.0.0.1", "certificate_subject": "", "remote_user": "reverse-proxy", "time_local": "25/Jul/2022:14:01:19 +0000", "request": "POST /api/v1/access/auth HTTP/1.1", "status": "200", "body_bytes_sent": "1656", "http_referrer": "", "http_user_agent": "lua-resty-http/0.08 (Lua) ngx_lua/10011", "request_id": "fdcf14b7-e812-4f3c-aa62-4d374a85b91e", "request_length": "482", "request_time": "0.011", "upstream_addr": "192.168.3.33:14871", "upstream_response_time": 0.012, "pipe": ".", "ssl_protocol": "", "ssl_cipher": ""}

The value in question is “upstream_acces_time”. I have defined a rule as follows:

rule “Convert to Double”
when
true
then
let converted_double = to_double($message.upstream_response_time, 987.6);
set_field(“upstream_response_time_converted”, converted_double);
end

(The default value is there for debug purposes)

But looking as the stream message, I only get the following result:
graylog_converted

Notice that the “converted” field contains the default value of 987.6 as defined in the rule and not the converted value of the original field. This is the case for all messages.

2. Describe your environment:

  • Graylog version 4.3.2
  • JDK: Oracle Corporation 1.8.0_162 on Linux 4.15.0-189-generic
    The nginx messages are converted using the extractor:
      "title": "Nginx JSON access log",
      "extractor_type": "json",
      "converters": [],
      "order": 0,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "flatten": false,
        "list_separator": ",",
        "kv_separator": "=",
        "key_prefix": "",
        "key_separator": "_",
        "replace_key_whitespace": false,
        "key_whitespace_replacement": "_"
      },

3. What steps have you already taken to try and solve the problem?
I have run the same message throw the simulator to see if I get any helpful messages. But it shows the same result:

4. How can the community help?
Is my rule correct, or what am I doing wrong?
How/where can I get more helpful message that point out what is going wrong?

I wonder if upstream_response_time_converted is already a long and to_double() is expecting a string value?

`to_double(value, [default]) : Double` Converts a value to a double value using its string representation

to_double is actually really simple - it will take either a string or number. It only returns the default if the value is null.

I think you have uncovered a bizarre bug. Doing some testing I found that when the new field name is an existing numeric field name with _converted appended, it does not get processed correctly.

Try renaming the new field to something else.
I have filed a bug on this. I also requested better documentation.

On a side note: The function parser is a bit strange with regard to field names. It requires additional escaping when the name has a dash (but it didn’t help in your example):

2 Likes

Thanks @patrickmann

This indeed seems to be a bug related to _ in field names.
I changed the field name from “upstream_response_time” to a camel case “responseTime”. The field was immediately recognized correctly as a float value. That can also be seen when hovering over field names (see screenshot, notice = float vs = string with underscores).
So, no pipeline at all needed, as I now can do numerical statistics on the original field.

graylog_string
graylog_float

2 Likes

@js275 I need to set the record straight - my earlier statement that there might be a bug in the handling of variable names was completely off the mark. Turns out there is no bug at all and this is expected behavior!

What’s going on is due to Elastic’s dynamic type mapping. The first time Elastic encounters a new field, it deduces the type and you cannot subsequently change it. When you later on try to assign a value of some other type, it result in a processing error. Particularly when developing rules, it’s easy to unintentionally lock a name to the wrong type.

Here is another explanation: Indexer failures

1 Like

Argh - usually I spot these! Good second catch @patrickmann! :smiley: