Extractor cut-ode is not working

Hello!

I have the following problem:

I receive messages with an IPv6 address in a field, I extract to “clientip”. This field is interpreted as an IP-type by our ES 2.3, which does not support IPv6.

So I tried to create an extractor that copies the field contents to a new field called “clientip_v6” and set the extractor mode to “cut”, which I thought should remove the contents of “clientip”.

However, it does not. “clientip” is still intact. (clientip_v6 is filled, though)

Does anybody have an idea what I’m doing wrong here?

Thanks.

Kind regards
Dennis

Please post the complete extractor configuration and some sample messages.

Yes, sure.

This is our complete extractor set for the input:

{
  "extractors": [
    {
      "title": "Puppet: Puppet run time",
      "extractor_type": "regex",
      "converters": [
        {
          "type": "numeric",
          "config": {}
        }
      ],
      "order": 7,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "puppetRunTime",
      "extractor_config": {
        "regex_value": "^.*puppet-agent\\[\\d[0-9]{0,9}.*\\]: Finished catalog run in ((\\d[0-9]{0,9}\\.[0-9]{0,9})) seconds"
      },
      "condition_type": "regex",
      "condition_value": "^.*(puppet-agent\\[\\d[0-9]{0,9}.*\\]: Finished catalog run in (\\d[0-9]{0,9}\\.[0-9]{0,9}) seconds)"
    },
    {
      "title": "Puppet: Puppet Configuration Version",
      "extractor_type": "regex",
      "converters": [
        {
          "type": "numeric",
          "config": {}
        }
      ],
      "order": 16,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "puppetConfigVersion",
      "extractor_config": {
        "regex_value": "^.*puppet-agent\\[\\d[0-9]{0,9}.*\\]: Applying configuration version '(\\d[0-9]{0,9})'"
      },
      "condition_type": "regex",
      "condition_value": "^.*(puppet-agent\\[\\d[0-9]{0,9}.*\\]: Applying configuration version '(\\d[0-9]{0,9})')"
    },
    {
      "title": "Apache Combined Log Extractor",
      "extractor_type": "grok",
      "converters": [],
      "order": 8,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{COMBINEDAPACHELOG}",
        "named_captures_only": true
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Graylog server log format",
      "extractor_type": "grok",
      "converters": [],
      "order": 12,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{GRAYLOG_SERVER_LOG}",
        "named_captures_only": true
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "MongoDB Log",
      "extractor_type": "grok",
      "converters": [],
      "order": 10,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{MONGODB_LOG}",
        "named_captures_only": true
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Copy IP to text",
      "extractor_type": "copy_input",
      "converters": [],
      "order": 22,
      "cursor_strategy": "copy",
      "source_field": "clientip",
      "target_field": "clientip_text",
      "extractor_config": {},
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "HAProxy TCP Extractor",
      "extractor_type": "grok",
      "converters": [],
      "order": 14,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{HAPROXYTCP}",
        "named_captures_only": false
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Postfix SMTP",
      "extractor_type": "grok",
      "converters": [],
      "order": 2,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{POSTFIXSMTP}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Postfix SMTPD",
      "extractor_type": "grok",
      "converters": [],
      "order": 15,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{POSTFIX_SMTPD}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Postfix Queue Manager",
      "extractor_type": "grok",
      "converters": [],
      "order": 11,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{POSTFIX_QMGR}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Postfix Statistics",
      "extractor_type": "grok",
      "converters": [],
      "order": 13,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{POSTFIX_ANVIL}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "IPTables",
      "extractor_type": "grok",
      "converters": [],
      "order": 17,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{IPTABLES}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "JIRA_PROJECT_ID",
      "extractor_type": "grok",
      "converters": [],
      "order": 18,
      "cursor_strategy": "copy",
      "source_field": "request",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": ".*selectedProjectId=(?<project_id>[^&]*)"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Artifactory Request Log",
      "extractor_type": "grok",
      "converters": [],
      "order": 3,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{DATESTAMP_EVENTLOG:timestamp}\\|%{NUMBER:size}\\|REQUEST\\|%{IPV4:client}\\|%{USER:user}\\|%{WORD:method}\\|(?<path>[^|]*)\\|HTTP/%{NUMBER:httpversion}\\|%{NUMBER:response}\\|.+",
        "named_captures_only": true
      },
      "condition_type": "string",
      "condition_value": "|"
    },
    {
      "title": "AEM Request Log response",
      "extractor_type": "grok",
      "converters": [],
      "order": 5,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{HTTPDATE} \\[%{NUMBER:sequence}\\] %{NOTSPACE:direction} %{NUMBER:response} %{NOTSPACE:content-type} %{NUMBER:duration}ms"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "AEM Request Log request",
      "extractor_type": "grok",
      "converters": [],
      "order": 4,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{HTTPDATE} \\[%{NUMBER:sequence}\\] %{NOTSPACE:direction} %{WORD:verb} %{NOTSPACE:path} HTTP/%{NUMBER:httpversion}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "AEM Access Log",
      "extractor_type": "grok",
      "converters": [],
      "order": 0,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{IPORHOST:clientip} %{HTTPDUSER:ident} %{USER:auth} %{HTTPDATE:timestamp} \"(?:%{WORD:verb} %{NOTSPACE:request}(?: HTTP/%{NUMBER:httpversion})?|%{DATA:rawrequest})\" %{NUMBER:response} (?:%{NUMBER:bytes}|-) %{QS:referrer} %{QS:agent}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "AEM Error Log",
      "extractor_type": "grok",
      "converters": [],
      "order": 1,
      "cursor_strategy": "cut",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{DATE} %{HOUR}:%{MINUTE}:%{SECOND}.(?<second_fraction>[0-9][0-9][0-9]) \\*%{LOGLEVEL}\\* "
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Fix POSTFIX_KEYVALUE length",
      "extractor_type": "substring",
      "converters": [],
      "order": 20,
      "cursor_strategy": "cut",
      "source_field": "POSTFIX_KEYVALUE",
      "target_field": "POSTFIX_KEYVALUE",
      "extractor_config": {
        "end_index": 32766,
        "begin_index": 0
      },
      "condition_type": "regex",
      "condition_value": "^.{32765,}$"
    },
    {
      "title": "Fix postfix_keyvalue_data length",
      "extractor_type": "substring",
      "converters": [],
      "order": 21,
      "cursor_strategy": "cut",
      "source_field": "postfix_keyvalue_data",
      "target_field": "postfix_keyvalue_data",
      "extractor_config": {
        "end_index": 32766,
        "begin_index": 0
      },
      "condition_type": "regex",
      "condition_value": "^.{32765,}$"
    },
    {
      "title": "Fix POSTFIX_SMTPD length",
      "extractor_type": "substring",
      "converters": [],
      "order": 19,
      "cursor_strategy": "cut",
      "source_field": "POSTFIX_SMTPD",
      "target_field": "POSTFIX_SMTPD",
      "extractor_config": {
        "end_index": 32766,
        "begin_index": 0
      },
      "condition_type": "regex",
      "condition_value": "^.{32765,}$"
    },
    {
      "title": "Christ Apache Logs",
      "extractor_type": "grok",
      "converters": [],
      "order": 9,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{CHRISTAPACHELOG}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "EP Apache Log extractor",
      "extractor_type": "grok",
      "converters": [],
      "order": 6,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "grok_pattern": "%{EPAPACHELOG}"
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Remove IPv6",
      "extractor_type": "regex_replace",
      "converters": [],
      "order": 23,
      "cursor_strategy": "cut",
      "source_field": "clientip",
      "target_field": "clientip_v6",
      "extractor_config": {
        "regex": ".*",
        "replacement": "0.0.0.0",
        "replace_all": true
      },
      "condition_type": "string",
      "condition_value": ":"
    }
  ],
  "version": "2.2.0-SNAPSHOT"
}

I posted the complete export, because maybe it’s a side effect of another extractor. The extractor giving me the headache is “Remove IPv6”

This is a sample message:

0:0:0:0:0:0:0:1 - admin 07/Mar/2018:08:51:59 +0100 "GET /etc/replication/agents.publish/flush.2.json HTTP/1.1" 200 725 "-" "Ruby"

Any ideas? :no_mouth:

Try using the regular expression (.*) in your “Remove IPv6” extractor instead of .*.

That, umm, works partly. No, the content of the clientip-field is “fullyCutByExtractor” :grin:
Can this be changed?

Use “Copy”, not “Cut”. The contents of the field will be replaced by the extractor anyway.

1 Like

Hm. With “copy”, clientip isn’t changed, but clientip_v6 is set to “0.0.0.00.0.0.0”…

FWIW, I would use a pipeline rule for that instead of an extractor (which might not be easy to follow when lots of extractors run before and after).

Example

rule "copy-ipv6-address"
when
  has_field("clientip") && contains(to_string($message.clientip), ":")
then
  let clientip = to_string($message.clientip);
  set_field("clientip_v6", clientip);
  set_field("clientip", "0.0.0.0");
end

Ah! I managed to get it working by setting the “store as field” value to the same field name. Now it’s just overwriting the field, which is okay. I could extract the ipv6 with another extractor and copy it to another field, if I wanted.

Thanks for the support!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.