Indexer failures and regex with set_field in Pipeline 2

Hello, I have identical problem as in topic “Indexer failures and regex with set_field” in Pipeline by Grakkal. I have read the documentation that was in answer of jochen, but I have problems with creating a new index template.
I want to have field look like:

a1_fw_device_name
XG210

but I have:

a1_fw_device_name
{"0":"XG210"}

Example of $message.message:
device=“SFW” date=2020-02-10 time=15:19:35 timezone=“CET” device_name=“XG210” log_type=“Firewall” log_component=“Firewall Rule” log_subtype=“Allowed” status=“Allow” priority=Information duration=0 fw_rule_id=132 policy_type=1 user_name=""

My code of rule for pipeline:

rule "Sorted fields of Sophos syslog"
when
    true
then
    let str_msg = to_string($message.message);
    let fw_device_name = regex("device_name=\"([^\"]*)\"", str_msg);
    set_field("a1_fw_device_name", fw_device_name["0"]);
end

My try to create new index:

{
  "template": "graylog_*",
  "mappings" : {
    "message" : {
      "properties" : {
        "0" : {
          "type" : "text"
        }
      }
    }
  }
}

I still have error {“type”:“mapper_parsing_exception”,“reason”:“object mapping for [a1_fw_device_name] tried to parse field [a1_fw_device_name] as object, but found a concrete value”}.
Version: Graylog 3.1.4+1149fe1.
Could you say please, what I do wrong?
Many thanks in advance

you have your sample message a perfect key-value. Why not using that?

I’m not using key-value because:

  1. I want to define an order of fields. (I want to see it like src_ip, src_port must be together and after them must be dst_ip, dst_port, but Graylog sorting fields in alphabetical order and I don’t know how to change that)
  2. I have posted a shorted example of $message.message, full is here:
    device=“SFW” date=2020-02-10 time=15:46:25 timezone=“CET” device_name=“XG210” device_id=D57076BFDD6HG42 log_id=010101600001 log_type=“Firewall” log_component=“Firewall Rule” log_subtype=“Allowed” status=“Allow” priority=Information duration=10 fw_rule_id=132 policy_type=1 user_name="" user_gp="" iap=0 ips_policy_id=0 appfilter_policy_id=0 application="" application_risk=0 application_technology="" application_category="" in_interface=“Port5” out_interface=“Port5.2135” src_mac=00:00:00:00:00:00 src_ip=192.168.22.9 src_country_code=R1 dst_ip=192.168.23.148 dst_country_code=R1 protocol=“TCP” src_port=41538 dst_port=10050 sent_pkts=6 recv_pkts=5 sent_bytes=358 recv_bytes=290 tran_src_ip= tran_src_port=0 tran_dst_ip= tran_dst_port=0 srczonetype=“LAN” srczone=“LAN” dstzonetype=“DMZ” dstzone=“DMZ” dir_disp="" connevent=“Stop” connid=“3264209776” vconnid="" hb_health=“No Heartbeat” message="" appresolvedby=“Signature” app_is_cloud=0
    There are many keys that I don’t need them to have separate fields

You don’t have to use all fields that split() creates. In the case below I have commented out the fields I don’t want (in case they are needed in the future).

let message   = to_string($message.message);
let splittraf = split(",", message);
set_field("DateTime",                 splittraf[0]);
set_field("RequestId",                splittraf[1]);
//set_field("MajorVersion",           splittraf[2]);
//set_field("MinorVersion",           splittraf[3]);
set_field("BuildVersion",             splittraf[4]);
set_field("RevisionVersion",          splittraf[5]);

Hi @sequento,
Graylog 3.2 have option to include field in All mesages (widget Meesage Table) in your own order.

https://docs.graylog.org/en/3.2/pages/searching/widgets.html#message-table

Thank you, @tmacgbay,
Your example helps to simplify code in my rule (no need to use regex and make extra variable) !

Hi @shoothub,
I upgraded Graylog to 3.2 and tried to do the same thing like you posted, but it looks it have no effect on order of fields.

Hi @sequento,
If you want to order of fields in All messages, you have to edit widget, and remove fields and insert one by one in your desired order.

Or you can change order of field by drag and drop (press left mouse button on field for at least 2 second and after that move to it desired position).

Another option is to use Format string decorator. Click to selectbox below Decorators, select Format String. Add new decorator, which concatenate 2 or more field to one.
https://docs.graylog.org/en/3.2/pages/searching/decorators.html#format-string

Hi @shoothub,
Finally I understood how works widget Message table, but it’s not what I need.
It changes order in short description of message(maybe it officially names different), but I need order in fully opened message(see on picture).
Screenshot from 2020-02-12 12-24-50

Hi, @tmacgbay,
Is it possible to split message by regex from string like this?

device=“SFW” date=2020-02-10 time=15:46:25 timezone=“CET”

to

splittraf[0] = “SFW”
splittraf[1] = 2020-02-10
splittraf[2] = 15:46:25
splittraf[3] = “CET”

I have result I need, but in some spaghetti code:

    let str_msg = to_string($message.message);
    let splittraf = split(" ", str_msg);
    
    set_field("Date",                   split("=", to_string(splittraf[1]))[1]);

Hi all,
When @jan proposed to use key-value i have thought about Extractors, but in Graylog also exists function key-value, so I used it and have needed result, but with few exceptions.
First exception:
log_component="Firewall Rule"
this string is a pitfall for key-value and split functions as it have delimeter(whitespace) of entire message inside value.
Second exception:
for ordering fields I must had to make special field names to have it alphabetical.
So, solution for me looks like this (of course it’s not elegant, but works) :

rule "Sophos sorted 3"
when
    true
then
    let str_msg = to_string($message.message);
    
    let values = key_value(str_msg);
    
    set_field("fw11_device_name", values["device_name"]);
    set_field("fw12_log_id", values["log_id"]);
    set_field("fw13_log_type", values["log_type"]);
    set_field("fw14_log_subtype", values["log_subtype"]);
    set_field("fw15_status", values["status"]);
    set_field("fw16_username", values["username"]);
    set_field("fw17_rule_id", values["fw_rule_id"]);
    set_field("fw18_protocol", values["protocol"]);

    set_field("fw21_src_ip", values["src_ip"]);
    set_field("fw22_src_port", values["src_port"]);
    set_field("fw23_local_interfaceip", values["in_interface"]);
    
    set_field("fw31_dst_ip", values["dst_ip"]);
    set_field("fw32_dst_port", values["dst_port"]);
    set_field("fw33_remote_interfaceip", values["out_interface"]);
    
    set_field("fw41_timezone", values["timezone"]);
    set_field("fw42_time", values["time"]);
    set_field("fw43_date", values["date"]);
end

This was my first topic on the forum, so sorry for the confusion in questions and answers.
Thank you all!

I was looking around for a solution to the “space in a field” issue and came across the following Marketplace solution.

This uses GROK to parse out the line which would help greatly with allowing you to choose the field names and multi-word field data.

1 Like

he @sequento

the function has some parameters that allow to work with that:

That might already help you in this situation. I’m also sorry that I wasn’t clear in my message before that I aim for the processing pipelines …

The SNAFU in the example given is there are some values that are quoted and some are not. In some cases the quoted values have a space in them which is also the delimiter for key value pairs. Examples:

  • src_ip=192.168.22.9
  • hb_health=“No Heartbeat”
  • message=""

Both split() and key_value() would have a hard time with that space between the quotes (I spent some time trying to find a regex_replace() solution to this)

The layered GROK solution from the marketplace seems like it would do the trick … in a moment of angst… I have been wrong before…

Post up what you end up doing to solve it for future searchers!

Hi, @jan
It’s ok, it happens with everyone that you can be not clear in something. In any case, thank you, I come to my solution with your help.

Hi, @tmacgbay
I will try GROK solution this week and will give feedback on this

Hi, @tmacgbay
I think I will not use GROK, because it’s syntax confuses me in some level and you must to write code of entire template to one row(I can’t find possibilities to have it on multiple lines).

Fair enough - but you can create a custom Grok Pattern under System->Grok Patterns (there is a whole testing setup there). and reference it by it’s name in your pipeline to make it cleaner. If you consider Grok to be a way to assign fields via multi level pointer names to underlying regular expressions it gets a little easier to understand. Either way good luck!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.