Filebeat - adding custom fields does not work for Windows DHCP Server Logs

Hello,

1. Describe your incident:

I’m trying to add custom fields with the Windows DHCP Server file log retrieved with filebeat.
The default logs retrieved with Winlogbeat gives only few information but not the leases information nor mac addresses information.

While simulating the extractor or simulating the pipelines rule, it works but the fields are not created on the target stream.

2. Describe your environment:

  • Filebeat sidecar configuration:
# Needed for Graylog
fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}

output.logstash:
   hosts: ["graylog.company.lan:5044"]
   
path.data: C:\Program Files\Graylog\sidecar\cache\winlogbeat\data
path.logs: C:\Program Files\Graylog\sidecar\logs

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - "C:/Windows/System32/dhcp/DhcpSrvLog-*.log"
  include_lines: ["^[0-9]{2},"]
  fields_under_root: true

Beats input already configured with my previous Winlogbeat configurations.

  • RAW Message received for the first time:
11,08/21/23,13:01:09,Renouveler,192.168.1.103,o8-test1.company.lan,005056B2E381,,1245140141,0,,,,,,,,,0
  • Grok Pattern Extractor:
^%{INT:dhcpv4.id},(?<timestamp>(%{MONTHNUM}\/%{MONTHDAY}\/%{YEAR},%{HOUR}:%{MINUTE}:%{SECOND})),%{GREEDYDATA:dhcpv4.option.message_type}?((,%{IP:dhcpv4.client_ip},%{DATA:dhcpv4.option.hostname},%{DATA:dhcpv4.client_mac},%{DATA:user.name})|(,,%{DATA:dhcpv4.option.hostname},,)|(,,,,)),%{WORD:dhcpv4.transaction_id},%{WORD:dhcpv4.op_code},,?,,%{DATA:dhcpv4.option.class_identifier},%{DATA:dhcpv4.option.vendor_identifying_options}?((,,,,)|,%{WORD:dhcpv4.option.user_identifying_options},%{WORD:dhcpv4.option.relay_agent_information},,)%{WORD:dhcpv4.option.message}
  • Result:

3. What I want to achieve:

The custom fields I want to add are:

  • dhcpv4_client_mac_prefix:

I created an extractor to extract the first 6 hexa numbers:

^([\dA-Za-z]{6})

It should create the field: dhcpv4_client_mac_prefix with value 005056 but there is no field created.

- dhcpv4_mac_vendor

I have a list with key-pair value for mac address and vendors.
Lookup Table name: mac_to_vendor

"mac_address";"Manufactor"
"10E992";"INGRAM MICRO SERVICES"
"78F276";"Cyklop Fastjet Technologies (Shanghai) Inc."
"286FB9";"Nokia Shanghai Bell Co., Ltd."
"E0A129";"Extreme Networks Headquarters"

Pipeline

rule "Windows DHCP Server: macaddress to manufactor"

when
  has_field("dhcpv4_client_mac_prefix")
  then
   let update_source = lookup_value("mac_to_vendor", to_string($message.dhcpv4_client_mac_prefix));
   set_field("mac_vendor", update_source);

end

I think it does not work because it depends of the previous extractor for dhcpv4_client_mac_prefix

- dhcpv4_op_description:

I have created a lookup table and pipeline for this one:

Lookup Table name: dhcp_op_code

Content:

"dhcpv4_op_code";"dhcpv4_op_description"
"0";"No Quarantine"
"1";"Quarantine"
"2";"Drop Packet"
"3";"Probation"
"6";"No Quarantine Information Probation Time"

Pipeline rule:

rule "Windows Server DHCP: dhcpOPCode and dhcpOPdescription Lookup"

when
  has_field("dhcpv4_op_code")
  then
   let update_source = lookup_value("dhcp_op_code", to_string($message.dhcpv4_op_code));
   set_field("dhcpv4_op_description", update_source);

end
  • Simulator result:

But it does not appear on the filebeat Stream for new generated logs.

4. How can the community help?

I was wondering if there is an order processing for the extractors ?
Because extracted first is the full DHCP message, then based on the field created, I created a new extractor, and based on the field created, I created pipelines based on it.

Maybe the second extracts look a match on the raw input and not the message ID processed after the first extractor ?

Can you share your processing order for your message processors configuration? Depending on which graylog version you are running it is somewhere in the system/configurations page:

image

My recommendation would be to replace your extractors with corresponding pipeline rules. I recommend this for several reasons: Extractors are being deprecated, extractors are significantly slower than pipelines.

That should also make troubleshooting much simpler. I understand this isn’t as simple as “copy and paste” but provides a lot of benefits.

Hello,

This is my message processors order:

I only use pipelines just for lookup table for now and adding custom field that are not present.

From what I understand, Extractors are used to parse logs and extract data that already exists, it look complicated to parses raw logs with pipelines, which are intended to create custom fields and it look like for now, it is easier to parse with extractors.

I will take a look on how to convert my regex and grok extractor to pipeline rules.

I tried for example replacing the prefix mac address extract by a pipeline rule:

rule "Windows Server DHCP: MAC ADDRESS PREFIX EXTRACTOR"

when
  has_field("dhcpv4_client_mac")
  then
   let extract_prefix = regex(pattern: "^([\\dA-Za-z]{6})",value: to_string($message.dhcpv4_client_mac));
   set_field("dhcpv4_client_mac_prefix", extract_prefix);

end

But the simulator return something weird:

It gives me [object Object] instead of the 6 first hexa character of the mac address

I’m trying to convert the grok pattern extractor to the pipeline rules, but it give me headaches :smiley:

I forgot to add the to_string for the set_field !

rule "Windows Server DHCP: MAC ADDRESS PREFIX EXTRACTOR"

when
  has_field("dhcpv4_client_mac")
then
   let extract_prefix = regex(
       pattern: "^([\\dA-Za-z]{6})",
       value: to_string($message.dhcpv4_client_mac)
       );
   set_field("dhcpv4_client_mac_prefix", to_string(extract_prefix));

end

Now I get the result, but I get 0= in addition to the string extracted and don’t know where it come from:

image

try changing set_field("dhcpv4_client_mac_prefix", to_string(extract_prefix));

to set_field("dhcpv4_client_mac_prefix", to_string(extract_prefix["0"])); (i can’t remember if the 0 needs to be double quoted, different functions require it differently but if you get validation errors remove the double quotes :slight_smile: )

Nice, it works like a charm in the simulator !
image

But still, I do not see the new field when a computer get a new lease for example.
I can see the dhcpv4_client_mac which contains the computer mac address but not the new field dhcpv4_client_mac_prefix.

I got the same issue for another pipeline rule (to add field dhcpv4_op_description associated with dhcpv4_op_code) which work in simulator but not in the stream.

What I understand from the processors order is:

Computer filebeat agent —> INPUT BEAT ----> grok pattern extractor —> stream → pipeline rules → stream

Is that right ?

Ok so I deleted the grok partern extractor.
I have now a new pipeline with 4 stages

Stage 0: Pipeline Rule with Grok Pattern to extract all basic fields

rule "Filebeat - DHCP Server - Grok Pattern"
when
  has_field("message")
then
  let msg = to_string($message.message);
  let dhcp_server_log = grok(pattern:"^%{INT:dhcpv4.id},(?<dhcpv4.logtime>(%{MONTHNUM}\\/%{MONTHDAY}\\/%{YEAR},%{HOUR}:%{MINUTE}:%{SECOND})),%{GREEDYDATA:dhcpv4.option.message_type}?((,%{IP:dhcpv4.client_ip},%{DATA:dhcpv4.option.hostname},%{DATA:dhcpv4.client_mac},%{DATA:user.name})|(,,%{DATA:dhcpv4.option.hostname},,)|(,,,,)),%{WORD:dhcpv4.transaction_id},%{WORD:dhcpv4.op_code},,?,,%{DATA:dhcpv4.option.class_identifier},%{DATA:dhcpv4.option.vendor_identifying_options}?((,,,,)|,%{WORD:dhcpv4.option.user_identifying_options},%{WORD:dhcpv4.option.relay_agent_information},,)%{WORD:dhcpv4.option.message}", value: to_string(msg), only_named_captures: true);
  set_fields(dhcp_server_log);
end

Stage 1: Pipeline Rule to add field dhcpv4_client_mac_prefix

rule "Filebeat - DHCP Server - Prefix Extractor"

when
  has_field("dhcpv4_client_mac")
then
   let extract_prefix = regex(
       pattern: "^([\\dA-Za-z]{6})",
       value: to_string($message.dhcpv4_client_mac)
       );
   set_field("dhcpv4_client_mac_prefix", to_string(extract_prefix["0"]));

end

Stage 2: Pipeline Rule to get the MAC vendor from dhcpv4_client_mac_prefix

rule "Filebeat - DHCP Server - MAC Vendor from mac_prefix"

when
  has_field("dhcpv4_client_mac_prefix")
  then
   let update_source = lookup_value("mac_to_vendor", to_string($message.dhcpv4_client_mac_prefix));
   set_field("mac_vendor", update_source);

end

Stage 3: Pipeline Rule to get the dhcpv4_op_descript from dhcpv4_op_code

rule "Filebeat - DHCP Server - OP Code"

when
  has_field("dhcpv4_op_code")
  then
   let update_source = lookup_value("dhcp_op_code", to_string($message.dhcpv4_op_code));
   set_field("dhcpv4_op_description", update_source);

end

If I select this raw message before any pipeline execution and simulate, only the stage 1 works.

{
  "agent_hostname": "SRVAD1",
  "agent_id": "5d550a99-fce6-49f0-8fe9-49379c3f19e1",
  "agent_name": "SRVAD1",
  "log_offset": 13060,
  "collector_node_id": "srvad1",
  "gl2_remote_ip": "192.168.1.11",
  "@metadata_version": "7.11.1",
  "gl2_remote_port": 63412,
  "source": "SRVAD1",
  "beats_type": "filebeat",
  "gl2_source_input": "64b79d63a69f204330b8a149",
  "ecs_version": "1.6.0",
  "@metadata_beat": "filebeat",
  "@metadata_type": "_doc",
  "log_file_path": "C:\\Windows\\System32\\dhcp\\DhcpSrvLog-Mer.log",
  "gl2_source_node": "3b147713-efd6-45b0-83e4-f3b8aeea69ef",
  "timestamp": "2023-08-23T09:37:41.942Z",
  "gl2_accounted_message_size": 632,
  "gl2_source_collector": "78fd7c59-b789-4596-be22-1754f8a179cf",
  "agent_ephemeral_id": "37c95a2c-87bb-48ca-b35f-e6ad624d373b",
  "streams": [
    "64e30b20634f7440cf55f7c7"
  ],
  "input_type": "log",
  "gl2_message_id": "01H8GW3HHP002CMARVMPN360PZ",
  "message": "11,08/23/23,11:37:33,Renouveler,192.168.1.118,CLIF-ADM2.lab.lan,989096AB22DE,,443592307,0,,,,0x4D53465420352E30,MSFT 5.0,,,,0",
  "agent_type": "filebeat",
  "agent_version": "7.11.1",
  "@timestamp": "2023-08-23T09:37:41.942Z",
  "_id": "b5695e41-4198-11ee-8c14-0242ac150002",
  "host_name": "SRVAD1"
}
  • Stage 0 result only:

If I select this raw message, which was processed by the first stage and simulate again, the 3 other stages works.

{
  "agent_hostname": "SRVAD1",
  "agent_id": "5d550a99-fce6-49f0-8fe9-49379c3f19e1",
  "agent_name": "SRVAD1",
  "dhcpv4_client_ip": "192.168.1.103",
  "collector_node_id": "srvad1",
  "log_offset": 13758,
  "gl2_remote_ip": "192.168.1.11",
  "@metadata_version": "7.11.1",
  "gl2_remote_port": 63412,
  "source": "SRVAD1",
  "beats_type": "filebeat",
  "gl2_source_input": "64b79d63a69f204330b8a149",
  "ecs_version": "1.6.0",
  "@metadata_beat": "filebeat",
  "dhcpv4_option_message_type": "Renouveler",
  "@metadata_type": "_doc",
  "log_file_path": "C:\\Windows\\System32\\dhcp\\DhcpSrvLog-Mer.log",
  "dhcpv4_logtime": "08/23/23,11:44:43",
  "gl2_source_node": "3b147713-efd6-45b0-83e4-f3b8aeea69ef",
  "dhcpv4_id": "11",
  "dhcpv4_option_message": "0",
  "timestamp": "2023-08-23T09:44:47.291Z",
  "gl2_accounted_message_size": 848,
  "gl2_source_collector": "78fd7c59-b789-4596-be22-1754f8a179cf",
  "agent_ephemeral_id": "37c95a2c-87bb-48ca-b35f-e6ad624d373b",
  "dhcpv4_transaction_id": "1289519543",
  "streams": [
    "64e30b20634f7440cf55f7c7"
  ],
  "input_type": "log",
  "gl2_message_id": "01H8GWGGXV002CVNJSB9PN44RQ",
  "dhcpv4_option_hostname": "o8-test1.iss.lan",
  "message": "11,08/23/23,11:44:43,Renouveler,192.168.1.103,o8-test1.iss.lan,005056B1DE71,,1289519543,0,,,,,,,,,0",
  "agent_type": "filebeat",
  "agent_version": "7.11.1",
  "dhcpv4_op_code": "0",
  "@timestamp": "2023-08-23T09:44:47.291Z",
  "dhcpv4_client_mac": "005056B1DE71",
  "_id": "b2f13ce0-4199-11ee-8c14-0242ac150002",
  "host_name": "SRVAD1"
}

Do you know why ? It seems that the pipeline process only the first stage and then finish off.

I did some quick testing and found the . in the field names to be causing issues. In short dots (') are not allowed in field names.

in Filebeat - DHCP Server - Grok Pattern
%{DATA:dhcpv4.client_mac}

in Filebeat - DHCP Server - Prefix Extractor
has_field("dhcpv4_client_mac")

Notice that dhcpv4_client_mac doesn’t exist so Filebeat - DHCP Server - Prefix Extractor does not match:

If i change %{DATA:dhcpv4.client_mac} to %{DATA:dhcpv4_client_mac} the rule succeeds:

Longer explanation: https://go2docs.graylog.org/5-0/planning_your_deployment/faq.html

My field names contain dots and stream alerts do not match anymore

Due to restrictions in certain Elasticsearch versions, Graylog needs to convert field names that contain .
characters with another character, by default the replacement character is _.

This replacement is done just prior to writing messages to Elasticsearch, which causes a mismatch between what stream rules and alert conditions see as field names when they are evaluated.

Stream rules, the conditions that determine whether or not a message is routed to a stream, are being run as data is being processed by Graylog. These see the field names as containing the dots.

However, alert conditions, which are also attached to streams, are converted to searches and run in the background. They operate on stored data in Elasticsearch and thus see the replacement character for the dots. Thus alert conditions need to use the _instead of .
when referring to fields. There is currently no way to maintain backwards compatibility and transparently fixing this issue, so you need to take action.

The best option, apart from not sending fields with dots, is to remember to write alert conditions using the replacement character, and never use .in the field names. In general Graylog will use the version with _in searches etc.

For example, if an incoming message contains the field docker.container stream rules use that name, whereas alert conditions need to use docker_container. You will notice that the search results also use the latter name.

If you change your grok capture to use _ instead of . you should be good to go. Sorry for the confusion!

3 Likes

Many thanks on that ! The problem is always hidden in the subtlety. And it is also happen when you copy paste a grok pattern without understanding it. My fault !

It work like a charm now ! And thanks to you I just discovered that I could trace the simulation !
I did not even know it was possible :sweat_smile:

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.