JOIN GRAYLOG FOR OUR

ANNUAL CONFERENCE

Thursday,
October 21, 2021
10am-5pm CT
REGISTER NOW

Date Parsing peculiarities

Hi,
I have asked this before but the thread this was asked on is now closed. apologies for the long post, I’m trying to understand this behaviour.

Basically I have a sidecar input sending RADUS accounting messages, the logs from it are recorded to Graylog and I have in my sidecar config a processor that treats the message as CSV - decode_csv_fields

I use “extract_arrays” to provide the association with the csv columns and a ‘column name’

All this works perfectly - accept for the parsing of the date.

The message contains a date string in the format of ‘yyyy/MM/dd hh:MM:ss’
This date sting has no Timezone associated to it, but it is BST (British Summer Time)
My Graylog is configured as the same timezone as the source message (configured to the same NTP source). As is my client desktop.

Ive tried using the sidecars to get the timestamp I want. However, this appears to be automatically parsed - for example,

message
Interim-Update,user1@realm,2021/10/11 09:16:36,0,BNG-2,30B87900029966615E4F3B,X.X.X.X,Framed-User,PPP,,,X.X.X.X,255.255.255.255,,,,3795710611,20,3327212325,,30783831,,61022672,PPPoEoQinQ,193687867,lag-11:2231.315,Radius,369577,2axx:xxxx:xx:xxxx::/56

the date extracted from the above is

2021/10/11 09:06:36

however, the value that is appearing in the field is

Event_Timestamp

2021-10-11 10:16:36.000 +01:00

I then tried using a pipeline, here I created the pipeline with a single rule.

rule "Event_Timestamp"
when
    has_field("Event_Timestamp")
then
    let new_date = parse_date(to_string($message.Event_Timestamp), "yyyy/MM/dd HH:mm:ss", "BST", "Europe/Isle_od_Man") - hours(1);
   set_field("Event-Timestamp", new_date);
end

Which does actually work and provide the date I want - However, this is not ideal as the BST is going back to GMT at the end of the month - and this will affect my dates once again.

I then took a look at an extractor -

{
  "extractors": [
    {
      "title": "Event_1",
      "extractor_type": "split_and_index",
      "converters": [],
      "order": 0,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "Event_1",
      "extractor_config": {
        "index": 3,
        "split_by": ","
      },
      "condition_type": "none",
      "condition_value": ""
    },
    {
      "title": "Event_Timestamp",
      "extractor_type": "copy_input",
      "converters": [
        {
          "type": "date",
          "config": {
            "date_format": "yyyy/MM/dd HH:mm:ss",
            "time_zone": "Europe/Isle_of_Man",
            "locale": "und"
          }
        }
      ],
      "order": 0,
      "cursor_strategy": "copy",
      "source_field": "Event_1",
      "target_field": "EventTimestamp",
      "extractor_config": {},
      "condition_type": "none",
      "condition_value": ""
    }
  ],
  "version": "4.1.0"
}

As you will see there are two extractors.
I tried to use a single extractor where I perform split on the ‘,’ and use index ‘3’

I get the date in the extractor preview.

I save this, but the extracted date is once again automatically parsed and ends up being 1hour ahead of what I want.

If I try using a single extractor - but this time with a date converter, I get this in my error logs

[type=illegal_argument_exception, reason=failed to parse date field [2021-10-11T10:16:36.000Z] with format [yyyy/MM/dd HH:mm:ss........

I tried to change the format to ‘yyyy-MM-dd’T’HH:mm:ss.SSSZ’ but then the error said

java.lang.IllegalArgumentExpection: Invalid format: "2021/10/11 07:55:24" at "/10/11 07:55:24"

But this time the error appears to be a java error…

By using the two extractors as above is the only wat I have found to get the date I want.

I recall seeing a post about automatic date parsing but I can no longer find it. Is this such a thing?

I would be keen to see if im not alone with this peculiar parsing issue and I would be grateful if anyone can provide some feedback on whether what I am doing is “the best way” or is there something else I can try?

Hello,

It’s a little confusing and correct me if I’m wrong I’m going to sum this up.

  • Your messages are received in GMT and your time zone is GMT +1 (BST)?
  • You then corrected the message timestamp with a pipeline, so timestamp is converted to GMT +1 on all messages?
  • User time Zone is set for GMT +1
  • At the end of the month, you need to revert the message timestamp back to GMT?

For better clarity here is an example of my user in GMT-5 (GMT/UTC - 4h during Daylight Saving Time) and my messages are coming from GMT-6 (GMT/UTC - 5h during Daylight Saving Time)

So at the end of the month the messages with the converted time stamp needs/wants to be reverted back? If this is correct I have not seen this type of situation before but maybe someone else here has.
The only automatic date parsing that I’m aware of is pipelines and extractors. I don’t know if you have tried this but on the extractors at the bottom page there is a converter as shown below.

Hi @gsmith,
Thanks for the reply.
sorry for the confusion - its confusing the hell out of me…

The messages sent to graylog are GMT+1 (BST) from 2x remote nodes using sidecars.

# Needed for Graylog
fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}

filebeat.inputs:
- input_type: log
  paths:
    - /var/AAA/accounting/*/*.csv
  type: log
output.logstash:
   hosts: ["x.x.x.x:5020"]
path:
  data: /var/lib/graylog-sidecar/collectors/filebeat/data
  logs: /var/lib/graylog-sidecar/collectors/filebeat/log

processors:
  - decode_csv_fields:
      fields:
        message: csv
      overwrite_keys: true

  - extract_array:
      field: csv
      overwrite_keys: true
      omit_empty: true
      mappings:
        Acct-Status-Type: 0
        User-Name: 1
        Event_Timestamp: 2
        Acct-Delay-Time: 3
        NAS-Identifier: 4
        Acct-Session-Id: 5
        NAS-IP-Address: 6
        Service-Type: 7
        Framed-Protocol: 8
        Framed-Compression: 9
        Unisphere-PPPoE-Description: 10
        Framed-IP-Address: 11
        Framed-IP-Netmask: 12
        Unisphere-Ingress-Policy-Name: 13
        Calling-Station-Id: 14
        Acct-Input-Gigawords: 15
        Acct-Input-Octets: 16
        Acct-Output-Gigawords: 17
        Acct-Output-Octets: 18
        Unisphere-Input-Gigapackets: 19
        Acct-Input-Packets: 20
        Unisphere-Output-Gigapackets: 21
        Acct-Output-Packets: 22
        NAS-Port-Type: 23
        NAS-Port: 24
        NAS-Port-Id: 25
        Acct-Authentic: 26
        Acct-Session-Time: 27
        Delegated-IPv6-Prefix: 28

  - convert:
      ignore_missing: true
      fail_on_error: false
      fields:
        - {from: Framed-IP-Address, type: ip}
        - {from: NAS-IP-Address, type: ip}
        - {from: Delegated-IPv6-Prefix, type: ip}
        - {from: Acct-Session-Time, type: long}
        - {from: Acct-Input-Octets, type: long}
        - {from: Acct-Output-Octets, type: long}
        - {from: Acct-Input-Gigawords, type: long}
        - {from: Acct-Output-Gigawords, type: long}
        - {from: Acct-Input-Packets, type: long}
        - {from: Acct-Output-Packets, type: long}
        - {from: Port, type: long}
        - {from: Delegated-IPv6-Prefix, type: ip}


  - drop_fields:
      fields:
        - csv

when looking at the messages collected by this sidecar - the field that contains the my timestamp of interest, instead of appearing as

Event_Timestamp
2021/10/12 23:48:17

it appears as

Event_Timestamp
2021-10-12 0:48:17.000 +01:00

At this stage, I have asked for no processing on the date in the Event_Timestamp field, but the time appears as thought its parsed automatically.

My work around was what I did with the two extractors - its the only way I have found to express the data as I need it.

All times on the Graylog server, My desktop and the 2x remote sources are all BST (GMT+1)

If this is a FileBeat situation, have you seen or tried this?

That was my in my original sidecard config

- timestamp:
      field: Event_timestamp
      layouts:
        - '2021/09/23 08:00:05'

Found lots of examples that made no difference.

I did recall seeing a post not so long ago mentioning that you could disable the automatic parsing of a timestamp but I did not bookmark this and now I cant find it. Typical…

Hello,

This makes me question what’s going on in your environment on why you can’t change you Date/time layout.

If your main concern is the Event_Timestamp layout I think looking at just a pipeline is the way to go.
Not sure how your testing new configurations but I would remove any old configurations made prior and only test the new configuration’s.
I assume you have seen this?

Functions - Processing Pipelines(value%3A%20string

What would help more here would be to show what you have tried and the out come of what happen, were pretty much in the dark over here.

@Oirbsiu

I came across this earlier. Perhaps was this what you were talking about?

Dynamic date detection can be disabled by setting date_detection to false: