Timestamp from log line in nginx JSON logs via filebeat / beats

Hi!

I have graylog 4.2.1 and nginx 1.21.4 with filebeat 7.15.2 and beats input.

Nginx logs information about request in a logfile in JSON format like this:

{ "time_iso8601": "2021-11-20T09:58:22+00:00", "msec": "1637402302.360", "connection": "106855", "connection_requests": "1", "pid": "18354", "request_id": "6f930f70f80c9cf8d0a6015bd42f6930", "request_length": "525", "remote_addr": "35.235.111.111", "remote_user": "", "remote_port": "43414", "time_local": "20/Nov/2021:09:58:22 +0000", "request": "GET /2013/08/06/?lang=en HTTP/1.1", "request_uri": "/2013/08/06/?lang=en", "args": "lang=en", "status": "301", "body_bytes_sent": "5", "bytes_sent": "399", "http_referer": "", "http_user_agent": "Mozilla/5.0 (compatible; AhrefsBot/7.0; +http://ahrefs.com/robot/)", "http_x_forwarded_for": "51.222.253.8", "http_host": "example.com", "server_name": "example.com", "request_time": "0.313", "upstream": "127.0.0.1:9001", "upstream_connect_time": "0.001", "upstream_header_time": "0.314", "upstream_response_time": "0.314", "upstream_response_length": "22", "upstream_cache_status": "", "ssl_protocol": "TLSv1.2", "ssl_cipher": "ECDHE-RSA-AES128-GCM-SHA256", "scheme": "https", "request_method": "GET", "server_protocol": "HTTP/1.1", "pipe": ".", "gzip_ratio": "", "http_cf_ray": "", "request_completion": "OK"}

Filebeat is configured in this way:

filebeat.inputs:

- input_type: log
  paths:
    - /var/log/nginx/json.log
  fields:
    logtype: nginx-access-json
  fields_under_root: true

I am able to receive logs and to parse them to fields using JSON extractor:

    {
      "title": "Extract JSON fields",
      "extractor_type": "json",
      "converters": [],
      "order": 1,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "flatten": true,
        "list_separator": ", ",
        "kv_separator": "=",
        "key_prefix": "",
        "key_separator": "_",
        "replace_key_whitespace": false,
        "key_whitespace_replacement": "_"
      },
      "condition_type": "none",
      "condition_value": ""
    },

My problem is that Graylog uses time from “filebeat_@timestamp” as “timestamp”, this means that I can see the situation when logs were actually received by Graylog, but I am really interested in being able to analyze the situation on the origin server, when every request was actually executed (as logs may arrive in batches with some delay).

In ES + logstash + kibana I used this type of logstash config to update timestamp:

filter {
  if [logtype] == "nginx-access-custom" {
  [..]
    date {
      match => [ "time_local" , "dd/MMM/YYYY:HH:mm:ss Z" ]
      target => "@timestamp"
    }
  [..]
  }
}

I read many different topics here and on strackoverflow about a proper way to implement something like this in Graylog via extractors or pipelines, but all of them failed either silently or with “gl2_processing_error”.

The most clean and neat way I can see is to add one more extractor to copy “time_local” field with a proper date format to “timestamp”. First of all I check this approach by trying to copy this to “timestamp2” field:

     {
      "title": "timestamp from JSON time_local",
      "extractor_type": "regex",
      "converters": [
        {
          "type": "date",
          "config": {
            "date_format": "dd/MMM/yyyy:HH:mm:ss Z"
          }
        }
      ],
      "order": 2,
      "cursor_strategy": "copy",
      "source_field": "time_local",
      "target_field": "timestamp2",
      "extractor_config": {
        "regex_value": "^(.*)$"
      },
      "condition_type": "none",
      "condition_value": ""
    }

And it works in a prefect way - I can see a new “timestamp2” field with a correct date from the logs:

But when I update this extractor to use “timestamp” field as a target I get “gl2_processing_error”:

P.S. I’m not yet familiar with pipelines, so I want to achieve the result with extractors if this is possible.

Hello,

I have seen this issue multiply times in the forum. I kind of understand what’s happening. The field
timestamp is there by default and you need to make another field so its not conflicting with each other. I’m not 100% sure but I think there is a fix.
On another note we have TAG"s now, just an FYI you can search for TimeStramp Issues. Not sure if you have seen this.

Correct me if I’m wrong but your time_iso8601 is the same time/date as the original log file?
Perhaps something like this will help

Thank you very much for this link, @gsmith ! For some reason I was not able to find it by myself, so these timestamp tag can be really convenient to use.

I was able to achieve my goal with this pipeline:

rule "parse timestamp"
when
    has_field("time_local")
then
    let new_time = parse_date(value: to_string($message.time_local), pattern:"dd/MMM/yyyy:HH:mm:ss Z");
    set_field("timestamp", new_time);
end

Note, that it is important to configure “Message Processors Configuration” so that “Pipeline Processor” works after “Message Filter Chain”. So we can parse JSON message and create fields, and then work with this fields in an easy way with pipelines.

Nice, and thanks for posting your resolve to the issue. Much appreciated.