Failed to index [1] messages. failed to parse field [DateTime] of type [date] in document

Hello, I already made a post however things have changed a lot since then and I thought that it would be best to just make a new one since I know exactly what the issue is.
Original post: JSON Extractor stops messages from showing up in input - #7 by cesq

So I have an Input that receives nginx access logs in the JSON format and whenever I add an extractor (that works correct in the preview), the messages stop coming in. Here’s a sample message that fails to extract:

{
   "timestamp":"1658474614.043",
   "remote_addr":"x.x.x.x.x",
   "body_bytes_sent":229221,
   "request_time":0.005,
   "response_status":200,
   "request":"GET /foo/bar/1999/09/sth.jpeg HTTP/2.0",
   "request_method":"GET",
   "host":"www…somesite.com",
   "upstream_cache_status":"",
   "upstream_addr":"x.x.x.x.x:xxx",
   "http_x_forwarded_for":"",
   "http_referrer":"https:://www.somesite.com/foo/bar/woo/boo/moo",
   "http_user_agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/103.0.0.0 Safari/537.36",
   "http_version":"HTTP/2.0",
   "nginx_access":true
}

I have reviewed the server logs located in /var/log/graylog-server/ and found the following error:


2022-07-25209:4-:47.146+02:00 ERROR [MessagesAdapterES7] Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: failure in bulk execution: [255]: index [graylog_313], type [_doc], id [1324e361-0bee-lled-be39-0050568fbcc4], message [ElasticsearchException[Elasticsearch exception [type=mapper_parsing_exception, reason=failed to parse field [DateTime] of type [date] in document with id '1324e361-0bee-lled-be39-0050568fbcc4'. Preview of field's value: 'DateTime']]; nested: ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=failed to parse date field [DateTime] with format [ strict date optional_timeHepoch_millis]]]; nested: ElasticsearchException[Elasticsearch exception [type=date_time_parse_exception, reason=Failed to parse with all enclosed parsers]];] - - --:47.146+02:00


So I have concluded that the error lies somewhere in the way it is parsing the [DateTime] field. However this is as far as I could get. I’ve been researching this error a lot and getting nowhere. I’ve seen people with similiar issues, but sadly I just can’t tackle this one. I kindly ask for your help!

Graylog version: 4.2.10+37fbc90 
OS: Red Hat Linux (kernel 4.18)

I am not sure why you started a new post, the other one was just fine other than needing editing for formatting that @gsmith suggested using the </> tool (you can see the tool in all the post and reply fields) I took the liberty of changing your post with it so that it is more readable for those who are helping you.

The timestamp appears to be in epoch time which is different formthe Elasticsearch datatype of date. You need to set up a way to get the Epoch timestamp to a date type so Elasticsearch can receive it.

You haven’t posted anything about your extractor settings are so I am not clear on (and @gsmith ) what/how you are extracting.

1 Like

The extractor’s options are default.
I create the extractor by clicking on the message and selecting a JSON extractor. That’s all, I don’t have access to the server as of this moment, so if you want to see the exact configuration I will post it in circa 14 hours.

What you said about the timestamp field makes sense. However I don’t know how I would convert that one specific field?

Hello,
I agree with @tmacgbay you should of just stayed on the other post and also what @tmacgbay stated you need to convert the DatTime field. Just so happens I have a rule on that, you may need to adjust it to your needs.

rule "Epoch Convert"
when
  has_field("BatMan")
then
  let ts_millis = to_long($message.BatMan) / 1000;
  let new_date = parse_unix_milliseconds(ts_millis);
  set_field("epoch_timestamp", new_date); 
end

Hi, I set out to ditch the extractor all together and make a pipeline, like in this post: Failing JSON Extractor - #3 by mulgurul

Here’s my rule:

rule "parse the json log entries"
when
  true
then
  let json_tree = parse_json(to_string($message.json)); 
  let json_fields = select_jsonpath(json_tree, { time: ".$timestamp", remote_addr: ".$remote_addr", body_bytes_sent: ".$body_bytes_sent", request_time: ".$request_time", response_status: ".$response_status", request: ".$request", request_method: ".$request_method", host: ".$host", upstream_cache_status: ".$upstream_cache_status", upstream_addr: ".$upstream_addr" , http_x_forwarded_for: ".$http_x_forwarded_for" , http_referrer: ".$http_referrer", http_user_agent: ".$http_user_agent", http_version: ".$http_version", nginx_access: ".$nginx_access"});

  set_field("remote_addr", to_string(json_fields.remote_addr));
  set_field("body_bytes_sent", to_double(json_fields.body_bytes_sent));
  set_field("request_time", to_double(json_fields.request_time));
  set_field("response_status", to_double(json_fields.response_status));
  set_field("request", to_string(json_fields.request));
  set_field("request_method", to_string(json_fields.request_method));
  set_field("host", to_string(json_fields.host));
  set_field("upstream_cache_status", to_string(json_fields.upstream_cache_status));
  set_field("upstream_addr", to_string(json_fields.upstream_addr));
  set_field("http_x_forwarded_for", to_string(json_fields.http_x_forwarded_for));
  set_field("http_referrer", to_string(json_fields.http_referrer));
  set_field("http_user_agent", to_string(json_fields.http_user_agent));
  set_field("http_version", to_string(json_fields.http_version));
  set_field("nginx_access", to_bool(json_fields.nginx_access));
  set_field("timestamp", parse_date(substring(to_string(json_fields.time), 0, 23), "yyyy-MM-dd HH:mm:ss.SSS"));
  
end

However when I run this rule, all the fields get created but look like this:

host
[]
http_referrer
[]
http_user_agent
[]
http_version
[]
http_x_forwarded_for
[]

Any idea what’s wrong here?

This a sample message that gets received:

**http_x_forwarded_for**
[]
**json**
{ "timestamp": "1658824872.073", "remote_addr": "x.x.x.x", "body_bytes_sent": 147051, "request_time": 0.093, "response_status": 200, "request": "GET /VoiceR/Images/315200dasdadaf00037304824556xsa.jpg HTTP/1.1", "request_method": "GET", "host": "xxxx","upstream_cache_status": "","upstream_addr": "x.x.x.x:x","http_x_forwarded_for": "","http_referrer": "", "http_user_agent": "Dart/2.15 (dart:io)", "http_version": "HTTP/1.1", "nginx_access": true }
**level**
6
**message**
MyHost nginx: { "timestamp": "1658824872.073", "remote_addr": "x.x.x.x", "body_bytes_sent": 147051, "request_time": 0.093, "response_status": 200, "request": "GET /VoicerwerewR/Irwerewmages/31520000037304824556.jpg HTTP/1.1", "request_method": "GET", "host": "xxx","upstream_cache_status": "","upstream_addr": "x.x.x.x:x","http_x_forwarded_for": "","http_referrer": "", "http_user_agent": "Dart/2.15 (dart:io)", "http_version": "HTTP/1.1", "nginx_access": true }
**nginx_access**
false
**remote_addr**
[]
**request**
[]
**request_method**
[]
**request_time**
0

I eventually got it all working. Turns out you should put $. instead of .$.
I also got the convertion from epoch to Date up and running by adjusting the function that @gsmith wrote.
Since my epoch values come like that: 0000000000.000 I had to remove the .000 since it wouldn’t convert to long. Oh and for now I added 7200 seconds to the value since the server’s timezone is UTC +00:00:00 and my timezone is UTC +02:00:00. I will change the server’s timezone at a later date.
My full rule:

rule "parse the json log entries"
when has_field("json")
then

  let json_tree = parse_json(to_string($message.json));
  
  let json_fields = select_jsonpath(json_tree, { time: "$.timestamp", remote_addr: "$.remote_addr", body_bytes_sent: "$.body_bytes_sent", request_time: "$.request_time", response_status: "$.response_status", request: "$.request", request_method: "$.request_method", host: "$.host", upstream_cache_status: "$.upstream_cache_status", upstream_addr: "$.upstream_addr" , http_x_forwarded_for: "$.http_x_forwarded_for" , http_referrer: "$.http_referrer", http_user_agent: "$.http_user_agent", http_version: "$.http_version", nginx_access: "$.nginx_access"});

  let s_epoch = to_string(json_fields.time);
  let s = substring(s_epoch, 0, 10);
  let ts_millis = (to_long(s) + 7200) * 1000;
  let new_date = parse_unix_milliseconds(ts_millis);
  
  set_field("date", new_date);
  
  

  set_field("remote_addr", to_string(json_fields.remote_addr));
  set_field("body_bytes_sent", to_double(json_fields.body_bytes_sent));
  set_field("request_time", to_double(json_fields.request_time));
  set_field("response_status", to_double(json_fields.response_status));
  set_field("request", to_string(json_fields.request));
  set_field("request_method", to_string(json_fields.request_method));
  set_field("host", to_string(json_fields.host));
  set_field("upstream_cache_status", to_string(json_fields.upstream_cache_status));
  set_field("upstream_addr", to_string(json_fields.upstream_addr));
  set_field("http_x_forwarded_for", to_string(json_fields.http_x_forwarded_for));
  set_field("http_referrer", to_string(json_fields.http_referrer));
  set_field("http_user_agent", to_string(json_fields.http_user_agent));
  set_field("http_version", to_string(json_fields.http_version));
  set_field("nginx_access", to_bool(json_fields.nginx_access));
  
end

Big thanks to you guys for helping me out, I really appreciate it!!

2 Likes

Good job on that pipeline :+1:

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.