Syslog messages disappearing after JSON extractor applied

(Matt) #1

I have a 3rd party application sending in syslog message. The message string in JSON format. When coming in without an extractor on that input the message string shows up fine. As soon as I apply the JSON extractor to it and test all field parse correctly. Once I save the extractor though. The input says messages are incoming but from that point on nothing shows up from that point on. I’m not confident on this but could it be because their “date field” isn’t formatted correctly? Grasping at straws here on this one as I’ve never had this issue before using a the extractor. Normally it just works. For reference. Timestamps that are on the raw message as it comes in look like this. “2018-01-26T16:04:35.294Z” They are sending their date field this “2018-01-26 16:03:38.218051” Not sure if this matters or not. Like I said. Grasping at straws on this one. On a sidenote. On that same input I’m getting a pretty heavy amount of Empty messages discarded. Something I’ve never seen before. Something on the graylog end? Or something I should kickback to the 3rd party devs to work on? My guess is it’s on them.

(Jochen) #2

Please provide some example messages and the complete configuration of the Syslog input including the extractors.

(Matt) #3

Here is the message field I’m getting. JSON extractor test works just fine but as soon as I apply it. From that moment on showing incoming messages no longer works. Message count is still there but get nothing on show messages anymore.


{"whitelist_score":134217728,"visitor_ip":"","valid_ajax":false,"user_agent":"Mozilla/5.0 (iPhone; CPU iPhone OS 11_2_1 like Mac OS X) AppleWebKit/604.4.7 (KHTML, like Gecko) Version/11.0 Mobile/15C153 Safari/604.1","time":1517238078.493,"threat_score":0,"server_serial":"60c20ce9-25f9-404c-90f0-115de631f853","server_ip":"","sdk_version":null,"sdk_token_id":"","sdk_token_expire":null,"sdk_platform":"","sdk_fingerprint_id":"","sdk_application_version_id":"","sdk_application_instance_id":"","request_protocol":"http","request_id":"60e1bcff-cdbe-4723-a8e3-436967df70b4","request":"GET /images/footergradient.png HTTP/1.1","referer":"","real_ip_header_value":"","re_field_3":"","re_field_2":"","re_field_1":"","primitive_id":"A52A50FA-E350-3E55-8F5D-B0667BDD6BF3","per_path_calculated_pages_per_session":1,"per_path_calculated_pages_per_minute":0.045,"path_security_type":"web","path_rule_scope_id":"","origin_status_code":"200","origin_response_time":"0.003","origin_content_type":"image/png","origin_address":"","nginx_worker_process":19971,"new_platform_domain_id":"77d34097-b1c2-467e-81ab-02f3e4d64568","machine_learning_score":null,"legacy_unique_id":"7E84FE53-C34B-3A12-8FFC-EC7F1B7034A1","legacy_domain_id":"11853","lb_request_time":"","k_s":null,"js_kv_additional_threats":null,"js_additional_threats":null,"informed_id":"D276B080-AAA4-3907-92FB-E86A6F4350B4","identifier_record_value":"","identifier_record_pointer":"","identification_provider":"web","http_status_code":200,"http_request_length":956,"http_host":"","http_accept_charset":"","hostname":"","geoip_org":"Verizon Wireless","geoip_country":"US","experiment_score":null,"experiment_id":null,"experiment_group_id":null,"experiment_auxiliary_string":"","distil_action":"@proxy","date":"2018-01-29 15:01:18.493900","datacenter_id":371,"connection":"close","calculated_session_length":1325.713,"calculated_pages_per_session":7,"calculated_pages_per_min":0.317,"cache_status":"-","bytes_sent":794,"bytes_returned_origin":279,"billable":false,"allowed":21,"account_id":2887,"accept_language":"en-us","accept_encoding":"br, gzip, deflate","accept":"image/png,image/svg+xml,image/*;q=0.8,*/*;q=0.5","ZUID":"AF1C298A-32F5-38FB-9502-210961B19FAF","ZID":"5F2966E7-4324-3F75-A778-E3837BB87E54","SID":"","HSIG":"HA,ULE_FC"}

Input setup

allow_override_date: true
expand_structured_data: false
force_rdns: false
max_message_size: 2097152
override_source: <empty>
port: 8686
recv_buffer_size: 1048576
store_full_message: false
tcp_keepalive: false
tls_cert_file: <empty>
tls_client_auth: disabled
tls_client_auth_cert_file: <empty>
tls_enable: false
tls_key_file: <empty>
tls_key_password: ********
use_null_delimiter: false

Here is the extractor config. All is default.

  "extractors": [
      "title": "test",
      "extractor_type": "json",
      "converters": [],
      "order": 0,
      "cursor_strategy": "copy",
      "source_field": "message",
      "target_field": "",
      "extractor_config": {
        "list_separator": ", ",
        "kv_separator": "=",
        "key_prefix": "",
        "key_separator": "_",
        "replace_key_whitespace": false,
        "key_whitespace_replacement": "_"
      "condition_type": "none",
      "condition_value": ""
  "version": "2.2.0-SNAPSHOT"

(Jochen) #4

Anything in the logs of your Graylog node(s)?

(Matt) #5

Looks like my suspicions were correct in the beginning message. Now question is. Do I have them make changes on their end or is there a way for me to modify that as it comes in?

 [127]: index [graylog_5295], type [message], id [4c624a21-0507-11e8-a973-0050568b02c2], message [MapperParsingException[failed to parse [date]]; nested: IllegalArgumentException[Invalid format: "2018-01-29 15:15:53.856552" is malformed $

(Jochen) #6

Either create a custom index mapping for the “date” field (and really any field you’re using) or ensure a specific and fixed type for the “date” field by using a processing pipeline rule.

(Matt) #7

That worked. Now that I understand what was happening with the “date” field. Thank you for the quick help and response Jochen.

(system) #8

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.