Json extractor problem


(123dev) #1

Hi,

We’re encountering an issue with json extractor, it’s giving us the following errors.

2018-02-13_14:57:48.40168 WARN  [Messages] Failed to index message: index=<graylog_616> id=<3a91c820-10ce-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"2018-02-13T00:00:00-0500, 2018-02-14T00:00:00-0500, 2018...\" is malformed at \", 2018-02-14T00:00:00-0500, 2018...\""}}>
2018-02-13_14:57:48.40250 ERROR [Messages] Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.
2018-02-13_14:57:54.28800 WARN  [Messages] Failed to index message: index=<graylog_616> id=<40552771-10ce-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"2018-02-13T00:00:00-0500, 2018-02-14T00:00:00-0500\" is malformed at \", 2018-02-14T00:00:00-0500\""}}>
2018-02-13_14:57:54.29061 ERROR [Messages] Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.
2018-02-13_14:57:55.56865 WARN  [Messages] Failed to index message: index=<graylog_616> id=<43a40451-10ce-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"2018-02-13T00:00:00-0500, 2018-02-14T00:00:00-0500, 2018...\" is malformed at \", 2018-02-14T00:00:00-0500, 2018...\""}}>
2018-02-13_14:57:55.56925 ERROR [Messages] Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.
2018-02-13_14:57:55.62012 WARN  [Messages] Failed to index message: index=<graylog_616> id=<4477f622-10ce-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"02-13-2018\" is malformed at \"-13-2018\""}}>
2018-02-13_14:57:55.62052 ERROR [Messages] Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.
2018-02-13_14:58:00.30708 WARN  [Messages] Failed to index message: index=<graylog_616> id=<458fcec1-10ce-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"2018-02-14T00:00:00-0500, 2018-02-15T00:00:00-0500\" is malformed at \", 2018-02-15T00:00:00-0500\""}}>
2018-02-13_14:58:00.30843 ERROR [Messages] Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.
2018-02-13_14:58:01.36052 WARN  [Messages] Failed to index message: index=<graylog_616> id=<46d63170-10ce-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"02-13-2018\" is malformed at \"-13-2018\""}}>
2018-02-13_14:58:01.36375 WARN  [Messages] Failed to index message: index=<graylog_616> id=<4706b751-10ce-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"02-13-2018\" is malformed at \"-13-2018\""}}>
2018-02-13_14:58:01.36420 ERROR [Messages] Failed to index [2] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.

The field we’re trying to extract from has a valid json message, with an entry similar to this.
"pickUpDates": ["2018-02-13T00:00:00-0500", "2018-02-14T00:00:00-0500"],

Is it the case that it’s seeing the word date in the field / entry name it’s trying to set the type to date?
If this is the issue do we have to create custom mappings for each possible field that could ever receive in the payload? That would be quite cumbersome.

Is this a bug or are we doing something wrong?
When we try a message in json extractor creation screen, it parses properly and shows the following.extractor_issue

Thanks


(123dev) #2

It would be convenient to allow an extractor to be enabled / disabled in the manage extractors page rather than having to delete the extractor until we figure out a proper solution.
This way we don’t lost the extractor setting.

Thanks


(123dev) #3

Looking closely at the error

{
	"type": "mapper_parsing_exception",
	"reason": "failed to parse [e_request_pickUpDates]",
	"caused_by": {
		"type": "illegal_argument_exception",
		"reason": "Invalid format: \"02-16-2018, 02-17-2018, 02-18-2018\" is malformed at \"-16-2018, 02-17-2018, 02-18-2018\""
	}
}

malformed at "-16-2018 where did the 02 part go?


(123dev) #4

Here’s something interesting that could shed some light.

We have the following stream

As we wanted the API input to be stored in API index.
Does the setting "Remove matches from ‘All messages’ stream remove the messages from default index Graylog when they stored in API index?

We’re seeing all messages in both indexes, graylog and api (was expecting to see only in api)
Except the messages that are causing the error, they are in the API index, but are NOT in graylog index.

Thanks


(Jochen) #5

Elasticsearch expects “e_request_pickUpDates” to be a valid (single) date, which it is not in your case.

You’ll have to create a custom index mapping for these fields:
http://docs.graylog.org/en/2.4/pages/configuration/elasticsearch.html#custom-index-mappings


(123dev) #6

Thanks Jochen for taking the time to respond.

I have already done that and am still getting these

2018-02-13_18:07:11.56396 WARN  [Messages] Failed to index message: index=<graylog_616> id=<b523cec0-10e8-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"02-15-2018\" is malformed at \"-15-2018\""}}>
2018-02-13_18:07:11.56488 WARN  [Messages] Failed to index message: index=<api_1> id=<b523cec0-10e8-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"02-15-2018\" is malformed at \"-15-2018\""}}>
2018-02-13_18:07:11.56595 WARN  [Messages] Failed to index message: index=<graylog_616> id=<b5a8dd40-10e8-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"02-16-2018, 02-17-2018\" is malformed at \"-16-2018, 02-17-2018\""}}>
2018-02-13_18:07:11.56679 WARN  [Messages] Failed to index message: index=<api_1> id=<b5a8dd40-10e8-11e8-8b55-0eeefea68770> error=<{"type":"mapper_parsing_exception","reason":"failed to parse [e_request_pickUpDates]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"02-16-2018, 02-17-2018\" is malformed at \"-16-2018, 02-17-2018\""}}>
2018-02-13_18:07:11.56856 ERROR [Messages] Failed to index [4] messages. Please check the index error log in your web interface for the reason. Error: One or more of the items in the Bulk request failed, check BulkResult.getItems() for more information.

Here’s my graylog and api mappings files, do you see any problems in them?

{
    "template" : "api_*",
    "settings": {
      "index.mapping.total_fields.limit": 1200
    },
    "mappings" : {
        "message" : {
            "properties" : {
                "e_request_pickUpDates" : {
                    "type" : "text",
                    "index" : true
                },
                "http_status" : {
                    "type" : "long"
                },
                "ThreadID" : {
                    "type" : "keyword",
                    "index" : true
                }
            }
        }
    }
}

and the graylog one (can the two be merged?, I run them separately)

{
    "template" : "graylog_*",
    "settings": {
      "index.mapping.total_fields.limit": 1200
    },
    "mappings" : {
        "message" : {
            "properties" : {
                "e_request_pickUpDates" : {
                    "type" : "text",
                    "index" : true
                },
                "http_status" : {
                    "type" : "long"
                },
                "ThreadID" : {
                    "type" : "keyword",
                    "index" : true
                }
            }
        }
    }
}

Also I would like to avoid having all the messages in both indexes, is that possible and what is the proper way, I was expecting remove Matches from ‘All messages’ stream to achieve that, or does it only at index rotation?

Update, to be clear, I have rotated the indexes after applying the mappings.

Thanks


(Jochen) #7

Have you manually rotated the active write index of these two index sets? Index templates are only applied to new indices.


(123dev) #8

Thanks

In my previous post, after posting it, I immediately added an update that I have already rotated the indexes.
But I only rotated the API, not the graylog (silly oversight)
So it’s working fine now, thank you.

So how did it decide that it should be a date type? based on the content or the field name?

Also for my other question regarding why I was still seeing the messages in graylog indices in addition to api indices, another silly mistake, I had one other stream that was still routed to the default index :blush:

Thanks for the follow up.
Have a good day


(Jochen) #9

Based on the content and what you want to do with it.


(123dev) #10

Thanks,

One last question, considering that I’m adding the following prefix
e_request_
is it possible to create a mapping that sets
e_request_* (all of them) to string?

I’m seeing mixes of keyword, long, … depending on the content, but I don’t want to go and create a mapping for each possible field, we’re parsing the request payload, and this is pretty dynamic, it’s not infinite, but it’s quite a bit of work to go through all of them to define one by one.

Thanks


(Jochen) #11

That’s possible via dynamic templates:
https://www.elastic.co/guide/en/elasticsearch/reference/5.6/dynamic-templates.html


(123dev) #12

Awesome, thanks
Much appreciated.


(system) #13

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.