How to identify the source of indexer failures

rayterrill · November 14, 2018, 9:09pm

We have messages like this:

{"type":"mapper_parsing_exception","reason":"failed to parse [deviceCustomDate1]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"50840-10-23\" is malformed at \"0-10-23\""}}

{"type":"mapper_parsing_exception","reason":"failed to parse [time]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"2018-11-14 00:00:04\" is malformed at \" 00:00:04\""}}

Is there any way to match the LetterID or something up to the underlying message or source? Not even sure where to start to track this down.

jan · November 15, 2018, 7:18am

he @rayterrill

the complete message should give you an idea what source is sending the message - if not you need to identify the messages that have the fields deviceCustomDate1 and/or time. It looks like the field is saved as timestamp in elasticsearch, but that expects another format of the date.

Your options are now:

create a custom mapping for this fields and force them to be a string ( http://docs.graylog.org/en/2.4/pages/configuration/elasticsearch.html#custom-index-mappings )
reformat that timestamp to be as elasticsearch would expect it. ( http://docs.graylog.org/en/2.4/pages/pipelines/functions.html#parse-date )

rayterrill · November 15, 2018, 3:03pm

@jan So there’s no way to retrieve the actual message, or turn on a higher level of debug to capture the full message in that mapper_parsing_exception message? That would seem to be super useful in quickly targeting the offending messages.

I can definitely see using Graylog search and something like “exists:deviceCustomDate1” to try and find the messages, but it’s a little like trying to find a needle in a haystack with so many sources.

jan · November 15, 2018, 4:46pm

I can definitely see using Graylog search and something like “ exists :deviceCustomDate1” to try and find the messages, but it’s a little like trying to find a needle in a haystack with so many sources.

currently that is a bit painful, yes. Sorry about that, but if Graylog would drop the complete messages a 3rd party could do harmful to your Graylog just by sending core dumps or similar. We know that need to be better.

You might know what kind of data is delivered to Graylog and you are able to identify this way what sender generates deviceCustomDate1 or you know what extractor or processing pipeline creates that field. This way it should be easier to identify.

rayterrill · November 15, 2018, 7:22pm

@jan Ok, we’ll try to back into it.

currently that is a bit painful, yes. Sorry about that, but if Graylog would drop the complete messages a 3rd party could do harmful to your Graylog just by sending core dumps or similar. We know that need to be better.

Even something like the first 100 characters of the message would be super helpful in terms of tracking down the messages. Maybe with some sort of debug flag we could set that’s off by default, but you could enable on-the-fly to help track down things in the event that you’re seeing indexer errors.

Thanks for the help. We’ll start digging.

-Ray

123dev · November 15, 2018, 10:50pm

This has been our main pain point with Graylog.
With so many sources sending messages to Graylog, it is almost impossible to isolate the offending sender and take corrective action.
We do use custom mapping, to make sure that only the good type ends up in ES, we even have created separate input for distinct sources, and by stopping one input at a time and observing if the errors stop, we can narrow it down to the offending source, but that only works if the rate of errors are high, not to mention the fact that production input has stopped logging data.

I think any form of capturing the offending messages, even to a log file that is set to a short buffer can greatly help isolate such issues.

The other painful option is to create a separate index per source (not recommended), and since the index is logged for the offending message, one can identify the source.

Hopefully one day, a bit more information would be captured?
Source / Input / part of or the full message …

Thanks

system · November 29, 2018, 10:50pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Unable to find and fix "Failed to index" messages Graylog Central (peer support)	5	6576	February 27, 2018
Recurring indexer failures Graylog Central (peer support)	3	835	April 21, 2021
Graylog indexer failure Graylog Central (peer support)	1	817	March 14, 2017
Incompatible index mappings Graylog Central (peer support)	8	4628	June 5, 2018
Indexer failures for timestamp Graylog Central (peer support)	10	2330	July 5, 2019

How to identify the source of indexer failures

Related topics