How to identify the source of indexer failures


(Ray) #1

We have messages like this:

{"type":"mapper_parsing_exception","reason":"failed to parse [deviceCustomDate1]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"50840-10-23\" is malformed at \"0-10-23\""}}

{"type":"mapper_parsing_exception","reason":"failed to parse [time]","caused_by":{"type":"illegal_argument_exception","reason":"Invalid format: \"2018-11-14 00:00:04\" is malformed at \" 00:00:04\""}}

Is there any way to match the LetterID or something up to the underlying message or source? Not even sure where to start to track this down.


(Jan Doberstein) #2

he @rayterrill

the complete message should give you an idea what source is sending the message - if not you need to identify the messages that have the fields deviceCustomDate1 and/or time. It looks like the field is saved as timestamp in elasticsearch, but that expects another format of the date.

Your options are now:


(Ray) #3

@jan So there’s no way to retrieve the actual message, or turn on a higher level of debug to capture the full message in that mapper_parsing_exception message? That would seem to be super useful in quickly targeting the offending messages.

I can definitely see using Graylog search and something like “exists:deviceCustomDate1” to try and find the messages, but it’s a little like trying to find a needle in a haystack with so many sources.


(Jan Doberstein) #4

I can definitely see using Graylog search and something like “ exists :deviceCustomDate1” to try and find the messages, but it’s a little like trying to find a needle in a haystack with so many sources.

currently that is a bit painful, yes. Sorry about that, but if Graylog would drop the complete messages a 3rd party could do harmful to your Graylog just by sending core dumps or similar. We know that need to be better.

You might know what kind of data is delivered to Graylog and you are able to identify this way what sender generates deviceCustomDate1 or you know what extractor or processing pipeline creates that field. This way it should be easier to identify.


(Ray) #5

@jan Ok, we’ll try to back into it.

currently that is a bit painful, yes. Sorry about that, but if Graylog would drop the complete messages a 3rd party could do harmful to your Graylog just by sending core dumps or similar. We know that need to be better.

Even something like the first 100 characters of the message would be super helpful in terms of tracking down the messages. Maybe with some sort of debug flag we could set that’s off by default, but you could enable on-the-fly to help track down things in the event that you’re seeing indexer errors.

Thanks for the help. We’ll start digging.

-Ray


(123dev) #6

This has been our main pain point with Graylog.
With so many sources sending messages to Graylog, it is almost impossible to isolate the offending sender and take corrective action.
We do use custom mapping, to make sure that only the good type ends up in ES, we even have created separate input for distinct sources, and by stopping one input at a time and observing if the errors stop, we can narrow it down to the offending source, but that only works if the rate of errors are high, not to mention the fact that production input has stopped logging data.

I think any form of capturing the offending messages, even to a log file that is set to a short buffer can greatly help isolate such issues.

The other painful option is to create a separate index per source (not recommended), and since the index is logged for the offending message, one can identify the source.

Hopefully one day, a bit more information would be captured?
Source / Input / part of or the full message …

Thanks


(system) #7

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.