Identify source and reason of bad messages


(Eroji) #1

We are constantly getting errors logged in the Graylog server log indicating there are messages failing to be processed… Theses errors are what I’m seeing. How do I look up where they are coming from and why Graylog is not able to process them?

   2017-02-22 12:05:52,271 ERROR: org.graylog2.indexer.messages.Messages - Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: failure in bulk execution:
   [355]: index [graylog_8832], type [message], id [4f973db0-f93a-11e6-b284-0050568f1eb6], message [java.lang.IllegalArgumentException: Document contains at least one immense term in field="xml" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[60, 63, 120, 109, 108, 32, 118, 101, 114, 115, 105, 111, 110, 61, 34, 49, 46, 48, 34, 32, 101, 110, 99, 111, 100, 105, 110, 103, 61, 34]...', original message: bytes can be at most 32766 in length; got 34194]
   2017-02-22 12:05:54,162 ERROR: org.graylog2.indexer.messages.Messages - Failed to index [1] messages. Please check the index error log in your web interface for the reason. Error: failure in bulk execution:
   [101]: index [graylog_8832], type [message], id [50bf9111-f93a-11e6-b284-0050568f1eb6], message [java.lang.IllegalArgumentException: Document contains at least one immense term in field="xml" (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.  Please correct the analyzer to not produce such terms.  The prefix of the first immense term is: '[60, 63, 120, 109, 108, 32, 118, 101, 114, 115, 105, 111, 110, 61, 34, 49, 46, 48, 34, 32, 101, 110, 99, 111, 100, 105, 110, 103, 61, 34]...', original message: bytes can be at most 32766 in length; got 53808]
   2017-02-22 12:05:55,938 ERROR: org.graylog2.inputs.codecs.GelfCodec - Could not parse JSON, first 400 characters:
   com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
    at [Source: ; line: 1, column: 0]
           at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:255) ~[graylog.jar:?]
           at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3851) ~[graylog.jar:?]
           at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3792) ~[graylog.jar:?]
           at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2332) ~[graylog.jar:?]
           at org.graylog2.inputs.codecs.GelfCodec.decode(GelfCodec.java:120) [graylog.jar:?]
           at org.graylog2.shared.buffers.processors.DecodingProcessor.processMessage(DecodingProcessor.java:146) [graylog.jar:?]
           at org.graylog2.shared.buffers.processors.DecodingProcessor.onEvent(DecodingProcessor.java:87) [graylog.jar:?]
           at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:58) [graylog.jar:?]
           at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:35) [graylog.jar:?]
           at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [graylog.jar:?]
           at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [graylog.jar:?]
           at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
   2017-02-22 12:05:55,939 ERROR: org.graylog2.shared.buffers.processors.DecodingProcessor - Unable to decode raw message RawMessage{id=51ee7420-f93a-11e6-b284-0050568f1eb6, journalOffset=12642597106, codec=gelf, payloadSize=8, timestamp=2017-02-22T20:05:55.938Z} on input <56be878ce4b0e6e046749b5b>.
   2017-02-22 12:05:55,939 ERROR: org.graylog2.shared.buffers.processors.DecodingProcessor - Error processing message RawMessage{id=51ee7420-f93a-11e6-b284-0050568f1eb6, journalOffset=12642597106, codec=gelf, payloadSize=8, timestamp=2017-02-22T20:05:55.938Z}
   com.fasterxml.jackson.databind.JsonMappingException: No content to map due to end-of-input
    at [Source: ; line: 1, column: 0]
           at com.fasterxml.jackson.databind.JsonMappingException.from(JsonMappingException.java:255) ~[graylog.jar:?]
           at com.fasterxml.jackson.databind.ObjectMapper._initForReading(ObjectMapper.java:3851) ~[graylog.jar:?]
           at com.fasterxml.jackson.databind.ObjectMapper._readMapAndClose(ObjectMapper.java:3792) ~[graylog.jar:?]
           at com.fasterxml.jackson.databind.ObjectMapper.readTree(ObjectMapper.java:2332) ~[graylog.jar:?]
           at org.graylog2.inputs.codecs.GelfCodec.decode(GelfCodec.java:120) ~[graylog.jar:?]
           at org.graylog2.shared.buffers.processors.DecodingProcessor.processMessage(DecodingProcessor.java:146) ~[graylog.jar:?]
           at org.graylog2.shared.buffers.processors.DecodingProcessor.onEvent(DecodingProcessor.java:87) [graylog.jar:?]
           at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:58) [graylog.jar:?]
           at org.graylog2.shared.buffers.processors.ProcessBufferProcessor.onEvent(ProcessBufferProcessor.java:35) [graylog.jar:?]
           at com.lmax.disruptor.WorkProcessor.run(WorkProcessor.java:143) [graylog.jar:?]
           at com.codahale.metrics.InstrumentedThreadFactory$InstrumentedRunnable.run(InstrumentedThreadFactory.java:66) [graylog.jar:?]
           at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

Where the length limitation comes from or is set up?
(Jochen) #2

The problem is that Lucene (the search library underlying Elasticsearch) can index only fields with a size less than 32 KB. Larger fields can be stored but they cannot be indexed (and not analyzed).

Also see this Discourse topic and this GitHub issue:

https://github.com/Graylog2/graylog2-server/issues/873


(Eroji) #3

Is there a way to trim the messages so that it can be indexed at least on a partial level? Or see what the actual full raw message is and source so that we can do something about them from the source?


(Paul) #4

I found that searching for the term “bytes can be at most 32766” (including quotes) within the graylog query builder, you should come across messages that show the source of the large message.

In our case we could uninstall the service from the hosts generating the large messages. Perhaps if the issue occurs frequently enough that’s what we’ll do…


(Eroji) #5

Didn’t come back with anything when I tried searching with that with quotes included.


(Paul) #6

Did you increase the time period you are searching in … like last day or something in stead of last 5 minutes


(Eroji) #7

Yes, I did 7 days. Nothing.


(Paul) #8

Yikes … I’ll take a look when I’m at work in the morning for you how I found it. Just to make sure.

We using graylog primarily for windows logs coming in via beats sidecar. How are you using it?

In my case, Only on 6 indexes at the moment. I noticed yours was 8832. Is yours an old deployment? I’m on v2.1

When the large message comes in the Web interface stops showing new messages for anything between 10 and 60 minutes. The cpu on the server hits the roof … incoming messages queue on the disk journal. Then as if by magic it’ll just come back to life, show messages, cpu back to idle and queue empty.


(Philipp Ruland) #9

Hey @eroji,

Look at the abbreviate() function provided by pipelines to limit string lenght. :slight_smile:

Greetings - Phil


(Paul) #10

Hi

Just checked and I have a saved search for “failed to execute bulk item”. That’s how I have been identifying the source.

Hope that helps … If you figure out how to prevent this the error from occurring, please let me know as I’m I’m in the same boat.

Thanks