Errors: RemoteTransportException / BlockingBatchedESOutput (org.joda.time.DateTime)

(Toomas Ormisson) #1


I’ve had a lot of problems with my Graylog/Elasticsearch cluster lately, which I’ve managed, more or less, to stabilize by implementing various best practices that weren’t set when I got the cluster for my administration.
Now when the cluster is more or less stable I’m still having two problems which run through my logs:

First is in the graylog web interface under indexer failures:

RemoteTransportException[[es-node1][<ipaddress:port>][indices:data/write/bulk[s]]]; nested: RemoteTransportException[[es-node1][<ipaddress:port>][indices:data/write/bulk[s][p]]]; nested: EsRejectedExecutionException[rejected execution of org.elasticsearch.transport.TransportService$4@32d2251 on EsThreadPoolExecutor[bulk, queue capacity = 50, org.elasticsearch.common.util.concurrent.EsThreadPoolExecutor@59ac05b8[Running, pool size = 32, active threads = 32, queued tasks = 50, completed tasks = 11182813]]];

It seems to me, that Elasticsearch isn’t able to process incoming messages fast enough, but what does this exactly mean? Alas, the documentation in this part is not helping me much. I started polling this metric every minute, and it seems, during active hours I average about 15K failures/minute - That does not sound too good… Am I losing log messages?

And second is in graylog-server log:

2017-06-20T08:44:21.643+03:00 ERROR [BlockingBatchedESOutput] Unable to flush message buffer
java.lang.ClassCastException: Cannot cast java.lang.String to org.joda.time.DateTime
        at java.lang.Class.cast( ~[?:1.8.0_121]
        at org.graylog2.plugin.Message.getFieldAs( ~[graylog.jar:?]
        at org.graylog2.plugin.Message.getTimestamp( ~[graylog.jar:?]
        at org.graylog2.indexer.messages.Messages.propagateFailure( ~[graylog.jar:?]
        at org.graylog2.indexer.messages.Messages.bulkIndex( ~[graylog.jar:?]
        at org.graylog2.outputs.ElasticSearchOutput.writeMessageEntries( ~[graylog.jar:?]
        at org.graylog2.outputs.BlockingBatchedESOutput.flush( [graylog.jar:?]
        at org.graylog2.outputs.BlockingBatchedESOutput.writeMessageEntry( [graylog.jar:?]
        at org.graylog2.outputs.BlockingBatchedESOutput.write( [graylog.jar:?]
        at org.graylog2.buffers.processors.OutputBufferProcessor$ [graylog.jar:?]
        at com.codahale.metrics.InstrumentedExecutorService$ [graylog.jar:?]
        at java.util.concurrent.Executors$ [?:1.8.0_121]
        at [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor.runWorker( [?:1.8.0_121]
        at java.util.concurrent.ThreadPoolExecutor$ [?:1.8.0_121]
        at [?:1.8.0_121]

It feels like graylog is expecting a timestamp at some point, but is getting something else. Can this be causing the indexer failures in the frontend and how could I find out what logs exactly cause this message?

Graylog version: 2.2.3
Elasticsearch version: 2.4.4

Any feedback would be appreciated.

Thank you!

(system) closed #2

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.