Graylog Error Broken Pipe


(GT) #1

Has anyone else had this error and are there any preventative measures to ensure it doesn’t happen again?

2018-07-10T10:05:38.153+01:00 ERROR [ServerRuntime$Responder] An I/O error has occurred while writing a response message entity to the container output stream.
org.glassfish.jersey.server.internal.process.MappableException: java.io.IOException: Connection closed
        at org.glassfish.jersey.server.internal.MappableExceptionWrapperInterceptor.aroundWriteTo(MappableExceptionWrapperInterceptor.java:92) ~[graylog.jar:?]
        at org.glassfish.jersey.message.internal.WriterInterceptorExecutor.proceed(WriterInterceptorExecutor.java:162) ~[graylog.jar:?]
        at org.glassfish.jersey.message.internal.MessageBodyFactory.writeTo(MessageBodyFactory.java:1130) ~[graylog.jar:?]
        at org.glassfish.jersey.server.ServerRuntime$Responder.writeResponse(ServerRuntime.java:711) [graylog.jar:?]
        at org.glassfish.jersey.server.ServerRuntime$Responder.processResponse(ServerRuntime.java:444) [graylog.jar:?]
        at org.glassfish.jersey.server.ServerRuntime$Responder.process(ServerRuntime.java:434) [graylog.jar:?]
        at org.glassfish.jersey.server.ServerRuntime$2.run(ServerRuntime.java:329) [graylog.jar:?]
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:271) [graylog.jar:?]
        at org.glassfish.jersey.internal.Errors$1.call(Errors.java:267) [graylog.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:315) [graylog.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:297) [graylog.jar:?]
        at org.glassfish.jersey.internal.Errors.process(Errors.java:267) [graylog.jar:?]
        at org.glassfish.jersey.process.internal.RequestScope.runInScope(RequestScope.java:317) [graylog.jar:?]
        at org.glassfish.jersey.server.ServerRuntime.process(ServerRuntime.java:305) [graylog.jar:?]
        at org.glassfish.jersey.server.ApplicationHandler.handle(ApplicationHandler.java:1154) [graylog.jar:?]
        at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service(GrizzlyHttpContainer.java:384) [graylog.jar:?]
        at org.glassfish.grizzly.http.server.HttpHandler$1.run(HttpHandler.java:224) [graylog.jar:?]
        at com.codahale.metrics.InstrumentedExecutorService$InstrumentedRunnable.run(InstrumentedExecutorService.java:176) [graylog.jar:?]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_171]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_171]
        at java.lang.Thread.run(Thread.java:748) [?:1.8.0_171]
Caused by: java.io.IOException: Connection closed
        at org.glassfish.grizzly.asyncqueue.TaskQueue.onClose(TaskQueue.java:331) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.AbstractNIOAsyncQueueWriter.onClose(AbstractNIOAsyncQueueWriter.java:501) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOTransport.closeConnection(TCPNIOTransport.java:402) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.NIOConnection.doClose(NIOConnection.java:647) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.NIOConnection$6.run(NIOConnection.java:613) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.DefaultSelectorHandler.execute(DefaultSelectorHandler.java:238) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.NIOConnection.terminate0(NIOConnection.java:607) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOConnection.terminate0(TCPNIOConnection.java:291) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOAsyncQueueWriter.write0(TCPNIOAsyncQueueWriter.java:136) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOAsyncQueueWriter.write0(TCPNIOAsyncQueueWriter.java:106) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.AbstractNIOAsyncQueueWriter.processAsync(AbstractNIOAsyncQueueWriter.java:344) ~[graylog.jar:?]
        at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:108) ~[graylog.jar:?]
        at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:526) ~[graylog.jar:?]
        at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112) ~[graylog.jar:?]
        at org.glassfish.grizzly.strategies.SameThreadIOStrategy.executeIoEvent(SameThreadIOStrategy.java:103) ~[graylog.jar:?]
        at org.glassfish.grizzly.strategies.AbstractIOStrategy.executeIoEvent(AbstractIOStrategy.java:89) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.SelectorRunner.iterateKeyEvents(SelectorRunner.java:415) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.SelectorRunner.iterateKeys(SelectorRunner.java:384) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:348) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:279) ~[graylog.jar:?]
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:593) ~[graylog.jar:?]
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:573) ~[graylog.jar:?]
        ... 1 more
Caused by: java.io.IOException: Broken pipe
        at sun.nio.ch.FileDispatcherImpl.write0(Native Method) ~[?:1.8.0_171]
        at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47) ~[?:1.8.0_171]
        at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:93) ~[?:1.8.0_171]
        at sun.nio.ch.IOUtil.write(IOUtil.java:51) ~[?:1.8.0_171]
        at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:471) ~[?:1.8.0_171]
        at org.glassfish.grizzly.nio.transport.TCPNIOUtils.flushByteBuffer(TCPNIOUtils.java:149) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOUtils.writeSimpleBuffer(TCPNIOUtils.java:133) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOAsyncQueueWriter.write0(TCPNIOAsyncQueueWriter.java:126) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOAsyncQueueWriter.write0(TCPNIOAsyncQueueWriter.java:106) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.AbstractNIOAsyncQueueWriter.processAsync(AbstractNIOAsyncQueueWriter.java:344) ~[graylog.jar:?]
        at org.glassfish.grizzly.filterchain.DefaultFilterChain.process(DefaultFilterChain.java:108) ~[graylog.jar:?]
        at org.glassfish.grizzly.ProcessorExecutor.execute(ProcessorExecutor.java:77) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.transport.TCPNIOTransport.fireIOEvent(TCPNIOTransport.java:526) ~[graylog.jar:?]
        at org.glassfish.grizzly.strategies.AbstractIOStrategy.fireIOEvent(AbstractIOStrategy.java:112) ~[graylog.jar:?]
        at org.glassfish.grizzly.strategies.SameThreadIOStrategy.executeIoEvent(SameThreadIOStrategy.java:103) ~[graylog.jar:?]
        at org.glassfish.grizzly.strategies.AbstractIOStrategy.executeIoEvent(AbstractIOStrategy.java:89) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.SelectorRunner.iterateKeyEvents(SelectorRunner.java:415) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.SelectorRunner.iterateKeys(SelectorRunner.java:384) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.SelectorRunner.doSelect(SelectorRunner.java:348) ~[graylog.jar:?]
        at org.glassfish.grizzly.nio.SelectorRunner.run(SelectorRunner.java:279) ~[graylog.jar:?]
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.doWork(AbstractThreadPool.java:593) ~[graylog.jar:?]
        at org.glassfish.grizzly.threadpool.AbstractThreadPool$Worker.run(AbstractThreadPool.java:573) ~[graylog.jar:?]
        ... 1 more

For a bit of context that may help out with identifying the root cause of this issue:
OS: Ubuntu 16.04 LTS
GL: 2.4.5
ES 5.6
Mongo: 3.4.0

All of the above is on one node (it is a test box). At the time of the incident no one was logged into the box via SSH or the web ui and the incident seems to have occurred out of no where. When the error occurred message processing stopped for about 15 minutes and then resumed by itself.

Thanks in advance,

George


(Jochen) #2

Please provide the complete configuration and logs of the Graylog node (and Elasticsearch, if it’s running on the same machine), as well as a short description of the hardware specs of the machine running Graylog
:arrow_right: http://docs.graylog.org/en/2.4/pages/configuration/file_location.html


(GT) #3

Elasticsearch logs:

[2018-07-10T00:00:02,811][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_58/iAXgXgNGQteNhY1L6ovrTA] update_mapping [message]
[2018-07-10T00:21:12,631][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_58/iAXgXgNGQteNhY1L6ovrTA] update_mapping [message]
[2018-07-10T01:00:07,827][INFO ][o.e.c.m.MetaDataCreateIndexService] [MA2aQJ4] [graylog_59] creating index, cause [api], templates [graylog-internal], shards [4]/[0], mappings [message]
[2018-07-10T01:00:07,924][INFO ][o.e.c.r.a.AllocationService] [MA2aQJ4] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[graylog_59][0]] ...]).
[2018-07-10T01:00:09,771][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T01:00:09,774][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T01:00:09,812][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T01:00:09,841][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T01:00:09,894][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T01:01:08,783][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T01:01:57,784][INFO ][o.e.c.m.MetaDataDeleteIndexService] [MA2aQJ4] [graylog_56/MSg_-lnkQmie6c1nVpIvDw] deleting index
[2018-07-10T01:02:09,766][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T02:53:51,799][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T04:32:44,785][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T04:32:46,768][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T05:18:31,776][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T05:19:09,774][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T05:19:09,837][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T08:07:21,781][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T08:11:08,781][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]
[2018-07-10T09:33:08,790][INFO ][o.e.c.m.MetaDataMappingService] [MA2aQJ4] [graylog_59/zwh_SHnnQLuxH2hWjlr7bA] update_mapping [message]

Graylog logs(either side of the error):

2018-07-10T09:54:58.379+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T09:55:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T09:56:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T09:57:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T09:58:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T09:59:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:00:58.377+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:01:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:02:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:03:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:04:58.378+01:00 INFO  [AggregatesMaintenance] Removed 0 history items

Error occurred here
2018-07-10T10:05:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:06:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:07:46.275+01:00 INFO [GelfTcpTransport] Channel disconnected!
2018-07-10T10:07:58.377+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:08:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:09:58.377+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:10:58.382+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:11:58.653+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:12:58.376+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:13:58.377+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:14:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:15:58.380+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:16:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:17:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:18:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:19:58.379+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:20:58.379+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:21:58.379+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:22:58.379+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:23:58.379+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:24:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:25:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items
2018-07-10T10:26:58.378+01:00 INFO [AggregatesMaintenance] Removed 0 history items

Also what GL config do you require as I’d prefer to just send you some and not all of it.

Cheers,

G


(Jochen) #4

Are there any other messages in the logs?
Hint: You might want to disable this particular logger, since it seems to spam the logs unnecessarily every 60 seconds.

The Graylog configuration file and JVM settings.