Slow down the Graylog system and not all inputs are process

Graylog log

/var/loh/graylog-server/server.log

2017-03-17T13:56:47.434+05:30 WARN  [ProxiedResource] Unable to call http://192.168.0.71:12900/system/inputstates on node <19ce2075-9719-40b4-8fe9-c41a4568d204>
java.net.SocketTimeoutException: timeout
        at okio.Okio$4.newTimeoutException(Okio.java:227) ~[graylog.jar:?]
        at okio.AsyncTimeout.exit(AsyncTimeout.java:284) ~[graylog.jar:?]
        at okio.AsyncTimeout$2.read(AsyncTimeout.java:240) ~[graylog.jar:?]
        at okio.RealBufferedSource.indexOf(RealBufferedSource.java:325) ~[graylog.jar:?]
        at okio.RealBufferedSource.indexOf(RealBufferedSource.java:314) ~[graylog.jar:?]
        at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:210) ~[graylog.jar:?]
        at okhttp3.internal.http1.Http1Codec.readResponse(Http1Codec.java:191) ~[graylog.jar:?]
        at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:132) ~[graylog.jar:?]
        at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:54) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at org.graylog2.rest.RemoteInterfaceProvider.lambda$get$0(RemoteInterfaceProvider.java:59) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179) ~[graylog.jar:?]
        at okhttp3.RealCall.execute(RealCall.java:63) ~[graylog.jar:?]
        at retrofit2.OkHttpCall.execute(OkHttpCall.java:174) ~[graylog.jar:?]
        at org.graylog2.shared.rest.resources.ProxiedResource.lambda$null$0(ProxiedResource.java:76) ~[graylog.jar:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
Caused by: java.net.SocketException: Socket closed
        at java.net.SocketInputStream.read(SocketInputStream.java:203) ~[?:1.8.0_91]
        at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_91]
        at okio.Okio$2.read(Okio.java:138) ~[graylog.jar:?]
        at okio.AsyncTimeout$2.read(AsyncTimeout.java:236) ~[graylog.jar:?]
        ... 29 more
2017-03-17T13:56:47.434+05:30 WARN  [ProxiedResource] Unable to call http://192.168.0.71:12900/system/inputstates on node <19ce2075-9719-40b4-8fe9-c41a4568d204>
java.net.SocketTimeoutException: timeout
        at okio.Okio$4.newTimeoutException(Okio.java:227) ~[graylog.jar:?]
        at okio.AsyncTimeout.exit(AsyncTimeout.java:284) ~[graylog.jar:?]
        at okio.AsyncTimeout$2.read(AsyncTimeout.java:240) ~[graylog.jar:?]
        at okio.RealBufferedSource.indexOf(RealBufferedSource.java:325) ~[graylog.jar:?]
        at okio.RealBufferedSource.indexOf(RealBufferedSource.java:314) ~[graylog.jar:?]
        at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:210) ~[graylog.jar:?]
        at okhttp3.internal.http1.Http1Codec.readResponse(Http1Codec.java:191) ~[graylog.jar:?]
        at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:132) ~[graylog.jar:?]
        at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:54) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at org.graylog2.rest.RemoteInterfaceProvider.lambda$get$0(RemoteInterfaceProvider.java:59) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179) ~[graylog.jar:?]
        at okhttp3.RealCall.execute(RealCall.java:63) ~[graylog.jar:?]
        at retrofit2.OkHttpCall.execute(OkHttpCall.java:174) ~[graylog.jar:?]
        at org.graylog2.shared.rest.resources.ProxiedResource.lambda$null$0(ProxiedResource.java:76) ~[graylog.jar:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]
Caused by: java.net.SocketException: Socket closed
        at java.net.SocketInputStream.read(SocketInputStream.java:203) ~[?:1.8.0_91]
        at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_91]
        at okio.Okio$2.read(Okio.java:138) ~[graylog.jar:?]
        at okio.AsyncTimeout$2.read(AsyncTimeout.java:236) ~[graylog.jar:?]
        ... 29 more
2017-03-17T13:56:47.453+05:30 WARN  [ProxiedResource] Unable to call http://192.168.0.71:12900/system/inputstates on node <19ce2075-9719-40b4-8fe9-c41a4568d204>
java.net.SocketTimeoutException: Read timed out
        at java.net.SocketInputStream.socketRead0(Native Method) ~[?:1.8.0_91]
        at java.net.SocketInputStream.socketRead(SocketInputStream.java:116) ~[?:1.8.0_91]
        at java.net.SocketInputStream.read(SocketInputStream.java:170) ~[?:1.8.0_91]
        at java.net.SocketInputStream.read(SocketInputStream.java:141) ~[?:1.8.0_91]
        at okio.Okio$2.read(Okio.java:138) ~[graylog.jar:?]
        at okio.AsyncTimeout$2.read(AsyncTimeout.java:236) ~[graylog.jar:?]
        at okio.RealBufferedSource.indexOf(RealBufferedSource.java:325) ~[graylog.jar:?]
        at okio.RealBufferedSource.indexOf(RealBufferedSource.java:314) ~[graylog.jar:?]
        at okio.RealBufferedSource.readUtf8LineStrict(RealBufferedSource.java:210) ~[graylog.jar:?]
        at okhttp3.internal.http1.Http1Codec.readResponse(Http1Codec.java:191) ~[graylog.jar:?]
        at okhttp3.internal.http1.Http1Codec.readResponseHeaders(Http1Codec.java:132) ~[graylog.jar:?]
        at okhttp3.internal.http.CallServerInterceptor.intercept(CallServerInterceptor.java:54) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.connection.ConnectInterceptor.intercept(ConnectInterceptor.java:45) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.internal.cache.CacheInterceptor.intercept(CacheInterceptor.java:93) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.internal.http.BridgeInterceptor.intercept(BridgeInterceptor.java:93) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept(RetryAndFollowUpInterceptor.java:120) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at org.graylog2.rest.RemoteInterfaceProvider.lambda$get$0(RemoteInterfaceProvider.java:59) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:92) ~[graylog.jar:?]
        at okhttp3.internal.http.RealInterceptorChain.proceed(RealInterceptorChain.java:67) ~[graylog.jar:?]
        at okhttp3.RealCall.getResponseWithInterceptorChain(RealCall.java:179) ~[graylog.jar:?]
        at okhttp3.RealCall.execute(RealCall.java:63) ~[graylog.jar:?]
        at retrofit2.OkHttpCall.execute(OkHttpCall.java:174) ~[graylog.jar:?]
        at org.graylog2.shared.rest.resources.ProxiedResource.lambda$null$0(ProxiedResource.java:76) ~[graylog.jar:?]
        at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [?:1.8.0_91]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [?:1.8.0_91]
        at java.lang.Thread.run(Thread.java:745) [?:1.8.0_91]

This is only the log of Graylog, which you’ve basically already posted.

See http://docs.graylog.org/en/2.2/pages/configuration/file_location.html#omnibus-package for additional file locations.

Hi Jochen,

Please refer the elasticsearch log from following link

https://mega.nz/#!XIZ2WSoC

Decryption key “!iW1jQyU-tRpcs49lfG0f924NoB6r7DmP4XWcmq0wJqU”

I had some maybe related happen when i updated from 2.1.1 to 2.2.2 and elastic 2.3.2 to 2.4.4. It worked great until a few days later when the processing queue went to max. The graylog server was barely taking any CPU or memory hit at the time. Tried restarting graylog, creating a new index, removing old indexes, restarting the elastic cluster, checking the NIC speeds, and removing the syslog extractors from the input. Even completely bounced the graylog and ES nodes.

Turned out I had to rebuild the input and everything has been working fine since then…

@karlt Can u explain me what you have done exactly ?. After rebuild the input are your system worked perfectly ?

Have you delete all your Graylog inputs at once and add them one by one or

Have you delete one by one and add them one by one ?

@karj I have delete the input one at a time and add it as new input. Done for all my Grayloh system inputs. Then Restart the Graylog service. But the system performance remains the same. No any improvement.

I did exactly what you did. I deleted the input and built it again. After I did that, I added my extractor and I have not had an issue with the processing queue filling up since then. Unfortunately it looks like your issue persists and is not the same as mine. Also, I’m not using openjdk, i’m using: Java™ SE Runtime Environment (build 1.8.0_91-b14)
Java HotSpot™ 64-Bit Server VM (build 25.91-b14, mixed mode)

Have your log sources changed at all to send larger or differently formatted events?

@karlt How did you monitor the processing queue of Graylog. Have you use any command or from Graylog heap monitor ?

I just watch the buffer queues under nodes in the web UI

If you need some advices what and how to monitor.

finally I have decided to uninstall the graylog-server 2.2.2 and keep other packages as it is and install graylog-server 2.2.2 again.

But I need to keep the existing index files in /var/graylog folder. I have a doubt that if I uninstall the graylog, will it remove the /var/graylog folder ?

/var/graylog/
drwxrwxrwx. 3 root root 4096 Jun 17 2016 data
drwxr-xr-x. 3 graylog graylog 4096 Mar 22 15:43 journal
drwxrwxrwx. 3 root root 4096 Jun 17 2016 log

Only if you “purge” (in contrast to “uninstall”) the package.

@jochen Sorry I didn’t get your point

I referred to apt-get uninstall and apt-get purge. See http://manpages.ubuntu.com/manpages/xenial/en/man8/apt-get.8.html for details.

If I uninstall only the graylog-server 2.2.2 and If I keep elasticsearch and MongoDB as it is, during the graylog-server 2.2.2 uninstall process will it remove the indexing ?

I need to keep graylog settings and elasticsearch as it is and only remove the graylog-server 2.2.2 rpm. How can I perform such scenario ?

Then install graylog-server 2.2.2 again and check the performance issue.

Make a snapshot of your machine and try it out. If it doesn’t work as intended, restore the previously created snapshot.

I got below error on graylog now. Is this issue raised only the output buffer issue or related with any other issue.

Earlier output_batch_size was 500 and I have set it to 1000. (output_batch_size = 1000). But still I am getting this message.

I am still stuck on improving the output buffer in Graylog. Journal drop messages since elasticsearch doesn’t write all messages

Can any one help me to make necessary changes to increase the Graylog output buffer.

The configuration file and the variable to be edited.

In my system it has 32GB of RAM and receiving nearly 10000 msg/sec

If indexing is too slow, the solution is making Elasticsearch more performant (e. g. by adding more nodes or by adding more hardware resources), not increasing output buffers in Graylog. This would only lead to full buffers at a later time.

@jochen, Fine if there is an issue on Elasticsearch we can look into to that separately.

But why the total Graylog System getting slow ?. To load each and every page takes more time and some pages data are not loading,
specially
http://192.168.0.71:9000/system/nodes/19ce2075-9719-40b4-8fe9-xxxxxxxxxxx

not loading the data, which we need to monitor the Graylog buffer performance