Graylog server extremely slow. cant get System/inputs

2017-11-16T06:11:58.388+05:30 INFO  [cluster] Exception in monitor thread while connecting to server localhost:27017
java.lang.OutOfMemoryError: Java heap space
2017-11-16T06:16:44.297+05:30 WARN  [ProcessBuffer] Unable to process event MessageEvent{raw=null, message=null, messages=null}, sequence 13
java.lang.OutOfMemoryError: Java heap space
2017-11-16T06:16:44.300+05:30 WARN  [InputBufferImpl] Unable to process event <invalid>, sequence 275
java.lang.OutOfMemoryError: Java heap space
2017-11-16T06:16:44.309+05:30 INFO  [cluster] No server chosen by WritableServerSelector from cluster description ClusterDescription{type=UNKNOWN, connectionMode=SINGLE, serverDescriptions=[ServerDescription{address=localhost:27017, type=UNKNOWN, state=CONNECTING, exception={java.lang.OutOfMemoryError: Java heap space}}]}. Waiting for 30000 ms before timing out
2017-11-16T06:16:44.784+05:30 INFO  [cluster] Monitor thread successfully connected to server with description ServerDescription{address=localhost:27017, type=STANDALONE, state=CONNECTED, ok=true, version=ServerVersion{versionList=[3, 2, 17]}, minWireVersion=0, maxWireVersion=4, maxDocumentSize=16777216, roundTripTimeNanos=777202}
2017-11-16T07:20:19.464+05:30 WARN  [ProcessBuffer] Unable to process event MessageEvent{raw=null, message=null, messages=null}, sequence 10
java.lang.OutOfMemoryError: Java heap space

also see this

There was no master Graylog server node detected in the cluster. (triggered 11 hours ago)
Certain operations of Graylog server require the presence of a master node, but no such master was started. Please ensure that one of your Graylog server nodes contains the setting is_master = true in its configuration and that it is running. Until this is resolved index cycling will not be able to run, which means that the index retention mechanism is also not running, leading to increased index sizes. Certain maintenance functions as well as a variety of web interface pages (e.g. Dashboards) are unavailable.

First, you should format your Logfiles that persons that try to help you can read them faster.

And to your error:

  • java.lang.OutOfMemoryError: Java heap space
    • your Graylog is simple out of memory

set the JVM Heap for Graylog to a proper value and be sure that your server has at least that amount of RAM available.

â—Ź graylog-server.service - Graylog server
   Loaded: loaded (/usr/lib/systemd/system/graylog-server.service; enabled; vendor preset: disabled)
   Active: active (running) since Thu 2017-11-16 13:13:33 +0530; 45s ago
     Docs: http://docs.graylog.org/
 Main PID: 23229 (graylog-server)
   CGroup: /system.slice/graylog-server.service
           ├─23229 /bin/sh /usr/share/graylog-server/bin/graylog-server
           └─23230 /usr/bin/java -Xms2g -Xmx5g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CMSClassUnloadingEnabled -XX:+UseParNewGC -XX...

Nov 16 13:13:33 testmaster systemd[1]: Started Graylog server.
Nov 16 13:13:33 testmaster systemd[1]: Starting Graylog server...

this is the service status.

graylog-server working perfectly. min usage is 2G and max RAM is 5G is that insufficient.

[root@testmaster ~]# free
              total        used        free      shared  buff/cache   available
Mem:        6111184     5686784      124416        3596      299984      165688
Swap:       2125820     1098108     1027712
[root@testmaster ~]#

why graylog-server takes this much memory

Nalin, I had now edited your postings before that you can actually see how to format!

second: that is not a chat and no direct helpline - write proper posting and add all information a person that is not you and is not working for your company or department might need to answer a question.

Now your problem:

  • Is Graylog the only service that is running in this Server with 6GB RAM available? Or is some other process fighting with Graylog for the System resources?
  • Why did you raise the Graylog JAVA HEAP?

my max ram allocation was 1g previously. so i changed it to 2g min and 5g max. there is no other process than elastic search and this graylog server. you will see ps -aux output with this post if you are interested. There is only one input right now. i have tick store full message from input settings. sometimes that may affect memory.

root       747  0.0  0.0  21624   360 ?        Ss   Nov15   0:04 /usr/sbin/irqbalance --foreground
root       751  0.0  0.0 335384  4188 ?        Ssl  Nov15   0:15 /usr/sbin/syslog-ng -F -p /var/run/syslogd.pid
dbus       753  0.0  0.0  32772   608 ?        Ssl  Nov15   0:03 /bin/dbus-daemon --system --address=systemd: --nofork --nopidfile --systemd-activation
root       756  0.0  0.0   4340     0 ?        Ss   Nov15   0:00 /usr/sbin/acpid
ntp        760  0.0  0.0  25668   596 ?        Ss   Nov15   0:00 /usr/sbin/ntpd -u ntp:ntp
root       765  0.0  0.0 469620  1076 ?        Ssl  Nov15   0:02 /usr/sbin/NetworkManager --no-daemon
root       766  0.0  0.0  24204   896 ?        Ss   Nov15   0:01 /usr/lib/systemd/systemd-logind
root       769  0.0  0.0 126236   432 ?        Ss   Nov15   0:01 /usr/sbin/crond -n
root       779  0.0  0.0 110044     0 tty1     Ss+  Nov15   0:00 /sbin/agetty --noclear tty1 linux
root      1052  0.0  0.0      0     0 ?        S<   Nov15   0:00 [kworker/1:1H]
root      8842  0.0  0.0 147852   204 ?        Ss   10:08   0:00 sshd: root@pts/0
root      8847  0.0  0.0 115524   184 pts/0    Ss+  10:08   0:00 -bash
elastic+  9195  1.1  6.8 5339432 417660 ?      Ssl  10:11   2:55 /bin/java -Xms2g -Xmx2g -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=75 -XX:+UseCMSInitiatingOccupancyOnly -XX:+
root      9529  0.0  0.0 147852   120 ?        Ss   10:16   0:00 sshd: root@pts/1
root      9533  0.0  0.0 115524     4 pts/1    Ss+  10:16   0:00 -bash
root      9583  0.0  0.0 147852   220 ?        Ss   10:22   0:00 sshd: root@pts/2
root      9587  0.0  0.0 115524   740 pts/2    Ss   10:22   0:00 -bash
root     12361  0.1  0.0 176268  1668 ?        S    Nov15   1:39 /usr/sbin/vmtoolsd
root     12391  0.0  0.0  52140     0 ?        S    Nov15   0:00 /usr/lib/vmware-vgauth/VGAuthService -s
root     12443  0.0  0.0 200652   424 ?        Sl   Nov15   0:43 /usr/lib/vmware-caf/pme/bin/ManagementAgentHost
root     12452  0.0  0.0      0     0 ?        S    11:01   0:00 [kworker/2:1]
root     12593  0.0  0.0 105996    32 ?        Ss   Nov15   0:00 /usr/sbin/sshd -D
root     12597  0.0  0.0 562392   588 ?        Ssl  Nov15   0:15 /usr/bin/python -Es /usr/sbin/tuned -l -P
root     12600  0.0  0.0 115644    80 ?        Ss   Nov15   0:00 /usr/bin/rhsmcertd
root     12603  0.0  0.0  27116     0 ?        Ss   Nov15   0:00 /usr/sbin/xinetd -stayalive -pidfile /var/run/xinetd.pid
mongod   12641  1.0  1.1 874952 67740 ?        Sl   Nov15  13:38 /usr/bin/mongod -f /etc/mongod.conf
root     12712  0.0  0.0 107920   100 ?        Ss   Nov15   0:00 rhnsd
root     15656  0.0  0.0      0     0 ?        S    11:40   0:00 [kworker/0:2]
root     19632  0.0  0.0      0     0 ?        S    12:28   0:00 [kworker/2:0]
root     22235  0.0  0.0      0     0 ?        R    13:01   0:02 [kworker/0:1]
root     25225  0.0  0.0      0     0 ?        S    13:40   0:00 [kworker/u6:0]
root     26032  0.0  0.0      0     0 ?        S    13:51   0:00 [kworker/u6:1]
root     27816  0.0  0.0      0     0 ?        S    14:12   0:00 [kworker/1:1]
root     28616  0.0  0.0      0     0 ?        S    14:22   0:00 [kworker/1:0]
root     28991  0.0  0.0      0     0 ?        S    14:27   0:00 [kworker/1:2]
graylog  28997  0.0  0.0 113128  1432 ?        Ss   14:28   0:00 /bin/sh /usr/share/graylog-server/bin/graylog-server
graylog  28998  255 44.6 8553996 2731424 ?     Sl   14:28   4:15 /usr/bin/java -Xms2g -Xmx5g -XX:NewRatio=1 -server -XX:+ResizeTLAB -XX:+UseConcMarkSweepGC -XX:+CMSConcurrentMTEnabled -XX:+CM
root     29167  5.0  0.0 151064  1828 pts/2    R+   14:29   0:00 ps -aux

Please post some suggestions in order to reduce memory and post some reasons that may cause this much of memory usage.

Your system only has 6 gigabytes of memory but you’ve assigned a total of 7 gigabytes to various JVM processes.

Color me surprised that an OutOfMemoryError occurred…

Graylog rarely needs more than 2 gigabytes of heap. Additionally, make sure to leave some memory for the operating system’s disk cache, otherwise performance will be tanking.

Oh, and swap “memory” makes everything worse. Make sure to never swap.

1 Like

changed to min2g max 2g. still slowness is same.when refreshing web fe it says
" We are preparing Graylog for you…"

attached system logs if you may interested…

Nov 16 14:41:37 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'
Nov 16 14:41:47 localhost syslog-ng[751]: Syslog connection established; fd='12', server='AF_INET(127.0.0.1:1514)', local='AF_INET(0.0.0.0:0)'
Nov 16 14:41:47 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'
Nov 16 14:41:57 localhost syslog-ng[751]: Syslog connection established; fd='12', server='AF_INET(127.0.0.1:1514)', local='AF_INET(0.0.0.0:0)'
Nov 16 14:41:57 localhost syslog-ng[751]: I/O error occurred while writing; fd='12', error='Connection refused (111)'
Nov 16 14:41:57 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'
Nov 16 14:42:07 localhost syslog-ng[751]: Syslog connection established; fd='12', server='AF_INET(127.0.0.1:1514)', local='AF_INET(0.0.0.0:0)'
Nov 16 14:42:07 localhost syslog-ng[751]: internal() messages are looping back, preventing loop by suppressing all internal messages until the current message is processed; trigger-msg='', first-suppressed-msg='I/O error occurred while writing; fd=\'12\', error=\'Connection refused (111)\''
Nov 16 14:42:07 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'
Nov 16 14:42:17 localhost syslog-ng[751]: Syslog connection established; fd='12', server='AF_INET(127.0.0.1:1514)', local='AF_INET(0.0.0.0:0)'
Nov 16 14:42:17 localhost syslog-ng[751]: internal() messages are looping back, preventing loop by suppressing all internal messages until the current message is processed; trigger-msg='', first-suppressed-msg='I/O error occurred while writing; fd=\'12\', error=\'Connection refused (111)\''
Nov 16 14:42:17 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'
Nov 16 14:42:27 localhost syslog-ng[751]: Syslog connection established; fd='12', server='AF_INET(127.0.0.1:1514)', local='AF_INET(0.0.0.0:0)'
Nov 16 14:42:27 localhost syslog-ng[751]: internal() messages are looping back, preventing loop by suppressing all internal messages until the current message is processed; trigger-msg='', first-suppressed-msg='I/O error occurred while writing; fd=\'12\', error=\'Connection refused (111)\''
Nov 16 14:42:27 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'
Nov 16 14:42:37 localhost syslog-ng[751]: Syslog connection established; fd='12', server='AF_INET(127.0.0.1:1514)', local='AF_INET(0.0.0.0:0)'
Nov 16 14:42:37 localhost syslog-ng[751]: internal() messages are looping back, preventing loop by suppressing all internal messages until the current message is processed; trigger-msg='', first-suppressed-msg='I/O error occurred while writing; fd=\'12\', error=\'Connection refused (111)\''
Nov 16 14:42:37 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'
Nov 16 14:42:47 localhost syslog-ng[751]: Syslog connection established; fd='12', server='AF_INET(127.0.0.1:1514)', local='AF_INET(0.0.0.0:0)'
Nov 16 14:42:47 localhost syslog-ng[751]: I/O error occurred while writing; fd='12', error='Connection refused (111)'
Nov 16 14:42:47 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'
Nov 16 14:42:57 localhost syslog-ng[751]: Syslog connection established; fd='12', server='AF_INET(127.0.0.1:1514)', local='AF_INET(0.0.0.0:0)'
Nov 16 14:42:57 localhost syslog-ng[751]: I/O error occurred while writing; fd='12', error='Connection refused (111)'
Nov 16 14:42:57 localhost syslog-ng[751]: Syslog connection broken; fd='12', server='AF_INET(127.0.0.1:1514)', time_reopen='10'

why you attach random logfiles?

My second question wasn’t answered:

Why did you raise the Graylog JAVA HEAP?

As we can see from your ps -ef output you have one Server with 6GB of RAM that has Elasticsearch, MongoDB and Graylog running. What again was the reason to raise the Graylog JAVA HEAP from the default?

After you returned to the default (1GB) - what again is your Problem exactly.

thank you

Thank you very much for your contribution for my problem. It is highly appreciated.

My problem was graylog server slowness.Java Heap error was rised from graylog logs. now it is gone. after i increased RAM to 2g. still slowness is there. system log said following

internal() messages are looping back, preventing loop by suppressing all internal messages until the current message is processed

so i stopped sending internal messages into graylog from syslog-ng server.

sysconfig settings as follows

destination d_net {
    syslog("localhost" transport("udp") port(1514));
};
# Tell syslog-ng to send data from source s_src to the newly defined syslog destination.
log {
    #source(s_sys); # Defined in the default syslog-ng configuration.
    #destination(d_net);
};

still slowness is there but system error messages also gone.

still problem is there i can’t even login to the graylog.

Make sure you don’t over-provision the memory of your system and that there is still enough memory left for the operating system to use for the disk cache.

For reference:

i have allocated 2g for graylog and all other are stopped. with elastic search with 2g and 2g for system

To date you have provided very little information about the system and been somewhat demanding. Both Jan and Jochen have been very gracious and tolerant in their responses.

I highly recommend you read the Graylog install documents for whatever OS you are running. Particularly any parts about memory requirements, swap, and open file handlers.
http://docs.graylog.org/en/2.3/pages/installation.html

From looking at the free command you posted it’s obvious you are swapping on the system, which is going to cause performance issues due to misconfiguration and or insufficient resources.

For Graylog a 1gb min/max heap should be fine.
For Elasticsearch if you set the min/max heap to 2gb then I would you plan on having an equal amount of ram free in case Elasticsearch crashes.
For Mongodb I would have at least 1gb of ram.
Total I would allocate minimum of 8gb of ram to the system if it has very little load.

If you want help with this you need to post sanitized Graylog, Elasticsearch, and Mongodb configs. They need to be formatted so we can read them as Jan mentioned above.

We also need to know what kind of environment you are running on and the number of messages you are sending per second to the server.

Just to chime in here, 256m of memory for MongoDB should be fine in most cases. Small setups might even work with 64m for MongoDB.

I am using RedHat 7.3 enterprise version.

some cases messages load is 3000 msg/sec

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.