Incoming messages not writing to index


(GT) #1

I have logs coming in, this can be verified by looking at the top right ‘In 12 / Out 0 msg/s’ and also by looking at the input which also shows messages coming in. These logs aren’t being written to the active write index, I have tried rotating it which yielded no results. I can provide more info if needed.

G


(Jochen) #2

What’s in the logs of your Graylog and Elasticsearch nodes?
http://docs.graylog.org/en/2.2/pages/configuration/file_location.html


(GT) #3

This is the ES log. I was getting messages showing up in the web interface but this stopped at approximately 14:43

[2017-06-26T13:29:18,815][INFO ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][25] overhead, spent [434ms] collecting in the last [1.3s]
[2017-06-26T13:30:13,206][WARN ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][young][78][8] duration [1.6s], collections [1]/[2.3s], total [1.6s]/[2.5s], memory [155.7mb]->[125.3mb]/[1.9gb], all_pools {[young] [100.7mb]->[16.1mb]/[133.1mb]}{[survivor] [16.6mb]->[12.5mb]/[16.6mb]}{[old] [38.3mb]->[96.7mb]/[1.8gb]}
[2017-06-26T13:30:13,237][WARN ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][78] overhead, spent [1.6s] collecting in the last [2.3s]
[2017-06-26T13:32:28,480][INFO ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][young][213][9] duration [714ms], collections [1]/[1s], total [714ms]/[3.3s], memory [240.3mb]->[121.8mb]/[1.9gb], all_pools {[young] [131.1mb]->[1.1mb]/[133.1mb]}{[survivor] [12.5mb]->[7.9mb]/[16.6mb]}{[old] [96.7mb]->[112.7mb]/[1.8gb]}
[2017-06-26T13:32:28,480][WARN ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][213] overhead, spent [714ms] collecting in the last [1s]
[2017-06-26T13:52:15,724][INFO ][o.e.c.m.MetaDataCreateIndexService] [d_v_zrN] [graylog_15] creating index, cause [api], templates [graylog-internal], shards [4]/[0], mappings [message]
[2017-06-26T13:52:15,822][INFO ][o.e.c.r.a.AllocationService] [d_v_zrN] Cluster health status changed from [YELLOW] to [GREEN] (reason: [shards started [[graylog_15][1], [graylog_15][3], [graylog_15][2], [graylog_15][0]] ...]).
[2017-06-26T13:53:36,914][INFO ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][1479] overhead, spent [413ms] collecting in the last [1s]
[2017-06-26T14:47:06,560][INFO ][o.e.c.m.MetaDataMappingService] [d_v_zrN] [graylog_15/VcUxherFTM6xQzqE4SEOPg] update_mapping [message]
[2017-06-26T15:21:36,098][WARN ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][young][6741][494] duration [9.1s], collections [1]/[9.8s], total [9.1s]/[19.7s], memory [183.2mb]->[158.1mb]/[1.9gb], all_pools {[young] [26.3mb]->[2.7mb]/[133.1mb]}{[survivor] [6.7mb]->[4.9mb]/[16.6mb]}{[old] [150.1mb]->[150.4mb]/[1.8gb]}
[2017-06-26T15:21:36,835][WARN ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][6741] overhead, spent [9.1s] collecting in the last [9.8s]
[2017-06-26T15:25:59,240][WARN ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][young][7003][670] duration [1.1s], collections [1]/[1.2s], total [1.1s]/[23.3s], memory [277.3mb]->[176.6mb]/[1.9gb], all_pools {[young] [100.8mb]->[34kb]/[133.1mb]}{[survivor] [2.6mb]->[1.9mb]/[16.6mb]}{[old] [173.8mb]->[174.7mb]/[1.8gb]}
[2017-06-26T15:25:59,242][WARN ][o.e.m.j.JvmGcMonitorService] [d_v_zrN] [gc][7003] overhead, spent [1.1s] collecting in the last [1.2s]
[2017-06-26T15:40:49,462][INFO ][o.e.c.m.MetaDataMappingService] [d_v_zrN] [graylog_15/VcUxherFTM6xQzqE4SEOPg] update_mapping [message]
[2017-06-26T15:40:49,567][INFO ][o.e.c.m.MetaDataMappingService] [d_v_zrN] [graylog_15/VcUxherFTM6xQzqE4SEOPg] update_mapping [message]
[2017-06-26T15:40:49,648][INFO ][o.e.c.m.MetaDataMappingService] [d_v_zrN] [graylog_15/VcUxherFTM6xQzqE4SEOPg] update_mapping [message]
[2017-06-26T15:40:51,505][INFO ][o.e.c.m.MetaDataMappingService] [d_v_zrN] [graylog_15/VcUxherFTM6xQzqE4SEOPg] update_mapping [message]
[2017-06-26T15:40:51,596][INFO ][o.e.c.m.MetaDataMappingService] [d_v_zrN] [graylog_15/VcUxherFTM6xQzqE4SEOPg] update_mapping [message]
[2017-06-26T15:40:53,428][INFO ][o.e.c.m.MetaDataMappingService] [d_v_zrN] [graylog_15/VcUxherFTM6xQzqE4SEOPg] update_mapping [message]
[2017-06-26T15:40:55,398][INFO ][o.e.c.m.MetaDataMappingService] [d_v_zrN] [graylog_15/VcUxherFTM6xQzqE4SEOPg] update_mapping [message]

The Graylog Log file shows:

2017-06-26T14:43:11.933+01:00 WARN  [AbstractNioSelector] Failed to initialize an accepted socket.
java.io.IOException: overrun, bytes = 612

I was receiving this type of error for about ten minutes prior to logs not being visible and these types of error continued for about ten minutes after.

G


(Jochen) #4

There is something wrong with that setup.

Garbage collection times of 9 seconds are pretty abysmal (usually it’s only a few milliseconds) and the error message from the Graylog node’s logs also looks strange. Is there more content in the logs of Graylog?

What are the hardware specs of the machines running Graylog and Elasticsearch?
How did you install Graylog?
How did you configure Graylog and Elasticsearch?
What are the JVM (heap) settings for Graylog and Elasticsearch?


(GT) #5

The specs are 2 cores, 2gb ram and 20gb disk space.

We used an ansible playbook to install, its installed on an Ubuntu 16.04 VM.

We also use the ansible playbook to configure our instances, I’m hesitant to post any of our config, however are there any specific parts of the config you would require?

I’m not sure on how to find the JVM settings, if you could point me in the direction of them I’d be happy to provide them.

There was nothing else in the GL logs at the time.

G


(Jochen) #6

That’s not enough to run Graylog and Elasticsearch and MongoDB on the same machine with decent performance.

Try using a machine with at least 4 GB of memory and make sure to not use everything for JVM processes, so that the operating system can make use of the memory as disk cache.

Please refer to http://docs.graylog.org/en/2.2/pages/configuration/file_location.html#deb-package


(GT) #7

I have just upgraded the box to 4gb of ram, and restarted all of the services. I would like to happily inform you that this completely solved the problem and logs are being indexed.

Thankyou very much for you time and assistance.

G


(system) #8

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.