Graylog Big Problem

gsmith · November 17, 2022, 6:24am

Hello,

Elasticsearch is ingest logs, you will not see logs until its done, If the journal does not go down. I would look at your Elasticsearch/Graylog log files.

Ensure elasticsearch is functioning correctly. Depending on the amount of log ingesting you may need to in crease the buffer thread count, but if you do not have enough resources i.e., CPU , Memory, I would not increase those configurations until you have an adequate amount of resource .
I believe by default it set for processor_buffer =5 and output_buffer=3. You should have at least 8 CPUs the Graylog server.

Grayuser78 · November 21, 2022, 2:44am

Dear @gsmith ,
Thank you for your more information.
Recently, i have stopped sending logs to the Graylog server, but the process buffer output below still on status 100% as screenshot below, could you please help advise more? Thanks.

gsmith · November 21, 2022, 10:16pm

Hello,

Looks like you have three alerts on top of the picture. What do that show?
First I would take a look at elasticsearch status or perhaps curl commands to check the health of ES and post the results. If you do don’t forget to remove personal information.

Example:

systemctl status elasticsearch

curl -XGET http://es_node:9200/_cluster/health?pretty

Post an update on your Elasticsearch/Graylog configuration file and by chance did you tail -f the graylog log file when this was happing? if so what did you see? That picture could be a multiple reasons for this issue. Normally when I see buffers fill up its Elasticsearch , resource, configuration and/or connection issues.
I need to see if your ES status is good. Not only running in green but making sure its not stuck in read mode. That could cause the journal, buffers to fill up quick.

Grayuser78 · November 24, 2022, 7:11am

Dear @gsmith,
Thank you for your more information. Please find more information below:
-The three alerts here:

$ systemctl status elasticsearch
â— elasticsearch.service - Elasticsearch
Loaded: loaded (/usr/lib/systemd/system/elasticsearch.service; enabled; vendor preset: disabled)
Active: failed (Result: exit-code) since Wed 2022-11-16 18:46:20 +07; 1 weeks 0 days ago
Docs: https://www.elastic.co
Process: 31885 ExecStart=/usr/share/elasticsearch/bin/systemd-entrypoint -p ${PID_DIR}/elasticsearch.pid --quiet (code=exited, status=128)
Main PID: 31885 (code=exited, status=128)
$ curl -XGET http://es_node:9200/_cluster/health?pretty
curl: (6) Could not resolve host: es_node; Unknown error
$ tail -f /var/log/graylog-server/server.log
… 11 more
Caused by: java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method) ~[?:?]
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:777) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvent(DefaultConnectingIOReactor.java:174) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.apache.http.impl.nio.reactor.DefaultConnectingIOReactor.processEvents(DefaultConnectingIOReactor.java:148) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.apache.http.impl.nio.reactor.AbstractMultiworkerIOReactor.execute(AbstractMultiworkerIOReactor.java:351) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.apache.http.impl.nio.conn.PoolingNHttpClientConnectionManager.execute(PoolingNHttpClientConnectionManager.java:221) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.apache.http.impl.nio.client.CloseableHttpAsyncClientBase$1.run(CloseableHttpAsyncClientBase.java:64) ~[?:?]
… 1 more
2022-11-24T09:17:00.848+07:00 ERROR [VersionProbe] Unable to retrieve version from Elasticsearch node: Failed to connect to /127.0.0.1:9200. - Connection refused (Connection refused).
2022-11-24T09:17:00.920+07:00 ERROR [IndexFieldTypePollerPeriodical] Couldn’t update field types for index set <Default index set/62f4724c6e7df24565785364>
org.graylog.shaded.elasticsearch7.org.elasticsearch.ElasticsearchException: An error occurred:
at org.graylog.storage.elasticsearch7.ElasticsearchClient.exceptionFrom(ElasticsearchClient.java:140) ~[?:?]
at org.graylog.storage.elasticsearch7.ElasticsearchClient.execute(ElasticsearchClient.java:100) ~[?:?]
at org.graylog.storage.elasticsearch7.ElasticsearchClient.execute(ElasticsearchClient.java:93) ~[?:?]
at org.graylog.storage.elasticsearch7.IndicesAdapterES7.resolveAlias(IndicesAdapterES7.java:139) ~[?:?]
at org.graylog2.indexer.indices.Indices.aliasTarget(Indices.java:145) ~[graylog.jar:?]
at org.graylog2.indexer.MongoIndexSet.getActiveWriteIndex(MongoIndexSet.java:202) ~[graylog.jar:?]
at org.graylog2.indexer.fieldtypes.IndexFieldTypePollerPeriodical.lambda$schedule$4(IndexFieldTypePollerPeriodical.java:249) ~[graylog.jar:?]
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515) [?:?]
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:305) [?:?]
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:305) [?:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) [?:?]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) [?:?]
at java.lang.Thread.run(Thread.java:829) [?:?]
Caused by: java.net.ConnectException: Connection refused
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestClient.extractAndWrapCause(RestClient.java:849) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestClient.performRequest(RestClient.java:259) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestClient.performRequest(RestClient.java:246) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestHighLevelClient.internalPerformRequest(RestHighLevelClient.java:1613) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestHighLevelClient.performRequest(RestHighLevelClient.java:1583) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.RestHighLevelClient.performRequestAndParseEntity(RestHighLevelClient.java:1553) ~[?:?]
at org.graylog.shaded.elasticsearch7.org.elasticsearch.client.IndicesClient.getAlias(IndicesClient.java:1315) ~[?:?]
at org.graylog.storage.elasticsearch7.IndicesAdapterES7.lambda$resolveAlias$2(IndicesAdapterES7.java:139) ~[?:?]
at org.graylog.storage.elasticsearch7.ElasticsearchClient.execute(ElasticsearchClient.java:98) ~[?:?]
… 11 more

Please kindly check and advise more…
Thanks,
Best Regards

gsmith · November 24, 2022, 10:44pm

Hey,

Man you really need to check your alerts those were from two months ago. This tells me you have a bigger problem then expected.

From the logs it looks like Elasticsearch crashed because of the warning two months ago. Basically when disk get full over 95% I believe Elasticsearch goes into read mode, it a safety precaution.
I’ll be honest, you have a lot of work to bring this back to life.

If this happened on my server I would shut Graylog service down.
Take ES out of read mode. This link should give you a better idea what needed.

The increase the volume to something larger if possible or this may happen again.

Once that is completed calculate the sizes of the indices and make sure it does not exceed the volume it resides on.

I would not start Graylog service until Elasticsearch status is good. Try to use cURL command to make sure you indices are good, no errors are found, system status is good, etc…

And if you did get this going, please pay attention to the errors shown, you could have avoided this two months ago.

Notice it states Could not resolve host: es_node; Unless you named your Elasticsearch node “es_node” this will not work. it should be IP Address , FQDN, or localhost.

I think you need to read this documentation.

Grayuser78 · December 2, 2022, 7:56am

Dear @gsmith ,
Thank for your more information.
But as i noted, i always monitored the hard disk space of the server, it’s never get full over 95%. Hmm…it seem that I need to re-setup the server, but I will lost data logs again :(((( hmm…

joe.gross · December 5, 2022, 4:57pm

It does not need to go over 95%. If you hit the high watermark, it will stop accepting new logs.

system · December 19, 2022, 4:58pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
GRAYLOG 2.5 server currently unavailable Graylog Central (peer support)	8	1400	January 16, 2020
Unable to perform search query error Graylog Central (peer support)	15	7090	October 28, 2019
Loading Forever in search Graylog Central (peer support)	16	5443	January 4, 2018
Graylog indices Graylog Central (peer support)	23	6883	July 11, 2018
Graylog want start Graylog Central (peer support)	11	2531	August 15, 2018

Graylog Big Problem

Related topics