I collect a very large amount of logs, when I do a search (search in last 5, 10.15 minutes …) I do not receive the logs. (just for information, my server logs are powerful, i.e. it is not a resource problem on the server side)
While retrieving data for this widget, the following error(s) occurred:
Connection refused (Connection refused).
Please Help !!!
In this case, is it necessary Add more resources to Elasticsearch or adjust the output settings from Graylog to Elasticsearch ?? Can somebody please explain it to me ?
Based on the messages it looks like either your elasticsearch instance/cluster is at the least not accessible, but also possibly not healthy or not able to keep up (resources) with the amount of data it is receiving. It appears the graylog server is journaling messages more quickly than elasticsearch can process them, hence the warning about some messages being flushed from the journal before they can be indexed.
Make sure ES is reachable by Graylog. If it is, check health and performance.
Is ES served up on the same host as graylog? Or is it a separate host? If they are on separate hosts then the messages queueing on Graylog more quickly than they can be indexed by ES could be a problem with the network connection between them.
How is performance of Graylog? Do you have any pipeline rules:
Graylog and elasticsearch in the same host ! I assure you that there is not a network problem, graylog has enough ram 12 GB and 8 vcpu and has large amount of storage…
now it is working properly. its operation is correct I can load (at least 5 minutes …) and is ok,
NO I don’t have a pipeline rules . of course, I indexed the logs for a period of 4 months, I indexed it per month, but it’s getting heavy. so I indexed it daily, (for a period of 4 months)
If there is no evidence of compute exhaustion, there’s no network issue to worry about, and you don’t have any special processing like a pipeline rule then I would suspect storage performance. You mention that you are processing a lot of messages. If you look at iotop/atop/top (wa column) do you see many processes waiting for disk?
Related, we found this guide very valuable for monitoring performance to identify any issues:
We also followed the Elasticsearch guidelines for sizing based on our deployment to ensure that we are within or as near to as possible the recommendations for shard count / size and adjusted Graylog indices as necessary.