I am having an issue where messages always take a very long time (>3 minutes) to show up on the search page. During this time the page appears unresponsive to the browser (Firefox or Chrome) and it pops up the “page unresponsive” dialog. After the first load, searching is quick and everything is normal until the page is reloaded. There are no log messages generated in the Graylog server log, the Elasticsearch log, the Mongodb log or the browser dev console when this issue occurs.
Description of steps you’ve taken to attempt to solve the issue
Switched from OpenJDK 11 to OpenJDK 8
Upgraded from MongoDB 4.1 to 4.2
Upgraded from ES 6.8 to 7.10
Removed unneeded plugins
Removed unneeded content packs
Recalculated index ranges
Rotated active write index
Environmental information
Intel Xeon e3-1271v3
32GB RAM
8x1TB SSD in RAID10
Mongo, Elastic and Graylog all on this server
I might be able to help. Need to ask a couple question that may pertain to web loading problems.
How much logs are you ingesting per hour/day?
What does you index setting look like? or do you have multiply index sets beside the default ones?
What are you log retention settings?
Do you have a lot of extractors configured? if so what type and how many do you have?
How’s your JAVA heap setting for Graylog?
Judging from your Environmental information, If you have a lot of logs being ingested you know elasticsearch is resource intensive. That why most community member separate ES from Graylog/MongoDb in a situation like that. I have seen in some situation where they have over sharded there server causing a lag and unresponsive interface. You can find out why here
I also recall reading somewhere that Graylog does not like lengthy log messages. I have some Windows Active Directory logs like this, which are quite lengthy. Could this be related?
Same here, I have a CentOS 7 with 14 CPU cores, 12 GB RAM, and 500GB disk. It runs 30 GB logs a day with 30 Days retention using TCP/TLS for Web UI and INPUTS. but no problems, or I should say Not yet
EDIT: a quick glance here are my windows logs for an hour.
For the amount on shards It looks good compared to the resource you have. To be honest you shouldn’t have a problem. As you stated it takes about 3+ minutes for messages to show on the Web UI but alerts are fine.
Check your browser cache reload the tab?
Check firewall?
Check Selinux/apparmor?
Do you have Proxy ( Nginx. Apache) in front of Graylog?
Have you checked your network performance when loading the Web UI Search page?
I have search the forum of different events when the Web is slow or unresponsive. I think were missing something but not sure what. My system does not have a problem like that I and think we have the same amount of logs ingest but you have triple the amount of resources then I do.
BTW I have two sets Windows Active Directory servers sending logs to Graylog so I know what you mean about large message plus we turned up the audit logs on and four MSAD servers to get even more logs.
It might be a configuration issue, I’m not sure how you set you system up thou.