Web interface stops responding intermittently

gsmith · December 23, 2021, 2:00am

To check use top/htop see if you can see the added CPU’s. If you do you should be all good. I’ve been looking into your issue all day, and re-reading what’s been posted here. It could be a couple things but I want to take one step at a time. The first being resources if its not that maybe move on to configurations.

sparrowhawk · December 23, 2021, 2:02am

Htop screenshot below. Still can’t access the web interface though.

sparrowhawk · December 23, 2021, 2:03am

Thanks gsmith, I really appreciate you help on this by the way.

gsmith · December 23, 2021, 2:07am

No problem, glad to help. Give me a few I want to do research. If you get a chance those log files would help, but make sure you cut/blur out personal info.

sparrowhawk · December 23, 2021, 2:09am

Thanks What log files do you need and how do I get them?

gsmith · December 23, 2021, 2:18am

Graylog Logs since were dealing with the Web UI.

One more idea, it has to do with Network Port Conflicts. I’ve seen this before when two or more services are using the same port. For example port 9000. Just to make sure your Graylog web UI and Elasticsearch are using different ports.

You can find out by using something like this.

sudo lsof -i -P -n | grep LISTEN

Results,

If those are all good perhaps your Graylog configuration file.
Also check you firewall , selinux/apparmor if there enabled.

sparrowhawk · December 23, 2021, 2:23am

I have probably misled you regrading the default port for ES, I think that is still 9200. The web interface is on port 9000.

systemd-r  797 systemd-resolve   13u  IPv4  19826      0t0  TCP 127.0.0.53:53 (LISTEN)
mongod     945         mongodb   11u  IPv4  20270      0t0  TCP 127.0.0.1:27017 (LISTEN)
java       999   elasticsearch  124u  IPv6  27295      0t0  TCP [::1]:9300 (LISTEN)
java       999   elasticsearch  126u  IPv6  26217      0t0  TCP 127.0.0.1:9300 (LISTEN)
java       999   elasticsearch  151u  IPv6  26242      0t0  TCP 127.0.0.1:9200 (LISTEN)
java       999   elasticsearch  152u  IPv6  26241      0t0  TCP [::1]:9200 (LISTEN)
sshd      1023            root    3u  IPv4  22797      0t0  TCP *:22 (LISTEN)
sshd      1023            root    4u  IPv6  22799      0t0  TCP *:22 (LISTEN)
nginx     1066            root    8u  IPv4  23022      0t0  TCP *:80 (LISTEN)
nginx     1066            root    9u  IPv6  23023      0t0  TCP *:80 (LISTEN)
nginx     1066            root   10u  IPv4  23024      0t0  TCP *:443 (LISTEN)
nginx     1067        www-data    8u  IPv4  23022      0t0  TCP *:80 (LISTEN)
nginx     1067        www-data    9u  IPv6  23023      0t0  TCP *:80 (LISTEN)
nginx     1067        www-data   10u  IPv4  23024      0t0  TCP *:443 (LISTEN)
nginx     1068        www-data    8u  IPv4  23022      0t0  TCP *:80 (LISTEN)
nginx     1068        www-data    9u  IPv6  23023      0t0  TCP *:80 (LISTEN)
nginx     1068        www-data   10u  IPv4  23024      0t0  TCP *:443 (LISTEN)
nginx     1070        www-data    8u  IPv4  23022      0t0  TCP *:80 (LISTEN)
nginx     1070        www-data    9u  IPv6  23023      0t0  TCP *:80 (LISTEN)
nginx     1070        www-data   10u  IPv4  23024      0t0  TCP *:443 (LISTEN)
nginx     1071        www-data    8u  IPv4  23022      0t0  TCP *:80 (LISTEN)
nginx     1071        www-data    9u  IPv6  23023      0t0  TCP *:80 (LISTEN)
nginx     1071        www-data   10u  IPv4  23024      0t0  TCP *:443 (LISTEN)
java      1182         graylog   92u  IPv6  28853      0t0  TCP x.x.x.x:9000 (LISTEN)

gsmith · December 23, 2021, 2:34am

Oh, you have Nginx installed and I assume your using Nginx as a proxy?

I would look into those logs also. A while back a community member had a similar issue like yourself, it ended up being something with Nginx not sure if they fixed it. Ill try to dig it up maybe something in there might help.

EDIT: Check your firewall , selinux/apparmor if there enabled.

Here is my Lab GL Server configuration. Maybe something in there might help.

[root@graylog elasticsearch]# cat /etc/graylog/server/server.conf | egrep -v "^\s*(#|$)"
is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret = 8-6-7-5-3-0-9_something_something
root_password_sha2 =
root_email = "greg.smith@domain.com"
root_timezone = America/Chicago
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = 10.10.10.10:9000
http_enable_cors = true
elasticsearch_hosts = http://10.10.10.10:9200
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = true
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 5000
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
message_journal_max_size = 12gb
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://mongo_admin:password@localhost:27017/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
transport_email_enabled = true
transport_email_hostname = localhost
transport_email_port = 25
transport_email_subject_prefix = [graylog]
transport_email_from_email = root@domain.com
transport_email_web_interface_url = http://10.10.10.10:9000
http_connect_timeout = 10s
proxied_requests_thread_pool_size = 32
prometheus_exporter_enabled = true
prometheus_exporter_bind_address = graylog.domain.com:9833

sparrowhawk · December 23, 2021, 3:20pm

Nginx is loaded but both the access and error logs are empty.

Selinux is not running but apparmor is.

Here’s the result of your cat command on my GL config file.

root@SystemsLoggingGraylog-Live:/etc/graylog/server# cat /etc/graylog/server/server.conf | egrep -v “^\s*(#|$)”
is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret = ***********************
root_password_sha2 = ********************
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = 10.10.0.131:9000
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
message_journal_max_size = 6gb
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
proxied_requests_thread_pool_size = 32

There are a few lines missing, and some differences in values that I’ve highlighted.

Thanks

gsmith · January 1, 2022, 6:08am

Hello,

Couple things I noticed in you config file. With the amount of CPU’s you have compared to the configurations you showed above. I would assume you have 10 physical CPU cores. Shown below that I quoted each one of those configuration I do believe creates a thread. It is recommended that processbuffer_processors, outputbuffer_processors and inputbuffer_processors are no greater then your physical CPU cores. So in other words you should have 10 physical cores or more.

And

If your receiving that many messages I would defiantly kick it up a notch. One of my small Graylog server in production is running 14 cpu, 10 GB mem, a TB storage. Its receiving about 30 GB logs a day and my Graylog server is in a happy place. Using the following configuration. If you add them up = 12 and I reserved 2 cores for the server.

processbuffer_processors = 7
outputbuffer_processors = 3
inputbuffer_processors = 2

I’m not 100% sure but you may need more CPU cores then 4.

sparrowhawk · January 4, 2022, 11:55am

Hey gsmith, happy New Year, I hope you had a good one?

This server is an EC2 t2.xlarge instance with 4 vCPUs. Pardon my lack of knowledge about AWS, but how does that relate to physical CPUs? I’ve upgraded the instance to a t2.2xlarge with 8 vCPUs but I still can’t get the web interface to respond.

Thanks

gsmith · January 4, 2022, 10:47pm

Thank you , I hope you had a good one also

Couple things to try.

Check your services
Systemctl status graylog-server
Check the Graylog log file. This would be a good place to start to find out Why and What went wrong.

If you don’t see anything in the log that may pertain to the issue try restarting Graylog service and tail graylog log file. If I understand you correct now you can not get the Web UI to respond? Can you log into the web ui?

Is your Graylog configuration still the same as shown above? If not can you repost your new config here?
Normally when I add CPU or Memory to my Virtual machine sometimes a reboot helps.

system · January 18, 2022, 10:48pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Web interface does not open Graylog Central (peer support)	7	10960	January 22, 2019
Graylog web interface is slow after upgrade Graylog Central (peer support)	13	4833	November 8, 2017
Server currently unavailable [Ubuntu 16.04] Graylog Central (peer support) pipeline-rules , debuggingpl	4	1916	February 7, 2018
Loading Forever in search Graylog Central (peer support)	16	5443	January 4, 2018
Web ui goes down after adding new input Graylog Central (peer support)	14	1524	February 27, 2019

Web interface stops responding intermittently

Related topics