Graylog 3.0 Upgrade Problem

Hello All,

Problem occurred after upgrading Graylog 2.5.X (Latest Version) to Graylog 3.0.0+.

The upgrade went well, Graylog services started right up. The following links are what I used to apply the Upgrade.

http://docs.graylog.org/en/3.0/pages/upgrade/graylog-3.0.html

http://docs.graylog.org/en/3.0/pages/installation/operating_system_packages.html

After logging into Web Interface, I noticed my Inputs are no longer running and I’m unable to start them, but I’m still receive messages from those Inputs. And the Certificates seem to be working with the GELF_TCP/TLS Inputs.

Under System/Nodes I receive the following error:

“Getting plugins on node "58aba0a0-9aee-4a8b-b6d6-1a75394fbab1" failed: Error: cannot GET https://<FQDN>:9000/api/cluster/58aba0a0-9aee-4a8b-b6d6-1a75394fbab1/plugins (500)”

Error: cannot GET https://<FQDN>:9000/api/cluster/58aba0a0-9aee-4a8b-b6d6-1a75394fbab1/jvm (500) Check your Graylog logs for more information.

Graylog Server Logs:

2019-02-25T19:06:27.463-06:00 WARN [ProxiedResource] Unable to call https://<FQDN>:9000/api/system/metrics/multiple on node <58aba0a0-9aee-4a8b-b6d6-1a75394fbab1>

javax.net.ssl.SSLPeerUnverifiedException: Hostname not verified:

2019-02-25T19:06:18.786-06:00 WARN [ProxiedResource] Unable to call https://<FQDN>:9000/api/system on node <58aba0a0-9aee-4a8b-b6d6-1a75394fbab1>

javax.net.ssl.SSLPeerUnverifiedException: Hostname not verified:

I have created Self-Signed Certificates and been using them since the link below and never had any problems since until now.

My Environment;

  • Virtual machine with CentOS 7, all packages is fully updated. Hardware: 6 Processors, 8GB Ram, and 1TB HDD.
  • graylog-server-3.0.0-12.noarch
  • elasticsearch-6.6.1-1.noarch
  • mongodb-org-4.0.6-1.el7.x86_64

Graylog Configuration file;

is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret =qGcHdYTZSVIQMInA056Te0uSZLtyvDqt3hdVmTWXFM1rAocHR5E9dgnm3TTd5Wy5uOin3neYQhAvvqlfAPgEe2NdgHdTQl2c
root_password_sha2 =ce1dedff58447c834034af15c7c139aa1ad6149366ad8c87984058ae98ae4dae
root_timezone = America/Chicago
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address =FQDN:9000
http_enable_tls = true
http_tls_cert_file =/etc/graylog/graylog-certificate.pem
http_tls_key_file =/etc/graylog/graylog-key.pem
http_tls_key_password =secret
elasticsearch_hosts = http://IP ADDRESS:9200
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
message_journal_max_size = 5gb
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
proxied_requests_thread_pool_size = 32

I have tried to recreate the Certificates, and the problem still is occurring. Any advice would be appreciated.

I guess that you certificate does not contain the IP you get when Graylog resolvs the Hostname.

http://docs.graylog.org/en/3.0/pages/configuration/server.conf.html#web-rest-api

According to the docs, http_bind_adress should be an IP …

@jan
For testing, I created a fresh install with the same hardware configuration as the virtual machine above. Also I created the certificates like i did before that worked from the previous Graylog server 2.5.

My Graylog-Server configuration file as shown below;

is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret =qGcHdYTZSVIQMInA056Te0uSZLtyvDqt3hdVmTWXFM1rAocHR5E9dgnm3TTd5Wy5uOin3neYQhAvvqlfAPgEe2NdgHdTQl2c
root_password_sha2 =ce1dedff58447c834034af15c7c139aa1ad6149366ad8c87984058ae98ae4dae
root_timezone = America/Chicago
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address =10.200.6.21:9000
http_enable_tls = true
http_tls_cert_file =/etc/graylog/graylog-certificate.pem
http_tls_key_file =/etc/graylog/graylog-key.pem
http_tls_key_password =secret
elasticsearch_hosts = http://10.200.6.21:9200
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 5
outputbuffer_processors = 3
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
message_journal_max_size = 5gb
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
proxied_requests_thread_pool_size = 32

I’m still receive the following error.

2019-02-26T16:41:38.719-06:00 WARN [ProxiedResource] Unable to call https://10.200.6.21:9000/api/system/metrics/multiple on node <58aba0a0-9aee-4a8b-b6d6-1a75394fbab1>
javax.net.ssl.SSLPeerUnverifiedException: Hostname 10.200.6.21 not verified:
certificate: sha256/lrL7c+PZE2Hxo5Fq/nK7F4FDrG+yRrVvDtNrt2E7CLY=
DN: CN=test-install.enseva-labs.net, OU="Computers ", O=enseva-labs, L=cedar rapids, ST=iowa, C=us

Test with curl

[root@test-install server]# curl -k ‘https://test-install.enseva-labs.net:9000/api/?pretty=true
{
“cluster_id” : “d8d5cc3d-72cd-4104-ac0e-57431230c052”,
“node_id” : “58aba0a0-9aee-4a8b-b6d6-1a75394fbab1”,
“version” : “3.0.0+db6cf59”,
“tagline” : “Manage your logs in the dark and have lasers going and make it look like you’re from space!”
}[root@test-install server]#

Any more suggestion i could use would be appreciated.

Hostname 10.200.6.21 not verified:

did you added the certificate or the CA to the java keystore that this can be verified?

@jan
Yes I did, the keystore and Certs did not change from the previous Version of Graylog 2.5. The only
thing I did was Upgrade the graylog version to 3.0.0+. , Elasticsearch and MongoDB.

I have tried two different configuration, and i get the same results.
http_bind_address =10.200.6.21:9000
And
http_bind_address =test-install.enseva-labs.net:9000

I’m not sure, but do you think it might be Java??

[root@test-install security]# java -version
openjdk version “1.8.0_191”
OpenJDK Runtime Environment (build 1.8.0_191-b12)
OpenJDK 64-Bit Server VM (build 25.191-b12, mixed mode)
[root@test-install security]#

did you checked if the startup parameters still contain the link to your custom keystore?

@jan
I have been using the “cacerts”. What gets me confused is why this stopped working after the upgrade. Not sure what could of change to make this error. Everything seems to be functioning, Except System/Node, Input will not start , but still receiving messages and finally the System/Logging.

This is a different Virtual machine, but the same configuration.
I checked DNS, hostname, and i was able to ping this virtual machine using FQDN from a different node.


What I’m doing is reproducing this error on a different machine, as of now I disabled HTTPS on my production Server until i get this figured out.

your browser is telling you that he does not trust the CA that has signed the certificate - I guess it is the same for Graylog, that is does not trust the CA.

Hello.

Does anyone know how to solve this? I have the exact same problem.
Everything was working then i upgraded from version 2.5 to 3.0, changed the API variables, etc
Everything works fine except the inputs and i get the errors already displayed in this thread. There were no changes to the certificates and i can see them in the keystore.
The only change was that i upgraded graylog.

Any ideas?

Best Regards

you should mention your rest_* and web_* configuration from pre 3.0 and now your http_* settings that we can help you. @JohnDoe

Those variables were replaced by:

http_bind_address = 10.10.25.145:9000

And i can access the web interface without issues using https. But when i go to System/Nodes -> Node i get an error and also it’s impossible to start the Inputs.
What i noticed is that if i put http_enable_tls = false and access using http then everything works.
I thought there might be a problem it the certificate, but the certificate is valid, it’s in the keystore and is the same that was used prior to the upgrade without issues.

Best Regards

@JohnDoe
Yeah, I’m still struggling on this. I even transferred my Cert’s from our CA and I still get the same results. Everything seems to work except System/Node, Input/s will not start, and Logging.
I rolled back my Graylog version to 2.5.x using the same certs and it worked fine. I think something with Java that might be the problem. Not sure thou.
I have tried the following;
http://docs.graylog.org/en/3.0/pages/configuration/https.html


Still no joy…

@jan
If this helps, here are my config file’s from before and after upgrade

The following graylog configuration file for Graylog version 2.5.x
is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret = 6hHJ8HKQ1k3ZqVOmWAyTlfjScdFkRZpt1BEm7Rch4QY63djfKjT0puF1POIZOrzVlA7k13zMssGKu4wSpSZL5jw4hT2Kk2Rs
root_password_sha2 =ce1dedff58447c834034af15c7c139aa1ad6149366ad8c87984058ae98ae4dae
root_email = ""
root_timezone = America/Chicago
plugin_dir = /usr/share/graylog-server/plugin
rest_listen_uri = http://graylog.enseva-labs.net:9000/api/
rest_enable_tls = true
rest_tls_cert_file = /etc/graylog/graylog-certificate.pem
rest_tls_key_file = /etc/graylog/graylog-key.pem
rest_tls_key_password = secret
web_listen_uri = http://graylog.enseva-labs.net:9000/
web_enable_tls = true
web_tls_cert_file = /etc/graylog/graylog-certificate.pem
web_tls_key_file = /etc/graylog/graylog-key.pem
web_tls_key_password = secret
elasticsearch_hosts = http://10.200.6.70:9200

The following graylog configuration file for Graylog version 3.0.x

is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret = 6hHJ8HKQ1k3ZqVOmWAyTlfjScdFkRZpt1BEm7Rch4QY63djfKjT0puF1POIZOrzVlA7k13zMssGKu4wSpSZL5jw4hT2Kk2Rs
root_password_sha2 =ce1dedff58447c834034af15c7c139aa1ad6149366ad8c87984058ae98ae4dae
root_email = ""
root_timezone = America/Chicago
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = graylog.enseva-labs.net:9000
elasticsearch_hosts = http://10.200.6.70:9200

@gsmith @JohnDoe

you are both having issues in the same reason. I have asked for the specific configration parameter because I think that your certificate does not include what you are currently have configured.

Means in 2.5 you had configured the IPs for inter-node communication, but now you use node names or something similar. That is why you should debug your certificate, does is include all information, hostnames and IPs that might be used for communication when you refer to the hostname?

The problem is related to the certificates, not Graylog - what is changed in Graylog is how the interface is configured that might be the reason you run into issues now.

Hi
If that can help you, this my configuration (2.5 to 3.0).

I just change

rest_listen_uri = http://192.168.1.206:9000/api/
web_listen_uri = http://192.168.1.206:9000/

by

http_bind_address = 192.168.1.206:9000

Configuration :

is_master = true
node_id_file = /etc/graylog/server/node-id
elasticsearch_max_docs_per_index = 20000000
password_secret =  ***
root_password_sha2 = ***
plugin_dir = /usr/share/graylog-server/plugin

#rest_listen_uri = http://192.168.1.206:9000/api/
#web_listen_uri = http://192.168.1.206:9000/

http_bind_address = 192.168.1.206:9000

rotation_strategy = count
elasticsearch_hosts = http://192.168.1.208:9200
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 2
outputbuffer_processors = 2
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 2
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://192.168.1.207:27017/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
content_packs_dir = /usr/share/graylog-server/contentpacks
content_packs_auto_load = grok-patterns.json
proxied_requests_thread_pool_size = 32
root_timezone = Europe/Paris

is that IP 192.168.1.206 part of your certificate?

No, In this case, I use HTTP, I dont have certificate, It’s local configuration.

I don’t have problems ^^ I show my exemple to help people who have problems.

@jan
My self-signed certs have the FQDN when created. As shown from above are my configurations in Graylog’s Config file. I checked the keystore in java and it does show my FQDN and the cert’s are trusted. Not sure what else i can check or troubleshoot.
This is my current Graylog Config using the self-signed certs

is_master = true
node_id_file = /etc/graylog/server/node-id
password_secret = 6hHJ8HKQ1k3ZqVOmWAyTlfjScdFkRZpt1BEm7Rch4QY63djfKjT0puF1POIZOrzVlA7k13zMssGKu4wSpSZL5jw4hT2Kk2Rs
root_password_sha2 =ce1dedff58447c834034af15c7c139aa1ad6149366ad8c87984058ae98ae4dae
root_email = "greg.smith@enseva.com"
root_timezone = America/Chicago
bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server
plugin_dir = /usr/share/graylog-server/plugin
http_bind_address = graylog.enseva-labs.net:9000
http_enable_tls = true
http_tls_cert_file = /etc/graylog/graylog-certificate.pem
http_tls_key_file = /etc/graylog/graylog-key.pem
http_tls_key_password = secret
elasticsearch_hosts = http://10.200.6.70:9200
rotation_strategy = count
elasticsearch_max_docs_per_index = 20000000
elasticsearch_max_number_of_indices = 20
retention_strategy = delete
elasticsearch_shards = 4
elasticsearch_replicas = 0
elasticsearch_index_prefix = graylog
allow_leading_wildcard_searches = false
allow_highlighting = false
elasticsearch_analyzer = standard
elasticsearch_index_optimization_timeout = 22h
output_batch_size = 500
output_flush_interval = 1
output_fault_count_threshold = 5
output_fault_penalty_seconds = 30
processbuffer_processors = 15
outputbuffer_processors = 9
processor_wait_strategy = blocking
ring_size = 65536
inputbuffer_ring_size = 65536
inputbuffer_processors = 4
inputbuffer_wait_strategy = blocking
message_journal_enabled = true
message_journal_dir = /var/lib/graylog-server/journal
message_journal_max_size = 8gb
lb_recognition_period_seconds = 3
mongodb_uri = mongodb://localhost/graylog
mongodb_uri = mongodb://mongo_admin:PASSWORD-Here@localhost:27017/graylog
mongodb_max_connections = 1000
mongodb_threads_allowed_to_block_multiplier = 5
transport_email_enabled = true
transport_email_hostname = localhost
transport_email_port = 25
transport_email_from_email = root@graylog.enseva-labs.net
transport_email_web_interface_url = https://10.200.6.70:9000
proxied_requests_thread_pool_size = 32

Any advice would be appreciated.

ok sherlock, now down in the dirt. When you connect to https://<FQDN>:9000 with cli tools - what is the output?

nmap --script ssl-enum-ciphers -p 9000 <FQDN>
openssl s_client -connect <FQDN>:9000

that is the dirty way to debug this now and find the error.

I finally got this to work. I had to reissue my certificates adding my fqdn and my IP as SAN in the certificates. These worked just fine without the SAN with version 2.5
Try to do this and tell us if it worked for you too.