Graylog was initially set up three years ago and after upgrades and cert changes I’m getting errors for the Java Keystore
"None of the TrustManagers trust this certificate chain"
So I found instructions for fixing it here: How-To Guide: Securing Graylog with TLS
But the instructions are weird: For instance, it shows the sample keytool entry for a Root cert pem
sudo keytool -importcert -keystore /etc/graylog/graylog.jks -storepass changeit -alias cachain -file /etc/graylog/enterpriseRootCA.pem
And then it reads “Repeat the above step for all root and intermediate CAs. For example, if you have a private Root CA that issues certs to a sub CA, add both the Root CA and Sub CA public certs to the above JKS” Well, you can’t repeat the step because it usues a specific alias, cachain. So I assumed the goal was to add the certificate chain.
We have an AD CA so I navigate to http://certserver/certsrv and grab the certificate chain, convert from p7b using openssl, and added that to the keystore. It appeared to do nothing, and just as I was preparing to try something else suddenly all three hosts in the cluster fire up and start ingesting.
The thing is, in the past 24 hours, one or more hosts will occasionally go unavailable in the Nodes tab, and then will, equally inexplicably, come back up. And if I click on the node while its down I get the Java Keystore error.
I’m in the process of upgrading. I’ve gotten to 6.1 so I have three more versions to jump to, but I want to get this handled before I move forward because the fewer things to troubleshoot, the better. So, is the certificate chain out of ADCS what I want to add to the java keystore, or is it something else entirely? Is there any explanation why it will work most of the time but not all of the time?
System Setup: 3-node Cluster running on RHEL 8
Graylog ver 6.1, Opensearch ver 1.1, Mongo ver 6.0