Errors after upgrading from 2.5.2 to 3.1

Greetings! I’d highly appreciate, if someone wiser here would be able to assist in a few issues I’m experiencing.

I updated our Graylog instance from 2.5.2 to the latest 3.1 following to the update documentation as closely as possible. Elasticsearch was already 5.6.13 thus no separate update for that required. In short, I’ve updated the server.conf with new settings as well as the nginx configuration.

Graylog loads up and I’m able to access it and various settings and the dashboards we have in use. However, I’m unable to open the System>Nodes>our node ID. This results in a HTTP 500 error code with the following error messages:

Could not get plugins: 
Getting plugins on node "<node ID>" failed: Error: cannot GET https://<FQDN>/api/cluster/<node ID>/plugins (500)
Could not get JVM information
Getting JVM information for node '<node ID>' failed: Error: cannot GET https://<FQDN>/api/cluster/<node ID>/jvm (500)

I’m also unable to start any inputs, resulting in the following error message and log entry:

Input '<input name>' could not be started. Request to start input '<input name>' failed. Check your Graylog logs for more information. 
WARN  [ProxiedResource] Unable to call https://api/system/inputstates/<input ID> on node <<node ID>>: api 

At this point I also noticed that the server.log is getting spammed with similar error messages:

 WARN  [ProxiedResource] Unable to call https://api/system/metrics/multiple on node <<node ID>>: api

Correct me if I’m wrong, but there’s definitely something fishy going on if the system is trying to call to an address without the server IP or FQDN name? The IP/FQDN related parameters that I have in the server.conf are the following:

http_bind_address =
http_publish_uri = https://$http_bind_address/
http_external_uri = <FQDN>/ (same issues persist even with this removed)
elasticsearch_hosts =

Considering that everything was working before updating from 2.5.2, what am I missing? Luckily I do have a snapshot from before the update, but it would be nice to have the new features, you know?

Thanks a ton in advance! :slight_smile:

he @Werrex

if we would know your rest_* and web_* settings from before we might be able to help you.

Hi Jan, thanks for the quick reply. I’ve pasted the parameters from the configuration below.

As a note, all parameters were set to use HTTP prior and post-update I’ve tested both HTTP/HTTPS and w/wo the endpoint/external URI with no affect on outcome. Also, during update I kept the existing configuration file and manually updated the changed parameters and added the new bin_dir and data_dir parameters.

I can paste the full server.conf if necessary, although no other parameters were touched upon.

bin_dir = /usr/share/graylog-server/bin
data_dir = /var/lib/graylog-server

#rest_listen_uri = 
http_bind_address =

#rest_transport_uri =
http_publish_uri = https://$http_bind_address/

#web_listen_uri =

#web_endpoint_uri =
http_external_uri = <hostname.domain>.net/

if you use http_publish_uri = https://$http_bind_address/ did you enable https in the server.conf with a certificate and by setting?

In addition the http_external_uri = <hostname.domain>.net/ needs the protocol too …

I guess you have Graylog running on ONE server and you have a proxy like NGINX, Apache or similar that make it available to the external network.

Yes, I’m running a single server with Graylog and Elasticsearch with nginx setup for external https access. The https and certificates are setup in the server.conf and I haven’t had any certificate issues prior or after the update, as I’m able to access Graylog portal via https with no issues.

I wasn’t aware of including the protocol to http_external_uri, that has now been fixed.

Unfortunately, perhaps expected, this did not help with the error messages I’m receiving. The below error messages appear every second or more frequently to the log.

 WARN  [ProxiedResource] Unable to call https://api/system/metrics/multiple on node <<node ID>>: api

I’ve come across other discussions on this forum with similar errors, but even those the errors contained an actual URI in the messages, not an https://api/system/...

While attempting to open the System > Nodes > nodeID, the following error is generated to the logs:

2019-09-09T18:16:57.267+03:00 ERROR [AnyExceptionClassMapper] Unhandled exception in REST resource api
at ~[?:1.8.0_222]
at ~[?:1.8.0_222]
at ~[?:1.8.0_222]
at okhttp3.Dns.lambda$static$0( ~[graylog.jar:?]
at okhttp3.internal.connection.RouteSelector.resetNextInetSocketAddress( ~[graylog.jar:?]
at okhttp3.internal.connection.RouteSelector.nextProxy( ~[graylog.jar:?]
at ~[graylog.jar:?]
at okhttp3.internal.connection.ExchangeFinder.findConnection( ~[graylog.jar:?]
at okhttp3.internal.connection.ExchangeFinder.findHealthyConnection( ~[graylog.jar:?]
at okhttp3.internal.connection.ExchangeFinder.find( ~[graylog.jar:?]
at okhttp3.internal.connection.Transmitter.newExchange( ~[graylog.jar:?]
at okhttp3.internal.connection.ConnectInterceptor.intercept( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at okhttp3.internal.cache.CacheInterceptor.intercept( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at okhttp3.internal.http.BridgeInterceptor.intercept( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at okhttp3.internal.http.RetryAndFollowUpInterceptor.intercept( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at$get$0( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at okhttp3.internal.http.RealInterceptorChain.proceed( ~[graylog.jar:?]
at okhttp3.RealCall.getResponseWithInterceptorChain( ~[graylog.jar:?]
at okhttp3.RealCall.execute( ~[graylog.jar:?]
at retrofit2.OkHttpCall.execute( ~[graylog.jar:?]
at ~[graylog.jar:?]
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) ~[?:1.8.0_222]
at sun.reflect.NativeMethodAccessorImpl.invoke( ~[?:1.8.0_222]
at sun.reflect.DelegatingMethodAccessorImpl.invoke( ~[?:1.8.0_222]
at java.lang.reflect.Method.invoke( ~[?:1.8.0_222]
at org.glassfish.jersey.server.model.internal.ResourceMethodInvocationHandlerFactory$1.invoke( ~[graylog.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher$ ~[graylog.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.invoke( ~[graylog.jar:?]
at org.glassfish.jersey.server.model.internal.JavaResourceMethodDispatcherProvider$TypeOutInvoker.doDispatch( ~[graylog.jar:?]
at org.glassfish.jersey.server.model.internal.AbstractJavaResourceMethodDispatcher.dispatch( ~[graylog.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.invoke( ~[graylog.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply( ~[graylog.jar:?]
at org.glassfish.jersey.server.model.ResourceMethodInvoker.apply( ~[graylog.jar:?]
at org.glassfish.jersey.server.ServerRuntime$ [graylog.jar:?]
at org.glassfish.jersey.internal.Errors$ [graylog.jar:?]
at org.glassfish.jersey.internal.Errors$ [graylog.jar:?]
at org.glassfish.jersey.internal.Errors.process( [graylog.jar:?]
at org.glassfish.jersey.internal.Errors.process( [graylog.jar:?]
at org.glassfish.jersey.internal.Errors.process( [graylog.jar:?]
at org.glassfish.jersey.process.internal.RequestScope.runInScope( [graylog.jar:?]
at org.glassfish.jersey.server.ServerRuntime.process( [graylog.jar:?]
at org.glassfish.jersey.server.ApplicationHandler.handle( [graylog.jar:?]
at org.glassfish.jersey.grizzly2.httpserver.GrizzlyHttpContainer.service( [graylog.jar:?]
at org.glassfish.grizzly.http.server.HttpHandler$ [graylog.jar:?]
at com.codahale.metrics.InstrumentedExecutorService$ [graylog.jar:?]
at java.util.concurrent.ThreadPoolExecutor.runWorker( [?:1.8.0_222]
at java.util.concurrent.ThreadPoolExecutor$ [?:1.8.0_222]
at [?:1.8.0_222]

That is true, I’m running Graylog and Elasticsearch on the same server with nginx.

I also receive a Java related error when attempting to open the node settings. The post with this log message is pending review from our beloved moderators. :slight_smile:

just to clarify

http_publish_uri = https://$http_bind_address/

is written exactly like that? Just make a comment # before or use

so the default setting kicks in.

plus that this holds the nginx reachable address.

http_external_uri = https://<hostname.domain>/

disable ssl in graylog itself - or if you want/need that use https for the publish uri and take care that the certificates are trusted by Graylog.

Thanks Jan, it seems that commenting out the following:

http_publish_uri = https://$http_bind_address/

Helped as all of the inputs now auto-started and the node is back online without any HTTP 500 errors. This with HTTPS and TLS certificates in place as before and the http_external_uri pointing to the nginx address as since yesterday.

To be honest, I’m not quite certain why commenting out that one parameter, which is part of the 3.x configuration documentation, would be the cause for this? Well, for now I’ll be testing the hell out of our Graylog in case there are any other issues or errors appearing.

Big thanks to you Jan! :smiley:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.