Hey all
So we’ve been building out our Graylog infrastructure, and are getting to the point where we want to present it to clients. Naturally, we need to turn on TLS, so I spent some time doing that today, following the guide here: http://docs.graylog.org/en/2.4/pages/configuration/https.html
Once I had it working, though, I started getting errors on my dashboard widgets. About 40% of them don’t load initially, showing ‘N/A’, with the red triangle exclamation icon, hovering over showing: “Error loading widget value: cannot GET… (500)” If I wait a minute or so, they will eventually load. Searches exhbit similar behvaiour, often not loadining initially with a 500 error.
I tried disabling TLS, but the issue persisted. I tried updating to the latest graylog version, but the error persisted. My elasticsearch cluster is healthy and green. Messages are being ingested at around 1000-2000 per second, and results seem up-to-date when the searches don’t return 500.
Example of performing a search via API (partly redacted):
# curl -vv "http://ro-user:********:9000/api/search/universal/absolute?query=*****************&from=2018-05-24%2000%3A00%3A00&to=2018-05-24%2000%3A05%3A00&fields=source&filter=streams%3A59df7d23da1a031aaea70e66&limit=1&decorate=false"
* Trying *****...
* TCP_NODELAY set
* Connected to ******* port 9000 (#0)
* Server auth using Basic with user 'ro-user'
> GET /api/search/universal/absolute?query=*************&from=2018-05-24%2000%3A00%3A00&to=2018-05-24%2000%3A05%3A00&fields=source&filter=streams%3A59df7d23da1a031aaea70e66&limit=1&decorate=false HTTP/1.1
> Host: ***********:9000
> Authorization: Basic **********
> User-Agent: curl/7.58.0
> Accept: */*
>
< HTTP/1.1 500 Internal Server Error
< X-Graylog-Node-ID: 4a6e00a9-5b27-4241-b1ba-cbad1f430f18
< X-Runtime-Microseconds: 306860
< Content-Type: application/json
< Date: Fri, 08 Jun 2018 21:08:42 GMT
< Connection: close
< Content-Length: 57
<
* Closing connection 0
{"message":"Unable to perform search query","details":[]}