Before you post: Your responses to these questions will help the community help you. Please complete this template if you’re asking a support question.
Don’t forget to select tags to help index your topic!
1. Describe your incident:
I have noticed that Graylog is very slow in processing messages, often times 24-36 hours behind, and only recently.
2. Describe your environment:
-
OS Information: Ubuntu 22.04
-
Package Version: 5.07
-
Service logs, configurations, and environment variables:
N/A
3. What steps have you already taken to try and solve the problem?
I tried increasing available memory and CPU resources, however, my process queue had over 300,000 unprocessed messages and the processing rate was .34/second.
4. How can the community help?
Posting to provide my solution that I found. I am enriching logs with WHOIS data as well as OTX data.
I noticed that I was getting a lot of HTTP 504 Timeout codes from OTX in my Graylog log.
Has anyone else experienced this with OTX?
I removed OTX lookups from my pipeline rules, and the queue cleared out in a matter of seconds.
Has anyone else experienced this with OTX? I surmise what was happening was the request was going to OTX, and graylog was waiting for any response before moving on to the next message. Since it’s a timeout, that caused a large queue to form. Any ideas on how to improve this situation from a Graylog perspective? On the data adapter, my timeouts are 15000 ms for Connect, 10000 ms for Write and 75000 ms for Read. I may try and drop those significantly, but it seems like everything has been timing out…
I will also check in with OTX on the reason for the timeout messages.