Alert when no logs is sent from a host | ChatGPT solution

I’ve spent a few hours yesterday trying to figure out how to create an Event that would alert me when no logs received from any of the host, and I could not find a solution on the internet. There was a solution that works as on entire stream but not on individual hosts reporting to that stream.

Here is problem statement:
I have several hosts reporting to Graylog, all to the same stream. I want to be alerted when any of the hosts stops sending logs (let’s say for an hour or a day).

After some back and worth with ChatGPT, it proposed the following solution:
Create a Pipeline Rule such as

rule "Check Host Logs"
when
 // Define the time frame for evaluating the condition
 has_field("timestamp") && to_long($message.timestamp) >= (now() - 5 * 60 * 1000)
then
 // Check if logs are missing for any of the hosts
 let hosts = ["host1", "host2", "host3"]; // Add your hostnames or IPs here
 let missingLogs = hosts - search("stream:YourStreamName AND source:(" + hosts.join(" OR ") + ")", 1);
 if (missingLogs.size() > 0) {
   // Trigger an alert when logs are missing
   set_field("alert", true);
   set_field("missing_hosts", missingLogs.join(", "));
 }
end

Create a Monitor that would trigged on condition when field alert is set to true.

If I understand correctly, it’s based on the fact that while one host can stop reporting logs, others will continue. And that Pipeline Rule can actually perform searches on previous logs. While it’s astonishing that ChatGPT could came up with this, I’m skeptical about performance. Another problem that it still uses a static list of hosts deeply embedded into the Rule’s code.

I wonder what do you think about this approach?

1 Like

Well…
I pasted the code into a rule editor. It pointed out 5 different lines which will not work:

I like the idea in general, and I already saw it beeing used to write Grok Patterns, but there is still work needed.

I was able to get 2 more solutions out of ChatGPT that was based on Message Processor and Extractors with Actions but they both seem to be for v.4+ so I should not verify them on my v.3.

Then I’ve asked ChatGPT to create a feature request for graylog team, and here we go:

Subject: Request for Simple Monitoring for Sidecars in Graylog

Dear Graylog Team,

I would like to request the addition of a simple and user-friendly monitoring feature for sidecars in Graylog. Sidecars play a crucial role in collecting logs from various sources, and having a built-in monitoring capability would greatly enhance the overall management and visibility of the sidecar infrastructure.

Feature Description:

  1. Status Overview: Introduce a dedicated dashboard or section in the Graylog web interface that provides an overview of the sidecars deployed in the environment. This dashboard should display key metrics such as the sidecar status (active/inactive), last heartbeat, and the number of logs successfully collected.

  2. Alerting: Enable configurable alerts for sidecar-related issues, such as sidecar failures or unresponsive sidecars. Users should be able to define conditions (e.g., sidecar offline for a specific duration) and select notification methods (e.g., email, Slack, etc.) for receiving alerts.

  3. Historical Monitoring: Implement a mechanism to track and display historical data for sidecar performance. This could include metrics like log collection rate, connectivity status, and any errors encountered by the sidecars.

  4. Integration with Graylog Streams: Enable integration with Graylog Streams to provide a seamless experience for managing sidecars within the existing stream-based log processing. This integration would allow users to easily apply rules, filters, and alerts specifically targeting sidecar logs.

By incorporating these monitoring features, Graylog users would benefit from enhanced visibility, proactive monitoring, and streamlined troubleshooting of their sidecar infrastructure. It would simplify the management of sidecars and help ensure reliable log collection from diverse sources.

Thank you for considering this feature request. I believe it would greatly contribute to the usability and effectiveness of Graylog in log management and analysis.

Sincerely,
[Your Name]
[Your Organization (if applicable)]

Beside 4th point it looks legit to me. I asked to elaborate about 4th point and it went a rout of full customization for Sidecars specifically. Like a separate Stream-Sidecar management workflow that could be more flexible than regular Stream workflow. I think it’s unreasonable.

Yea, I’ve battled with it for a bit but it won’t produce a working code. It kept using non existent functions. Perhaps it tried to use features added in latest version of Graylog. Or maybe pseudocode.

If I understand correctly, it’s based on the fact that while one host can stop reporting logs, others will continue. And that Pipeline Rule can actually perform searches on previous logs.

Huh, I don’t think there is a function for Pipeline Rules to “search” in past logs. I can’t seem to find such function.

It’s dreaming, as usual. Not in the way humans dream, in the way AI dreams. It has seen similar stuff elsewhere and presents it as something plausible.

My suggestion: build an event for the cardinality (=number of different values) of your source field for each stream. you don’t need pipelines for this.
If the number is to high you might think about a script building the events via API for each one by one.

1 Like

Does this mean to count unique values of “source” and alerting if it’s less than expected value (assuming that we have a fixed number of servers)?

I haven’t thought of it. I’ll try that.

For this type of alert, you can specify your search criteria that you expect to be found (but alerted if not), and then add an aggregation to test if the count() is < 1:

I just did a quick test and verified this event is triggered as expected (when i disabled that log source for a minute):

Thanks but I don’t think this is a workable solution when you have multiple beats reporting to the same stream, which is my case.

Hey @roman.potato

I did a config for something like that, I have Widget for number of sources like this It trend per Day

What I used was Cardinality.

As for an alert/Notification have haven configured that yet.