Apache piped log, unresponsive graylog and best practices

NeoGeo · June 27, 2019, 1:29pm

We recently setup Graylog for a cluster of Apache2 servers using piped log messages via /bin/nc

At some point in time, graylog’s listener stopped working and we started filling the apache2 error logs with piped log program '/bin/nc -u OURSERVER.com 12201' failed unexpectedly errors

Around the same time, our web clusters started dying unexpected deaths and falling off their load ballancers - I suspect because the hung connections to graylog were causing them to fault somehow.

That said, I suspect the failure of a graylog listener in our case was the cause of a cascade failure across our webserver clusters.

With the above in mind, what is best practice to get data from an apache2 webserver cluster to a graylog install in a way that prevents this from happening in the future should the graylog instance “go away”?

Many thanks in advance for the conversation and advice to come!

jan · June 28, 2019, 6:55am

the problems are located locally at your apache servers - cause you use UDP to send the logs it can’t be that anything in Graylog is the reason for that.

You should check your log files of the systems itself and your metrics

NeoGeo · June 28, 2019, 12:25pm

Hello Jan, thanks very much for your reply, time and efforts!

Initially this is what I thought as well, but restarting the graylog box corrected the problem. Looking at the sources graph corroborates this further as all source activity abruptly stops at the same time all three apache servers begin to get the error described.

I’ve since added -w 1 to the nc command in order to enforce a timeout / prevent indefinite waiting / hanging up the apache instance, but I was hoping there was a better solution?

Perhaps some way to queue messages and transmit them in bulk every minute or so in order to prevent multiple open connections all of which are at risk of hanging up?

With great appreciation,

George

jan · June 28, 2019, 3:32pm

personal I would never make my user facing server that vulnerable and send the messages directly out. As your selected solution is hacky you run into strange issues.

I would write the log messages local, having filebeat collect them and on rotate delete fast if you have space issues. But this way your frontend is not bound to any backoffice logging to work proper…

NeoGeo · June 28, 2019, 4:38pm

This is great advice - I will look into FileBeat.

Is there a best-practices DOC somewhere I should be following?

BTW - Re: Hacky - this is what turned up as the solution when I searched the above It’s actually piping through VPC and not user-facing (port is not exposed) so the transmission is actually secure end-to-end.

Any available docs / faq regarding setting this up for an Apache cluster you could reference would be greatly appreciated, as is your help with this problem thus far - many thanks!

jan · June 29, 2019, 10:26am

you get in this community some insides, the documentation will give you some additional.

No “final solution” is out as you can have more than one way to make your goal happen. you need to decide what fits to your setup.

NeoGeo · July 5, 2019, 11:57am

Some more information about this issue, which continues for us:

Every 12-48 hours the input itself is stopping ingesting messages.

I was able to get it running again by stopping / starting the input from the web interface alone this time (as opposed to restarting the entire server).

The input is currently frozen/locked up - I will screenshot what I can and put it below for reference:

If anyone has some things to check or try next, I would be greatly appreciative. Thank you

system · July 19, 2019, 11:57am

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
New tool to feed Apache logs into Graylog Graylog Central (peer support)	1	94	August 21, 2024
Graylog Cluster and Load Balancing Graylog Central (peer support) sidecar , winlogbeat	10	3541	August 31, 2021
Failed Cluster Member shuts down entire Graylog Cluster Graylog Central (peer support)	4	1901	October 12, 2017
Graylog host logs and monitoring Graylog Central (peer support) architecture	5	341	November 27, 2023
Graylog UI stops responding after a while Graylog Central (peer support)	7	1100	April 15, 2021

Apache piped log, unresponsive graylog and best practices

Related topics