Hello, all… I’m fairly new to GrayLog, but I understand the basics. I have a business requirement to have the following type of installation:
Apps (Docker containerized) -> Graylog (via LogSpout) -> Beats -> ELK
Apps are a series of GoLang based microservices running in Docker containers. Each Docker HOST node in the cluster has a LogSpout agent running which is picking up a JSON log entry coming from the various APP microservices.
LogSpout sends the JSON formatted message to Graylog successfully, where Graylog works it’s magic and I get all the goodness I need, including automatic field generation. This part is working very well.
What I am trying to do from here is to configure an automatic output of anything detected for a given Graylog Stream to send to an ELK stack via Beats that is expecting a JSON input. I’ve tried creating the following outputs on that stream: GELF and KAFKA plugins to try and send directly from GrayLog to the target server’s ingestion points. This doesn’t seem to work out of the box because our ELK cluster requires authentication tokens to be present in the packet headers or messages are blindly rejected. I’ve also tried using the FileOutput plugin with a Beats agent, which DOES work (because I can configure the beats agent to send the proper headers with token values). BUT, with this approach, Graylog always insists on using a flat format message being logged (i.e., it converts the original JSON message into a SYSLOG style message, stripping out the JSON formatting before writing the entry to the logfile.
So, in a nutshell, the question is: Can Graylog either write a JSON formatted message to a logfile? Or, can either of the GELF or KAFKA plugins somehow accept a configurable custom header token to be sent to a target server without resorting to modifying the existing plugin sources to include one (i.e., maybe via a config file on disk)? Or, is there a better solution to what I’m trying to do?
If it helps to identify a better solution, why we’re trying to do this is: We have a corporate ELK stack where all application logs are supposed to be sent to. Which is all good, BUT… we have ZERO control over what happens there, so we can’t create custom alerts or automation actions if certain log conditions are realized. We can’t do customized reports or charts, or much of anything with the log data once it gets there. So, the thought was to put a 2-3 node Graylog cluster as a sort of Man-in-the-middle approach that my team could access and control and do the custom reporting/charting we want, add custom alert triggers that might trigger automation actions, etc. without affecting the logging that goes to the corporate ELK stack.
Any other suggestions are warmly welcomed (I have no pride left at this point).
Thanks in advance!