Tracking message source when using output to forward to another Graylog server

samf · February 2, 2022, 10:26pm

We are deploying Graylog as a monitoring tool across multiple client sites. If its relevant we are using the docker images at all client sites and for our main server.

Our plan is to configure multiple Graylog instances (one at each client site) and then have those instances output some or all of their messages to our central Graylog server.

I’ve got this setup and working but I can’t see anyway to identify the location the message came from.

I would like to be able to have all messages from the same source system at each client land on the central server in a single stream so we can easily apply rules, extractors etc to all the different input types. Essentially, Windows Log Stream at client A, B, C, D all land into Central Windows Log Stream on our central server.

The issue I’m having is when they all land there I have no idea where they’ve come from. How can I identify or even better tag the source? For instance can I name each node with a docker environment variable and then have that passed on with the output as a field? Or can I at least see the source ip it came to us from, that would allow me to narrow down the client based on their public ip.

I’m assuming once I have this setup I will very easily be able to filter to different clients when I need to in searches or dashboards.

Thank you in advance for any guidance you can provide!

gsmith · February 2, 2022, 11:47pm

Hello @samf

Correct me if I’m wrong you have remote Graylog server/s sending messages to a central Graylog server and you need a way to sort out logs/message from each DMZ? If this is correct? We have done this in a couple different ways, The best one I know of is to create a INPUT for each DMZ on the main server zone-01, zone-02,zone-03.

Quick Example:

The output on the remote GL server .

Then from there you would be able to sort out your logs/message either with pipelines and or extractors for the individual DMZ’s.

tmacgbay · February 2, 2022, 11:54pm

I have not used Graylog forwarding so I can’t say if there is a setting in there a that you can change… however, perhaps you can have your outlying Graylog servers create a new field that includes the source field as well as any other relevant data before you forward the message…

samf · February 3, 2022, 9:22am

Hi @gsmith, thanks for the reply. I am aware of this option but I foresee an issue with scalability. Right now this will be deployed across a handful of sites, but I need something that can be automated to hundreds of sites using automation as much as possible. These are small sites but each will have its own server to potentially retain more data than will be retained at the central site.

I already have a more or less finalised method for deploying these servers in a 99% automated way.

In this context needing to setup multiple inputs on the central server would not be ideal.

But on a more limiting note these messages will be identical (apart from source) and I’d like to process them with the same extractors and then siphon them off into different streams on the basis of the data rather than the source. For example critical security or high risk events are retained longer and actively monitored by our staff, whereas run of the mill data can be cleared after a few weeks.

From what I understand the best way to achieve this use case is with an extra field.

samf · February 3, 2022, 9:27am

Hi @tmacgbay, thanks for the reply, that would work!

Unfortunately I can’t figure it out, the extractor stuff I’ve found details how to extract fields from the messages but what I actually need is to just add a static field to every message on the server. I could add an extractor to the incoming streams to do this, but how to write it is beyond me, I was hoping the answer would be “oh yeah tick this box and Graylog will store the IP it received the message from”!

If it needs to be a customised extractor on each source server I can live with that, but can you guide me on how would I write an extractor to add a static field please?

Alternatively if it could pickup a system variable (ie the hostname of the server) and it is a field that would be even better, but seems a bit of a pipe dream!

samf · February 3, 2022, 12:00pm

Found it!

On an input on the input screen next to the Stop Input button just click More Actions → Add Static Field and add the field you want to add! I’m sure this can be automated via the API when I need to.

Thank you @tmacgbay and @gsmith for your help!

tmacgbay · February 3, 2022, 2:31pm

Glad you found a solution! That will at least put your site name in. If you are using Beats or nxlgog, you can have the those sidecar configurations add in the host name - here is an example tfor a beats configuration that captures messages from Windows IIS and inserts the hostname as a field before the message is ssent to Graylog The line that does this: test_hostname: ${sidecar.nodeName}

# Needed for Graylog
fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}
output.logstash:
   hosts: ["${user.BeatsInput}"]
path:
  data: C:\Program Files\Graylog\sidecar\cache\winfilebeat\data
  logs: C:\Program Files\Graylog\sidecar\logs
tags:
 - windows, iis
filebeat:
  inputs:
    - type: log
      enabled: true
      # include_lines: ['example', 'Turf', 'stuff'] #Commented out... for now
      exclude_lines: ['^#'] # --exclude anything that starts with #
      fields:
        test_hostname: ${sidecar.nodeName}
      ignore_older: 7h
      paths:
        - R:\data\logs\iis\W3SVC2\*.log

You could also do it further down the path the message takes on at the satellite office. Attach a pipeline on the stream associated with the local input(s) and use the source field to create a new and separate field to be picked up later - in simplest form the rule in the pipeline would look like this:

rule "the One True Source"
when
  true
then
    set_field("true_source", $message.source);
end

Also - Mark your note as the solution for future searchers!

gsmith · February 3, 2022, 10:38pm

Nice

I completely understand, specially for a larger environments. Maybe two or three DMZ’s but after that it would be a pain.

Adding on to @tmacgbay suggested. If your using Nxlog and depending on the type of Graylog Input is in use, the input configuration on nxlog can be renamed for your DMZ, I have done this before to filter out specific nodes for searches.

For example if this was configured on nxlog as…

<Extension gelf>
    Module      xm_gelf
 </Extension>

<Input zone-01> <---- place name here
    Module      im_msvistalog    
</Input>

This would be shown like this.

That field shown is generated automatically from the GELF INPUT. From that field your Alerts, Notifications, and Re-routing can take place.

tmacgbay · February 4, 2022, 12:59pm

ms vista?

gsmith · February 4, 2022, 10:06pm

Yeah , that’s the nxlog-ce.

system · February 18, 2022, 10:07pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
How graylog differentiate logs from various servers Graylog Central (peer support)	2	642	May 24, 2018
Centralize netflow from multiple location Graylog Central (peer support)	2	897	April 27, 2018
I have some questions about default field 'source' Graylog Central (peer support) pipeline-rules	5	4489	April 24, 2017
How can i get seprate log of different server Graylog Central (peer support)	2	375	July 9, 2018
Central Graylog Server Graylog Central (peer support)	2	388	December 4, 2019

Tracking message source when using output to forward to another Graylog server

Related topics