Graylog 3.0 filebeat configuration and tags

czechsys · February 18, 2019, 12:49pm

Hi,
with Graylog 2.5 i used collector-sidecar (now graylog-sidecar) with filebeat and tags. Every tag has its own configuration (for www,psql,php etc logfiles). With Graylog 3.0 i am stuck how to make it? I can import filebeat configuration into “Collector configuration”, but how i can make difference between those two types servers:
1] nginx+php+psql
2] nginx+php
Filebeat configuration was created based on those tags. Now i need to create two configs - one for 1] and one for 2] So generally, creating config for every used combination tags??? And when i will need edit some “tag”, i will need edit every configuration with “it”???

Thanks.

benvanstaveren · February 18, 2019, 7:18pm

Graylog 3 no longer uses tags, instead it pushes an explicit full configuration to a sidecar, but it’s a manual action you have to perform.

So, for your 3 servers, what you do is create 2 configurations (full), one for each, and set it up so yes, you are correct. Once you edit a config, that change is applied to all hosts to which you have assigned that configuration.

To make it easier, you can create “variables” that contain snippets of configuration that can be re-used between configurations - so you can do one for nginx. one for php, one for psql, and then create a config that includes the variables for nginx+php+psql, then one that has nginx+php.

czechsys · February 19, 2019, 8:16am

So, variables are something as tags. Good.

But the worse side…That change made configuration harder even for automation.
1] needing the unique name for configuration - aka phpmysql, phpmysqlnginx, etc… - i hate it.
2] with automation i added tag into sidecar config and restarted the service. Now what? I need change in webui. Or study api and make more complex playbook.

The simplicity is lost

benvanstaveren · February 19, 2019, 8:52pm

You can still assign a configuration to multiple machines - it’s more like, a configuration represents a tag, and now you can only assign the single tag.

For automation, I’m afraid you’re stuck with having to do an API call. We’re in the process of solving it with some additions to our internal management system, since we already tag servers (and in previous Graylog that was a 1:1 match with the sidecar), in the current version we link a configuration to a tag and do an API call to have the appropriate config associated with a machine.

Bit more complicated, but I have to admit I’m liking the Graylog 3 implementation of the sidecar better.

bitfactory-henno · February 21, 2019, 9:45am

I also have been struggling to replicate the old configuration structure in Graylog 3. In 2.5.x it was ideal that you could just provide the correct tags in a server’s configuration, and Graylog would apply all configurations to that machine matching these tags.

Some tags were more generic (e.g. ‘linux’ which applied to all Linux systems and collected generic things in /var/log), and some specific for the logs of an application only installed on several machines, and some for specific machines.

Now I am forced to create a configuration for every former tag combination and put most of the configurations inside variables if I want to follow a DRY approach, please tell me I’m overlooking something Just allowing me to apply multiple configurations to a sidecar would solve all of this.

benvanstaveren · February 21, 2019, 2:53pm

Nope, unfortunately that’s how it is now. I actually have to admit it’s better this way because the previous tag system could actually get you a non-working configuration if you combined snippets and collector entries.

We solved it by having a ‘base’ configuration that always picks up syslog and auth.log (since that is our “linux” default), and every other thing we want to do is in a variable. We manage these mostly in Graylog, but our internal tooling that handles our inventory/ansible/etc. will use the server groups from ansible to discover what variables need to be included (by virtue of naming, e.g. “filebeat_docker” applies to all hosts in the “docker” group) and then checks to see if an exact configuration like that exists already (by naming, again) - if it does, it applies that config to the server, if it doesn’t, it creates it through the Graylog API, then applies it to a server.

A bit more work, granted, but I feel a bit more comfortable with this way of doing things since it’s all a very “explicit” thing, it’s push instead of pull, which tends to work out better in large scale environments.

But that’s just my opinion

P.S. You can always open feature request issues for the sidecar things, of course

Totally_Not_A_Robot · February 22, 2019, 7:57am

Wow, that’s the second “Awww crap, really?!” moment you’ve given me for Graylog 3.x.

I freaking loved the tags and subscriptions! It was super-duper wonderful to be able to say: “you need to grab this, this and this config for yourself” to each of our boxen. Also wonderful to setup with Ansible or Puppet. Moving away from that architecture seems like a step back; I don’t want to configure each box separately

The logic of:

IF install application-stack
THEN add Sidecar tag

Was so easy to use!

EDIT:
Basically, what @bitfactory-henno said.

I can certainly see your point. Unfortunately many organizations are not mature enough to implement automation against the Graylog API. #Guilty

EDIT:
To sum it up, I’ve just warned my colleagues that upgrading from 2.x to 3.x is not something to take lightly. Het is geen kattenpis.

benvanstaveren · February 22, 2019, 8:53am

Zeker geen kattenpis! Worked out for us since we really haven’t been using it for very long so most of our in-house tooling wasn’t too set in it’s ways to be changed, granted, tagging hosts with their ansible groups was easy, but error prone

We’re not quite there yet either, right now the entire thing of assigning configs to servers is a hot mess of perl scripts and doing things in Ansible you shouldn’t really do

jan · February 22, 2019, 10:15am

To sum it up, I’ve just warned my colleagues that upgrading from 2.x to 3.x is not something to take lightly. Het is geen kattenpis .

To be honest - following semantic versioning this is the reason for raised first number. Because changed workflows and configuration is introduced and it needs attention.

We can’t improve without remove parts that does not work well. What not means that the user didn’t find it useful. But the sidecar in pre 1.0 version wasn’t really flexible and lacked the ability to use it with other collectors like the ones we initial wrote it for. Most requested feature was to be able to integrate other collectors. The way it works now we are not hardcoded to specific ones and any collector can be used.
The sidecar is now more generic.

benvanstaveren · February 22, 2019, 10:35am

Which is a good thing, however… the system of using tags to have the proper snippets pulled is one thing that did work very well - personally I’m not in favor of any one method, they both work for us, but it seems the tag-based configuration pull was in somewhat widespread use.

Ideally, I’d love to be able to assign a tag to a configuration I’ve created, and define that in the sidecar config so the sidecar can literally assign that configuration to itself - if a tag is supplied, of course. Would that be worth a feature request?

jan · February 22, 2019, 10:51am

Ideally, I’d love to be able to assign a tag to a configuration I’ve created, and define that in the sidecar config so the sidecar can literally assign that configuration to itself - if a tag is supplied, of course. Would that be worth a feature request?

Our Idea was to use variables for that - so in theory you can create a variable that holds specific configuration and that variable you can use in multiple configurations.

benvanstaveren · February 22, 2019, 11:59am

Yes, which we’re using right now as a “do not repeat yourself” type deal - but before with the tags, you assigned tags to a collector, and the collector would receive a configuration based on that. In the new mechanism, we need to explicitly tell Graylog to send a configuration to a collector. What I would like to do is have the option of telling a collector “this is your tag, you arrange with Graylog that you receive the configuration tagged with that” so one doesn’t have to do it in their deploy pipeline.

Again, for myself it’s not so much of a thing, we can use either way

jan · February 22, 2019, 12:01pm

it would be very helpful if you can describe the behaviour in a feature request for the sidecar.

thx

Totally_Not_A_Robot · February 22, 2019, 12:49pm

Fair points Jan, and I understand why you guys did it. So, not griping about Graylog and the devteam, just about the pain we’ll need to go through

benvanstaveren · February 22, 2019, 2:04pm

Yeah let me just say I’m not complaining, Graylog 3 is looking awesome

patcable · February 23, 2019, 5:57pm

I decided to dig in and figure how I will actually migrate to the new sidecar system if I end up going that route, since I rely on tags for configuration.

Using the API browser I noticed a PUT endpoint at /sidecars/configurations, which was encouraging… but the JSON model in the API browser only mentions a nodes object. After digging into the code, it looks like nodes is a way of referencing NodeConfiguration objects, which require node_id (string) and assignments (another array of objects) as values. That assignments object has two strings, configuration_id and collector_id.

You can get both collector_id and configuration_id by calling GET on /sidecars/configurations - add ?name=configname on that request to search for a specific named configuration. Once you have those values, you can build an object to send to /sidecars/configurations endpoint like so:

{
  "nodes": [
    {
      "node_id": "86cc6573-d443-4dc8-b589-4e9f1c6d7968",
      "assignments": [ 
        {
          "configuration_id": "5c716d9216698621505be8ae",
          "collector_id": "5c7066e279728d1cf7c10a9f"
        }
      ]
    }
  ]
}

After sending this to that PUT endpoint, the collector output shows:

time="2019-02-23T16:56:54Z" level=info msg="[filebeat] Configuration change detected, rewriting configuration file."

so it got the updated configuration. Hopefully this is helpful to folks looking to replicate the old behavior!

I think it’s important to mention, though: this change would be easier to handle if there were more complete documentation and examples on how to register a configuration to a sidecar programatically. While I appreciate the flexibility of the new solution and work you’ve done, bringing up SemVer and “this is better” as a response isn’t very empathetic or kind. For those who are using configuration management to deploy configuration out to their machines, it might be easier to deploy and configure filebeat without sidecar now.

Totally_Not_A_Robot · February 25, 2019, 9:23am

Thanks for your helpful work Patrick!

Hmmz, yea… now that you mention it. That would be an alternative approach.

bitfactory-henno · February 27, 2019, 8:35am

That’s what I’m doing on my FreeBSD machines right now due to the gosigar library issue, however for me it’s not really a great way to quickly change configurations everywhere right now (I’m using Ansible and the plays take quite long, I’m planning to split things into roles so I can maybe only run the Filebeat part on each host).

flosto · March 7, 2019, 3:01pm

@benvanstaveren I would like to try this approach but am struggling with multiline strings in variables:
Say I have a multiline snippet/variable:

multiline:
  match: after
  negate: true
  pattern: ^20[0-3][0-9]\.

that I want to include in my filebeat config like that:

fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}

filebeat.inputs:
- input_type: log
  paths:
    - /var/log/*.log
  type: log
  ${user.multiline}
output.logstash:
   hosts: ["192.168.1.1:5044"]
path:
  data: /var/lib/graylog-sidecar/collectors/filebeat/data
  logs: /var/lib/graylog-sidecar/collectors/filebeat/log

Then the preview shows:

fields_under_root: true
fields.collector_node_id: <node name>
fields.gl2_source_collector: <node id>

filebeat.inputs:
- input_type: log
  paths:
    - /var/log/*.log
  type: log
  multiline:
  match: after
  negate: true
  pattern: ^20[0-3][0-9]\.
output.logstash:
   hosts: ["192.168.1.1:5044"]
path:
  data: /var/lib/graylog-sidecar/collectors/filebeat/data
  logs: /var/lib/graylog-sidecar/collectors/filebeat/log

So it screws up the indentation. Any idea how I can get multiline variables to always have the same indentation? Or did I misinterpret the snippets idea entirely?

benvanstaveren · March 7, 2019, 3:51pm

Unfortunately, snippets are entirely as-is, so if you need the same snippet but with twice the indentation, you will need to add two snippets, each having the correct indentation for their use case

Topic		Replies	Views
Graylog 3 Sidecar Config Management Graylog Central (peer support) sidecar	4	661	July 16, 2019
Configuration of tags collector-sidecar Graylog Central (peer support) sidecar , filebeat-linux	8	4905	October 16, 2017
Status: No configuration found for configured tags! Graylog Central (peer support) sidecar , filebeat-linux , nosendlogfblx	9	2153	October 22, 2018
More than one Collector Configuration Graylog Central (peer support) sidecar , filebeat-linux	5	1384	September 3, 2019
Best practice approach with Sidecar Graylog Central (peer support) sidecar , winlogbeat	5	1479	May 10, 2019

Graylog 3.0 filebeat configuration and tags

Related topics