Migration from ELK where to find data views or extractors

I’m currently evaluating graylog coming from ELK stack. I succeeded to build graylog/opensearch app on Ubuntu. I also succeeded to create 2 inputs. One as a UDP syslog server and one from filebeats with 3GB/day various log data.

What I am missing, is to find already available extractors. E.g. for syslog. In ELK there is something like data views delivered with the filebeat. In graylog I have not found anything by searching in the marketplace. E.g. in the open source product zabbix, there is a template store, which is a good basis for own templates and this avoids to reinvent the wheel.

I also would be interested in extractors for kea-dhcp and rspamd and many other open source products.

so:
where can I find common extractors or “templates” for typical types of log files?

@biller, the Graylog Marketplace is where you might find extractors or processor pipelines you can use, though I don’t know if parsers for those products specifically are there.

However, depending on the data source, it may be easy enough to create your own. If you can post a few examples of each log, we may be able to suggest a pipeline function, such as key-value pairs, that will parse them with a simple pipeline rule.

One caveat about the Marketplace though, some of the existing content was created for previous versions of Graylog. The 4.x and newer content may work in 5.x, but the older content almost certainly will not. Hope that saves you a bit of time and frustration.

was that an answer from a KI bot? Sorry, but your answer is general and not goal leading. The idea is, not to do the same thing as 1000 other graylog users have done before. And as mentioned, the search in the marketplace does not lead to useful results. So the approach of unstructured search seems to be not useful. Even for very basic beginner questions like mine.

Hey @biller, there are no bots here! Chris is one of our SEs at GL, and is sending you down the best path to find what you are looking for. Unfortunately, if you don’t see what you are looking for in the Marketplace section of the forum, then the community has not developed any preexisting content for that source. The vast majority of the Graylog created parsing library is currently reserved for our Operations and Security customers. That being said, if you are looking for guidance on how to start building your own parser for certain kinds of logs you have in your environment, feel free to post a sample here and we will be more than happy to give you guidance on how to get started utilizing Processing Pipelines.

1 Like

So,
my first impression on graylog is:

  • The installation is quite straight forward. I managed it within a day and also created ansible roles for it.
  • After the installation there is no really guidance for new customers and how to bring a first small example running
  • the “documentation” in Graylog Documentation is hijacked by the marketing team and absolutely without benefit for technical staff
  • the community section “marketplace” does not lead to results and there is no support by the community for very basic questions like extractors from filebeat.
  • the API sounds promising. I started to build a library. However there is no reference with examples for it.
  • there is no guidance about best practises and variants on how to organize the log management within graylog.
  • the Graylog community support does just give truisms or reply for basic questions, that there is no answer.

In total I spent several days to do very basic things after the installation without knowing if it will be the appropriate way to use the app.

Coming back to the topic of this thread:

  • Comparing to ELK, there is quite some emphasis taken to make the app installation easier.
  • The beats concept provides much more integration into ELK than into graylog.
  • The integration into an existing architecture is better with ELK because one can’t find the API reference in graylog.
  • the security is not fully evaluted by me. We successfully and easily mirrored the sources to our internal installation servers. What is pending from our side, is, how graylog behaves with (automatic) updates. Here ELK is quite a pain.

Hi @biller, if all you are looking for is a parser for FileBeat, then utilizing the Beat input should parse your Beat messages automatically. If you want to parse the messages FileBeat transmits, then that is going to have to be built custom to the messages that you are sending, and as offered in my previous reply, we can help point you in the right direction with that.

As for the topic of this thread, I am not exactly sure what it is at this point, other than you wanting to share your negative opinions. You asked a generic question which has been answered two different ways now, both of which asking for more information, which you have failed to provide. If you have specific questions you would like to ask about best practices, parsing, our API, etc. I encourage you to ask those in separate topics. We have answers for all of these things, many of which can be found in this forum and on our blog.

Lastly, you brought up our API multiple times, so here is a link to the documentation on how to access our in product API Browser for you to work with. If you have feedback for our documentation team as to how they can improve our docs, feel free to message me it directly, so I can ensure they receive it.

Feel free to message me directly with any concerns you have going forward, and I will be happy to assist you where possible.

Hi Hermann,
We appreciate the feedback.

Its important to understand the implicit tradeoff when using free software: time vs money. Using a free solution typically does require an upfront investment in time. Its also not helpful to compare solutions as they are developed by different people and to solve different problems.

If you are interested in learning more about graylog there are some helpful resources. The documentation is a good first step. I recommend to read through some of the pages to get better oriented with how graylog works.

You don’t need to read every word on every page but at least to become familiar with the topics.

This blog post is very helpful to get an understanding of processing pipelines and concepts you can use to write your own: Graylog Parsing Rules and AI Oh My! . It looks intimidating but I’m confident that you can do it.

I understand the desire to want something provided for you but I believe you will get much more out of the experience learning the basics of graylog, pipeline lines, even regex, and experimenting with those concepts while using graylog to solve problems or challenges specific to you.

I hope this is helpful. If you have any other specific questions feel free to create a new topic for each individual question. Myself and my team check this forum frequently and provide high quality replies.

About sidecar… If I understand it correctly, than sidecar for linux is a filebeat.yml configurator?
Would it be possible to see an example for such a generated filebeat.yml file in order to get a first look and feel?

Sidecar for Windows, Linux, and Mac OS behaves the same way on all 3 platforms. It serves as a central point of management for “collectors” like beats and nxlog. So in short, yes it can be a filebeat.yml configurator among other things.

One important thing to know is that sidecar manages the collector itself, meaning it distributes the collector configuration (e.g. .yml) and starts/stops the processes. This means that on linux, when you install filebeat, you do not enable the service since sidecar will handle that. You also do not need to manually configure filebeat on the linux server as it will consume a config generated by sidecar.

Here is an example filebeat config that i use in my cluster:

From the graylog UI perspective

# Needed for Graylog
fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}
fields.source: ${sidecar.nodeName}
fields.config_tag: graylog-server

filebeat.inputs:

- input_type: log
  paths:
    - /var/log/path/log.log
  multiline.type: pattern
  multiline.pattern: '^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}'
  multiline.negate: true
  multiline.match: after
  type: log

- type: filestream
  id: bts
  paths:
    - /home/user/path/log.log
  parsers:
    - multiline:
        type: pattern
        pattern: '^\['
        negate: true
        match: after

output.logstash:
   hosts: ["<ip addr>:5044"]

path:
  data: /var/lib/graylog-sidecar/collectors/filebeat/data
  logs: /var/lib/graylog-sidecar/collectors/filebeat/log

Generated on the server itself via /var/lib/graylog-sidecar/generated/64219d0478419f473d47d2b1/filebeat.yml:

# Needed for Graylog
fields_under_root: true
fields.collector_node_id: <redacted>
fields.gl2_source_collector: <redacted>
fields.source: <redacted>
fields.config_tag: graylog-server

filebeat.inputs:

- input_type: log
  paths:
    - /var/log/path/log.log
  multiline.type: pattern
  multiline.pattern: '^\d{4}-\d{2}-\d{2}T\d{2}:\d{2}:\d{2}\.\d{3}'
  multiline.negate: true
  multiline.match: after
  type: log

- type: filestream
  id: bts
  paths:
    - /home/user/path/log.log
  parsers:
    - multiline:
        type: pattern
        pattern: '^\['
        negate: true
        match: after

output.logstash:
   hosts: ["<ip addr>:5044"]

path:
  data: /var/lib/graylog-sidecar/collectors/filebeat/data
  logs: /var/lib/graylog-sidecar/collectors/filebeat/log

Thank you for the comprehensive answer

1 Like

so… what I achieved until now:

  • The problem with the api was a missing http_publish_uri entry in the /etc/graylog/server/server.conf .After fixing that, the api was reachable by token without asking for passwords.
  • I managed to write a pipline for ansible results.
    ansible-log → rsyslog → filebeats → graylog
    it was a bit tricky to bring it run. I had to change that piplines are processed before stream rules in the menu “System/Configurations” → " Message Processors"
    find below the grok based pipline rule
  • I created a small library of api commands which are saving my graylog configurations in json formate into a git repo

As of now my evaluation looks good. :slight_smile:

rule
“ansible_extract_summary_fields”
when
contains(to_string($message.message),“ansible_bootstrap”) or
contains(to_string($message.message),“ansible_log”)
then
let ansible_summary = grok(
pattern: “%{SYSLOGTIMESTAMP:timestamp} %{DATA:host.hostname} %{DATA:ansible.tag}(\s.*\|)? %{DATA:ansible.hostname}\s+: ok=%{INT:ansible.result.ok}\s+changed=%{INT:ansible.result.changed}\s+unreachable=%{INT:ansible.result.unreachable}\s+failed=%{INT:ansible.result.failed}\s+rescued=%{INT:ansible.result.rescued}\s+ignored=%{INT:ansible.result.ignored}”,
value: to_string($message.message)
);

set_fields(ansible_summary);

route_to_stream(
    name: "ansible",
    remove_from_default: true
);

end

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.