I just needed to clarify the terminology and basic structure of Gralog so I knew where to go to configure things. I find tools like these I set up do all the work and just use it periodically and forget when I come back months later to remember how to add/change something. So a picture is worth 1000 words and instantly refreshes my memory.
So I thought I’d share. This is very basic. But if I have something wrong feel free to let me know and I’ll correct it.
EDIT - Current Version (after discussions below):
Assuming the arrows represent message path, here are some thoughts:
- After the input are extractors. (Message Processors Configuration matters around here) Extractors and pipeline rules do similar things.
- The rules a stream has defines messages it pulls in from any/all inputs
- Pipelines are assigned to streams (one or many) and massage the message via rules en route to the index.
- I think Alerts fire off on index queries so they would follow the index next to the dashboards rather than tailing off streams.
- Decorators can be put in with dashboards since they do not affect data, only how data is displayed.
I like the design! Those are my thoughts… though I wonder if I am slightly wrong on some things…
So Inputs - Extractors and then Streams?
So you think I should connect the Pipelines as being upstream requirement of Streams instead of inputs?
Alerts go after indexes? Good to know.
Decor - I’ll attach to the Dash. Thanks for setting me straight. I’m just trying to separate all of the components and how they interact on a basic level for newbs trying to understand.
This was my biggest issue trying to read the docs was in which order do I add things and learn things to get data the way I wanted.
Yes. I think this can be modified in your “Message Processors Configuration” under System/Configurations but most have it set up with Message Filter Chain first and Pipeline processor second.
Pipelines require a stream to function, streams work fine without Pipelines (in case you are super awesome at extractors like @gsmith) Pipelines happen IN a stream. Once you connect a pipeline to a stream, anything the stream rules capture gets shot down the pipeline or pipelines if you connect multiple to a stream. (also of not pipelines have stages that have methods to how they sequence between pipelines… but that might be more nitty gritty than you want to get into for this… that’s maybe a 10 foot view )
Tom Lawrence does a great walkthough of Graylog:
I must admit having a complete layout of a logical diagram for Graylog like this is really nice to have here. Doing a good job
Event Definitions can be attached to specific Stream or is not configured it will search all streams which Notification are attached to Event Definitions Index → Stream → Alerts
i wouldn’t put nxlog etc inside sidecars, sidecars just launch a specific log shipper (like in green box you have there) based on data they receive from graylog, also rules are integral part of streams, without rules there are no streams, indices and and pipelines are both dependent on streams, but messages ultimately go to indices, also outputs are dependent on streams too so i’d put those as equal to alerts, indices and pipelines, ideally pipelines should be between streams and indices, right?
The way I did the rules with the streams was sort of what I meant. It was implied you need them, like you need connections, stages and roles for the pipelines to work.
Trying to keep this simple but I could wrap each general function with its dependants to show they are necessary.
Point taken though I’ll add a few words to clarify and wrap the streams
Not entirely sure I follow about the event definitions. Let me redo it and tell me if this is right.
Seems like a lot of work is done within the streams.
This a little more accurate?
I could use you over here to make couple logical diagram Good Job.
I think you have archived this.
I always thought alerts works by periodically searching indices and when I attach alert to stream it will just filter for “stream: my-stream-id” in this periodical search.
Otherwise LGTM if graylog is configured like this
But if you reverse order of these two, then Pipelines will come first and Extractors and Stream Rules after them. Pipelines will then see everything come in “All messages” stream, which can look useless until they thell you pipelines can do
route_to_stream. Then message appears in new stream and pipelines connected to that stream start to run. So its possible to do everything in pipelines too.
So is nisow95612 correct? Or how is the diagram as it stands?
He is actually right, but this depends on how your setting up Graylog instance. For basic understanding your good. You could put a note about settings in Processor Configuration.
I like the diagram in General a lot! I’m wondering though, if is would be suitable to put the Output to the streams as well. From my point of view the output is configured per stream after all the processing is finished.
I use Lookup tables mostly in pipelines - I’m not aware of any other way to use them. I’d vote to shift them from Dashboards to Pipelines.
Good catch On the Dashboard tip.
After reading about Outputs.
All of these Outputs first write messages to an on-disk journal in the Graylog cluster. Messages stay in the on-disk journal until the Output is able to successfully send the data to the external receiver. Once the messages have been written to the journal, they are optionally run through a processing pipeline to modify or enrich logs with additional data, transform the message contents, or filter out any some logs before sending.
So I believe your correct @ihe . I’m still learning something new
I will try to revise. So it only outputs from the streams?
This is correct from what I read.
Wouldn’t be output from the Stream “Rules”? Or just Streams (All)?
Doesn’t let me edit my post so if an admin can post this to the original and replace the old please thanks.
(added current diagram to the initial post)