Rename index set backing "All messages" stream?

nisow95612 · October 11, 2021, 2:27pm

Hello graylog community,
I have one more question this month:

Per Create/Cycle index set to specific ID? I am naming our index templates like “logs_r01m”, “logs_r06m”, “logs_r12m”. I am using stream filters to pick out logs for each retention period / index set. It works well, but lately I had to create a couple of special retention periods and realized that managing stream rules that route the messages to individual indices is getting harder and harder.

If I could use “All messages” stream as a fallback, then I would use it for default retention period and considerably simplify the stream rules. Unfortunately I see no way to change where logs from “All messages” go.They are forced to go to “Default index set” and then into the graylog_* index set.

I would very much like you to prove me wrong and to show me how to redirect them.

Thank you!

nisow95612 · October 11, 2021, 2:43pm

Oh, right, the answer to everything is “Pipeline processor”.
Obviously I can write a rule that reroutes everything from a default stream into another one and connect that to “All messages” stream.

route_to_stream(name:"logs_r12m", remove_from_default: true);

But that approach raises a few complications. First, it forces me to mix “message processing” and “message routing” in the pipelines, making them even more convoluted and making stream routing even more non-debugable. Second is a lack of documentation on what happens when pipelines re-route messages into a different stream:
I mean - Because messages are put through pipelines based on their stream membership, what happens when a pipeline rule changes stream membership?

Does pipeline processing continue like nothing happened?
Is pipeline processing aborted if the pipeline is no longer matching?
If so, does it abort after the step that changed streams, or after the pipeline runs to its end?
Is pipeline processing started in a newly matched pipeline?
If so, does it start at the “current” pipeline step, or the new pipeline start from its beginning?

… and all of these options sound bad.

PS: My processing order is set up like gsmith’s:

gsmith · October 12, 2021, 1:09am

Hello,

Not sure about the pipeline but on the stream’s when you create one or one that exists, there is a tic box as shown below.

nisow95612 · October 12, 2021, 11:28am

Yes, I am using this. But the problem is making stream rules so no message slips to graylog_* via ‘All messages’. All streams are set to “Remove matches from ‘All messages’ stream”.

Streams marked in red are there just to pull all remaining messages into the default retention period index set. If it was possible to redirect ‘All messages’, all these streams (50%!) would be unnecessary.

tmacgbay · October 12, 2021, 12:12pm

Rather than setting up so many stream rules, set up rules in a pipeline that creates flags-as-fields and use pipeline staging to decide what to do with the flags it finds. In later stages you can route to streams based on flag fields you find and even delete the flag field if you no longer want it. This might end up as a routing stream for all messages coming in and subsequent pipelines are attached to the the routed-to stream. You can use rule naming conventions to keep these routing rules alphabetically grouped for easier future edits.

Does pipeline processing continue like nothing happened?
I believe processing runs to the end of the pipeline regardless of when you route it. Easy to confirm in testing if you want to be sure
Is pipeline processing aborted if the pipeline is no longer matching?
Nope, once you start a pipeline, it runs to the end
If so, does it abort after the step that changed streams, or after the pipeline runs to its end?
see previous answer
Is pipeline processing started in a newly matched pipeline?
Always. You can have multiple pipelines attached to a stream but I don’t think you can control sequence… yet. I think that’s what you were thinking…
If so, does it start at the “current” pipeline step, or the new pipeline start from its beginning?
Pipelines always start from the beginning and run all the way out. There was some questions in the forums recently asking about why pipelines continue to run after a drop_message() function… this is what leads me to believe there is nothing that currently aborts a pipeline.

It would be nice to choose pipeline sequence on a stream - or for that matter rule sequence in a pipeline stage… or even to have an abort_stream() function… all those are currently potential feature requests.

EDIT: Read further down for more detail - messages can run through multiple streams concurrently but execute in the same stage… but the current docs are unclear about what the sequence is on use of route_to_stream()

nisow95612 · October 12, 2021, 12:26pm

Thank your for this long answer.
Yes, I should have said I am asking “In case someone knows” and otherwise fall back to experimenting.

Let’s have a simple scenarios:

two streams ‘All mesages’ and ‘Alternate stream’
no stream rules, everything goes to ‘All messages’
two pipelines, ‘Default’ and matching ‘All messages’ and ‘Alternate’ and matching ‘Alternate stream’.
both pipelines have stages -2, -1, 0 and 1

Now, new messsages arrives and is routed to ‘All messages’ so processing in pipeline ‘Default’ starts.
Let’s imagine in step 0 there is route_to_stream(name:"Alternate stream", remove_from_default: true);

What do you expect to happen? OK, pipeline ‘Default’ will probably run all the way to its end.
You expect that newly matched pipeline ‘Alternate’ will start at this point. Will it start processing its stage -2 concurrently with Default pipeline stage 1? Or it will wait until pipeline Default finishes and start after that?

tmacgbay · October 12, 2021, 12:43pm

Good question - someone inside Graylog or who has tested that scenario would have to answer… Maybe @aaronsachs can provide some insight? Based on how Graylog handles rules in each stage, it would suggest that it would randomly handle that scenario - which is unhelpful. The result is to be mindful and place your routing and dropping at the end of pipeline staging.

gsmith · October 12, 2021, 9:33pm

@nisow95612
Hello,

Have you seen the Pipeline Simulator? I personally haven’t used it a long time but it did help me out on Staging and Rules. Maybe it will give you a better insight.

nisow95612 · October 25, 2021, 5:00pm

@tmacgbay - Yes, keeping fingers crossed that your call for devs will succeed.
I don’t want my rules to accidentally depend on accidental/unstable behavior.

@gsmith - nice idea. Unfortunately it is giving me just errors

I figured out it dislikes my standard timestamp sanitization rule:

rule "Timestamp sanitizer (not ± 30 minutes)"
when
    to_date($message.timestamp) - minutes(30) > now() || 
    to_date($message.timestamp) + minutes(30) < now()
then
  set_field("timestamp_wrong", to_date($message.timestamp));
  set_field("timestamp", now());
end

-----------------------------------------------------------------

PS: The solution to my original question is trivial - assuming you avoid routing in pipelines like I do.
Just create a stream “All messages (Editable)” and attach this one-liner rule to All messages:

route_to_stream(name: "All messages (Editable)", remove_from_default: true);

tmacgbay · October 25, 2021, 6:44pm

The condensed question is:

If a message in mid-pipeline is shunted to a new stream that has a different pipeline attached to it, do those pipelines run sequentially or parallel?

—> hey @aaronsachs!!!

aaronsachs · October 25, 2021, 7:05pm

Oof. Off the top of my head, I believe pipelines are run in parallel. E.g., 2 pipelines with 0, 1, and 2 stages will run those stages in parallel with each other.

nisow95612 · October 26, 2021, 9:03am

Thank you for coming here. Let me try to explain with a picture:
Rerouting mid-pipeline

Main question is: When, if at all, does Stage A1 run?
I am asking how you think it should be, because I want to avoid relying on unintended behavior.

other questions

Does M3 run? Main pipeline no longer matches, so may make sense to stop there.
Does Alternate pipeline run at all? Alternate pipeline now matches, it makes sense to start it.
Does Alternate pipeline start from the beginning or from the current processing step (A3)?
If Alternate pipeline starts from A1, does A1 run during M3, or it waits until Main completes?

What about messages created by clone_message() / create_message()?

How do these enter pipelines? Do they continue at M3 or start from M1?
If I execute route_to_stream(name:"Alternate") on them, do they start from A1 or A3?

tmacgbay · October 26, 2021, 12:14pm

Best to review the docs on pipeline staging. If you want authoritative answers, you might have to sign up for support.

nisow95612 · October 26, 2021, 2:07pm

EDIT: answer to this part is there explicitely:

So I am explicitely guaranteed that a newly matched pipeline will somehow start. Nice find @tmacgbay !
I think this also answers that clone_message() / create_message() will just start their own pipelines.

The only question left is: Will A1 run together with M3 or after M3?

tmacgbay · October 26, 2021, 4:51pm

Built a test scenario where a message rides two pipelines and as expected the message hit all stages in numerical sequence. When the second test pipeline was cleared of stream connections and I used route_to_stream() to get a message to it, message processing would only start after the first stream had finished and it ran all stages regardless of where the initiating route_to_stream() was placed.

Where:

Pipeline one has stages MINUS TWO, ZERO and TWO
Pipeline two has stages MINUS ONE, and ONE
A message starting more than one pipeline will run parallel staying the same stages all the way through. Example results:
A message that gets to a pipeline via route_to_stream() will finish its current pipeline and then start the new pipeline from the beginning. Example results:

Certainly not comprehensive tests but interesting results!

system · November 9, 2021, 4:52pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Pipelines on same stages working on same messages? Graylog Central (peer support) pipeline-rules , route-to-streampl	7	2214	April 19, 2018
Pipeline help - Route to different index Graylog Central (peer support) pipeline-rules , route-to-streampl	9	4039	June 19, 2018
Routing Messages Across Streams and Index Sets Graylog Central (peer support) pipeline-rules , route-to-streampl	2	1796	February 8, 2018
Pipeline does not work with custom stream, only with "All Messages" Documentation Campfire	4	504	June 6, 2023
Removing field from stream but not "All Messages" Graylog Central (peer support) pipeline-rules , route-to-streampl	9	1579	March 26, 2020

Rename index set backing "All messages" stream?

Related topics