Pipeline best practices for message intake?

I’m administering a low-volume, self-hosted, single-node Graylog Open deployment. I’m new to Graylog, and one thing that’s unclear to me is the current thinking behind stream rules and pipelines.

Stream rules are described as a legacy feature (or soon to be one). Trying to be future-proof, I decided to forgo Stream rules entirely for my newly created streams, which means I’m not using the Data Routing page at all. Instead, in the Pipelines pages, I’m just using the Route to Stream rule in the last stage of my pipelines, with remove_from_default checked.

A few questions:

  • Is this the intended way of using pipelines for configuring intake into streams? What makes me doubt myself is the fact that the Remove matches from ‘Default Stream’ checkbox on my custom streams was not sufficient to actually remove the messages from Default Stream, if they ended up in the custom stream via the Route to Stream rule of a pipeline. remove_from_default had to be additionally checked in the rule itself (and possibly the stream-level option isn’t even necessary when using it).
  • The Data Routing page, currently, has no “first class” support for intake via pipelines, though you can connect pipelines to it in the Processing step (for enrichment/extraction, I guess). Given that pipelines can do all three steps—intake, processing and output—is the Data Routing page going to undergo transformation into being entirely pipeline-based at some point?
    • If so, will my current pipelines with their manual Route to Stream actions need to be rebuilt into whatever form the new pipeline-based Data Routing view looks like?

Basically, I want to know if it makes sense to go all in on pipelines now, as a new user with a fresh install, or if legacy Stream rules will have a better migration path into whatever form the future pipeline-based system will take.

Yes, you want to go all in on pipelines.

Yea the two ways of routing ar totally separate, and the remove from default setting only applies to one method and not the other.

The reason pipelines are a little hidden in data routing is a few reasons. Practically, it grew out of the streams pages, and so many of the things on that page are just those same settings moved around. Secondly, pipelines are tied to streams in a much looser fashion that is much harder to show in the UI, they can be associated to many streams, they could be routing to many many streams, even using variables to route to any stream that you wouldn’t even be able to see by looking at the rule code because it’s being driven by the message field values etc.

All that to say, yes pipelines are the way, and they shouldn’t really need to be rewritten at any point because you are using them the way they were built to be used.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.