Removing field from stream but not "All Messages"

My graylog cluster is running 3.0.2 atm, upgrading soon.

I am trying to do something that seems fairly simple. I have a stream that matches all messages that are coming from the F5 ASM, they are assigned a unique index and all is working treat there.

There are two very large fields that I want in the short term, but want to drop in the long term to save space.

Retention on my “All Messages” stream is short and will keep the full entry.

I have a pipeline attached to the ASM stream where I am removing the large field.

The problem is that it is removing it from BOTH the All and the ASM streams.

Since my pipeline processing is after Message Filter Chain I thought pipeline processing would not touch the All stream unless I explicitly attached it to that stream.

Message Processors Configuration

The following message processors are executed in order. Disabled processors will be skipped.

# Processor Status
1 Message Filter Chain active
2 GeoIP Resolver active
3 Pipeline Processor active
4 AWS Instance Name Lookup active

Thanks for the help!

he @ramerman

is the routing into the two different streams with different index sets done with stream filter? When you look at those messages and expand them in the search window - what streams are those messages in?

The stream routing is done via a static field set in the input. That is then matched with a stream rule.

One thing I wondered about but have not tried was getting rid of the stream rule and setting the stream in the pipeline.

I wondered if I did the stream assignment in stage 0 then dealt with the fields in stage 1 that might work?

Really thought this would not be a big deal due to the fact that a pipeline is attached to just one stream so was not expecting it to be able, by its very nature, to effect the All stream.

I just made this change but it is not having the desired effect:

I took out the rule in the ASM stream.
I added a new pipeline that is attached to the All Messages stream that routes to the ASM stream.
The pipeline attached to the ASM stream is unchanged.

My assumption was that this might more directly force the assignment of the messages to the ASM stream, leaving ALL Messages behind before the second pipeline was triggered.

That second pipeline is still effecting the messages in the All Messages stream.

I have another pipeline that DOES follow how I thought it should work.

It is looking for messages that match a criteria, and drops it.

If that pipeline is only attached to the ASM stream the message is still left in the All Messages stream.

I had to attach it also to the All Messages stream to drop the message there as well.

This makes sense to me.

So why is my pipeline that is only attached to the ASM stream effecting the message both in the ASM stream AND the All Messages stream!??

I know I’m kind of just restating the same question but it is now in the context of something that IS working as expected.

Is this just a bug? I could not find anything in the issue tracker.

Still have not found anything that explains this.

Does anyone have anything to offer?

I have a second pipeline that that is attached to another stream and is changing the value of a field. It is also effecting both the attached stream, and the All Messages stream. In this case though, I did want this change on both, but this proves that this is not just an anomaly with either my pipeline or the first stream I was working with.

Any insight is GREATLY appreciated.

Thanks!

I think this stems from how you are thinking of streams and perhaps mixing them with the index where the message is stored. A stream is a tag like construct that facilitates connection points for pipelines, search restriction, and provides direction to which index to store the data in as the message. “All Messages” is a stream “tag” applied to all messages unless you have explicitly set a stream to remove it.

Back to what you are trying to accomplish - I would experiment with a pipeline rule that used the functions create_message() and route_to_stream(). I have not used them but it seams to me you could create your two messages and route them to different indexes via different streams. It would take a little stream/rule work to do the breaking out and removing fields for the one message… but that’s the fun of Graylog.

I do understand exactly what you are talking about but by the nature of how pipelines are set up, the fact that you have to attach them to one or more stream to work on, and you are assigning messages to an index in a stream, this should work.

For my situation, the All Messages and my ASM streams are in different indexes. For my pipeline to effect both it has to work on both OR it has to do its work BEFORE the message has been processed into an index, something that the stream does.

If what you did in a pipeline always effect multiple streams, it would not make complete sense to attach it to a singe stream.

It may be that the stream only effects the data that is evaluated, but not being very clear about that is a huge oversight if true.

I’m also wondering if this is just a bug in the particular version of Graylog I have, which is 3.0.2.

You will have to experiment with create_message(), clone_message(), remove_field() and route_to_stream().

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.