Replace string in field with pipeline rule

Hi,
I have a n conundrum on how to write a pipeline rule to omit password from log.
Log line looks something like this:
Mar 19 12:28:36 localhost haproxy[56341]: 77.243.30.178:12273 [19/Mar/2019:12:28:36.317] api.example.com http-api/BACKEND-1 0/0/1/12/13 200 183 - - ---- 709/707/2/1/0 0/0 {||api.example.com} {189299} "GET /api/command?user=someUser&password=myPassWord&cmd=SOMETHING HTTP/1.1" - -

And I want to change it to look something like this:
Mar 19 12:28:36 localhost haproxy[56341]: 77.243.30.178:12273 [19/Mar/2019:12:28:36.317] api.example.com http-api/BACKEND-1 0/0/1/12/13 200 183 - - ---- 709/707/2/1/0 0/0 {||api.example.com} {189299} "GET /api/command?user=someUser&password=*****&cmd=SOMETHING HTTP/1.1" - -

So omitting password is the goal.
My progress so far is that I’ve created a regext that matches password and incorporated it in a rule.

rule "omit passwords"
when
    regex("(?<=password=)[^&]+").matches == true
then

end

I’ve connected it to All messages stream. but not all massages will have password part and will not match, and it seems they get unprocessed with all my attempts at configuring “then”.

Prior to pieline, I’ve configured Haproxy extractor and that works fine. So fields that need edit are “http_request” and “message”.
Tnx in advance.

There is currently no way to replace text in a string via a regular expression (or otherwise), but you could add it as a feature request on the Graylog github page?

On a side note, passing credentials via query string is… not so good. If it’s an app built in-house, try using headers instead because the problem with credentials in query strings is that the chance of the credentials inadvertantly leaking are larger than I’d personally be comfortable with.

when Graylog 3.0 is used, you would use the regex_replace function.

1 Like

Huh, I checked the documentation for 3.0 (functions list) and didn’t see regex_replace in there - but cool that it exists! :smiley:

Additionally, I believe that it’s not possible to edit the original “message” field? I’m guessing this would need to be done at the log source end, some kind of filebeat processor config? (assuming you’re using filebeat of course)

I believe that it’s not possible to edit the original “message” field?

you are able to edit ANY field of a message, even remove and add as many as you like. The only that you need to keep in mind is that Graylog needs timestamp, source and message field to store a message …

Thanks Jan,

I’m a graylog newbie, but I was under the impression that “message” can’t be changed - I think I based this off the fact that “cut” isn’t an option on the message field when creating an extractor.

Apologies up front to the OP for going off-topic, but Jan could you clarify my understanding?

Let’s say I have a message

ColumnA,ColumB,ColumnC
2019-04-10,Frank,Password123

( I believe this is similar to what the OP is mentioning).

So I have my extractor, and after it runs Graylog has 4+ columns;

Timestamp,Message,UserName,Password

But I want to remove password for security reasons - That’s easy, don’t extract it - But it’s still in the “message” field - So you are saying that I can edit the original message field to be

2019-04-10,Frank

?

And if that is the case, in order to save storage as I’ve already extracted all the info I need into separate columns, could I completely remove the “message” field, or completely blank it out?

If the message that is sent in to Graylog (let’s say to a raw TCP socket) is in fact 2019-04-10,Frank,Password123 then when Graylog has ingested it, internally the message is now:

{ "timestamp": "2019-04-10 08:39:00.000Z", "message": "2019-04-10,Frank,Password123", "source": "the-host-that-logged-it"}

(Plus some additional fields such as the gl2_* set that helps Graylog do smart things like being able to show all messages from an input)

Ideally what you would do is set up a Pipeline, not an extractor, something along the lines of this:

rule "parse message and discard password"
when
  true
then
   let res = split(",", to_string($message.message));
   set_field("username", res[1]);
   set_field("message", concat(to_string(res[0]), ","));  
   set_field("message", concat(to_string($message.message), to_string(res[1]));
end

The reason for the repeated concat is that while a regex_replace function exists (which could do it in one line) I haven’t found it’s usage documentation on the Graylog site.

But, in essence, you can slice and dice the message, then put it back in a different form if you wish. I use this all the time on nginx logs - we send them in JSON format, a pipeline parses them and extracts the fields, then re-assembles a message field to look like a standard common log format so that some external legacy tools we use don’t get too upset :slight_smile:

`

Many thanks Ben :smile:

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.