Pipelines with regex and special character

Hi All,

I am facing issue with pipeline and regex, in my regex i have special character [, to escape this I am using \\[, you can see the complete regex below, while doing pipeline simulator getting extra \ in results, any idea how to avoid this ?


rule "file categorize"
  set_field("pipeline", "message");
  let message_field = to_string($message.message);
  //set_field("xy", key_value(message_field));
  set_field("Pal_Inputs", regex("inputs=(\\\\[(.*?)\\\\])", to_string(message_field), ["rayees"]));
  //set_fields(fields: key_value(value: "pipeline"));
  set_field("pipeline2", to_string($message.transaction_date));


2016-09-29 00:55:24,261 level=INFO tag="run_pal_workflow.py" msg="Run complete for appname=locationJoiner, job_date=20160912, status=Passed starttime=Thu Sep 29 00:10:31 2016, endtime=Thu Sep 29 00:55:19 2016, duration=0:44:47, inputs=[{"path": "/processed/pal/parse//staticDataJoin/latest", "tag": "static", "stats": {"size": "6.59GB"}}, {"path": "/processed/pal/parse/location/date=20160909", "tag": "location", "stats": {"size": "216.78GB"}}], outputs=[{"path": "/processed/pal/parse//test/locationStaticDataJoin", "tag": "locationstaticjoiner.output.path", "stats": {"diffSize": "45.96GB", "newFiles": ["/processed/pal/parse/test/locationStaticDataJoin/date=20160909"], "endSize": "218.80GB", "startSize": "172.84GB"}}]"


{"1":"{\"path\": \"/processed/pal/parse//staticDataJoin/latest\", \"tag\": \"static\", \"stats\": {\"size\": \"6.59GB\"}}, {\"path\": \"/processed/pal/parse/location/date=20160909\", \"tag\": \"location\", \"stats\": {\"size\": \"216.78GB\"}}","rayees":"[{\"path\": \"/processed/pal/parse//staticDataJoin/latest\", \"tag\": \"static\", \"stats\": {\"size\": \"6.59GB\"}}, {\"path\": \"/processed/pal/parse/location/date=20160909\", \"tag\": \"location\", \"stats\": {\"size\": \"216.78GB\"}}]"}


do you mean the extra \ (backslash) in front of the " (quote)?

I guess this is because Graylog pipelines uses “safe” strings that escape quotes with a backslash to avoid parsing errors of the given string.

Afaik there is no way to properly remove them. Maybe a replaceAll() function would be nice for pipelines…

Greetings - Phil

Yes, I mean \ (backslash) infront of ", i cant find any function to replace " to " in pipelines,

Is it a bug in Graylog ? I was in the process of converting extractors to pipeline, but got blocked now

I tried like above also, still the same issue

Well, there is no replace() or replaceAll() function in pipeline (yet). Maybe this will get added sooner or later, or you could write a plugin for that.

You should write a feature request to the pipeline repo that includes what should be possible with the function.

Or you write this functionality yourself

For reference: