I have a field in the database called RequestLatency. It’s a string composed of a number with a trailing comma. Here’s an example:
12,
I want to strip the comma off and convert it to a long int.
I have the following so far…
rule "RequestLatency_to_long"
// Convert number with trailing comma to long
when
has_field("RequestLatency")
then
let RequestLatency = to_long($message.RequestLatency);
end
But the comma is still there.
I’ve been trying combinations of grok and set_field and regex_replace but almost everything a try is just a syntax error that doesn’t even save. I’m trying to read the documentation here… https://docs.graylog.org/v1/docs/functions
…but I can’t figure out what I’m looking at. If someone knows a site I could lookup to tell me how to interpret the documentation that would be great too!
Is this an extractor that has created the field RequestLatency? If so, just an idea, instead of patching up the field RequestLatency using a pipeline how about fixing the extractor so it doesn’t have a trailing “,”?
If we could see how this field RequestLatency is created and perhaps there is a fix to that. If not, is this something your looking for?
PUT _ingest/pipeline/my-pipeline-id
{
"description": "converts the content of the id field to an integer",
"processors" : [
{
"convert" : {
"field" : "id",
"type": "integer"
}
}
]
}
Maybe my Pipeline mentor @tmacgbay can have a look
Thank you for the quick response.
I don’t know why the comma is getting in there. I’m using the Key-value pair converter on the input and it’s not doing a good job.
I tried a bunch of stuff…
rule "RequestLatency_to_long"
// Convert number with trailing comma to long
when
has_field("RequestLatency")
then
let var_x = regex_replace("^(\\d*)", to_string($message.RequestLatency), "$1");
set_field("RequestLatency", var_x);
end
Still has the trailing comma in there! I have no idea if regex_replace is the right function to be using here.
And a message that would pertain to that input. Make sure you block out personal information.
Maybe we can come up with a fix for you. I think that were the problem resides.
I think the below should do it - no need to complicate with regex or grok.
rule "RequestLatency_to_long"
// Convert number with trailing comma to long
when
has_field("RequestLatency")
then
let var_x = replace( to_string($message.RequestLatency), ",","");
set_field("RequestLatency", to_long(var_x));
end
It would be more efficient to fix it in the key_value extractor you are using - @gsmith is the guru there… If you post your anonymized extractor and example message there may be a fix.
This code you provided worked great. Thank you! I’ll put this in my cookbook of reusable component’s.
But @gsmith is right. It would be nice if the parser on the input didn’t put trailing commas in the field. In fact, it’s doing it to all my key=value pairs, but RequestLatency was the one that was harassing me today.
I’ll start a different thread on the trailing comma issue.
Thanks!
@tmacgbay Actually, I think this is still not working. I’m having problems with running max() and avg() on this field. I think it’s still a string. When I do a search and look at a record and hover over the field it says RequestLatency = string.
I should have mentioned that. It will still be a string until you rotate the index it is in. Elasticsearch maintains the type until the index is incremented. All previous indexes will retain it as a string. There is a way to fix that that I wrote up but it is a rather involved process - you can find it here if you are interested.
Also - mark the answer for this topic so others can find it easily.
Oh, man. I looked at your post for the fix. I suspect I have another 60 hours of uninterrupted deep dive on how this all works before I can figure out what you’re describing. Maybe when I retire in 2052 I can devote some time into how ElasticSearch works!
My indexes rotate ever 24 hours. If I wait 24 hours and if I don’t run avg() across data more than 24 hours old, will I get it to work? And by Monday I’ll be able to run graphs 48 hours old, etc.