Manage email JsonArray

Hi,
I need to create a rule in a pipeline that has to work with a json array (String).
In the message there is a filed ‘recipients’ that contains a list of email address e.g [“gianluca@gmail.com”, “nexemail@mail.net”, etc…].

So in my scenario I need to detect the email domain using a CSV that tell me if the email domain is good or not.

So I created a lookuptable that use a csv file to detect the allowed domains (i.e gmail.com).
But now I don’t know how to permorm the request with the recipients list structure.

Do you have a suggestion about it? I know that in the single rule is not possbile to use a for cycle to inspect the single email address and execute the lookup function.
How can I proceed?

Thanks
Gianluca

Hi @gianluca-valentini,
Please post your results, what you want to achieve. Do you want to extract domain from recipient list, compare with csv using lookup, and after that what? You have lot of domain in list, so one could be allowed and one not, so do you want to do with it?

Hi @shoothub,
you hit the point. In the input message I have this field:

“recipients”: [
g.valentini@gmail.com”,
g.valentini@aruba.com”,
g.valentini@mycompany.com”,
],

In the same time I have a CVS vile where I have:

domain, internal
gmail.com, false
mycompany.com, true

and other domains.
So I need to extract for each recipient the information about the domain, if is internal or not compaing it with the csv ones.

After that, if internal, I add a new field to the message like
addField("internal", "true/false")

In another stage if is external I send the message to kafka or generate an alert. I can also generate query on Kibana to have statistics on that field

But currently the problem is to extract the recipients like a java forEach
Can you give me an help to manage it on the rules?
Consider that I don’t know home many entries, the recipient field has.

Hi @gianluca-valentini,

But I don’t undestand how do you want to deal with situations like:

  1. more internal domains
  2. internal domains and also some external
  3. and so on

So you want to add field internal if one of recipient is internal? And add field e.g. external if all external or how? It’s not very clear for me, so I can’t help right now, please explain.

In my scenario if one recipient is external (looking the CSV file) this means that the email is sent externally.

So checking all recipient emails I can detect this situation.

How to move manage the json array in order to check each of it?
Thanks

Try to use this 2 pipeline rules.
I assume, that you have email addresses extracted in field recipients

First one (step 1) clean unnecesery chars ["] from recipient field, split values with comma, extract first email, extract domain, compare with lookup table, setup field internal (true/false)

rule "email jsonarray 1.1"
when
    has_field("recipients") AND contains(to_string($message.recipients), "@")
then
    // Remove unneccesary strings []" from array field
    let fix_strings = regex_replace("(\\[|\"|\\])", to_string($message.recipients), "");
    // Split email addresses
    let split_emails = split(",", fix_strings);
    // Join emails without first one (from second to last)
    let join_emails = join(split_emails, ",", 1, -1);
    // Save it to temporaty field (used also in second pipeline rule condition)
    set_field("recipients_tmp", join_emails);
    // Extract domain from first email address
    let extract_domain = regex("@(.*)$", to_string(split_emails[0]));
    // Lookup domain in CSV - "email_domains" = name of lookup table
    let lookup_internal = lookup_value("email_domains", extract_domain["0"]);
    set_field("internal", to_bool(lookup_internal));
end

Second pipeline rule do similar to first one, and use temporary field recipients_tmp as input field. It’s run only when internal=false and field contains @ (so it will run until find first internal domain). Instead of join (to create field recipeints_tmp) this pipeline rule uses regex_replace, due to graylog can’t store empty string in field. So if it will be use join as first rule, it will run much more times than expected.

rule "email jsonarray 1.2"
when
    has_field("recipients_tmp") AND contains(to_string($message.recipients_tmp), "@") AND to_bool($message.internal2) == false
then
    // Split email addresses
    let split_emails = split(",", to_string($message.recipients_tmp));
    // Replace first email with |
    let replace_first = regex_replace("^[^,]+[,]{0,1}", to_string($message.recipients_tmp), "|");
    // Save remaining emails to temporaty field
    set_field("recipients_tmp", to_string(replace_first));
    // Extract domain from first email address
    let extract_domain = regex("@(.*)$", to_string(split_emails[0]));
    // Lookup domain in CSV - "email_domains" = name of lookup table
    let lookup_internal = lookup_value("email_domains", extract_domain["0"]);
    set_field("internal2", to_bool(lookup_internal));
end

How to use:

  1. Create lookup table email_domains with Default single value to false (string). Create CSV file and rows should include only local domains, not external. If not included in CSV, all other domains will be theated as external (internal = false) => Default single value. Use quotes in CSV, it’s necessary in graylog.
    Format of CSV:
"domain","internal"
"mycompany.com","true"
  1. Create new pipeline
  2. Assign pipeline rule email jsonarray 1.1 to step 1
  3. Create as many steps (2 and more) as propable max number of recipients in array (5-10 e.g. or more)
  4. Assign pipeline rule email jsonarray 1.2 to all step from 2 to number of steps created in point 4
  5. Done

Pipeline rules will process all email addresses in json array field, and stop after first internal domain.

@shoothub thanks a lot!
Your code is very useful, thanks.
The problem is that I don’t know the number of email present in the recipient fields :frowning:

And that nber obviously change for each input message

Maybe I can create a Java code creating a function plugin and invoke the lookup there (even if I don’t know how).

Or I can suppose to have a maximum number of recipient (i.e. 10)
Thanks for your time and help

No problem, just create as many steps as you think (expected maximum number of recipients), because it will stop after first step which find internal address in CSV. Another steps will be skipped.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.