Mapping IPs to subnets


(Bruce Givens) #1

Hello,

I have a list of about 50 subnets and corresponding names like this:

"subnet","subnet_name"
"192.168.0.0/16","office"
"10.10.10.0/24","datacenter1"
"10.10.20.0/24","datacenter2"
...

What I would like to do is look up an IP from an existing field and create an additional field that shows the name of the subnet to which it belongs. So, for example, looking up 192.168.1.1 would return ‘office’.

I had originally thought I could do this with a CSV Lookup Table, but if my understanding is correct, I would need to have each individual IP in the lookup table instead of the CIDR definition - is this correct or can a cidr_match be performed on the key of the lookup table?

I also looked at using a MaxMind DB for this as described at https://blog.maxmind.com/2015/09/29/building-your-own-mmdb-database-for-fun-and-profit/, but it appears that the lookup table data adapter only supports City- or Country-MMDBs.

Does anyone have a pointer on how this could be accomplished without having to populate a CSV with every single IP in each subnet?

Thanks!
Bruce


Condition in condition in Pipeline
(Jochen) #2

http://docs.graylog.org/en/2.4/pages/pipelines/functions.html#cidr-match


(Bruce Givens) #3

Hello Jochen,

thanks for your reply!

I know that there is a cidr_match function, but I can not figure out how to work it in to the lookup_value function.

If I were to generate a CSV with each individual IP, I could then look up the subnet name something like this:

rule "lookup: source_ip subnet name"
when
  has_field("source_ip")
then
  let subnet_name = lookup_value(lookup_table: "subnet_table_name", key: $message.source_ip, default: "unknown");
  set_field("subnet_id", to_string(subnet_name));
end

but I am unsure of where cidr_match would fit into this pipeline rule when using the below CSV instead and attempting to get the subnet_name for (for example) 192.168.1.1:

"subnet","subnet_name"
"192.168.0.0/16","office"
"10.10.10.0/24","datacenter1"
"10.10.20.0/24","datacenter2"
...

What am I missing?

Thanks again,
Bruce


(Jan Doberstein) #4

how much data is in that csv file?

is it like 4 or more like 50?


(Bruce Givens) #5

Hello Jan,

the CSV has about 50 subnets.

Bruce


#6

@bruce,
I think (but I’m not sure) the lookup will do 1-1 match. Eg. convert banana -> yellow fruit; carrot -> orange vegetable
I don’t see the source_ip -> subnet (NOT subnet_name) conversion in your code.
If you use only /16 and /24 subnets, maybe you can try to cut the end of it IP, and make a lookup for “192.168.1” if no match you can try “192.168”, etc.
The geoip locator maybe do the subnet conversion. If you can upload personal data to geoip database, maybe it can solve it.

First we should know the lookup function does the ip-> subnet conversion, to do a match a subnet -> subnet_name.


(Jason Keller) #7

Hi @bruce

Unfortunately that’s exactly what I’ve had to resort to doing, populated CSV with every single IP to subnet…

The lookup table functionality basically is a 1:1 mapping, i.e. the key needs to match one column, and it pulls in the other column(s) in that row. The cidr_match() function is pretty much moot at that point, as it cannot divine a subnet without knowing the masks of each subnet beforehand; the other side of this is you cannot call a lookup table without a key, and you cannot iterate over the table.

I think this is going to have to be a custom lookup table plugin, purpose built for single column lookups (all the subnets) utilizing a cidr_match() function iterated over all of the rows.

At the very least, I’m happy to hear I’m not alone in this.

EDIT: Actually, after thinking about it, the threat plugin has to be doing this exact same thing to match to the CIDRs from Spamhaus DROP/EDROP. Overall should take comparatively little modification. @jochen? @jan?


(Jan Doberstein) #8

Currently I have no “this is the solution” idea on this.

But I would write a feature request in the processing pipeline repo ( https://github.com/Graylog2/graylog-plugin-pipeline-processor/issues ) that ask for the option to lookup cidr_matches in lookup tables that contains network information, if I would be in your position.


(Bruce Givens) #9

Thanks to everyone for their input.

If the list of subnets were shorter, I could easily write a pipeline that would match each of the subnets and assign the appropriate value to a field, but this does not scale well.

I have opened an issue as @jan suggested:

As mentioned in the issue, opening up the lookup table data adapters to accept custom-made MMDBs may also be an option.


(Jason Haar) #10

FYI we have an “our Internet IP addresses to sitename” lookup file. But
unlike you that consists of a mixture of subnets plus individual IP
addresses - so CIDR support doesn’t really help. I bit the bullet and
simply generate a large CSV file with one IP per line. Not really that much
extra work, and can cover all the use-cases.

Ours is 20K entries - graylog happily gobbles it up and works fine :slight_smile:


(Jochen) #11

Just be aware that it’s loaded into heap memory completely and you will need to size the JVM heap memory of your Graylog nodes accordingly.


(system) #12

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.