I want to use the CSV file option in Lookup tables to bring a range of metadata into Graylog. I’m definitely doing our Asset database (to map IP addresses to workstations), but am also thinking of blacklists for tagging external addresses as suspicious.

The latter would be hundreds of thousands of lines long, so I just wanted to check that graylog would be ‘happy’ with that. eg if you’re dealing with 10K/sec ingestion rates and every line contains an IP address, then that would be 10K lookups per second. Obviously there’s some (minor in the scheme of things) RAM requirements - but I was mainly concerned about CPU - ie that graylog uses hash tables or something to optimise lookup performance?



I’m not sure we’ve tested the CSV adapter with very large files but I’d be very interested in your findings (and many other people too, I think).

