GROK, GeoIP and whole lotta hurt

Hi Everyone

Let me start off by saying thank you to all the contributors who makes this such a great product!

I am very new to graylog, and have gotten a few basic things going. But I have been struggling to get this use-case solved for three days straight without any success.

What I am trying to do
Easy (for other people) - take the IP from an incoming HAProxy log, get the co-ordinates associated with said IP address, and plot these co-ordinates on a map in Grafana.

What I have done

  • I have 2 streams set up currently. One stream for all apache logs, and one for haproxy logs. Streams are distinguished based on source of log sender.
  • I have installed the GeoLite2 database in /etc/graylog/server/GeoLite2-City.mmdb
  • I have set up a Lookuptable, a cache, and a data-adapter. I have gotten zero errors from any of these setups so I assume they are correct.
  • I have created a pipeline called ‘HAProxy Geolocation Data’. This pipeline has two stages: Stage0 contains a rule called ‘HAProxy get client IP’ and Stage1 contains a rule called ‘GeoIP lookup: *clientip’.

Example log line input

haproxy[806]: REDACTED_IP:51931 [18/Aug/2020:04:20:23.163] FRONTEND-NEXTCLOUD~ BACKEND-NEXTCLOUD/REDACTED_FQDN 0/0/1/507/508 207 1178 - - --NI 1/1/0/1/0 0/0 "PROPFIND /remote.php/dav/files/my/files HTTP/1.1"

Content of ‘HAProxy get client IP’:

rule "HAProxy get client IP"
when
    has_field("message")
then
    let message_field = to_string($message.message);
    
    let placeholder = grok(pattern: ": %{IPV4}:", value: message_field, only_named_captures: true);
    set_field("src_ip",to_ip(placeholder.IP));
    
end

I know the match is not refined enough. I will refine it to a “finer” match once I know the data is actually being retrieved and written.

Content of ‘GeoIP Lookup: *clientip’:

rule "GeoIP lookup: *clientip"
when

    has_field("src_ip")

then

    let geo = lookup("geo-lookup", to_string($message.src_ip));
    
    set_field("src_ip_geolocation", geo["coordinates"]);
    set_field("src_ip_geo_country", geo["country"].iso_code);
    set_field("src_ip_geo_country_name", geo["country"].names.en);
    set_field("src_ip_geo_city", geo["city"].names.en);

end

After these two rules are executed (I can see there are msg/s triggered on each) there are no messages displayed in my HAProxy stream page.

I know there is a plugin I can enable to get the details, but I seem to have an issue in grafana that the values of the co-ordinates (which are written as ‘lat,long’ and of type string) cannot be correctly interpreted. This statement might be completely wrong, but that is the only reason I could think of that it would not plot the co-ordinates in grafana.

Any help will be greatly appreciated please?

PS: Sorry for the long post, but wanted to give as much info as possible.

In the Graylog pipeline function documentation I have found this:

### grok
`grok(pattern: string, value: string, [only_named_captures: boolean])`

Applies the grok pattern  `grok`  to  `value` . Returns a match object, containing a Map of field names and values. You can set  'only_named_captures'  to  `true`  to only return matches using named captures.

You can try to change the placeholder grok pattern to sth like this:
“: %{IPV4:IP}:”

Otherwise ‘placeholder.IP’ should be '‘placeholder.IPV4’ as you do not name it at the moment.

I would also replace

when
    has_field("message")

with

when
   true

Thank you @LuPo for this answer! Let me try that and get back to you!

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.