Graylog ingesting Crowdstrike FDR Logs (refined repost)

enjet_it · February 15, 2023, 1:36pm

I spent days searching for a solution to the above. Graylog’s AWS plugin doesn’t work in this case unless you have your own bucket that FDR is dumping into, and Filebeat can’t read the input (likely because the data is stored in gz). So for those that want an actual solution that doesn’t involve “Just spend thousands per month on Splunk!”, here it is:

Use Logstash with the s3 plugin. Example conf.d/fdr.conf:

input {
  s3 {
    access_key_id => "AKblahblahblahblah"
    secret_access_key => "ThisIsNotTheSecretAccessKeyYouAreLookingFor"
    bucket => "CrowdstrikeWillSellYouThis"
    region => "us-some-region"
    additional_settings => {
    force_path_style => false
    follow_redirects => false
    }
  }
}

filter {
  json {
    source => "message"
  }
}

output {
  gelf {
    host => "GraylogIPorHostname"
    port => PortNumber
    sender => "FDR"
  }
}

Also: Default FDR settings (no filters) will generate at least 5GB/day by itself, flooding Graylog with data every 5 minutes.

In my other post, I indicated GELF output wasn’t required, but after trying out TCP, UDP, and Syslog, apparently it is. Also, due to the nature of gelf and its interaction with Graylog, a separate “full_message” field will be generated, which will double the size of the input by replicating the “message” (these messages are LONG). I have found no way of suppressing or deleting the full_message field. Not in Logstash via a Filter, nor in Graylog via Pipelines, but I at least was able to use a regex replacement Extractor to replace a full-match (.+) with a single word (blah) to trim the size. Not ideal, given that the system has to read and match that very expensive pattern, but it might result in less storage space.

tmacgbay · February 15, 2023, 9:32pm

In the pipeline, couldn’t you do this:

set_field("full_message", "blah");

No need to regex…

(untested)

enjet_it · February 16, 2023, 5:42pm

That’s a good idea. I was in the process of smashing my head against a wall when I found success with the following config:

input {
  s3 {
    access_key_id => "AKblahblahblahblah"
    secret_access_key => "ThisIsNotTheSecretAccessKeyYouAreLookingFor"
    bucket => "CrowdstrikeWillSellYouThis"
    region => "us-some-region"
    additional_settings => {
    force_path_style => false
    follow_redirects => false
    }
  }
}

filter {
  json {
    source => "message"
  }
  truncate {
    fields => "FeatureVector"
    length_bytes => 50
  }
  mutate {
    rename => {"id" => "ID"}
  }
}

output {
  gelf {
    host => "GraylogIPorHostname"
    port => PortNumber
    sender => "FDR"
    full_message => "full"
  }
}

Changes:

output: specify anything as “full_message” to essentially wipe out the value before the message hits Graylog.
filters: While this originally worked, I discovered an “undocumented feature” for the gelf output. Apparently a json field with the exact name “id” will not be extrapolated, regardless of its value. The actual value of “id” has meaning for the source information, so I needed it to be processed. The mutate filter resolves that by capitalizing the field name. Graylog then picked up the extrapolated field along with all of the other fields.
When tailing Graylog’s log, I noticed some log vomit:

message [ElasticsearchException[Elasticsearch exception [type=illegal_argument_exception, reason=Document contains at least one immense term in field=“FeatureVector” (whose UTF8 encoding is longer than the max length 32766), all of which were skipped.

Which eventually led me to a simple Truncate filter. This field’s data is extraneous, so a basic length trim resolved that issue.

tmacgbay · February 16, 2023, 6:02pm

Thanks for posting your solution!!

Topic		Replies	Views
Graylog ingesting Crowdstrike FDR logs (Solution) Graylog Central (peer support)	4	448	February 28, 2023
Logstash Output to Graylog 3.0 (Can't search message) Graylog Central (peer support)	3	2536	May 24, 2019
GELF output plugin blocks message processing Graylog Tech Challenges	4	937	August 27, 2021
Reduce size of daily logs from different devices Graylog Central (peer support)	5	712	December 31, 2021
How to configure graylogs from logstash output Graylog Central (peer support)	3	9263	March 12, 2019

Graylog ingesting Crowdstrike FDR Logs (refined repost)

Related topics