Ingesting IIS/Exchange CSV

Hi,

We’re in the process of adding our sources to Graylog and I’m trying to configure a pipeline to extract the fields from incoming messages from Exchange and IIS logs, which are stored in CSV format and fed through a filebeat collector.

I have the messages entering Graylog without issue, and I’ve configured tags on the logs so we distinguish types, but I’m not able to get the pipeline to extract the fields. Here’s the logic we’ve been using:

Filebeat config:
filebeat:
inputs:
- type: log
enabled: true
paths:
- D:\LogFiles\IMAP*.LOG
fields: {log_type: Exch_IMAP}
- type: log
enabled: true
paths:
- D:\LogFiles\MessageTracking*.LOG
fields: {log_type: Exch_MessageTracking}
- type: log
enabled: true
paths:
- D:\LogFiles\Connectivity*.LOG
fields: {log_type: Exch_Connectivity}
- type: log
enabled: true
paths:
- D:\LogFiles\W3SVC**.log
fields: {log_type: Exch_IIS}

I’ve then got 2 pipeline rules, one to recognise a tag and the other to run the grok pattern (I tried having them in a single rule but that also didn’t work).

Stage 0 rule:

rule “Exch_W3SVC_Logs_Tags”
when
contains(to_string($message.fields_log_type), “Exch_IIS”, true)
then
end

Stage 1 rule:

rule “Exch_W3SVC_Logs_Extract”
when
true
then
let mess = to_string($message.message);
let parsed = grok(pattern: “%{EXCH_W3SVC1_LOG}”, value: mess, only_named_captures: true );
set_fields(parsed);
end

Thus far the rules don’t seem to be running against incoming message. Any thoughts would be appreciated.

  1. First check your processing order:
    https://docs.graylog.org/en/3.1/pages/pipelines/stream_connections.html#the-importance-of-message-processor-ordering
    https://docs.graylog.org/en/3.1/pages/pipelines/usage.html#configure-the-message-processor

  2. Try to debug rules conditions using debug function:

let debug_message = concat("Tag ", to_string($message.fields_log_type));
debug(debug_message);

Then check log file /var/log/graylog-server/server.log for debug output
https://docs.graylog.org/en/3.1/pages/pipelines/functions.html#debug

1 Like

One thing to note - if you have grok’ed in a field name with a space (or some odd char) in it, set_fields() will fail without logging why (I put in a change request for it)…this is where debug() was helpful! I have been setting up exchange and iis to use split() rather than GROK if possible, it isn’t as neat as GROK but it reads well - here is our iis:

rule "winbeat-iis"
when
    has_field("filebeat_fields_tag")       &&
    ends_with(to_string($message.filebeat_fields_tag), "_iis",true)
then
    let splittraf = split(" ", to_string($message.message));
//    set_field("date",             splittraf[0]);
//    set_field("time",             splittraf[1]);
      set_field("s_ip",             splittraf[2]);
      set_field("cs_method",        splittraf[3]); 
      set_field("cs_uri_stem",	    splittraf[4]);
      set_field("cs_uri_query",	    splittraf[5]);
      set_field("s_port",		    splittraf[6]);
      set_field("cs_username",	    splittraf[7]);
      set_field("c_ip",	    	    splittraf[8]);
      set_field("cs_user_agent",    splittraf[9]);
      set_field("cs_referer",	    splittraf[10]);
      set_field("sc_status",        splittraf[11]);
      set_field("sc_substatus",	    splittraf[12]);
      set_field("sc_win32_status",	splittraf[13]);
//    set_field("sc_bytes",		    splittraf[14]);
//    set_field("cs_bytes",		    splittraf[15]);
      set_field("time_taken",		splittraf[16]);
end

Hmm, do you mean that using a space as a delimiter in a grok pattern won’t work, or that having a space in the field name is what causes the issue?

I’ve tested the grok pattern using “test with sample data” function which seemed to work fine. Here’s the pattern, for reference:
(?%{YEAR}-%{MONTHNUM}-%{MONTHDAY} %{TIME}) %{DATA:Server_IP} %{DATA:Method} %{DATA:URI_Stem} %{DATA:URI_Query} %{DATA:Server_Port} %{DATA:Client_Username} %{DATA:Client_IP} %{DATA:Client_UserAgent} %{DATA:Referrer} %{DATA:HTTP_Status} %{DATA:Protocol_Substatus} %{DATA:Win32_Status} %{DATA:Time_Taken} %{DATA:X-Forwarder-For}$

We’ve temporarily setup input extractors, but at the moment we’ve configured multiple kinds of logs to go to the same input, so I’m wondering whether an input extractor is the better method or whether using pipelines would scale better?

There are a series of special characters that set_fields() will silently die on if they are in the field name that it’s trying to create. None of which are in your GROK. Extractors work fine and when I asked the preference in the forum, the answer was ambivalent. While I was troubleshooting a different issue I moved everything from extractors to pipelines. We have a small environment so it wasn’t an issue.

So ignore me and go with shoothub’s checking the processing order and using debug() to see what is happening in the pipeline in the server logs

tail -f /var/log/graylog-server/server.log

1 Like

I recently setup Exchange 2016 iis log ingestion. Here is my grok pattern. I don’t know if it’s ideal, but it works.

%{TIMESTAMP_ISO8601} %{IP:s-ip} %{NOTSPACE:cs-method} %{NOTSPACE:uri-stem} %{NOTSPACE:uri-query} %{BASE10NUM:port} %{NOTSPACE:username} %{IP:c-ip} %{NOTSPACE:user-agent} %{NOTSPACE:UNWANTED} %{BASE10NUM:status} %{NOTSPACE:UNWANTED} %{NOTSPACE:UNWANTED} %{BASE10NUM:time-taken} (-|%{IP:src_ip})