Writing an extractor for CSV type of log - Microsoft IAS / NPS

(bubba198) #1

Hi everyone,

I’m Writing an extractor for CSV type of log - this one happened to be IAS straight of out Microsoft NPS. It lives on the Raw Text TCP input. The extractor works just as I want with one exception: I had to use “copy” type of extractor so that I can get the conversion of my fields and that creates a duplicate message of the entire log line. One is for the “message” field and the second one is for the field I had to specify to create the extractor in the first place - that field is called nps_message_log.

Ideally I don’t want either meaning I would like to eliminate completely the original “message” field and my “nps_message_log” field since my converter places all log data into appropriate fields so I am happy with the way it works; just a housekeeping item to avoid duplicating the original raw “message” and my byproduct field “nps_message_log”. Can those both be deleted during extraction so I end up with my nicely formatted fields which already populate with the correct data as a result of the extractor.

Any guidance will be appreciated. Here’s my extractor:

  "extractors": [
  "title": "NPS_IAS_logfile",
  "extractor_type": "copy_input",
  "converters": [
      "type": "csv",
      "config": {
        "column_header": "ComputerName,ServiceName,Record-Date,Record-Time,Packet-Type,User-Name,Fully-Qualified-Distinguished-Name,Called-Station-ID,Calling-Station-ID,Callback-Number,Framed-IP-Address,NAS-Identifier,NAS-IP-Address,NAS-Port,Client-Vendor,Client-IP-Address,Client-Friendly-Name,Event-Timestamp,Port-Limit,NAS-Port-Type,Connect-Info,Framed-Protocol,Service-Type,Authentication-Type,Policy-Name,Reason-Code,Class,Session-Timeout,Idle-Timeout,Termination-Action,EAP-Friendly-Name,Acct-Status-Type,Acct-Delay-Time,Acct-Input-Octets,Acct-Output-Octets,Acct-Session-Id,Acct-Authentic,Acct-Session-Time,Acct-Input-Packets,Acct-Output-Packets,Acct-Terminate-Cause,Acct-Multi-Ssn-ID,Acct-Link-Count,Acct-Interim-Interval,Tunnel-Type,Tunnel-Medium-Type,Tunnel-Client-Endpt,Tunnel-Server-Endpt,Acct-Tunnel-Conn,Tunnel-Pvt-Group-ID,Tunnel-Assignment-ID,Tunnel-Preference,MS-Acct-Auth-Type,MS-Acct-EAP-Type,MS-RAS-Version,MS-RAS-Vendor,MS-CHAP-Error,MS-CHAP-Domain,MS-MPPE-Encryption-Types,MS-MPPE-Encryption-Policy,Proxy-Policy-Name,Provider-Type,Provider-Name,Remote-Server-Address,MS-RAS-Client-Name,MS-RAS-Client-Version"
  "order": 0,
  "cursor_strategy": "cut",
  "source_field": "message",
  "target_field": "nps_message_log",
  "extractor_config": {},
  "condition_type": "none",
  "condition_value": ""
  "version": "2.1.0-SNAPSHOT"

(Philipp Ruland) #2

Hey @bubba198,

Extractors are not able to delete standard fields like message. You’ll need to get into Pipelines to achieve this.

Look here for further info:

Greetings - Phil

(bubba198) #3

Hi @derPhlipsi - thanks for the quick reply. I looked at pipelines but it’s beyond what I can do. As an alternative is there a way to at least not have to duplicate the message field using my extractor?

At the moment it forces me to copy the message field into another field – can I avoid that and at the end still have the message field present but at least not have another field which is duplicate of the message field? Thank you!


Have you tried to copy the message field to the message field itself? Did it work?

(bubba198) #5

Thanks @jtkarvo – I did try that and it worked meaning that now I don’t have to duplicate the message field but I still have a message field which is the row CSV data. This is an improvement, basically I’m saving 50% of space by not having to duplicate the built-in field “message” into another field! I’m getting closer!

Also for anyone who’s reading this – the way I dump my csv to Graylog is using NetCat:
C:\Soft\LogFiles>type test-import.log | nc 5555

(Philipp Ruland) #6

Hey @bubba198,

this is the closest you will get with Extractors. They do not have any write access on the default fields, so you won’t be able to clear or delete it.

Greetings - Phil

(bubba198) #7

Thanks @derPhlipsi – I figured this is the end of the line with extractors and it isn’t that bad at all actually. Everything works as expected and only one copy of the “message” is kept in Graylog.

However I must ask: is there a better way to import csv structured data into Graylog? Obliviously that is my end goal and I would like to know whether there’s a better vehicle which would deliver the same outcome?

Thank you

(Jochen) #8

You can use Logstash to preprocess the CSV data before sending it to Graylog: