Custom CSV log - how best to process

Description of your problem

I’m trying to determine the best method (pipeline, extractor, or?) to parse two incoming CSV log files into my GL4.1 installation. I’m not sure I am approaching this the correct way. Ideally, the header row of the CSV’s would be ignored, and the tab delimted fields would be processed in some manner for better analysis. The header fields are: TASK NAME/ OPERATION STATUS/ ERROR LINE NUM / ERROR DESCRIPTION INFORMATION

Description of steps you’ve taken to attempt to solve the issue

I’ve got the sidecar installed on the server in question, and the CSV data is incoming from what I can see. What I’m trying understand is what approach I should take.Grok patterns, pipelines, extractors etc.

Environmental information

Server logs from Windows 2019 server running sidecar and custom application
Log files written to csv

Operating system information

CentOS 7.9

Package versions

Standard vanilla Graylog 4.1 install from tarball

Thanks.

Personally I prefer pipelines over extractors but you could likely do either.

Assuming you are using beats on your Windows server you can use the beats configuration to ignore header lines of the file. Below is one I am using - I changed some words around…

# Needed for Graylog
fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}
output.logstash:
   hosts: 
   - ${user.BeatsInput}
   ssl:
   verification_mode: none
path:
  data: C:\Program Files\Graylog\sidecar\cache\winlogbeat\data
  logs: C:\Program Files\Graylog\sidecar\logs
tags:
 - windows, vet
filebeat:
  inputs:
##### find animal things but not when line contains bark or lines that start with #
    - type: log
      enabled: true
      include_lines: ['my_brothers_dog', 'hidden_cat', 'lizard']
      exclude_lines: ['bark','^#']
      fields:
        unique_log_tag: animals
      ignore_older: 72h
      paths:
        - C:\Program Files\MS-VET\*.LOG
#
##### find quiet stuff but not if it contains shhhh or lines that start with #
    - type: log
      enabled: true
      include_lines: ['ASMR_videos']
      exclude_lines: ['shhhh','^#']
      fields:
        unique_log_tag: quiet
      ignore_older: 72h    
      paths:
        - C:\Program Files\MSQuiet\*.LOG

Hi Tmac,

Thank you. I’ll look this over to see if I can incorporate it. Yes, I am using beats on Windows as my “agent”.

In this example you provide - is this considered a pipeline, or is it an extractor? Or neither? :smile: More than one way to skin a cat I suppose…

Thx

Sorry I wasn’t clear… the example I provided is a sidecar configuration for filebeats against a windows server - it allows you to define what is sent to the graylog server before it hits the input, extractors or pipelines.

Thanks for the clarification!
I’ve got this entered into my sidecar config.
Just to understand:
If I want to exclude on the header row, I could enter something like
exclude_lines: [‘header1’, ‘header2’, ‘header3’] ?
Would I need to explicitly state every header (column) entry?
Or would the first line that has ‘header1’ exclude the entirety of that row?

I believe it works regex style. That would mean that:

exclude_lines: ['header1', 'header2', 'header3']

would exclude any lines that have the words header1, header2 or header3 in them. In my case I wanted to exclude commented lines in a iis log file so it would be any lines that start with #

exclude_lines: ['^#']

With any luck the header lines are commented out or have something unique in them you can use regex to find and exclude

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.