Does GrayLog sidecar ingest data from Windows server event log in an incremental fashion?

1. Describe your incident:
GrayLog is ingesting more than 10GB of data everyday from our 10-12 windows servers event log. We would like to understand why is the volume of ingestion so high?

2. Describe your environment:

  • OS Information: Ubuntu 18.04.5 LTS

  • Package Version: 4.0.15+a7bed0d, codename Noir

  • Service logs, configurations, and environment variables:

3. What steps have you already taken to try and solve the problem?
Changed the daily ingestion interval on sidecar.yml to 24 hours from the intial 10 seconds but our daily data usage shown on the GrayLog system overview is around 10GB

4. How can the community help?

Can someone please explain to me how GrayLog system overview calculates the volume of data ingested from my Windows event logs using sidecars? I want to visualise how much event log is getting pushed through by each server and if the update interval on sidecar.yml has impact on this volume?

#GrayLog #sidecar #communityeditioin

Hello && welcome @susmitk

This is depending on how many fields are created after messages are ingested (i.e., INPUT type, Log type). More fields generated; more volume needed.

If the amount of log your receiving is the main issue here are some suggestions

Don’t collect any log unless it is required. Depending on what type of log shipper your using (Winlogbeat, MetricBeat, FileBeat, Nxlog, etc…), you can minimize the amount of data being sent.
For example:
In one environment I have over 150+ Window OS VM’s sending logs to Graylog, using Syslog UDP input with Nxlog shippers on each. This generates about 5-6 GB a day
In another environment I have 50 Window nodes sending 32 Gb a day. If you allow your DMZ to log verbose or Set GPOs to run every Event log possible, you will get a lot of Log’s and with the number of fields this can add up as you are noticing.

Once logs hit Elasticsearch, it creates those fields & index’s that data this is where it adds the amount of data.

Not knowing what or how your environment is setup, I would suggest minimizing the amount of data from all your node/s. There are multiply way to reduce the white noise coming from Windows devices. If that not an option I would suggest dropping messages not needed before it hits Elasticsearch.

You can create a Widget, either on your Dashboard or Saved search’s. Then filter out each “Source” and add message count.

Or you can use Metrics, this show the amount of data total (inputs streams, etc…). This is under System/Node , then click on Metric needed. If Prometheus is enabled this might be another option to trend data.

Last suggestion, if you navigate to the built-in Dashboard called Sources (version 4.x).
You should see something like this, Notice each node shown has a message count per hour/day or minute. Which ever I prefer

Hope that helps

1 Like

Thanks for this , I will definitely give this a try today to analyze the message count coming through from each server.

I am using winlogbeat sidecars to collect the data utilising the default template. We have also set up GPO to limit the size of Windows event logs application log - 150 MB, Security - 300MB , System - 150MB which means technically we can have a server with a maximum of 600MB of log at any point of time.

# Needed for Graylog
fields_under_root: true
fields.collector_node_id: ${sidecar.nodeName}
fields.gl2_source_collector: ${sidecar.nodeId}

output.logstash:

  • hosts: [“xx.xx.xx.xx:5044”]*
    path:
  • data: C:\Program Files\Graylog\sidecar\cache\winlogbeat\data*
  • logs: C:\Program Files\Graylog\sidecar\logs*
    tags:
    • windows*
      winlogbeat:

Lastly, can I please request you to give me some directions with regards to finding out the fields that are created after message ingestion and how to drop messages before it hits ElasticSearch? Any article / blog would be very helpful.

Thanks,
Susmit

Hello

Its sometime not the size of the logs, its the amount collected.
I can have 1000 logs at 1 MB or 1 log at 1000MB :thinking:

Below example will collect all the Events from Application, System, Security.

output.logstash:
   hosts: ["8.8.8.8:5044"]
path:
  data: C:\Program Files\Graylog\sidecar\cache\winlogbeat\data
  logs: C:\Program Files\Graylog\sidecar\logs
tags:
 - windows
winlogbeat:
  event_logs:                  <<<------ This section here
   - name: Application
   - name: System
   - name: Security

To limit what you want, perhaps try something like this, please adjust it to your needs, this is only example.

winlogbeat.event_logs:
    - name: Application
      ignore_older: 72h    
    - name: System
      event_id: 400, 403, 600, 800 
    - name: Microsoft-Windows-PowerShell/Operational
      event_id: 4103, 4104, 4105, 4106

You can look here on HowTo.

Perhaps something like this, Example:

curl -XGET 'http://graylog_server:9200/graylog_101/_mapping?pretty'

Perhaps this link will help.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.