Using Graylog as a Data Lake and forwarding logs to SIEM based on the use cases implemented

Hello,

I really need your help and vision on a situation I’m passing through.
Here is the description:

We have a SIEM (QRadar) structure, with Event Collectors that receive logs from various data sources from many customers. Those logs are then correlated by the configured Use Case/SIEM rules.However, we have a lot of logs from various data sources that are being sent to the SIEM that we don’t need for Use Cases/SIEM Rules. They are just being stored for searching, reporting etc…
Our goal here is to reduce the number of events received on the SIEM (the rate of Event Per Second)
With that, we want to keep receiving every logs (the ones that we need for Use Cases and the ones that we don’t need) but before they reach the SIEM we want to filter them in order to have just the ones we need for use cases being sent to the SIEM and the others to stay in Graylog.
We were thinking of two ways:

1. Having the following structure - DATA SOURCES --> AGREGATOR OF LOGS --> DATA LAKE Graylog(data lake receives all the logs and forwards the ones needed for Use Cases to SIEM) --> SIEM
  1. Having the following structure - DATA SOURCES → AGREGATOR OF LOGS (aggregator of logs receives all the logs and forwards the ones we need for Use Cases to the SIEM and the ones we dont need to Graylog) → SIEM

Is this approach I’m describing, possible to implement?
From this two aproaches which one of them would you think it would be the best?

Other questions I have are:

1. Does Graylog permit segregation of customers? and segregation of datasources?
2. We have logs that come through syslog with LEEF format here is an example: "LEEF:2.0|Check Point|VPN-1 & FireWall-1|1.0|Accept|...", LEEF format doesn't bring the source host IP or hostname like the normal syslog header on the payload, so the log will arrive at the Data Leak with the original Source IP of the Data Source, but will leave the Data Lake with the Source IP being the one of the Data Lake. SIEM looks for the host on the payload first, if it doesnt find it uses the source ip from where the log comes. 
  1. Doest Graylog permit the forwarding of logs withthe Source IP still being the one of the original data source, but doing this without modifying the original log, because of the legal purposes?
    4. Does Graylog guarantees integrity and authenticity of the logs for legal purposes?
    5. Does Graylog permits to apply retention policies for the logs by each customer?
    6. Is Graylog capable of segregating data sources like LEEF?

Could you help me with this situation?

Thank you so much.

Hello @bit1290

See If I understood this correct. perhaps simplify it a bit.

All logs/messages are received by Graylog. you need to forward only specific logs to your SIEM solution. If so, yes you can but it maybe limited depending on what Output you may want to use.

Here are a couple examples

Ill try to answer you other question.

Depend on the type of installment you have, but either way using LDAP with permissions you maybe able to achieve this.

You may want to read this documentation for file format

You best bet would create a stream to collect the Source IP then forward them to SIEM. If the logs need to be modified there are multiple solution like Pipeline that can achieve this.

This would depend on how you setup Graylog or Cluster, meaning the inputs, Input types, streams, Custom indices or Default , etc…

It is Opensource. The license is SSPL, It’s similar to GPL but you may not offer the software as a service. I think there is no formal guarantee.

Data retention in Graylog is defined by the indexset the stream lives on. If you have multiple streams for different data retention periods this will work.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.