I use Graylog to store the logs sent via Syslog by a Fortigate appliance. Of course, it is a problem of the space required to store the logs for as long as possible. Approximately 3-400 MB of logs are collected per day, and the dedicated space on the machine on which I installed Graylog is 400 GB. However, I noticed that most of the messages are DNS requests and I do not consider it necessary to store this information. I think eliminating DNS records would save a lot of space. I created a “drop dns” rule to drop messages with dstport=53, the rule seems to work, it identifies the messages, only that they appear in the feed, and the volume of logs collected daily does not decrease enough. What other methods could I use to reduce the space dedicated to log storage?Before you post: Your responses to these questions will help the community help you. Please complete this template if you’re asking a support question.
Drop dns rules:
rule “drop DNS”
to_long($message.dstport) == 53
300-400 MB logs a day seems very little, if you have 400GB of storage?
DNS Logs are valid to delete, if you want to decrease your volume. I’d recommend to save the DNS queries as a substitute, but those will eat up a lot of space again.
If you want to find out what else causes a lot of logs I recommend heatmaps. Put in the source as row and the destination as column and you will see the top talker in your environment.
Thanks for the tip, I’ll try to make a heatmap, maybe identify sources that can be eliminated. Although we do not offer critical, financial or security services, it is a client’s requirement to keep logs for at least 48 months. Unfortunately, except for DNS requests, almost any other recorded messages could be of interest.
If it’s more about recording than searching you might be interested in the paid Graylog Operations and the archives-function. It will enable you to put your elastic/opensearch indices into coldstorage-files.
As far as I know Graylog Operations is available for instances with less then 2GB Data ingestet into Opensearch/Elastic for free.
It’s a cost issue. We are a company with up to 100 employees. We collect and store logs for at least 48 months as a requirement imposed by a particular client. Investments in larger and better performing storage equipment, licensing or subscriptions could increase the costs for the respective process and would no longer be profitable.
That’s odd, from a security viewpoint DHCP and DNS are key data in finding out how the hell your infrastructure got targeted months ago, and are key findings in protecting oneself when things go wrong.
DNS server for our domain are in a separate cloud, and have his own log system. DHCP Server run in Windows Server and store log localy. From a security perspective, I also use the log management system in the cloud from Fortigate (Forticloud), but unfortunately it does not store more than 12 months. There is no possibility to download the old recordings from there either. FortiAnalyzer would be a solution, but it is a rather high cost that we cannot cover. As I said before, the Graylog court operates strictly to satisfy the client’s requirement to keep records for at least 48 months.
by default pipeline processor is placed before Message Filter Chain, the fields aren’t redy yet, you can change the processor order in System > Configurations, on the right pane
Just chiming in.
if your looking for long term storage ( 48 Months) perhaps on the Index you could use “close Index”.
A closed index has almost no overhead on the cluster (except for maintaining its metadata), and is blocked for read/write operations. A closed index can be opened which will then go through the normal recovery process. This is what I like to call “Deep Storage” This might require an other volume to handle 48 Months. With Elasticsearch/OpenSearch its possible to make a simple repository and migrate the closed indices for deep storage. Just a thought…
Does “Close index” somehow help to reduce the final occupied space?
What we did was migrating the repo created to another volume for "Deep Storage’.
Only other solution I can see in your case is have very few fields, this would mean a input like Syslog UDP and only created a field/s needed. The point is, more fields created the more space needed.
And last would be Graylog Operation/Enterprise version which has a pretty awesome Archive solution.