Graylog/ filebeat Timestamp problem

Hello,
Here’s what I installed on my VM
RedHat RHEL 7.6 x86_s_64
Graylog 4.0.2
elasticsearch 7.10.2
mongodb 4.4.2
When I send logs from another VM with the filebeat agent
filebeat-7.11.1-x86_64
Here is my Graylog/filebeat problem
It recovers the entire log file whenever the log is changed.
How to get Graylog/filebeat to retrieve only the last message of a log file.
I would really like to fix this bug that mixes all my data

Bonjour,
Voici ce que j’ai installé sur ma VM
RedHat RHEL 7.6 x86_s_64
Graylog 4.0.2
elasticsearch 7.10.2
mongodb 4.4.2
Quand j’envoie des logs d’une autre VM avec l’agent filebeat
filebeat-7.11.1-x86_64
Voici mon problème Graylog/filebeat
Il récupère l’ensemble du fichier de log à chaque fois que le log est modifié.
Comment faire pour que Graylog/filebeat récupérer uniquement le dernier message d’un fichier de log.
J’aimerais vraiment corriger ce bug qui mélange toutes mes données

Hello && Welcome

Filebeat keeps the state of each file and frequently flushes the state to disk in the registry file. The state is used to remember the last offset a harvester was reading from and to ensure all log lines are sent. If the output, such as Elasticsearch or Logstash, is not reachable, Filebeat keeps track of the last lines sent and will continue reading the files as soon as the output becomes available again. While Filebeat is running, the state information is also kept in memory for each input. When Filebeat is restarted, data from the registry file is used to rebuild the state, and Filebeat continues each harvester at the last known position.

This might be something you need to configure in your FileBeat configuration file as shown below in this post.

Hope that helps

this is my configuration for filebeat?yml

filebeat.inputs:

  • type: log
    enabled: true
    • /opt/application/xxxx/logs/xxxxxx.log
      multiline.pattern: ‘[0-9]{4}-[0-9]{2}-[0-9]{2}’
      multiline.negate: true
      multiline.match: after

======================= Elasticsearch template setting =======================

setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false

------------------------------ Logstash Output -------------------------------

output.logstash:

The Logstash hosts

hosts: [“xx.xxx.xx.xx:1761”]

================================= Processors =================================

processors:

  • add_host_metadata:
    when.not.contains.tags: forwarded
  • add_cloud_metadata: ~
  • add_docker_metadata: ~
  • add_kubernetes_metadata: ~

============================== Filebeat modules ==============================

filebeat.config.modules:

Glob pattern for configuration loading

path: ${path.config}/modules.d/*.yml

Set to true to enable config reloading

reload.enabled: false

Period on which files under path should be checked for changes

#reload.period: 10s

The problem is that at each new entry it recovers the entire file so when a message and from 14:30 and the agent recovers it at 16:00 it will create an offset in Graylog!

I understand, as I posted earlier, and seeing your configuration, looks like you configured it for Opensearch/ELK and not Graylog but I could be wrong.

What I posted above should have helped you out on identifying your issue and correcting it. maybe try something like this.

path.data: /var/lib/filebeat
filebeat.inputs:
- type: log
  enabled: false
  paths:
    - /opt/application/xxxx/logs/xxxxxx.log
      multiline.pattern: ‘[0-9]{4}-[0-9]{2}-[0-9]{2}’
      multiline.negate: true
      multiline.match: after  

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

output.logstash:
  hosts: ["92.168.1.1:5044"]
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded

logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat.log

Just to be clear filebeat keeps the state of each file and frequently flushes the state to disk in the registry file. The state is used to remember the last offset a harvester was reading from and to ensure all log lines are sent. If the output, such as Elasticsearch or Logstash, is not reachable, Filebeat keeps track of the last lines sent and will continue reading the files as soon as the output becomes available again. While Filebeat is running, the state information is also kept in memory for each input. When Filebeat is restarted, data from the registry file is used to rebuild the state, and Filebeat continues each harvester at the last known position.

Make sure FileBeat can access its registry file this is where it keeps the information where it left off in the log file. I haven’t looked into it but it might have to do with multi lines but I’m not 100% sure you would need to test that out.

My setup is functional for Graylog.
I receive logs without problems!

The real problem is that when you restart the filebeat agent or try to recover a new log file, there will be a timestamp error
But from what I understand it to look like the nominal functioning of Filebeat.

In the end I can’t correct that.

Sorry about you issue with FileBeat maybe have better luck with a different log shipper.

I stated earlier, FileBeat uses a registry file which tells it the last Known position this would be where it left off reading from the log file/s and if it does not have access to read/write it will grab the new file when restarting. This can be the possible cause for your issue.

While Filebeat is running, the state information is also kept in memory for each input. When Filebeat is restarted, data from the registry file is used to rebuild the state, and Filebeat continues each harvester at the last known position.

Have you looked at my link I showed you, within that documentation it talks about “rotating logs”.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.