Graylog/ filebeat Timestamp problem

whovian · November 9, 2021, 2:07pm

Hello,
Here’s what I installed on my VM
RedHat RHEL 7.6 x86_s_64
Graylog 4.0.2
elasticsearch 7.10.2
mongodb 4.4.2
When I send logs from another VM with the filebeat agent
filebeat-7.11.1-x86_64
Here is my Graylog/filebeat problem
It recovers the entire log file whenever the log is changed.
How to get Graylog/filebeat to retrieve only the last message of a log file.
I would really like to fix this bug that mixes all my data

Bonjour,
Voici ce que j’ai installé sur ma VM
RedHat RHEL 7.6 x86_s_64
Graylog 4.0.2
elasticsearch 7.10.2
mongodb 4.4.2
Quand j’envoie des logs d’une autre VM avec l’agent filebeat
filebeat-7.11.1-x86_64
Voici mon problème Graylog/filebeat
Il récupère l’ensemble du fichier de log à chaque fois que le log est modifié.
Comment faire pour que Graylog/filebeat récupérer uniquement le dernier message d’un fichier de log.
J’aimerais vraiment corriger ce bug qui mélange toutes mes données

whovian · November 9, 2021, 2:07pm

gsmith · November 9, 2021, 11:23pm

Hello && Welcome

Filebeat keeps the state of each file and frequently flushes the state to disk in the registry file. The state is used to remember the last offset a harvester was reading from and to ensure all log lines are sent. If the output, such as Elasticsearch or Logstash, is not reachable, Filebeat keeps track of the last lines sent and will continue reading the files as soon as the output becomes available again. While Filebeat is running, the state information is also kept in memory for each input. When Filebeat is restarted, data from the registry file is used to rebuild the state, and Filebeat continues each harvester at the last known position.

This might be something you need to configure in your FileBeat configuration file as shown below in this post.

Hope that helps

whovian · November 10, 2021, 8:25am

this is my configuration for filebeat?yml

filebeat.inputs:

type: log
enabled: true
- /opt/application/xxxx/logs/xxxxxx.log
  multiline.pattern: ‘[0-9]{4}-[0-9]{2}-[0-9]{2}’
  multiline.negate: true
  multiline.match: after

======================= Elasticsearch template setting =======================

setup.template.settings:
index.number_of_shards: 1
#index.codec: best_compression
#_source.enabled: false

------------------------------ Logstash Output -------------------------------

output.logstash:

The Logstash hosts

hosts: [“xx.xxx.xx.xx:1761”]

================================= Processors =================================

processors:

add_host_metadata:
when.not.contains.tags: forwarded
add_cloud_metadata: ~
add_docker_metadata: ~
add_kubernetes_metadata: ~

============================== Filebeat modules ==============================

filebeat.config.modules:

Glob pattern for configuration loading

path: ${path.config}/modules.d/*.yml

Set to true to enable config reloading

reload.enabled: false

Period on which files under path should be checked for changes

#reload.period: 10s

whovian · November 10, 2021, 3:17pm

The problem is that at each new entry it recovers the entire file so when a message and from 14:30 and the agent recovers it at 16:00 it will create an offset in Graylog!

gsmith · November 10, 2021, 11:28pm

I understand, as I posted earlier, and seeing your configuration, looks like you configured it for Opensearch/ELK and not Graylog but I could be wrong.

What I posted above should have helped you out on identifying your issue and correcting it. maybe try something like this.

path.data: /var/lib/filebeat
filebeat.inputs:
- type: log
  enabled: false
  paths:
    - /opt/application/xxxx/logs/xxxxxx.log
      multiline.pattern: ‘[0-9]{4}-[0-9]{2}-[0-9]{2}’
      multiline.negate: true
      multiline.match: after  

filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false

output.logstash:
  hosts: ["92.168.1.1:5044"]
processors:
  - add_host_metadata:
      when.not.contains.tags: forwarded

logging.level: info
logging.to_files: true
logging.files:
  path: /var/log/filebeat
  name: filebeat.log

Just to be clear filebeat keeps the state of each file and frequently flushes the state to disk in the registry file. The state is used to remember the last offset a harvester was reading from and to ensure all log lines are sent. If the output, such as Elasticsearch or Logstash, is not reachable, Filebeat keeps track of the last lines sent and will continue reading the files as soon as the output becomes available again. While Filebeat is running, the state information is also kept in memory for each input. When Filebeat is restarted, data from the registry file is used to rebuild the state, and Filebeat continues each harvester at the last known position.

Make sure FileBeat can access its registry file this is where it keeps the information where it left off in the log file. I haven’t looked into it but it might have to do with multi lines but I’m not 100% sure you would need to test that out.

whovian · November 12, 2021, 8:35am

My setup is functional for Graylog.
I receive logs without problems!

The real problem is that when you restart the filebeat agent or try to recover a new log file, there will be a timestamp error
But from what I understand it to look like the nominal functioning of Filebeat.

In the end I can’t correct that.

gsmith · November 12, 2021, 11:42pm

Sorry about you issue with FileBeat maybe have better luck with a different log shipper.

I stated earlier, FileBeat uses a registry file which tells it the last Known position this would be where it left off reading from the log file/s and if it does not have access to read/write it will grab the new file when restarting. This can be the possible cause for your issue.

While Filebeat is running, the state information is also kept in memory for each input. When Filebeat is restarted, data from the registry file is used to rebuild the state, and Filebeat continues each harvester at the last known position.

Have you looked at my link I showed you, within that documentation it talks about “rotating logs”.

system · November 26, 2021, 11:43pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Not receiving all logs in Graylog Graylog Central (peer support) sidecar , filebeat-linux , nosendlogfblx , clfblx	7	4601	April 5, 2019
Filebeat logs dropped out Graylog Central (peer support) pipeline-rules , route-to-streampl , debuggingpl , sidecar , filebeat-windows	10	3426	February 8, 2018
Timestamp issue Graylog Central (peer support)	4	1484	March 22, 2018
Logs received too late but with the right timestamp Graylog Central (peer support) sidecar , filebeat-linux , filebeat-windows , winlogbeat	12	2788	November 24, 2021
Graylog 3.1.4 - Containerized \| Elasticsearch-oss 6.8.6 - on host \| Filebeat 6.8.5 - on host Graylog Central (peer support) sidecar , nxlog , filebeat-linux , filebeat-windows , winlogbeat , nodatanx , nosendlogfblx , clfblx	8	2585	March 13, 2020