Sidecars filebeat clearing read issues


#1

As the title says: I have a simple setup for Graylog, one machine that does have elasticsearch, graylog, mongodb, and the sidecar to collect the logs. Set up a few extracting mechanism, but only using the all_messages stream and the default index set. Can access is from remote as it gets proxied by apache2.
Now to the problem(s): I started on a single logfile to work on the extraction. That was also the currently active logfile, so data was getting appended. That way i was able to test that the extraction was working fine. Now i wanted to put it on the full collection of logfiles, and reindex everything. To clear the data out of Graylog, i rotated the index and deleted the old one. To make it read the file again, i deleted filebeats registry.
The weird things: The log file i use rotates through a folder structure, to group the log with other logfiles. Having given the collector to order to get every file, so /*/*/*/logfile, for year/month/day, only the currently active logfile was put into graylog, repeatently. Went up to almost 100 of copies of the exact same log entry. So, stopped the sidecar, cleaned up again, tested with an inactive logfile. Works like a charm, gets indexed once, and searchable with all it’s fields. After that testing, again cleanup, and setting it to read everything of the current month. Which includes the currently active logfile. Nothing gets sent from the collector.
Checking the registry (location found from https://github.com/Graylog2/collector-sidecar/issues/203), all the files are inside and have the offset that they seem to have been fully read.

So, my questions would be: Is there a second location that saves the registry, to save you from an accidental reset? Why don’t i get ANY data now, including fresh entries to the currently active log?


#2

Update about the not getting any data, having left graylog and the sidecar active for a good bit, it did return to it’s activity that happened on the setting of all the logs, so reading the currently active log over and over, almost 200 times was it entered, including using and loosing something through the journal buffer.
Second update: 1 mistake located, that was the config for the collector, The “tail files” was checked, causing the registry to start at the end, filebeat still attempting to read from there downwards, nothing gets sent for inactive files.
The problem with repeated full-reading of the currently active log persists.


#3

Located the fault, filebeat does log the file as truncated, probably based that it is a remote mount for a VM this is running on, any known solutions for that?


(Jan Doberstein) #4

depending how you have this mounted locally solutions might be present. But it is know that filebeat has issues with remote file locations.


(system) #5

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.