As the title says: I have a simple setup for Graylog, one machine that does have elasticsearch, graylog, mongodb, and the sidecar to collect the logs. Set up a few extracting mechanism, but only using the all_messages stream and the default index set. Can access is from remote as it gets proxied by apache2.
Now to the problem(s): I started on a single logfile to work on the extraction. That was also the currently active logfile, so data was getting appended. That way i was able to test that the extraction was working fine. Now i wanted to put it on the full collection of logfiles, and reindex everything. To clear the data out of Graylog, i rotated the index and deleted the old one. To make it read the file again, i deleted filebeats registry.
The weird things: The log file i use rotates through a folder structure, to group the log with other logfiles. Having given the collector to order to get every file, so /*/*/*/logfile, for year/month/day, only the currently active logfile was put into graylog, repeatently. Went up to almost 100 of copies of the exact same log entry. So, stopped the sidecar, cleaned up again, tested with an inactive logfile. Works like a charm, gets indexed once, and searchable with all it’s fields. After that testing, again cleanup, and setting it to read everything of the current month. Which includes the currently active logfile. Nothing gets sent from the collector.
Checking the registry (location found from https://github.com/Graylog2/collector-sidecar/issues/203), all the files are inside and have the offset that they seem to have been fully read.
So, my questions would be: Is there a second location that saves the registry, to save you from an accidental reset? Why don’t i get ANY data now, including fresh entries to the currently active log?