bignjato
(Boris Ignjatović)
November 9, 2017, 8:36am
1
Hello,
I have problems with missing logs when importing logs from Graylog Collector Sidecar folder (NXLog import)
I was testing upload
log file with 30000 fields
Import test 1 - 29.967 logs imported
import test 2 - 22.939 logs imported
import test 3 - 27.646 logs imported
log file with 100 fields
Import test 1 - 100 logs imported
import test 2 - 100 logs imported
import test 3 - 100 logs imported
log file with 1000 fields
Import test 1 - 1000 logs imported
import test 2 - 1000 logs imported
import test 3 - 1000 logs imported
4 x log file with 100 fields
Import test 1 - 397 logs imported
import test 2 - 342 logs imported
import test 3 - 378 logs imported
Graylog ova with 6 CPU and 12 GB RAM
Configuration
Path: /var/opt/graylog/data/journal
Earliest entry:4 minutes ago
Maximum size:50.0GB
Maximum age:12 hours 0 minutes
Flush policy:Every 1,000,000 messages or 1 minutes 0 seconds
jochen
(Jochen)
November 9, 2017, 9:51am
2
How exactly are you transporting the logs to Graylog?
How exactly are you verifying how many log messages have been ingested and indexed?
bignjato
(Boris Ignjatović)
November 9, 2017, 11:21am
3
Hello @jochen
I import log files over Graylog Collector Sidecar folder (NXLog import) and then verify number of imported messages in search
every upload test I work on a clean base
Search by import file name
jochen
(Jochen)
November 9, 2017, 11:47am
4
What transport protocol are you using?
Raw/Plaintext? GELF? Syslog? Over TCP or UDP?
jochen
(Jochen)
November 9, 2017, 11:51am
6
The suspense is killing me…
What type of input are you using in Graylog?
It seems to be UDP-based, so you should keep in mind that this is a stateless protocol (in contrast to TCP) and you might lose network packets without noticing.
bignjato
(Boris Ignjatović)
November 9, 2017, 11:55am
7
Sorry for that,
I use graylog colector in the same machine and nxlog plugin to send logs from folder to graylog on the same machine.
Yes I use GELF UDP, what is your proposal to use TCP instead?
Sorry for my replays
jochen
(Jochen)
November 9, 2017, 12:53pm
8
You could try that.
Also make sure to read this GitHub issue:
opened 05:23PM - 21 Jul 17 UTC
closed 12:19PM - 15 Sep 17 UTC
needs-input
to-verify
When messages are sent at a rate (presumably too high for graylog to process), m… essages are dropped without any kind of client side errors occurring. Tests cases were performed on multiple dockerized instances of graylog server using the following test script: https://github.com/Aenima4six2/graylog-delivery-ratio
## Expected Behavior
Using a TCP input should guarantee any messages sent should be successfully delivered and processed or an error should occur at the client end.
## Current Behavior
When high rate of messages are sent to Graylog server, the client is informed that all packets (thus messages) were received, however, the Graylog journal and Elasticsearch index show otherwise (IE: 10000 messages sent and 9800 in the journal).
<img width="1064" alt="screen shot 2017-07-21 at 11 15 08 am" src="https://user-images.githubusercontent.com/2087362/28474831-455d0b58-6e07-11e7-8a00-6e1ac282704d.png">
## Possible Solution
Possibly rate limit or throttle incoming TCP/UDP connections such that all can be processed in time without timeouts or dropped messages.
## Steps to Reproduce (for bugs)
1. docker compose up with the provided compose file (below).
<img width="754" alt="screen shot 2017-07-21 at 11 10 28 am" src="https://user-images.githubusercontent.com/2087362/28474373-b1b40042-6e05-11e7-85d5-5ecc886c61c9.png">
2. Ensure a TCP input has been created and started on the default GELF port (12201).
3. If using provided test script, clone or download python test script from repo above. Ensure python3 is installed and restore all dependencies with pip install -r requirements.txt
4. Send an onslaught of messages via TCP to graylog using a correlation id to count how many of the messages were received in the batch. If using the test script, use:
`python3 test_delivery_ratio.py -t 10000 -m TCP -T 1 -v`
5. If using test script, evaluate the delivery ratio, if not, query the graylog journal and/or elasticsearch using the correlation id above.
## Context
This issue requires us to use GELF over HTTP, which is inherently slower than TCP.
Note: Also tried raising the TCP receive buffer to 50 megabytes but problem still persists.
Problem seems to be at its worse when graylog is under heavy load and concurrent TCP sends are in progress.
## Your Environment
Using the following docker compose file
```
version: "3"
services:
bd-mongo:
image: "mongo:3"
volumes:
- ./data/mongo:/data/db
bd-elasticsearch:
image: "elasticsearch:2"
command: "elasticsearch -Des.cluster.name='graylog'"
volumes:
- ./data/elasticsearch:/usr/share/elasticsearch/data
graylog:
image: graylog2/server:latest
volumes:
- ./data/graylog/data/journal:/usr/share/graylog/data/journal
- ./data/graylog/config:/usr/share/graylog/data/config
- ./data/graylog/plugin:/usr/share/graylog/plugin
environment:
GRAYLOG_PASSWORD_SECRET: somepasswordpepper
GRAYLOG_ROOT_PASSWORD_SHA2: 8c6976e5b5410415bde908bd4dee15dfb167a9c873fc4bb8a81f6f2ab448a918
GRAYLOG_WEB_ENDPOINT_URI: http://localhost:9000/api
GRAYLOG_SERVER_JAVA_OPTS: -Xdebug -agentlib:jdwp=transport=dt_socket,address=9999,server=y,suspend=n
links:
- bd-mongo:mongo
- bd-elasticsearch:elasticsearch
ports:
- "9000:9000"
- "12201:12201"
- "12201:12201/udp"
- "9999:9999"
- "9999:9999/udp"
```
* Graylog Version: Graylog v2.2.3+7adc951
* Elasticsearch Version: 2.3
* MongoDB Version: 3
* Operating System: CentOS 7 and Mac OS 10.12.5
* Browser version: N/A
bignjato
(Boris Ignjatović)
November 9, 2017, 12:56pm
9
Yes I try with tcp and 1 000 000 mesages goes ok I think that TCP will resolve all my problems!
Thanks you save me!
bignjato
(Boris Ignjatović)
November 15, 2017, 9:39am
10
No error is the same when I import all my logs every time is different count of all logs
Import 1 - 17,951.131 messages
Import 2 - 9,671.136 messages
Import 3 - 16,245.694 messages
Before every test i clean indices and hotfolder.
No error on nxlog or collector-sideca.
What plugin you use for import manual logs with collector-sidecar??
this is setup for nxlog
define ROOT /usr/bin
<Extension gelf>
Module xm_gelf
</Extension>
<Extension 59949e3cca105203b1fb0d79-multiline>
Module xm_multiline
HeaderLine /^-./
</Extension>
<Extension 599bea29ca105203d800403c-multiline>
Module xm_multiline
HeaderLine /^\d{4}-\d{2}-\d{2}/
</Extension>
<Processor 59949e3cca105203b1fb0d79-buffer>
Type Mem
Module pm_buffer
MaxSize 16384
</Processor>
<Processor 599bea29ca105203d800403c-buffer>
Module pm_buffer
MaxSize 16384
Type Mem
</Processor>
<Processor 59a664bfca105208352055fc-buffer>
Module pm_buffer
MaxSize 16384
Type Mem
</Processor>
User nxlog
Group nxlog
Moduledir /usr/lib/nxlog/modules
CacheDir /var/spool/collector-sidecar/nxlog
PidFile /var/run/graylog/collector-sidecar/nxlog.pid
define LOGFILE /var/log/graylog/collector-sidecar/nxlog.log
LogFile %LOGFILE%
LogLevel INFO
<Extension logrotate>
Module xm_fileop
<Schedule>
When @daily
Exec file_cycle('%LOGFILE%', 7);
</Schedule>
</Extension>
<Input 59949e3cca105203b1fb0d79>
Module im_file
File '/swisslog/*.log*'
PollInterval 120
SavePos True
ReadFromLast False
Recursive True
RenameCheck True
Exec $FileName = file_name(); # Send file name with each message
InputType 59949e3cca105203b1fb0d79-multiline
</Input>
<Input 599bea29ca105203d800403c>
Module im_file
File '/swisslog/*.txt*'
PollInterval 30
SavePos True
ReadFromLast False
Recursive True
RenameCheck True
Exec $FileName = file_name(); # Send file name with each message
InputType 599bea29ca105203d800403c-multiline
</Input>
<Input 59a664bfca105208352055fc>
Module im_file
File '/swisslog/*.tsv*'
PollInterval 5
SavePos True
ReadFromLast False
Recursive True
RenameCheck True
Exec $FileName = file_name(); # Send file name with each message
</Input>
<Output 59949e20ca105203b1fb0d58>
Module om_tcp
Host 172.16.11.19
Port 5045
OutputType GELF_TCP
Exec $short_message = $raw_event; # Avoids truncation of the short_message field.
Exec $gl2_source_collector = 'f2fb322c-dcd3-44e8-b5c1-10fb0ba92188';
Exec $collector_node_id = 'graylog-collector-sidecar';
Exec $Hostname = hostname_fqdn();
</Output>
<Output 599bea29ca105203d800403b>
Module om_tcp
Host 172.16.11.19
Port 5055
OutputType GELF_TCP
Exec $short_message = $raw_event; # Avoids truncation of the short_message field.
Exec $gl2_source_collector = 'f2fb322c-dcd3-44e8-b5c1-10fb0ba92188';
Exec $collector_node_id = 'graylog-collector-sidecar';
Exec $Hostname = hostname_fqdn();
</Output>
<Output 59a664bfca105208352055fb>
Module om_tcp
Host 172.16.11.19
Port 5060
OutputType GELF_TCP
Exec $short_message = $raw_event; # Avoids truncation of the short_message field.
Exec $gl2_source_collector = 'f2fb322c-dcd3-44e8-b5c1-10fb0ba92188';
Exec $collector_node_id = 'graylog-collector-sidecar';
Exec $Hostname = hostname_fqdn();
</Output>
<Route route-0>
Path 59949e3cca105203b1fb0d79 => 59949e3cca105203b1fb0d79-buffer => 59949e20ca105203b1fb0d58
</Route>
<Route route-1>
Path 599bea29ca105203d800403c => 599bea29ca105203d800403c-buffer => 599bea29ca105203d800403b
</Route>
<Route route-2>
Path 59a664bfca105208352055fc => 59a664bfca105208352055fc-buffer => 59a664bfca105208352055fb
</Route>
system
(system)
Closed
November 29, 2017, 9:39am
11
This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.