I want to implement a log aggregation solution in my company. I’m going for Graylog but I’m still struggling with how to implement log shipping.
Indeed I have a lot of differents files types to ship to Graylog.
Syslog are not a problem, this is very easy and already done.
Here is what I did :
on source server :
nxlog shipping each file to a specific output (still struggling with nxlog configuration btw)
I want to specify a regex to identify the beginning of each messages per type of log with a regex but I’m unsure how to do it.
on log server :
one input on a specific port per type of log
I’ll use extractors on each input.
Does it make sense ? Is there a better way ?
Maybe I didn’t get something important about extractors ?
As you can see, I’m a bit confused on organizing the whole thing
generally… to separate the log types a good thing, you have no waste resource to analyze the type. eg run an extractor to separate your nginx logs on default os logs…
I did read the documentation but I’m still confused about the use of pipelines.
I have several servers to configure. all have the same configuration :
tomcat
mysql
syslog : not mandatory (for me)
home generated logs
With Graylog 2.5, and the old sidecar, I managed to easily configure my agent (filebeat) to parse complex messages as tomcat garbage collector : precisely, to configure a regex to get the beginning of messages.
With Graylog 3 and the new sidecar, which is is a clever idea, I’m still struggling with this configuration.
I changed my agent from filebeat to nxlog, because nxlog can manage multiples outputs.
Because I’m having a hard time with parsing messages in nxlog, I was wandering if pipelines could do the work I want from nxlog : detect the beginning of complex messages with a regex, especially my garbage collector.
generally (because I don’t use GL 3.0, and I used nxlog long time ago)
So, yes, the regexp is a good thing, but please try to avoid to use it. It need resource, and you need it on the GL server.
Nxlog, and filebeat know add extra tag to the message. Use it, and you need only check the exists of a tag or the fixed possible value of the field (only an exact match) instead of use regexp to recognize the format of the log.
So if you can try to push some function to your client, and don’t do unnecessary work for yourselves.
the problem is I’m monitoring Linux servers. Furthermore, I use nxlog because it can route multiple inputs and outputs. On my graylog, I have an input per log type.
### VARIABLES ###
define ROOT /usr/bin
define BASELOGLINUX /var/log
define BASELOGARTIS /artis/ArtisWeb/logs
define BASEPLATEFORME /artis/plateforme
### EXTENSIONS ###
<Extension gelfExt>
Module xm_gelf
# Avoid truncation of the short_message field to 64 characters.
ShortMessageLength 65536
</Extension>
<Extension multiline_gclog>
Module xm_multiline
Headerline /\{Heap\Wbefore\WGC/
</Extension>
<Extension multiline_artisLog>
Module xm_multiline
Headerline /\d\d\/\d\d\/\d\d\W\d\d\:\d\d\:\d\d/
</Extension>
<Extension multiline_mysql-error>
Module xm_multiline
Headerline /[0-9]{4}\W[0-9]{2}\W[0-9]{2}\W[0-9]{2}\W[0-9]{2}\W[0-9]{2}/
</Extension>
### CONF NXLOG ###
User nxlog
Group nxlog
Moduledir /usr/lib/nxlog/modules
CacheDir /var/spool/nxlog/data
PidFile /var/run/nxlog/nxlog.pid
LogFile /var/log/nxlog/nxlog.log
### INPUTS ###
<Input in_mysql-error>
Module im_file
InputType multiline_mysql-error
File '%BASELOGLINUX%/mysql/error.log'
PollInterval 1
SavePos True
ReadFromLast True
Recursive False
RenameCheck False
Exec $FileName = file_name(); # Send file name with each message
</Input>
<Input in_mysql-slow>
Module im_file
File '%BASELOGLINUX%/mysql/mysql-slow.log'
PollInterval 1
SavePos True
ReadFromLast True
Recursive False
RenameCheck False
Exec $FileName = file_name(); # Send file name with each message
</Input>
<Input in_artisLog>
Module im_file
InputType multiline_artisLog
File '%BASELOGARTIS%/artisLog.log'
#Exec if $raw_event !~ /\d\d\/\d\d\/\d\d\W\d\d\:\d\d\:\d\d/ drop();
Exec $FileName = file_name(); # Send file name with each message
</Input>
<Input in_gclog>
Module im_file
InputType multiline_gclog
File '%BASEPLATEFORME%/tomcat*/logs/gc*.log*'
Recursive TRUE
#Exec if $raw_event !~ /\{Heap\Wbefore+\WGC/ drop();
Exec $FileName = file_name(); # Send file name with each message
</Input>
### OUTPUTS ###
<Output out_artisLog>
Module om_tcp
Host GRAYLOG_IP
Port 10203
OutputType GELF_TCP
<Exec>
# These fields are needed for Graylog
$gl2_source_collector = '${sidecar.nodeId}';
$collector_node_id = '${sidecar.nodeName}';
</Exec>
</Output>
<Output out_mysql-error>
Module om_tcp
Host GRAYLOG_IP
Port 10701
OutputType GELF_TCP
<Exec>
# These fields are needed for Graylog
$gl2_source_collector = '${sidecar.nodeId}';
$collector_node_id = '${sidecar.nodeName}';
</Exec>
</Output>
<Output out_mysql-slow>
Module om_tcp
Host GRAYLOG_IP
Port 10702
OutputType GELF_TCP
<Exec>
# These fields are needed for Graylog
$gl2_source_collector = '${sidecar.nodeId}';
$collector_node_id = '${sidecar.nodeName}';
</Exec>
</Output>
<Output out_gclog>
Module om_tcp
Host GRAYLOG_IP
Port 10601
OutputType GELF_TCP
<Exec>
# These fields are needed for Graylog
$gl2_source_collector = '${sidecar.nodeId}';
$collector_node_id = '${sidecar.nodeName}';
</Exec>
</Output>
### ROUTES ###
<Route mysql-error_to_tcp>
Path in_mysql-error => out_mysql-error
</Route>
<Route mysql-slow_to_tcp>
Path in_mysql-slow=> out_mysql-slow
</Route>
<Route artislog_to_tcp>
Path in_artisLog => out_artisLog
</Route>
<Route gclog_to_tcp>
Path in_gclog => out_gclog
</Route>
As I was writing, I found out what was my problem : nxlog doesn’t manage wildcard for directories, only files.
From the moment I turned this :
File '%BASEPLATEFORME%/tomcat*/logs/gc*.log*'
into this :
File '%BASEPLATEFORME%/tomcat/logs/gc*.log*'
I saw messages coming in … So I’m still confused about pipelines and rules, but my first step is almost done : sending mutliples files from on server all sorted out to several inputs on graylog.