How to use graylog multi-node

Hi
I have some questions about graylog multiple nodes.
graylog multi-node setup ok.
But I send messages only to one of the servers, and I can hardly accept messages on other servers.
image

I use rsyslog v8.19.0 send message to graylog
rsyslog config recv log service is p1log2. other server is spare time.

Please help analyze what’s the problem

What’s the configuration of your clients?
What’s the configuration of the inputs in Graylog?
Are you using any load-balancer in front of Graylog’s inputs?

Hi
use rsyslog v8.19.0

# config file
ruleset(name="graylog") {
        action(type="omfwd"
            Protocol="tcp"
            Target="gek.bl.com" # 10.201.240.2 / p1log2
            Template="IpTemplate"
            Port="1514"
  )
stop
}

input(type="imfile"
    File="/root/graylog_test.log"
    Tag="test"
    PersistStateInterval="1000"
    reopenOnTruncate="on"
    addMetadata="on"
    Ruleset="graylog"
)

$template IpTemplate,"{\"hostname\": \"%HOSTNAME%\", \"ngx_tag\": \"%syslogtag%\", \"\": %msg:::sp-if-no-1st-sp%%msg:::drop-last-lf%}\n"

I used HAProxy to proxy other ports to test whether I could forward logs to other nodes, but I was unable to resolve source IP.
The address of the log source becomes the address of haproxy, and haproxy and graylog are deployed on the same server.

haproxy config file

frontend ha_h5
    bind *:1536
    mode tcp
    option tcplog
    log global
    default_backend h5_tcp

backend h5_tcp
    mode tcp
    balance leastconn
    server gek1 10.201.240.1:1526 weight 5 check inter 2000 rise 2 fall 3
    server gek2 10.201.240.1:1526 weight 5 check inter 2000 rise 2 fall 3
    server gek3 10.201.240.1:1526 weight 5 check inter 2000 rise 2 fall 3

You’re using the same backend server for every request.

I’m sorry. I made a mistake.
Correct configuration:
server gek1 10.201.240.1:1526 weight 5 check inter 2000 rise 2 fall 3
server gek2 10.201.240.2:1526 weight 5 check inter 2000 rise 2 fall 3
server gek3 10.201.240.3:1526 weight 5 check inter 2000 rise 2 fall 3

I think I’ve found a solution. Add a field to the rsyslog configuration template. source: “IP address”
As shown below.

$template IpTemplate,"{“hostname”: “%HOSTNAME%”, “source”: “10.201.143.183”, “ngx_tag”: “%syslogtag%”, “”: %msg:::sp-if-no-1st-sp%%msg:::drop-last-lf%}\n"

for load balancing we use a different approach to allow near equal balancing over 4x nodes.
send the syslogs from rsyslog and syslog-ng using AMQP to a RabbitMQ cluster in-front of graylog.

this has 3x major benefits:

  • if graylog cannot keep up with incoming message load due to a spike, these messages are stored on the RabbitMQ cluster in memory or disk (depending on your policies) for collection a few seconds later when it can process the messages.
  • no loss of messages - each message committed to RabbitMQ will be delivered to a node, if a node goes down in an uncontrolled fashion and the message isnt ACK’ed then it’s sent to another graylog node for processing.
  • retaining the eact message format in the RabbitMQ message and handling non-syslog format messages in the same input if required.
  • when reloading a graylog node, the traffic does not balance back to it, instead the TCP syslog’s connections stay active on other nodes, making processing imbalanced, RabbitMQ solves this as messages are pulled into the graylog nodes from RabbbitMQ rather than pushed in via syslog.
    We found the very same problem when we started to scale up and using HAProxy to balance inbound Syslog.

we have a RabbitMQ cluster running on 2x nodes that handles 8k per second and ramps to 80k per second and this works much better for us than using HA-proxy to try and rough balance TCP connections.
if a log generator doesn’t support AMQP natively there are a pair of Syslog-NG boxes that do the syslog UDP/TCP/any format conversion to AMQP and have disk buffering enabled too to ensure no loss of messages.


Just to pick up the initial post again, when sending Syslog over TCP, the syslog daemon won’t create a new TCP connection for each syslog message, which is quite costly, but keep the connection open and alive as long as possible.

If you want to send each syslog to a different receiver, you might want to use Syslog over UDP instead.

HAProxy wont load balance UDP, its TCP only, so syslog connections would need to be direct to graylog processing nodes not via HAproxy :cold_sweat:

There are other load-balancers than HAProxy. :wink:

Also rsyslog might also help with re-establishing TCP connections in a certain interval: https://www.rsyslog.com/doc/v8-stable/configuration/modules/omfwd.html#rebindinterval

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.