Graylog Cluster and Load Balancing

Hi everyone !

First of all, I would like to thank the whole Graylog team for creating and maintaining such a great product. Thanks also to all the people who take the time to come on this forum to propose solutions, it’s an excellent source of information ! And I hope that my questions will lead to answers that will in turn help other people as well !

I feel like I’m stuck / misunderstanding an important part about load balancing and I don’t feel like I’ve seen any solution online. Maybe there are, but I don’t feel like I’ve encountered “clear” answers to my questions. Or maybe I just don’t understand them !

I’ll start right here with my two questions :

  • How to make Graylog work with Apache2 as a load balancer
  • Should / Can I use a second load balancer for ingesting logs or should I use the same ?
  1. How to make Graylog work with Apache2 as a load balancer

So we’ve been playing around for few months with Graylog in minimum setup and decided to go with the Bigger Production Setup.

So far, everything works great in that bigger setup (everything Debian):

  • A cluster of three Elasticsearch nodes
  • A cluster of three MongoDB & Graylog nodes

I can go to the web interface of any Graylog nodes and check that everything is up and running.

I then went on and set up a server with just Apache as a load balancer with mod_proxy_balancer.

Here is the Apache load balancer configuration for now

<VirtualHost *:80>

        ServerName graylog.my.domain/

        Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED

        ProxyPass "/" "balancer://graylog-cluster/"
        ProxyPassReverse "/" "balancer://graylog-cluster/"

        <Proxy "balancer://graylog-cluster">
                Order deny,allow
                Allow from all
                BalancerMember "gray01.my.domain/api" route=node1
                BalancerMember "gray02.my.domain/api" route=node2
                BalancerMember "gray03.my.domain/api" route=node3
                ProxySet stickysession=ROUTEID
        </Proxy>

</VirtualHost>

Works fine ! But from what I understand, load balancing is supposed to be transparent, but if I go to http://graylog.my.domain, I can see which node I am redirected to in the search bar…

And then I saw this page about load balancing web interfaces : since I’m new to the load balancing world, I do not really know what to do with that configuration file…
Should I make an Apache server running on each of the 3 Graylog nodes ? I tried, from what I understood using this how to, works fine but nothing changed.

  1. Should / Can I use a second load balancer for ingesting logs or should I use the same ?

We are using the following inputs :

  • ​Sidecars with Filebeat and Winlogbeat
  • Palo Alto
  • RAW UDP
  • Syslog

From what I read here, here and there :

  • If load balancing seems not to work that well with TCP, is it worth it to use one ?
  • Is it worth it to work with two load balancer : one for the web interface, another one for inputs ?
  • Should I go back and learn more about load balancing (I’m doing it anyway) ?

THANKS a LOT for reading this, I know it’s a long topic.
I’m a Sys Admin trainee and not a native English speaker, so I’m sorry if I’m saying anything stupid.

Have a great day ! :wave:

Hello && Welcome

There are a few ways to configure load balancers. First is just knowing what it does can be half the battle.

A simple explaination of Load balancers.

The purpose of the load balancer (in this case an HTTP load balancer) is to distribute all incoming requests to our backend web servers. The load balancer hides all our backend servers to the public, and from the outside it looks like a single server doing all the work.

Some firewalls can do this also.
Example:
https://docs.fortinet.com/document/fortigate/6.0.0/Handbook/154107/basic-load-balancing-configuration-example

The link below is just added info.

It would be a bad idea to have apache on the same nodes as Graylog/MongoDb. Personally, they should be on separate servers or another device.

As for

This might be an added configuration to apache config to hide those nodes it connects to, but I’m not 100% sure. I personally do not see mine. Not sure what search bar are you refering to. Could you explain or maybe a screen shot?

Hi there and thanks a lot for your answer !

Right, so that’s kind of what I thought : I don’t get the point of having an Apache server on top of Graylog on the same node.

Unfortunately I’m not sure that a screenshot would help to understand better…

Some architectural considerations (hope that helps) :

Load Balancer :

  • 192.168.20.10

Graylog Nodes :

  • 192.168.20.11
  • 192.168.20.12
  • 192.168.20.13

What I meant :

When I go to 192.168.20.10, I am sent back to one of the three nodes and that is exactly what I am looking for. However, from what I understand, I’m not supposed to know which node I’m on. Now in the browser search bar I can see which node I’m being redirected to because the address changes to alternately 192.168.20.12, or 192.168.20.11, etc.

Do you use Apache as a load balancer too ? Perhaps I should switch to HAProxy or something else.
And what about inputs ? I think about trying apache kafka on another machine.

Thanks again for your answer, really appreciate it.

So, it seems that I was doing things wrong because as always, I was trying to skip steps by mixing tutorials from the interweb without really knowing what I was doing. Rushing headlong into an unfamiliar technology was not as efficient as I thought. Who could have known ? :open_mouth:

Here is my new configuration that works just fine.

<VirtualHost *:80>

        ServerName graylog.my.domain
        ProxyRequests Off

        Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED

        <Location /balancer-manager>
                SetHandler balancer-manager
                AuthType Basic
                AuthName "Load_Balancer_Manager"
                AuthBasicProvider file
                AuthUserFile "/path/to/passwords/files"
                Require user secretuser
        </location>

        ProxyPass /balancer-manager !

        <Proxy balancer://graylog>
                BalancerMember "http://node01.my.domain:9000" route=node1
                BalancerMember "http://node02.my.domain:9000" route=node2
                BalancerMember "http://node03.my.domain:9000" route=node3
                ProxySet lbmethod=byrequests
                ProxySet stickysession=ROUTEID
        </Proxy>

        RequestHeader set X-Graylog-Server-URL "http://graylog.my.domain"
        ProxyPass / balancer://graylog/
        ProxyPassReverse / balancer://graylog/

</VirtualHost>

I configured and secured the balancer manager by running htpasswd -c /path/to/passwords/files secretuser.

I’m going to go and dive more into load balancing inputs tomorrow.

Thanks again !

1 Like