Hi everyone !
First of all, I would like to thank the whole Graylog team for creating and maintaining such a great product. Thanks also to all the people who take the time to come on this forum to propose solutions, it’s an excellent source of information ! And I hope that my questions will lead to answers that will in turn help other people as well !
I feel like I’m stuck / misunderstanding an important part about load balancing and I don’t feel like I’ve seen any solution online. Maybe there are, but I don’t feel like I’ve encountered “clear” answers to my questions. Or maybe I just don’t understand them !
I’ll start right here with my two questions :
- How to make Graylog work with Apache2 as a load balancer
- Should / Can I use a second load balancer for ingesting logs or should I use the same ?
- How to make Graylog work with Apache2 as a load balancer
So we’ve been playing around for few months with Graylog in minimum setup and decided to go with the Bigger Production Setup.
So far, everything works great in that bigger setup (everything Debian):
- A cluster of three Elasticsearch nodes
- A cluster of three MongoDB & Graylog nodes
I can go to the web interface of any Graylog nodes and check that everything is up and running.
I then went on and set up a server with just Apache as a load balancer with mod_proxy_balancer.
Here is the Apache load balancer configuration for now
<VirtualHost *:80>
ServerName graylog.my.domain/
Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANGED
ProxyPass "/" "balancer://graylog-cluster/"
ProxyPassReverse "/" "balancer://graylog-cluster/"
<Proxy "balancer://graylog-cluster">
Order deny,allow
Allow from all
BalancerMember "gray01.my.domain/api" route=node1
BalancerMember "gray02.my.domain/api" route=node2
BalancerMember "gray03.my.domain/api" route=node3
ProxySet stickysession=ROUTEID
</Proxy>
</VirtualHost>
Works fine ! But from what I understand, load balancing is supposed to be transparent, but if I go to http://graylog.my.domain, I can see which node I am redirected to in the search bar…
And then I saw this page about load balancing web interfaces : since I’m new to the load balancing world, I do not really know what to do with that configuration file…
Should I make an Apache server running on each of the 3 Graylog nodes ? I tried, from what I understood using this how to, works fine but nothing changed.
- Should / Can I use a second load balancer for ingesting logs or should I use the same ?
We are using the following inputs :
- Sidecars with Filebeat and Winlogbeat
- Palo Alto
- RAW UDP
- Syslog
From what I read here, here and there :
- If load balancing seems not to work that well with TCP, is it worth it to use one ?
- Is it worth it to work with two load balancer : one for the web interface, another one for inputs ?
- Should I go back and learn more about load balancing (I’m doing it anyway) ?
THANKS a LOT for reading this, I know it’s a long topic.
I’m a Sys Admin trainee and not a native English speaker, so I’m sorry if I’m saying anything stupid.
Have a great day !