Best hardware setup for a small / medium sized company

Hey All.
I have a relatively straight-forward question:
I’m going to be setting up Graylog for one of our client’s locations – but I was wondering what would you recommend for my setup:

  1. it’s a small/medium sized company roughly 30 employees and 2 servers
  2. We’ll be monitoring their 2 servers, firewall, and router (to start with)
  3. The Graylog server will be setup in their server room.

Should we purchase a standalone server box and install Graylog on it, and then setup 2 more physical boxes for Elasticsearch? Or…should we purchase a VMware server and setup Graylog and Elasticsearch on one box, but as separate VMs?

Any direction would be greatly appreciated. BTW…thank you guys for all that you do here in the forum.
-Justin

if possible - take physical server … that is my advice.

1 Like

Hey Jan.
Ok great thanks.

So would you say in most situations, a dedicated Graylog server and dedicated Elasticsearch servers are the way to go?

thanks again for your reply.
-Justin

3 dedicated servers to monitor 3 devices seems overkill. We have one VM for entire Graylog setup, including Elastic. 2 VCPU, 4GB RAM, 160GB disk place. Currently monitoring 8 servers and 1 firewall, typical workday data about 350 MB/day.
Possibly will migrate to physical server next year, it is not the best idea to keep server logs in the same place as servers themselves.

Interesting. Thanks “Karlis” that’s very helpful. So what will you do? Move the physical server offsite?

Actually I think a better question would be: Do you keep your current Graylog VMs offsite? If so how do you get the logs into Graylog? Are you using NGINX?

Thanks again.
-Justin

No, Graylog VM sits on onsite virtual hosting. That’s why we are planning move it to physical server, and it will remain onsite.
No, we are not usin Nginx.

some of the questions you should ask are (if you haven’t already)

  • what growth do you see in the near/distant future?
  • what’s your availability requirement?
  • what length of retention do you want/need?
  • what resources are available to you?
  • what’s your comfort level troubleshooting with documentation and a forum versus being able to pick up the phone?
  • Do you have the need to audit log access, etc.

based on your use case, a single server running both graylog and elasticsearch will probably suffice, but there are challenges, both graylog and elasticsearch rely on java and configuration can take some tweaking to ensure there are no issues. Also, a single firewall can generate a large amount of logs. If you have nextgen features enabled or want/need to log NAT, or have your ACLs set to log, the volume of traffic can easily grow quicker than you may realize and then your indices turn over faster, your retention drops… etc… etc… also, if you do have a single server, consider SSD, sata is fine, but try for NVME or enterprise class will help. a single server will also run up against the challenges surrounding ingesting, processing, storing and retrieving messages, so ensure you have some kind of raid setup.

As far as CPUs are concerned, 2 may suffice, but I don’t recommend less than 4 and prefer 8. Graylog has a configuration option for allocating CPUs to various stages of the logging process. Being able to dedicate a CPU (or more) to these will help ensure the messages get processed quickly and will also compensate for certain configuration/parsing mistakes. For your setup, I would say 6 CPUs if you are putting everything on a single server. Or 4 for graylog and 4 for elastic if you you split the servers.

Elasticsearch likes memory, so go for as much as you can, but start with at least 8GB. I prefer 16, and recommend 32. for your “small” setup, I would say 16, if you are putting everything on a single server, and
4 for graylog and 16 for elasticsearch if you split them.

If you want any kind of HA or resiliency, now or down the road, I would caution you that Graylog runs on MongoDB under the hood and there are requirements for distributing a MongoDB across multiple servers. Best to read up on that. Same for Elasticsearch. There are steps you can take now, to allow for a smoother transition from a single node to multi node, but that’s probably beyond the scope of this.

Without knowing much about your environment/resources/requirements, I would suggest a single server
6CPU/16GB RAM/500GB useable SSD storage.

I prefer the virtual servers, but can’t recommend a virtual solution if that requires you bringing in a virtual solution.

Ok great. Thanks for the update, Karlis. That’s very helpful.

First, thank you for the information.That’s extremely helpful. We already have a virtual Dell PowerEdge server that we’re going to use.

To answer the questions:
-Our growth in this area will be relatively slow, as we’re still in the 2nd half of our testing phase…but eventually we would like to setup a more “enterprise” ready solution – but I don’t see that happening for at least another year.
-Availability needs to be decent. I would like to be up in the 99.99 or better (if possible) - we will be using RAID, Veeam backups, and battery backups
-For retention - I would like to hold on to the logs for about 2 weeks to start with.
-We have a Dell poweredge EMS VM server with Xeon 8-core processor with 16GB of RAM, and 2 2TB 7.2K rpm drives in RAID 1
-I am comfortable troubleshooting with docs and the forum – I have a pretty solid Linux background, so most things I am capable of dealing with.

Thanks again for the insight.
-Justin
-I will need access to audit logs eventually.

just for contrast we run Graylog stack on an off the shelf PC that was just laying around, with an 8core AMD FX CPU, 16GB of RAM, and 2 x 1TB WD RE disks in RAID1 through BTRFS with LZO compression, besides graylog it hosts TIG stack for metrics, MySQL database and a small Virtualbox VM.
With around 160 users we collect an average of 1.5 mil messages per day from our local and remote UTMs/routers, DC controllers and various other devices. It’s kinda slow to use interactively, but as a headless server it works fine.
The I/O is the main bottleneck here. CPU runs at 40% (though i had to limit cores to 3 via docker for graylog as since 3.1 it tends to hammer the CPU). BTRFS compression does the job at around 70% ratio for Elasticsearch indices. As for retention, we keep data for two years as it’s required by government regulations here. We rotate indices daily and close them after 60 days then eventually delete them manually via bash script calling graylog REST API. It all works well.

@maniel That’s more than I would feel comfortable running on my Graylog server, but to each his own and it’s obviously working, so kudos on the cost savings.

@Justin19 Graylog’s enterprise features are included (without support) for ingest volumes below 5Gb/day, so if you are below that threshold, feel free to reach out to Graylog via the website and request a free enterprise license. you will then have audit, archive, and reporting features.

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.