some of the questions you should ask are (if you haven’t already)
- what growth do you see in the near/distant future?
- what’s your availability requirement?
- what length of retention do you want/need?
- what resources are available to you?
- what’s your comfort level troubleshooting with documentation and a forum versus being able to pick up the phone?
- Do you have the need to audit log access, etc.
based on your use case, a single server running both graylog and elasticsearch will probably suffice, but there are challenges, both graylog and elasticsearch rely on java and configuration can take some tweaking to ensure there are no issues. Also, a single firewall can generate a large amount of logs. If you have nextgen features enabled or want/need to log NAT, or have your ACLs set to log, the volume of traffic can easily grow quicker than you may realize and then your indices turn over faster, your retention drops… etc… etc… also, if you do have a single server, consider SSD, sata is fine, but try for NVME or enterprise class will help. a single server will also run up against the challenges surrounding ingesting, processing, storing and retrieving messages, so ensure you have some kind of raid setup.
As far as CPUs are concerned, 2 may suffice, but I don’t recommend less than 4 and prefer 8. Graylog has a configuration option for allocating CPUs to various stages of the logging process. Being able to dedicate a CPU (or more) to these will help ensure the messages get processed quickly and will also compensate for certain configuration/parsing mistakes. For your setup, I would say 6 CPUs if you are putting everything on a single server. Or 4 for graylog and 4 for elastic if you you split the servers.
Elasticsearch likes memory, so go for as much as you can, but start with at least 8GB. I prefer 16, and recommend 32. for your “small” setup, I would say 16, if you are putting everything on a single server, and
4 for graylog and 16 for elasticsearch if you split them.
If you want any kind of HA or resiliency, now or down the road, I would caution you that Graylog runs on MongoDB under the hood and there are requirements for distributing a MongoDB across multiple servers. Best to read up on that. Same for Elasticsearch. There are steps you can take now, to allow for a smoother transition from a single node to multi node, but that’s probably beyond the scope of this.
Without knowing much about your environment/resources/requirements, I would suggest a single server
6CPU/16GB RAM/500GB useable SSD storage.
I prefer the virtual servers, but can’t recommend a virtual solution if that requires you bringing in a virtual solution.