Graylog Architecture and Setup

rajnishtyagi · June 20, 2019, 9:49am

I have setup Graylog cluster (3 HOST) in docker and it’s running perfectly fine
32GB RAM for each host
4 CORE for each host
1.5 TB each host

i am expecting more than 100 application logs in graylog cluster in future, so I have central graylog for logging. my question is

per day I am expecting 120 GB logs per day in future so the current architecture is fine .
Retention time will be 30 days so 120GB*30= 3600 GB logs, is 3 server enough to handle that much of load (after adding additional disk space).

Graylog heap size

GRAYLOG_SERVER_1_GL_HEAP="-Xms2g -Xmx4g"
GRAYLOG_SERVER_2_GL_HEAP="-Xms2g -Xmx4g"
GRAYLOG_SERVER_3_GL_HEAP="-Xms2g -Xmx4g"

Elasticsearch heap size

GRAYLOG_SERVER_1_ES_HEAP=“16g”
GRAYLOG_SERVER_2_ES_HEAP=“16g”
GRAYLOG_SERVER_3_ES_HEAP=“16g”

karlt · June 21, 2019, 7:36pm

I’ve run 10TB worth elastic indices in retention on a single server + Graylog with slightly higher specs than listed. I wouldn’t recommend doing that, but it’s possible.
You should be fine with the specs listed.

rajnishtyagi · June 22, 2019, 8:46am

How much logs you are getting per day in single graylog setup and what is the retention time in your graylog ?

karlt · June 24, 2019, 6:51pm

it was a test environment, but around 300million events / 250GB a day if I remember correctly. was retaining for 30 days

rajnishtyagi · June 26, 2019, 6:55am

Another question related to Architecture.

All application running in docker container and sending logs via logstash GELF UDP. i wanted to make graylog available 24x7. so my question is

What if network brake between logstash and Load Balancer. can we set up some buffer system in logstash to save log until the network is available again.
Current log entries is 40 Million logs per day so what if network is not available for 6 hours and where i can store this 10 Millions logs ?
can we use kafka but again the question remain the same for network outage

jan · June 26, 2019, 9:15am

your question is more logstash specific …

… but logstash has no buffer, if logstash can’t push messages somewhere this messages are lost. You can place some buffers like kafka or AMQP in the network but if that is not reachable for logstash again, messages are lost.

Mexonizator · June 26, 2019, 12:35pm

Theoretically, you could direct the logstash logs to the local syslog, so as to use it as a transfer point. It can do what you want:

$ActionQueueFileName fwdRule1 # unique name prefix for spool files
$ActionQueueMaxDiskSpace 1g   # 1gb space limit (use as much as possible)
$ActionQueueSaveOnShutdown on # save messages to disk on shutdown
$ActionQueueType LinkedList   # run asynchronously
$ActionResumeRetryCount -1    # infinite retries if host is down

But I’m not sure, if Syslog can receive GELF messages in reality.

system · July 10, 2019, 12:35pm

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Cost of graylog setup for production (10 GB/day) Graylog Central (peer support)	4	2389	July 5, 2018
Graylog Enterprise production infrastructure setup Graylog Central (peer support)	2	1279	June 21, 2017
Graylog Sizing / Architecture Graylog Central (peer support)	2	2023	September 11, 2017
Users feedbacks / Guides for heavy load graylog Cluster Graylog Central (peer support)	32	6428	July 29, 2020
Graylog Sizing/Optimzation Graylog Central (peer support) sidecar , nxlog , nodatanx	4	6561	December 15, 2017

Graylog Architecture and Setup

Graylog heap size

Elasticsearch heap size

Related topics