How many nodes needed for >5TB logs in a day?

Haha, no I’m not Indonesian - I lived in Jakarta for 8 years though, and got reasonably fluent in Bahasa :slight_smile: Saya lupa banyak, tapi masih bisa bicara sedikit…

We’re not using all features just yet, I think we’re getting there, many pipelines, many streams, and some alerts - and a lot of developers building new tools that use the Graylog API to do their things with, so that’s good :smiley:

And yeah, paid log management is a killer at 5Tb/day - I thought we were high volume with our 100-150Gb per day, but damn… 5Tb :smiley: (To give you an idea, to handle our 100-150Gb per day, and keep the data “live” for 3+ months we are running 3 Graylog nodes with 24 cores each for the processing, and a total of 19 Elasticsearch data nodes with 6Tb storage each - just, you know, so you know what you are in for at some point :D)

2 Likes

Yes, i have 22 Streams in the Streams menu.
For each streams will be paired with user account.
I have set retention to 7 days :slight_smile:
4 nodes seem can handle 1.5TB right now…
No error in Graylog but Elasticsearch is slightly busier than before

I am so happy can talk and exchange ideas with you and @Totally_Not_A_Robot
Heel erg bedankt voor je hulp :slight_smile:

I thought we were high volume with our 100-150Gb per day, but damn… 5Tb :smiley: (To give you an idea, to handle our 100-150Gb per day, and keep the data “live” for 3+ months we are running 3 Graylog nodes with 24 cores each for the processing, and a total of 19 Elasticsearch data nodes with 6Tb storage each - just, you know, so you know what you are in for at some point :D)

I note it, friend.
Time to add new ES node :smiley:
I think i need to deal with my developers to make standardization for logging.

But, could you tell me what is your (@benvanstaveren, @Totally_Not_A_Robot) reason to choose Graylog + Elasticsearch rather than Kibana/Logstash + Elasticsearch ?

1 Like

Okay, sit down and grab a kopi hitam… may be a long story :wink:

So at first, we had a standard ELK stack (Elasticsearch for storage, Logstash for ingest, Kibana to look at the pretty data). And this worked out alright. Then we ran into limitations of Logstash where we couldn’t do things like pull in external data, or have things like pipelines - well, we could, but it was a huge mess of if-then-else in the configuration.

So we replaced Logstash with a home-brew thing that acted a lot like Graylog does (pipelines, streams, etc.) and that worked fine too. But then we got to that point where Kibana allows access to everyone, at every level - we had passwords on there via basic authentication, but there were no roles, no limitations to who could do what. And you can get that, with x-pack, for a stupid amount of money which we felt wasn’t worth it. There’s open-source authentication packs for Elasticsearch (SearchGuard for example) but that didn’t quite do the trick either because Kibana was just too “generic” for what we wanted.

I did at some point in the past look at Graylog, and figured you know, I’ll take a look and see what it’s like these days (version 2.4), and liked what I saw, so we went ahead and converted the entire setup in something like 2 days.

Anyway, the actual bullet-point reasons are as follows:

  • Graylog has built-in LDAP integration, since we use LDAP extensively that was nice
  • Ease of processing - there’s streams, there’s pipelines, there’s rules in a pipeline. Very straight forward, very easy to use once you’ve wrapped your head around how it works.
  • Ease of deployment: it’s hilariously easy to add another Graylog node if you need more processing power. And with the new sidecar implementation (and the old one even), managing our log shippers has turned from a dirty dirty Ansible playbook into a case of just clicking a few things.
  • Graylog has roles, which means now our developers only get to see the things they need to see, no more, no less. This solves issues where enterprising individuals have taken it upon themselves to send commands to the ES cluster that required us to restart the whole thing (trust me when I say rebooting a 25 node cluster from scratch is a whole-day operation).
  • Community! Graylog’s forums and the people on them seem to be much better at actually solving problems. The Elastic discussion forums for example you can ask a question and never get any answer, or you ask a question only for people to tell you you’re doing it wrong but they won’t tell you more than that.

I could go on, but in general Graylog has made my life as a lazy devops guy easier, and with the API, our devs can now do things that they couldn’t before (because nobody gets full access to the ES cluster) so they’re quite happy with it.

1 Like

My story is quite comparable, so I’ll keep it short :slight_smile:

  • Security. Graylog has RBAC built in as well as TLS/SSL.
  • Money. ELK can get real expensive, real fast; just like Splunk. We can run Graylog for free in most cases and we added TLS onto Elastic for free with SearchGuard.
  • Ease of use. Sure, Splunk and Kibana offer much nicer visualizations, but for our team that’s not a priority. We need a few dashboards, but mostly quick querying for troubleshooting.

I really should talk my managers into paying for a support license at the very least. I honestly believe that the Graylog team deserve to get some money from us, if you see the value we get from their product!

1 Like

Hi!
im running 3 Graylog nodes and 8 ES nodes. all servers are HP proliant Gen6 -Gen8 with 128Gb ram
SAN storage on all 8 ES nodes 48TB disk that will store our logs for 30 days
EPS is avarge 25k peaks up to 40k and Data per day is around 1TB
running this setup for 5 years and still rock sollid.

// Anders

3 Likes

grab a kopi hitam :star_struck:
Seems you remember some the memorable moments haha

I can get the points that we are in same situation about user accounts problem. Graylog is the only one which similar to other paid third-party app.
Nice explanation, friend.
Thank you very much for your sharing knowledge. I really like this environment that can exchange any idea :slight_smile:

Special Thanks to Graylog Team :wink:

I agree with you too… in some cases, i need to implement TLS/SSL cert for security purposes according to migrating to Google Cloud Platform.
I also just know about SearchGuard and Splunk :smile: thank you for your sharing and time, my friend.

1 Like

I think we are in same way… but saving 5 TB pf logs for 30 days is meh :smile:
Now, my graylog has 4 nodes (16 cores, 32 gb memory each) and 3 ES (8 cores, 32 gb memory each, 4 TB SSD), saving logs for 1 week

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.