Optimising for 5TB of Data


We’ve been using Graylog for a while for various things and found it to be a great product. However we’re now trying to build a data archive for one of our systems, which will keep 14 days of data (read off Kafka topics) for a total of about 5TB (1100 Million Messages) and I feel like I’m a bit out of my depth with sizing and optimisation of my indices.

Currently what I have (due to budget more than anything) is a single machine with an i5 Quad Core, 14GB RAM, 3 SSDs in RAID0 for total of about 600GB storage. I’ve given Elasticsearch 6GB heap and Graylog 2GB and everything is working nicely at about 1000 messages a second (but not much retention obviously).

I’ve got 3 index sets and have configured each Index to roll over at 1GB (which works out at about 100000 messages) and to delete after I reach 100 indices (per set). Right now I’ve got 1 shard per index and no replicas.

I guess my first question is, do I have a hope of just throwing more SSDs in this machine (and increasing the number of old indices I keep) and it being able to cope with 14 days of data?

My second question is whether I’m even close to the right parameters with 1 shard per index and 1GB index size?

I’ve done a lot of reading on Elasticsearch and Graylog but not found any real answers unfortunately. Other than coming to the conclusion I might be being a bit optimistic about this single machine being able to cope with the amount of data I want to keep.

Oh and I know I’m running a massive risk running a single machine with RAID0 but for the moment I have a very limited budget.

Any guidance (or telling me I’m deluded) is much appreciated.

First of all, I wouldn’t try scaling vertically (i. e. build a single machine hosting everything) but horizontally (i. e. build multiple machines forming a cluster) for performance and data integrity purposes.

If you lose a single SSD in the machine or even the complete machine, you will lose all data in the current setup.

I’d also recommend reading the following blog article at the Elastic blog:

Thanks for the reply. I understand and am well aware of the risk I’m running. The moment I have some budget I’ll horizontally scale the cluster.

I’m just trying to get a feel for the amount of data I’m likely to be able to store on this machine before I start having issues.

That article is very very helpful by the way, especially the tips section.


This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.