Basically what @gsmith said. To me, this reads more like you were surprised by the amount of logs that were generated by Cloudtrail and didn’t quite prepare for that volume. So let’s try and walk through your questions
It’s not clear what your problems were, aside from the storage being consumed quickly. The title you added to the post is “Service behavior,” so I’m assuming you saw some issues around Graylog/Elasticsearch dying, or not processing messages because your disk filled up. Let me know if that’s not an accurate read. Assuming that it is, then yes–Graylog/Elasticsearch are wholly reliant on being able to have large swaths of disk available to them. So in the absence of a place to store the messages, then yes–it’s expected that Graylog won’t be able to process the messages.
That’s honestly something that you’re going to have to figure out for yourself. I’m under the assumption that going into deploying Graylog, you didn’t have a solid understanding of how much data you’d be ingesting–that’s totally fine. I’m not sure if you arrived at this part in the doc that @gsmith linked, but there’s this calculation for getting a close estimate:
A simple rule of thumb for planning storage is to take your average daily ingestion rate, multiply it by the number of days you need to retain the data online, and then multiply that number by 1.3 to account for metadata overhead. (GB/day x Ret. Days x 1.3 = storage req.).
So you’re going to have to let Graylog run for several days to get a solid idea of the amount of logs that something like Cloudtrail is generating. From there, you can decide how much storage you’ll actually need, as well as determine if you have any sort of business requirement to keep as much data as it generates. What I mean is, do you need to have all the fields? Can you drop some parts that are superfluous? If so, then you can reduce your storage need after the fact by figuring out just what you need to keep.
Graylog will archive indices once they’ve reached your specified criteria (e.g., number of messages, days of retention) and you can elect to move those archives off into cold storage where they’re not needed immediately. You could also elect to have Graylog delete data after it’s been archived. All that to say, it can be a situation where storage will grow over time–that’s up to you as the system owner to figure out how you’ll handle an ever increasing storage need. It could also be the case that through managing your retention periods, you end up with a solid feel for how much data you actually need and provided that you’re not onboarding a ton of new services, applications, or continually adding sources, your storage requirements could be fairly modest and not grow. But again, that’s up to you to figure out how you want to handle your retention strategy and all that comes with it.
Hopefully this helps.