The short story is Graylog stores it information and settings with it’s use of MongoDB but not the messages you are sending in. Once Graylog is finished processing a message it is sent out to Elasticsearch for storage and future retrieval.
You can search the community for recommendations for sizing to find information that is relevant to your installation (future installation?) Here is an example. There are many factors to take into consideration from clustering to shards in Elasticsearch/Opensearch etc. etc. OF NOTE: If you are creating a new Graylog instance it is likely better to start with OpenSearch as that seems to be the current Graylog direction.
Graylog can handle a lot, I have see 33,000+ messages per second. This would depend oh how Graylog cluster is setup and resources given. Kafka, Nginx or any other load balancers could be set in front of a cluster. Again this would depend on the environment.
Of cource SSD will work a lot better, the question is “is it necessary?”. If you have the money I would rollout a couple 4TB SSD and raid them. 1 TB a week is a lot and depending how long you want to keep the data , for 30 day /w 1 TB week that 4 TBs a months plus you need room for your OS , I would suggest keeping Elasticsearch Indices on its own volume.