I am thinking to make a back up about graylog and I thought in some options.
-Make copy of logs from monitored machine to a repository in another machine.
-Make a snapshot of opensearch.
-Copy the virtual machine where Graylog will be installed.
indices from your elastic/opensearch cluster - this can be a bit trickier. You can certainly just make copies of the data files, although the elastic/opensearch service must be stopped. I have more recent experience with opensearch, and they seem to have a useful feature called snapshot management that may be useful.
First thanks to write to my post and well, I am trying to “copy” the data that I am reciving from monitored machines. I thourgth that make snapshots was the same but I can see in your words that it is not the same so I am going to try both but seeing your words I like much more copy mongodb.
Can you tell me How many days OpenSource Graylog with OpenSearch keep data in to the node?
I am generating logs about more than a 1 month and they are still there but I need to nkow how many day exactly.
It help me a lot. I know that retention graylog is unlimited. This is good for one side but bad for the other side. Good to have all data but bad to make the searching beacuse it will be slow.
Sory to make this personal question, do you speak spanish? I tell you this because about your surname.
I unfortunately do not speak spanish. I inherited this surname through adoption. My understanding is the family I inherited the surname from was originally from El Salvador.
Getting the right mix of log retention and performance can sometime be a bit of an art. The biggest consideration is how much RAM you can allocate to your Elasticsearch/OpenSearch.
Elastic/Opensearch recommend no more than 20 shards per 1 GB of heap (memory specifically dedicated to Elasticsearch/OpenSearch). What this means is it is ideal to have as few shards as possible. The sweet spot is between 20-40GB per shard. Note that this is not per index because an index can contain multiple shards.
I know this is a bit in the weeds, but in summary:
configure your heap for elastic/opensearch to use up to 31GB of ram. This number should also not exceed more than 1/2 of your total operating system RAM.
configure graylog index set settings to have as few shards as possible while still meeting your retention. There is a trade off between how long you retain logs vs general cluster performance.
Regarding your last question about text logs, Graylog Sidecar can be very useful for this as it allows you to control log collection agents installed on windows or linux (or even mac os!) devices. You can use an agent called filebeat, which is automatically included in the Windows Graylog Sidecar install. However if you are using linux, you will need to install it separately to use it with graylog sidecar.
Thanks a lot to answer.
I am going to leave the default configuration becuse the customer want to have the data all as possible in all possible time.
To make copys I am going to recommend to my costumer copy the virtual machine each month or three month or anything else like that because I tryed to use the CURL COMMAND and it went wrong and in the opensearch manual says literally, “make the copy with curl… bla bla bla… and to be sure that it works after make it try to reinstall the dato to make sure that the copy went ok”, so if I have to make that , I have to stop to receive data make this and probe that the copy went ok reinstalling all data in another cluster or node, and all of this is imposible and so many resourced spend is non-viable.
I have to keep going to read opensearch manual and graylog.