ES reindex api and Graylog2

I used the shrink index to shrink a few hundred old indices. (see this Topic.)

I created the shrunk indices with the naming scheme of, e.g.: graylog_1400 -> s_graylog_1400

Then recalculated index ranges, and deleted the old indices.

Now, to get index rotation working nicely, I would like to move the shrunken indices to the original index set. Would it be possible just to use the ES reindex api https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html and set source index as s_graylog_1400 and destination index as graylog_1400, then recalculate index ranges and be OK? Or should I use some other means to do this?

The Elasticsearch Reindex API doesn’t move documents or indices. It merely reindexes (hence the name) documents into other indices (or the same index they were read from).

In other words, shrinking an index and then reindexing its documents into another index doesn’t make sense.

AFAIK, Elasticsearch currently doesn’t support renaming indices.

OK, so then I would first need to create the new destination index in the default graylog_* index set, then reindex the documents back there. Would the default templates be applied automatically to the created index?

If this became too cumbersome, I’d probably just manually delete indices from the shrunken index, when the time comes.

Yes, if they start with the prefix of the respective index set (default is “graylog_” for the default index set).

Hi,

I tried that. It seems create index is easy, but reindex takes a long time, and I guess after that it would be time for forcemerge and block writing, and recalculate index ranges. Scripting that would be pretty easy, but the effort it takes for the ES cluster seems an overkill, so I’ll just have a separate index set and delete the indices there when the time comes.

Thanks for you help, anyway. And just to let you all know; having a couple of thousand shards less can easily be felt in the responsiveness of the system, so shrinking was worth it. The blog post about sizing the ES cluster: https://www.elastic.co/blog/how-many-shards-should-i-have-in-my-elasticsearch-cluster was really informational.

Especially useful were these two tips:

TIP: Small shards result in small segments, which increases overhead. Aim to keep the average shard size between a few GB and a few tens of GB. For use-cases with time-based data, it is common to see shards between 20GB and 40GB in size.

TIP: The number of shards you can hold on a node will be proportional to the amount of heap you have available, but there is no fixed limit enforced by Elasticsearch. A good rule-of-thumb is to ensure you keep the number of shards per node below 20 to 25 per GB heap it has configured. A node with a 30GB heap should therefore have a maximum of 600-750 shards, but the further below this limit you can keep it the better. This will generally help the cluster stay in good health.

1 Like

This topic was automatically closed 14 days after the last reply. New replies are no longer allowed.