You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Pure Host - Wolfgang Freudenberger <w....@pure-host.de> on 2018/08/29 13:17:46 UTC

Migrate 4 Shards/0 Replica to 1 Shard/1 Replica

Hi Guys,


I am currently restructuring a big-data cloud with 1000+ collections on 
a SOLRCloud. The datas are stored on 4 shards without a replica. This 
data are deprecated and readonly for some purpose, so I want to migrate 
them to a new cloud with 1 Shard and 1 Replica.

Is there an "easy" way to merge the shards? Or do I have to read/write 
copy from the old to the new cloud?

Thank you!

-- 
Mit freundlichem Gruß / kind regards

Wolfgang Freudenberger
Pure Host IT-Services
Münsterstr. 14
48341 Altenberge
GERMANY
Tel.: (+49) 25 71 - 99 20 170
Fax: (+49) 25 71 - 99 20 171

Umsatzsteuer ID DE259181123

Informieren Sie sich über unser gesamtes Leistungsspektrum unter www.pure-host.de
Get our whole services at www.pure-host.de



Re: Migrate 4 Shards/0 Replica to 1 Shard/1 Replica

Posted by Shawn Heisey <ap...@elyograg.org>.
On 8/29/2018 7:17 AM, Pure Host - Wolfgang Freudenberger wrote:
> I am currently restructuring a big-data cloud with 1000+ collections 
> on a SOLRCloud. The datas are stored on 4 shards without a replica. 
> This data are deprecated and readonly for some purpose, so I want to 
> migrate them to a new cloud with 1 Shard and 1 Replica.

If you have no replicas, then you have no data to query. You can create 
a collection with zero replicas, but then you must specifically add a 
replica before you can actually use it.

I think you probably mean that you are going from a one-replica install 
(replicationFactor=1) to a two-replica install.  The leaders are also 
replicas.

> Is there an "easy" way to merge the shards? Or do I have to read/write 
> copy from the old to the new cloud?

The Collections API does not yet have a way to merge shards.  An issue 
has been created, but it hasn't been implemented yet.  I do not know 
when that might happen:

https://issues.apache.org/jira/browse/SOLR-9407

The CoreAdmin API does have an option to merge indexes -- but when 
running in cloud mode, the CoreAdmin API is an expert API and should not 
normally be used.

The way I would handle this in reality is to re-index the data onto the 
new cloud.  Anytime I upgrade Solr or build something new, I index from 
scratch.  It works better that way. You should always be prepared to 
reindex your data from scratch -- it's a common need with a search engine.

If reindexing is more difficult for you, and you're not upgrading Solr, 
then you could try this:  Copy cores from the older SolrCloud install to 
a standalone server, merge the indexes there, build the collection in 
the new cloud, and replace the index for that collection with the merged 
index.

Thanks,
Shawn