You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Jack Schlederer <ja...@directsupply.com> on 2018/10/03 15:42:40 UTC

Restoring and upgrading a standalone index to SolrCloud

Hello,

We currently run Solr 5.4 as our production search backend. We run it in a
master/slave replication architecture, and we're starting an upgrade to
Solr 7.5 using a SolrCloud architecture.

One of our collections is around 20GB and hosts about 200M documents, and
it would take around 6 hours to do a full dataimport from the database, so
we'd like to upgrade the index and restore it to SolrCloud. I've
successfully upgraded the Lucene 5 index to Lucene 6, and then to Lucene 7,
so I think I have an index that can be restored to Solr 7. Do you know if
it's possible to restore an index like this to a SolrCloud environment if I
can get it into a directory that is shared by all the nodes?

Thanks,
Jack

Re: Restoring and upgrading a standalone index to SolrCloud

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/3/2018 10:45 AM, Shawn Heisey wrote:
> Here's one way to do this: 
<snip>

Oh, and when you delete the data directory, delete the tlog directory 
too.  Don't copy tlog from the non-cloud install.  Solr will re-create 
it as long as the directory gives it permission to do so.

Thanks,
Shawn


Re: Restoring and upgrading a standalone index to SolrCloud

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/3/2018 9:42 AM, Jack Schlederer wrote:
> I've successfully upgraded the Lucene 5 index to Lucene 6, and then to Lucene 7,

Upgrading through two major versions is not guaranteed to work.  
Upgrading from an index fully built by major version X-1 is supported, 
but if X-2 or earlier has EVER touched the index, it's probably not 
going to work.  If you find that it does work, great ... but I wouldn't 
recommend trying it.

I recommend always building indexes from scratch when upgrading, even if 
the new version is capable of reading the index created by the old version.

> so I think I have an index that can be restored to Solr 7. Do you know if
> it's possible to restore an index like this to a SolrCloud environment if I
> can get it into a directory that is shared by all the nodes?

Each node needs its own copy of the data, they cannot share an index 
directory.  Lucene works really hard to prevent sharing indexes, and 
this behavior should not be overridden.

In general, yes, you can migrate an index (assuming it's an index that 
will work, note what I said above) from a non-cloud install to a cloud 
install.  That would be greatly complicated if the index were sharded 
already in the non-cloud install -- hopefully your 20GB index is one 
core, not multiple shards.If it's sharded ... build it from scratch, 
because it's not likely that the SolrCloud collection will route data to 
shards in precisely the same way as a non-cloud install.

Here's one way to do this:

* Set up your cloud, create an empty collection with one shard and as 
many replicas as you want.
* Shut down all of the Solr nodes related to that collection.
* Delete the "data" directory under all of the cores related to that 
collection.
* Copy the data directory from the non-cloud install to one of those 
replica cores.
* Start the Solr node where you copied the data.
* Let the system fully stabilize so the replica you have just built 
shows up as green in the Cloud graph.
* Start the other Solr nodes with the other replicas.  They will copy 
the index from the one that got started first.

Thanks,
Shawn