You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by sputul <sp...@gmail.com> on 2016/10/18 17:28:23 UTC

Migration from Solr 4

We are using Solr 4.3, sing Zoopeeker on development manage Solr Cloud having
one or two nodes. Will it be easier to migrate to Solr 5 first or should I
migrate to Solr 6 directly? I see Core definition has changed. Anything else
worth noting? 

The goal is to also use HTTPS perhaps after everything works in my local
environment using Single Zookeeper and a one or more Solr Nodes. Thanks.



--
View this message in context: http://lucene.472066.n3.nabble.com/Migration-from-Solr-4-tp4301788.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Migration from Solr 4

Posted by John Bickerstaff <jo...@johnbickerstaff.com>.
Gratzi Eric for the correction on Zookeeper...

On Tue, Oct 18, 2016 at 1:48 PM, Erick Erickson <er...@gmail.com>
wrote:

> bq: ...whether it's OK to just copy the data files between 4.3 and 6.x
>
> NOT ok. Solr (well, Lucene really) guarantees to read _one_ major version
> behind. So a Solr 5x will read a solr 4x. But a Solr 6x is not guaranteed
> at all to read a 4x.  And in fact removing back-compat complexification is
> one of the benefits of moving to a new major version.
>
> You can use the 5x IndexUpgraderTool to migrate from 4x->5x, then run
> 6x over the upgraded index.
>
> All that said, I completely agree with John's advice to re-index entirely
> from scratch if at all possible.
>
> **********
> It's perfectly possible to run with a single Zookeeper. The quorum
> formula is N/2 + 1. If you only have 1 ZK node, then 1 represents
> quorum. It's just not advisable. Zk is responsible for presenting the
> cluster topology to the entire set of Solr instances, you want it
> to be robust so 3 is the recommended minimum. BTW, It's quite rare
> to need more than that unless you're at pretty massive scale (hundreds
> of Solr instances and/or collections), and even in these cases it's
> usually best to use Observers....
>
> FWIW,
> Erick
>
> On Tue, Oct 18, 2016 at 2:28 PM, John Bickerstaff
> <jo...@johnbickerstaff.com> wrote:
> > For what it's worth, (and it may not work for your situation) I decided
> not
> > to upgrade, but to "upgrade by replacing".  In other words, I just
> > installed 6.x and because I had set up my configs for "include" I didn't
> > have to worry about what would be different about the "new"
> solrconfig.xml
> > and the managed_schema file.  Instead, I was able to use a copy from one
> of
> > the sample projects and add my "included" configs.
> >
> > Then, after creating the same collection structure in 6.x that I had in
> my
> > 5.x instances, I just re-indexed everything into the new 6.x Solr.
> >
> > The big deal (probably) is whether it will cost you days to re-index and
> > whether you have the resources to do that.
> >
> > I don't know if the index remained the same because I didn't have to
> > trouble myself with that due to the replacement.  I'm sure others on the
> > list can tell us whether it's OK to just copy the data files between 4.3
> > and 6.x  (I'd guess not...)
> >
> > By the way - unless I misunderstand the Zookeeper docs, you can't get
> away
> > with any less than 3 Zookeeper nodes. So keep that in mind.
> >
> > I have my rough notes about what I did available online.  You can go here
> > for the link...
> >
> > https://www.linkedin.com/pulse/actual-solrcloud-vms-
> zookeeper-nodes-john-bickerstaff
> >
> >
> >
> >
> >
> >
> > On Tue, Oct 18, 2016 at 11:28 AM, sputul <sp...@gmail.com> wrote:
> >
> >> We are using Solr 4.3, sing Zoopeeker on development manage Solr Cloud
> >> having
> >> one or two nodes. Will it be easier to migrate to Solr 5 first or
> should I
> >> migrate to Solr 6 directly? I see Core definition has changed. Anything
> >> else
> >> worth noting?
> >>
> >> The goal is to also use HTTPS perhaps after everything works in my local
> >> environment using Single Zookeeper and a one or more Solr Nodes. Thanks.
> >>
> >>
> >>
> >> --
> >> View this message in context: http://lucene.472066.n3.
> >> nabble.com/Migration-from-Solr-4-tp4301788.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
>

Re: Migration from Solr 4

Posted by sputul <sp...@gmail.com>.
Thanks, Eric. I will use IndexUpgraderTool to upgrade index per your
suggestion.
-- Putul



--
View this message in context: http://lucene.472066.n3.nabble.com/Migration-from-Solr-4-tp4301788p4301859.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Migration from Solr 4

Posted by Erick Erickson <er...@gmail.com>.
bq: ...whether it's OK to just copy the data files between 4.3 and 6.x

NOT ok. Solr (well, Lucene really) guarantees to read _one_ major version
behind. So a Solr 5x will read a solr 4x. But a Solr 6x is not guaranteed
at all to read a 4x.  And in fact removing back-compat complexification is
one of the benefits of moving to a new major version.

You can use the 5x IndexUpgraderTool to migrate from 4x->5x, then run
6x over the upgraded index.

All that said, I completely agree with John's advice to re-index entirely
from scratch if at all possible.

**********
It's perfectly possible to run with a single Zookeeper. The quorum
formula is N/2 + 1. If you only have 1 ZK node, then 1 represents
quorum. It's just not advisable. Zk is responsible for presenting the
cluster topology to the entire set of Solr instances, you want it
to be robust so 3 is the recommended minimum. BTW, It's quite rare
to need more than that unless you're at pretty massive scale (hundreds
of Solr instances and/or collections), and even in these cases it's
usually best to use Observers....

FWIW,
Erick

On Tue, Oct 18, 2016 at 2:28 PM, John Bickerstaff
<jo...@johnbickerstaff.com> wrote:
> For what it's worth, (and it may not work for your situation) I decided not
> to upgrade, but to "upgrade by replacing".  In other words, I just
> installed 6.x and because I had set up my configs for "include" I didn't
> have to worry about what would be different about the "new" solrconfig.xml
> and the managed_schema file.  Instead, I was able to use a copy from one of
> the sample projects and add my "included" configs.
>
> Then, after creating the same collection structure in 6.x that I had in my
> 5.x instances, I just re-indexed everything into the new 6.x Solr.
>
> The big deal (probably) is whether it will cost you days to re-index and
> whether you have the resources to do that.
>
> I don't know if the index remained the same because I didn't have to
> trouble myself with that due to the replacement.  I'm sure others on the
> list can tell us whether it's OK to just copy the data files between 4.3
> and 6.x  (I'd guess not...)
>
> By the way - unless I misunderstand the Zookeeper docs, you can't get away
> with any less than 3 Zookeeper nodes. So keep that in mind.
>
> I have my rough notes about what I did available online.  You can go here
> for the link...
>
> https://www.linkedin.com/pulse/actual-solrcloud-vms-zookeeper-nodes-john-bickerstaff
>
>
>
>
>
>
> On Tue, Oct 18, 2016 at 11:28 AM, sputul <sp...@gmail.com> wrote:
>
>> We are using Solr 4.3, sing Zoopeeker on development manage Solr Cloud
>> having
>> one or two nodes. Will it be easier to migrate to Solr 5 first or should I
>> migrate to Solr 6 directly? I see Core definition has changed. Anything
>> else
>> worth noting?
>>
>> The goal is to also use HTTPS perhaps after everything works in my local
>> environment using Single Zookeeper and a one or more Solr Nodes. Thanks.
>>
>>
>>
>> --
>> View this message in context: http://lucene.472066.n3.
>> nabble.com/Migration-from-Solr-4-tp4301788.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>

Re: Migration from Solr 4

Posted by sputul <sp...@gmail.com>.
Thanks for quick reply and all documents, John. I plan on placing our index
to Solr install to see if that works. And hope that Solr 4 index will
magically work with SolrConfig changes. Excuse my ignorance, but is there a
curl command or so to reindex documents in a collection? We do this in code
because index needs to synch up with other data.



--
View this message in context: http://lucene.472066.n3.nabble.com/Migration-from-Solr-4-tp4301788p4301820.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Migration from Solr 4

Posted by John Bickerstaff <jo...@johnbickerstaff.com>.
Good point on the chroot - I used one and at one point I had 3 different
versions of Solr running on 9 VMs - nary a problem...

Sputul, the chroot "instructions" are in my notes...  look for something
like "...../solr6.1" in the notes and you'll see what I mean...

On Tue, Oct 18, 2016 at 2:05 PM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 10/18/2016 12:28 PM, John Bickerstaff wrote:
> > For what it's worth, (and it may not work for your situation) I
> > decided not to upgrade, but to "upgrade by replacing". In other words,
> > I just installed 6.x and because I had set up my configs for "include"
> > I didn't have to worry about what would be different about the "new"
> > solrconfig.xml and the managed_schema file. Instead, I was able to use
> > a copy from one of the sample projects and add my "included" configs.
>
> John's way is the way that I would recommend doing it.  If you utilize a
> chroot, you can even use the same zookeeper ensemble for the new cloud,
> it will just go in a different location in the database. SolrCloud
> evolves very quickly, so trying to use an existing zookeeper database
> may result in less than optimal operation.  Solr itself doesn't evolve
> nearly as fast, but if you go to 6.x, you're going to jump two major
> versions -- even for Solr, that's a LOT of change.
>
> > By the way - unless I misunderstand the Zookeeper docs, you can't get
> > away with any less than 3 Zookeeper nodes. So keep that in mind.
>
> This is 100% correct.  If you want zookeeper redundancy, you must have
> at least 3 zookeeper servers.  This is clear in the zookeeper
> documentation.
>
> > I have my rough notes about what I did available online. You can go
> > here for the link...
> > https://www.linkedin.com/pulse/actual-solrcloud-vms-
> zookeeper-nodes-john-bickerstaff
>
>
> I haven't gone over these notes, but if they work, awesome.
>
> Regarding the latest question on the thread, there is no magic "reindex"
> button:
>
> https://wiki.apache.org/solr/HowToReindex
>
> Thanks,
> Shawn
>
>

Re: Migration from Solr 4

Posted by sputul <sp...@gmail.com>.
Thanks, Shawn. Sad but good to know upfront hat reindex is not magic. 




--
View this message in context: http://lucene.472066.n3.nabble.com/Migration-from-Solr-4-tp4301788p4301858.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Migration from Solr 4

Posted by Shawn Heisey <ap...@elyograg.org>.
On 10/18/2016 12:28 PM, John Bickerstaff wrote:
> For what it's worth, (and it may not work for your situation) I
> decided not to upgrade, but to "upgrade by replacing". In other words,
> I just installed 6.x and because I had set up my configs for "include"
> I didn't have to worry about what would be different about the "new"
> solrconfig.xml and the managed_schema file. Instead, I was able to use
> a copy from one of the sample projects and add my "included" configs. 

John's way is the way that I would recommend doing it.  If you utilize a
chroot, you can even use the same zookeeper ensemble for the new cloud,
it will just go in a different location in the database. SolrCloud
evolves very quickly, so trying to use an existing zookeeper database
may result in less than optimal operation.  Solr itself doesn't evolve
nearly as fast, but if you go to 6.x, you're going to jump two major
versions -- even for Solr, that's a LOT of change.

> By the way - unless I misunderstand the Zookeeper docs, you can't get
> away with any less than 3 Zookeeper nodes. So keep that in mind. 

This is 100% correct.  If you want zookeeper redundancy, you must have
at least 3 zookeeper servers.  This is clear in the zookeeper documentation.

> I have my rough notes about what I did available online. You can go
> here for the link...
> https://www.linkedin.com/pulse/actual-solrcloud-vms-zookeeper-nodes-john-bickerstaff


I haven't gone over these notes, but if they work, awesome.

Regarding the latest question on the thread, there is no magic "reindex"
button:

https://wiki.apache.org/solr/HowToReindex

Thanks,
Shawn


Re: Migration from Solr 4

Posted by John Bickerstaff <jo...@johnbickerstaff.com>.
For what it's worth, (and it may not work for your situation) I decided not
to upgrade, but to "upgrade by replacing".  In other words, I just
installed 6.x and because I had set up my configs for "include" I didn't
have to worry about what would be different about the "new" solrconfig.xml
and the managed_schema file.  Instead, I was able to use a copy from one of
the sample projects and add my "included" configs.

Then, after creating the same collection structure in 6.x that I had in my
5.x instances, I just re-indexed everything into the new 6.x Solr.

The big deal (probably) is whether it will cost you days to re-index and
whether you have the resources to do that.

I don't know if the index remained the same because I didn't have to
trouble myself with that due to the replacement.  I'm sure others on the
list can tell us whether it's OK to just copy the data files between 4.3
and 6.x  (I'd guess not...)

By the way - unless I misunderstand the Zookeeper docs, you can't get away
with any less than 3 Zookeeper nodes. So keep that in mind.

I have my rough notes about what I did available online.  You can go here
for the link...

https://www.linkedin.com/pulse/actual-solrcloud-vms-zookeeper-nodes-john-bickerstaff






On Tue, Oct 18, 2016 at 11:28 AM, sputul <sp...@gmail.com> wrote:

> We are using Solr 4.3, sing Zoopeeker on development manage Solr Cloud
> having
> one or two nodes. Will it be easier to migrate to Solr 5 first or should I
> migrate to Solr 6 directly? I see Core definition has changed. Anything
> else
> worth noting?
>
> The goal is to also use HTTPS perhaps after everything works in my local
> environment using Single Zookeeper and a one or more Solr Nodes. Thanks.
>
>
>
> --
> View this message in context: http://lucene.472066.n3.
> nabble.com/Migration-from-Solr-4-tp4301788.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>