You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by feroz_kh <fe...@gmail.com> on 2013/03/11 18:56:24 UTC

Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Hello,

We are planning to upgrade our solr servers from version 3.5 to 4.1.
We have master slave configuration and the index size is quite big (i.e.
around 14 GB ).
1. Do we really need to re-format the whole index , when we upgrade to 4.1 ?
2. What will be the consequences - if we do not re-format and simply upgrade
war file and config files ( solrconfig.xml, schema.xml ) on all slaves and
master together. (Shutdown all master & slaves and then upgrade & startup) ?
3. If re-formatting is neccessary - then what is the best tool to achieve
it. ( How long does it usually take to re-format the index of size around
14GB ) ?

Thanks,
Feroz




--
View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/11/2013 3:43 PM, feroz_kh wrote:
> Thanks Tomas!
> I see the latest available version is 4.1 - but you have suggested a 4.2
> version, where can i grab 4.2 version from?

It is already accessible from many mirrors.  Because it is not yet 
accessible from a large enough percentage of mirrors, the URL hasn't 
been updated on the main website yet.  Here is the URL:

http://www.apache.org/dyn/closer.cgi/lucene/solr/4.2.0

If the mirror that gets chosen for you automatically does not yet have 
it, just try another mirror.  There is no information on the download 
list about where each mirror is, so you'll just have to guess, or look 
them up to see where they are.

Thanks,
Shawn


Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by feroz_kh <fe...@gmail.com>.
Thanks Tomas!
I see the latest available version is 4.1 - but you have suggested a 4.2
version, where can i grab 4.2 version from?



--
View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4046471.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/11/2013 5:59 PM, feroz_kh wrote:
> One more question related to backward compatibilty.
> Previously we had upgraded our solr master/slaves from 1.4 version to 3.5
> version - We didn't reformat the whole index then. So i believe there will
> be some files with 1.4 format present in our index.
>
> Now when we upgrade from 3.5 to 4.1/or4.2  - Can we expect solr slave
> version 4.x to read both 1.4 and 3.5 formatted indices, without any issues ?

If you think that you've got index files from 1.4 still hanging around, 
you should optimize the indexes in 3.5 before upgrading further, to 
convert the index.  The new version will NOT read index segments that old.

Thanks,
Shawn


Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by feroz_kh <fe...@gmail.com>.
Thanks Tomas/Shawn!

One more question related to backward compatibilty.
Previously we had upgraded our solr master/slaves from 1.4 version to 3.5
version - We didn't reformat the whole index then. So i believe there will
be some files with 1.4 format present in our index.

Now when we upgrade from 3.5 to 4.1/or4.2  - Can we expect solr slave
version 4.x to read both 1.4 and 3.5 formatted indices, without any issues ?

Thanks,
Feroz



--
View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4046500.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by Tomás Fernández Löbbe <to...@gmail.com>.
Hi Feroz, due to Lucene's backward compatibility policy (
http://wiki.apache.org/lucene-java/BackwardsCompatibility ), a Solr 4.1
instance should be able to read an index generated by a Solr 3.5 instance.
This would not be true if you need to change the schema. Also, be careful
because Solr 4.1 could and will change the index files and will make them
unreadable by Solr 3.5 (so you should make a backup in case you need to
revert to 3.5 for some reason).
This means, that if you can't shutdown your whole application all together,
you could update the slaves first, and then the masters. Replacing all
servers together will also work.


That said, you should not use 4.1 if you are using Master/Slave, there are
some known bugs in that specific feature in 4.1 that were fixed for 4.2.

Tomás


On Mon, Mar 11, 2013 at 10:56 AM, feroz_kh <fe...@gmail.com> wrote:

> Hello,
>
> We are planning to upgrade our solr servers from version 3.5 to 4.1.
> We have master slave configuration and the index size is quite big (i.e.
> around 14 GB ).
> 1. Do we really need to re-format the whole index , when we upgrade to 4.1
> ?
> 2. What will be the consequences - if we do not re-format and simply
> upgrade
> war file and config files ( solrconfig.xml, schema.xml ) on all slaves and
> master together. (Shutdown all master & slaves and then upgrade & startup)
> ?
> 3. If re-formatting is neccessary - then what is the best tool to achieve
> it. ( How long does it usually take to re-format the index of size around
> 14GB ) ?
>
> Thanks,
> Feroz
>
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/11/2013 3:39 PM, feroz_kh wrote:
> Thanks Shawn.
> So if we have new segments in 4.1 format and all old files in 3.5 format at
> the same time, then will it cause any performance degradation on slaves
> while reading index files ( which will contain both 3.5 formatted and 4.1
> formatted files)?

There should be no performance degradation.  Solr 4.1 should perform at 
least as well as 3.5 and in many cases it will perform better.  Your 
index on disk will get smaller when converted to 4.1 format, and may 
become faster.

Thanks,
Shawn


Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by feroz_kh <fe...@gmail.com>.
Thanks Shawn.
So if we have new segments in 4.1 format and all old files in 3.5 format at
the same time, then will it cause any performance degradation on slaves
while reading index files ( which will contain both 3.5 formatted and 4.1
formatted files)?




--
View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4046469.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by feroz_kh <fe...@gmail.com>.
Hi Shawn,

I tried optimizing using this command...

curl '
http://10.7.233.54:8088/solr/update?optimize=true&maxSegments=10&waitFlush=true'


And i got this response within secs...

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">840</int></lst>
</response>

Is this a valid response that one should get ?
I checked the statistics link from  /solr/admin page and it shows the
number segments got updated.
Would this be a good indication that optimization is complete ?
At the same time - I even noticed the number of files in data/index
directory hasn't reduced & all files are not updated.
Since it took just couple of secs for the response(even with
waitFlush=true) - i am doubting if optimization really happened , but
details on statistics page shows me correct number of segments.




On Tue, Mar 12, 2013 at 8:34 PM, Shawn Heisey-4 [via Lucene] <
ml-node+s472066n4046834h97@n3.nabble.com> wrote:

> On 3/12/2013 4:17 PM, feroz_kh wrote:
> > Do we really need to optimize in order to reformat ?
>
> The alternative would be to start with an empty index and just reindex
> your data.  That is actually the best way to go, if that option is
> available to you.
>
> > If yes, What is the best way of optimizing index - Online or Offline ?
> > Can we do it online ? If yes -
> > 1. What is the http request which we can use to invoke optimization -
> How
> > long it takes ?
> > 2. What is the command line command to invoked optimization - How long
> this
> > one takes ?
>
> The only way I know of to optimize an index that's offline is using
> Luke, but it is difficult to find versions of Luke that work with
> indexes after 4.0-ALPHA - the official Luke page doesn't have any newer
> versions, and I have no idea why.  Online is better.  Solr 4.2 just got
> released, you may want to consider skipping 4.1 and going with 4.2.
>
> There would be no major speed difference between doing it offline or
> online.  Whatever else the machine is doing might be a factor.  I can
> only make guesses about how long it will take.  You say your index in
> 3.5 is 14GB.  I have experience with indexes that are 22GB in 3.5, which
> takes 11 minutes to optimize.  The equivalent index in 4.2 is 14GB and
> takes 14 minutes, because of the extra compression/decompression step.
> This is on RAID10, volumes with no RAID or with other RAID levels would
> be slower.  Also, if the structure of your index is significantly
> different than mine, yours might go faster or slower than the size alone
> would suggest.
>
> There is a curl command that optimizes the index in the wiki:
>
>
> http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_and_commitWithin_parameters_as_part_of_the_URL
>
> You would want to leave off the "maxSegments" option so it optimizes
> down to one segment.  Whether to include waitFlush is up to you, but if
> you don't include it, you won't know exactly when it finishes unless you
> are looking at the index directory.
>
> Thanks,
> Shawn
>
>
>
> ------------------------------
>  If you reply to this email, your message will be added to the discussion
> below:
>
> http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4046834.html
>  To unsubscribe from Upgrade Solr3.5 to Solr4.1 - Index Reformat ?, click
> here<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=unsubscribe_by_code&node=4046391&code=ZmVyb3oua2gyMDAwQGdtYWlsLmNvbXw0MDQ2MzkxfDIwNzA2NTYxOTI=>
> .
> NAML<http://lucene.472066.n3.nabble.com/template/NamlServlet.jtp?macro=macro_viewer&id=instant_html%21nabble%3Aemail.naml&base=nabble.naml.namespaces.BasicNamespace-nabble.view.web.template.NabbleNamespace-nabble.view.web.template.NodeNamespace&breadcrumbs=notify_subscribers%21nabble%3Aemail.naml-instant_emails%21nabble%3Aemail.naml-send_instant_email%21nabble%3Aemail.naml>
>




--
View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4052969.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by feroz_kh <fe...@gmail.com>.
Also, Is it absolutely necessary to set the maxSegments=1 , if we need to
reformat the whole index ?




--
View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4052991.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by Shawn Heisey <so...@elyograg.org>.
On 4/1/2013 12:19 PM, feroz_kh wrote:
> Hi Shawn,
> 
> I tried optimizing using this command...
> 
> curl
> 'http://localhost:XXXX/solr/update?optimize=true&maxSegments=10&waitFlush=true'
> 
> And i got this response within secs...
> 
> <?xml version="1.0" encoding="UTF-8"?>
> <response>
> <lst name="responseHeader"><int name="status">0</int><int
> name="QTime">840</int></lst>
> </response>
> 
> Is this a valid response that one should get ?
> I checked the statistics link from  /solr/admin page and it shows the number
> segments got updated.
> Would this be a good indication that optimization is complete ?
> At the same time - I even noticed the number of files in data/index
> directory hasn't reduced & all files are not updated.
> Since it took just couple of secs for the response(even with waitFlush=true)
> - i am doubting if optimization really happened , but details on statistics
> page shows me correct number of segments.

That looks like a valid success response.  An optimize in Solr defaults
to one segment.  You asked it to do ten segments.  Either you already
had less than 10 segments, or it was able to find some very small
segments to merge in order to get below 10.

When you are optimizing in order to upgrade the index format, you should
leave maxSegments off or set it to 1.

Thanks,
Shawn


Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by feroz_kh <fe...@gmail.com>.
Hi Shawn,

I tried optimizing using this command...

curl
'http://localhost:XXXX/solr/update?optimize=true&maxSegments=10&waitFlush=true'

And i got this response within secs...

<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="responseHeader"><int name="status">0</int><int
name="QTime">840</int></lst>
</response>

Is this a valid response that one should get ?
I checked the statistics link from  /solr/admin page and it shows the number
segments got updated.
Would this be a good indication that optimization is complete ?
At the same time - I even noticed the number of files in data/index
directory hasn't reduced & all files are not updated.
Since it took just couple of secs for the response(even with waitFlush=true)
- i am doubting if optimization really happened , but details on statistics
page shows me correct number of segments.





--
View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4052970.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/12/2013 4:17 PM, feroz_kh wrote:
> Do we really need to optimize in order to reformat ?

The alternative would be to start with an empty index and just reindex 
your data.  That is actually the best way to go, if that option is 
available to you.

> If yes, What is the best way of optimizing index - Online or Offline ?
> Can we do it online ? If yes -
> 1. What is the http request which we can use to invoke optimization - How
> long it takes ?
> 2. What is the command line command to invoked optimization - How long this
> one takes ?

The only way I know of to optimize an index that's offline is using 
Luke, but it is difficult to find versions of Luke that work with 
indexes after 4.0-ALPHA - the official Luke page doesn't have any newer 
versions, and I have no idea why.  Online is better.  Solr 4.2 just got 
released, you may want to consider skipping 4.1 and going with 4.2.

There would be no major speed difference between doing it offline or 
online.  Whatever else the machine is doing might be a factor.  I can 
only make guesses about how long it will take.  You say your index in 
3.5 is 14GB.  I have experience with indexes that are 22GB in 3.5, which 
takes 11 minutes to optimize.  The equivalent index in 4.2 is 14GB and 
takes 14 minutes, because of the extra compression/decompression step. 
This is on RAID10, volumes with no RAID or with other RAID levels would 
be slower.  Also, if the structure of your index is significantly 
different than mine, yours might go faster or slower than the size alone 
would suggest.

There is a curl command that optimizes the index in the wiki:

http://wiki.apache.org/solr/UpdateXmlMessages#Passing_commit_and_commitWithin_parameters_as_part_of_the_URL

You would want to leave off the "maxSegments" option so it optimizes 
down to one segment.  Whether to include waitFlush is up to you, but if 
you don't include it, you won't know exactly when it finishes unless you 
are looking at the index directory.

Thanks,
Shawn


Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by feroz_kh <fe...@gmail.com>.
Hi Shawn,

Do we really need to optimize in order to reformat ?
If yes, What is the best way of optimizing index - Online or Offline ?
Can we do it online ? If yes -
1. What is the http request which we can use to invoke optimization - How
long it takes ?
2. What is the command line command to invoked optimization - How long this
one takes ?

Offline -
1. What's offline way/tool of invoking index optimization ?

If No, what is the best way to only reformat index - Online or Offline ? How
long each takes ?
What's the online and offline ways/tool to reformat index ?

Thanks,
Feroz.



--
View this message in context: http://lucene.472066.n3.nabble.com/Upgrade-Solr3-5-to-Solr4-1-Index-Reformat-tp4046391p4046802.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

Posted by Shawn Heisey <so...@elyograg.org>.
On 3/11/2013 11:56 AM, feroz_kh wrote:
> We are planning to upgrade our solr servers from version 3.5 to 4.1.
> We have master slave configuration and the index size is quite big (i.e.
> around 14 GB ).
> 1. Do we really need to re-format the whole index , when we upgrade to 4.1 ?
> 2. What will be the consequences - if we do not re-format and simply upgrade
> war file and config files ( solrconfig.xml, schema.xml ) on all slaves and
> master together. (Shutdown all master & slaves and then upgrade & startup) ?
> 3. If re-formatting is neccessary - then what is the best tool to achieve
> it. ( How long does it usually take to re-format the index of size around
> 14GB ) ?

If you are replicating from 3.5 to 4.1, then your index will be in the 
3.5 format.  If you upgrade both the master where you index and the 
slave(s), existing index files will be in the old format, new index 
segments will be in the new format.  If you were to optimize your index 
after upgrading, it would completely replace it with the new format.

For me on a fast I/O subsystem (six 1TB SATA drives in RAID10), it takes 
about ten minutes to optimize a 22GB index on Solr 3.5.  Solr 4.1 needs 
to compress stored fields, which means extra CPU time, but less time 
actually writing to disk, so it would be about the same or possibly less.

Thanks,
Shawn