You are viewing a plain text version of this content. The canonical link for it is here.
Posted to announce@apache.org by Ishan Chattopadhyaya <is...@apache.org> on 2017/03/07 19:32:21 UTC

[ANNOUNCE] Apache Solr 6.4.2 released

7 March 2017, Apache Solr 6.4.2 available

Solr is the popular, blazing fast, open source NoSQL search platform from
the Apache Lucene project. Its major features include powerful full-text
search, hit highlighting, faceted search and analytics, rich document
parsing, geospatial search, extensive REST APIs as well as parallel SQL.
Solr is enterprise grade, secure and highly scalable, providing fault
tolerant distributed search and indexing, and powers the search and
navigation features of many of the world's largest internet sites.

Solr 6.4.2 is available for immediate download at:

   -

   http://lucene.apache.org/solr/mirrors-solr-latest-redir.html

Please read CHANGES.txt for a full list of new features and changes:

   -

   https://lucene.apache.org/solr/6_4_2/changes/Changes.html

Solr 6.4.2 contains 4 bug fixes since the 6.4.1 release:

   -

   Serious performance degradation in Solr 6.4 due to the metrics
   collection. IndexWriter metrics collection turned off by default, directory
   level metrics collection completely removed (until a better design is
   found)
   -

   Transaction log replay can hit an NullPointerException due to new
   Metrics code
   -

   NullPointerException in CloudSolrClient when reading stale alias
   -

   UnifiedHighlighter and PostingsHighlighter bug in PrefixQuery and
   TermRangeQuery for multi-byte text

Further details of changes are available in the change log available at:
http://lucene.apache.org/solr/6_4_2/changes/Changes.html

Please report any feedback to the mailing lists (http://lucene.apache.org/
solr/discussion.html)
Note: The Apache Software Foundation uses an extensive mirroring network
for distributing releases. It is possible that the mirror you are using may
not have replicated the release yet. If that is the case, please try
another mirror. This also applies to Maven access.

Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by "Caruana, Matthew" <mc...@icij.org>.
Hi Shawn,

These are the facts:

With Solr 6.4.1, we started the optimisation of a 200gb index with 67 segments. This did not trigger replication. It took a few days. We confirmed that the bottleneck was the CPU (optimisation is not parallelised).

We manually triggered replication of the optimised index to another Solr 6.4.1 instance, over a gigabit LAN. This took 45 hours before failing on the final file (the schema).

We upgraded both instances to 6.4.2 and started replication again. This took about 1.5 hours. Same index, same disks, same configuration, same network.

Matthew

> On 8 Mar 2017, at 5:25 pm, Shawn Heisey <ap...@elyograg.org> wrote:
> 
>> On 3/8/2017 5:30 AM, Caruana, Matthew wrote:
>> After upgrading to 6.4.2 from 6.4.1, we’ve seen replication time for a
>> 200gb index decrease from 45 hours to 1.5 hours. 
> 
> Just to check how long it takes to move a large amount of data over a
> network, I started a copy of a 32GB directory over a 100Mb/s network
> using a Windows client and a Samba server.  It said it would take 50
> minutes.  At this rate, copying 200GB would take over five hours.  This
> is quite a bit longer than I expected, but I hadn't done the math to
> check transfer rate against size.
> 
> Assuming that you actually intended to use the word "replication" there
> (and not something like "rebuild"), this tells me that your network is
> considerably faster than 100 megabits per second, probably gigabit, and
> that the bottleneck is the speed of the disks.
> 
> I see a previous thread where you asked about optimization performance,
> so it sounds like you are optimizing the master index which causes a
> full replication to slaves.  This is one of the reasons that
> optimization is generally not recommended except on very small indexes
> or indexes that do not change very often.
> 
> Thanks,
> Shawn
> 

Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Walter Underwood <wu...@wunderwood.org>.
During the replication, check the disk, network, and CPU utilization. One of them is the bottleneck.

If the disk is at 100%, you are OK. If the network is at 100%, you are OK. If neither of them is at 100% and there is lots of CPU used (up to 100% of one core), then Solr is the bottleneck and it needs more performance work.

We are using New Relic for monitoring. That makes this sort of check very easy.

wunder
Walter Underwood
wunder@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 8, 2017, at 8:24 AM, Shawn Heisey <ap...@elyograg.org> wrote:
> 
> On 3/8/2017 5:30 AM, Caruana, Matthew wrote:
>> After upgrading to 6.4.2 from 6.4.1, we’ve seen replication time for a
>> 200gb index decrease from 45 hours to 1.5 hours. 
> 
> Just to check how long it takes to move a large amount of data over a
> network, I started a copy of a 32GB directory over a 100Mb/s network
> using a Windows client and a Samba server.  It said it would take 50
> minutes.  At this rate, copying 200GB would take over five hours.  This
> is quite a bit longer than I expected, but I hadn't done the math to
> check transfer rate against size.
> 
> Assuming that you actually intended to use the word "replication" there
> (and not something like "rebuild"), this tells me that your network is
> considerably faster than 100 megabits per second, probably gigabit, and
> that the bottleneck is the speed of the disks.
> 
> I see a previous thread where you asked about optimization performance,
> so it sounds like you are optimizing the master index which causes a
> full replication to slaves.  This is one of the reasons that
> optimization is generally not recommended except on very small indexes
> or indexes that do not change very often.
> 
> Thanks,
> Shawn
> 


Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/8/2017 5:30 AM, Caruana, Matthew wrote:
> After upgrading to 6.4.2 from 6.4.1, we\u2019ve seen replication time for a
> 200gb index decrease from 45 hours to 1.5 hours. 

Just to check how long it takes to move a large amount of data over a
network, I started a copy of a 32GB directory over a 100Mb/s network
using a Windows client and a Samba server.  It said it would take 50
minutes.  At this rate, copying 200GB would take over five hours.  This
is quite a bit longer than I expected, but I hadn't done the math to
check transfer rate against size.

Assuming that you actually intended to use the word "replication" there
(and not something like "rebuild"), this tells me that your network is
considerably faster than 100 megabits per second, probably gigabit, and
that the bottleneck is the speed of the disks.

I see a previous thread where you asked about optimization performance,
so it sounds like you are optimizing the master index which causes a
full replication to slaves.  This is one of the reasons that
optimization is generally not recommended except on very small indexes
or indexes that do not change very often.

Thanks,
Shawn


Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Erick Erickson <er...@gmail.com>.
Caruana:

Thanks for that info.

Do you know offhand how that 1.5 hours compares to earlier versions?
I'm wondering if there is further work to be done here or are we back
to previous speeds.

Thanks
Erick

On Wed, Mar 8, 2017 at 4:30 AM, Caruana, Matthew <mc...@icij.org> wrote:
> After upgrading to 6.4.2 from 6.4.1, we’ve seen replication time for a 200gb index decrease from 45 hours to 1.5 hours.
>
>> On 7 Mar 2017, at 20:32, Ishan Chattopadhyaya <is...@apache.org> wrote:
>>
>> 7 March 2017, Apache Solr 6.4.2 available
>>
>> Solr is the popular, blazing fast, open source NoSQL search platform from
>> the Apache Lucene project. Its major features include powerful full-text
>> search, hit highlighting, faceted search and analytics, rich document
>> parsing, geospatial search, extensive REST APIs as well as parallel SQL.
>> Solr is enterprise grade, secure and highly scalable, providing fault
>> tolerant distributed search and indexing, and powers the search and
>> navigation features of many of the world's largest internet sites.
>>
>> Solr 6.4.2 is available for immediate download at:
>>
>>   -
>>
>>   http://lucene.apache.org/solr/mirrors-solr-latest-redir.html
>>
>> Please read CHANGES.txt for a full list of new features and changes:
>>
>>   -
>>
>>   https://lucene.apache.org/solr/6_4_2/changes/Changes.html
>>
>> Solr 6.4.2 contains 4 bug fixes since the 6.4.1 release:
>>
>>   -
>>
>>   Serious performance degradation in Solr 6.4 due to the metrics
>>   collection. IndexWriter metrics collection turned off by default, directory
>>   level metrics collection completely removed (until a better design is
>>   found)
>>   -
>>
>>   Transaction log replay can hit an NullPointerException due to new
>>   Metrics code
>>   -
>>
>>   NullPointerException in CloudSolrClient when reading stale alias
>>   -
>>
>>   UnifiedHighlighter and PostingsHighlighter bug in PrefixQuery and
>>   TermRangeQuery for multi-byte text
>>
>> Further details of changes are available in the change log available at:
>> http://lucene.apache.org/solr/6_4_2/changes/Changes.html
>>
>> Please report any feedback to the mailing lists (http://lucene.apache.org/
>> solr/discussion.html)
>> Note: The Apache Software Foundation uses an extensive mirroring network
>> for distributing releases. It is possible that the mirror you are using may
>> not have replicated the release yet. If that is the case, please try
>> another mirror. This also applies to Maven access.
>

Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by "Caruana, Matthew" <mc...@icij.org>.
After upgrading to 6.4.2 from 6.4.1, we’ve seen replication time for a 200gb index decrease from 45 hours to 1.5 hours.

> On 7 Mar 2017, at 20:32, Ishan Chattopadhyaya <is...@apache.org> wrote:
> 
> 7 March 2017, Apache Solr 6.4.2 available
> 
> Solr is the popular, blazing fast, open source NoSQL search platform from
> the Apache Lucene project. Its major features include powerful full-text
> search, hit highlighting, faceted search and analytics, rich document
> parsing, geospatial search, extensive REST APIs as well as parallel SQL.
> Solr is enterprise grade, secure and highly scalable, providing fault
> tolerant distributed search and indexing, and powers the search and
> navigation features of many of the world's largest internet sites.
> 
> Solr 6.4.2 is available for immediate download at:
> 
>   -
> 
>   http://lucene.apache.org/solr/mirrors-solr-latest-redir.html
> 
> Please read CHANGES.txt for a full list of new features and changes:
> 
>   -
> 
>   https://lucene.apache.org/solr/6_4_2/changes/Changes.html
> 
> Solr 6.4.2 contains 4 bug fixes since the 6.4.1 release:
> 
>   -
> 
>   Serious performance degradation in Solr 6.4 due to the metrics
>   collection. IndexWriter metrics collection turned off by default, directory
>   level metrics collection completely removed (until a better design is
>   found)
>   -
> 
>   Transaction log replay can hit an NullPointerException due to new
>   Metrics code
>   -
> 
>   NullPointerException in CloudSolrClient when reading stale alias
>   -
> 
>   UnifiedHighlighter and PostingsHighlighter bug in PrefixQuery and
>   TermRangeQuery for multi-byte text
> 
> Further details of changes are available in the change log available at:
> http://lucene.apache.org/solr/6_4_2/changes/Changes.html
> 
> Please report any feedback to the mailing lists (http://lucene.apache.org/
> solr/discussion.html)
> Note: The Apache Software Foundation uses an extensive mirroring network
> for distributing releases. It is possible that the mirror you are using may
> not have replicated the release yet. If that is the case, please try
> another mirror. This also applies to Maven access.


Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Ishan Chattopadhyaya <ic...@gmail.com>.
Hi Bernd,
Can you please double check?

I downloaded the 6.4.2 tarball and see that they have 6.4.2:

[ishan@ishanvps solr-6.4.2]$ grep -rn "luceneMatchVersion" *|grep
solrconfig.xml
CHANGES.txt:1474:  <schemaFactory class="ClassicIndexSchemaFactory"/> or
your luceneMatchVersion in the solrconfig.xml is less than 6.0
docs/changes/Changes.html:1694:&lt;schemaFactory
class="ClassicIndexSchemaFactory"/&gt; or your luceneMatchVersion in the
solrconfig.xml is less than 6.0
example/files/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>
example/example-DIH/solr/tika/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>
example/example-DIH/solr/rss/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>
example/example-DIH/solr/mail/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>
example/example-DIH/solr/db/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>
example/example-DIH/solr/solr/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>
server/solr/configsets/basic_configs/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>
server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>
server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml:38:
<luceneMatchVersion>6.4.2</luceneMatchVersion>


Maybe you downloaded the 6.4.1 version by mistake?
Thanks,
Ishan


On Thu, Mar 9, 2017 at 10:19 AM, Shawn Heisey <ap...@elyograg.org> wrote:

> On 3/8/2017 2:36 AM, Bernd Fehling wrote:
> > Shouldn't in server/solr/configsets/.../solrconfig.xml
> > <luceneMatchVersion>6.4.1</luceneMatchVersion>
> > really read
> > <luceneMatchVersion>6.4.2</luceneMatchVersion>
> >
> > May be something for package builder for future releases?
>
> That does look like it got overlooked, and is generally something that
> SHOULD be changed with each new version, but in this case, changing
> between those two version numbers will have zero effect.  It is against
> project policy to make significant changes in a bugfix release (where
> third version number changes).
>
> Any change that's significant enough to be controlled by a
> luceneMatchVersion check would only be allowed a minor or major release.
>
> Thanks,
> Shawn
>
>

Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Hi Shawn,

Thanks for the info.

Regards,
Edwin


On 9 March 2017 at 23:09, Shawn Heisey <ap...@elyograg.org> wrote:

> On 3/8/2017 10:46 PM, Zheng Lin Edwin Yeo wrote:
> > Just to check, are the index that was indexed in Solr 6.4.1 affected
> > by the bug? Do we have to re-index those records when we move to Solr
> > 6.4.2?
>
> None of the bugs fixed between 6.4.1 and 6.4.2 should affect your index
> contents at all.  That's really the point of a bugfix release -- to fix
> problems that have been found without causing new ones.
>
> Thanks,
> Shawn
>
>

Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/8/2017 10:46 PM, Zheng Lin Edwin Yeo wrote:
> Just to check, are the index that was indexed in Solr 6.4.1 affected
> by the bug? Do we have to re-index those records when we move to Solr
> 6.4.2? 

None of the bugs fixed between 6.4.1 and 6.4.2 should affect your index
contents at all.  That's really the point of a bugfix release -- to fix
problems that have been found without causing new ones.

Thanks,
Shawn


Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Zheng Lin Edwin Yeo <ed...@gmail.com>.
Hi,

Just to check, are the index that was indexed in Solr 6.4.1 affected by the
bug? Do we have to re-index those records when we move to Solr 6.4.2?

Regards,
Edwin


On 9 March 2017 at 12:49, Shawn Heisey <ap...@elyograg.org> wrote:

> On 3/8/2017 2:36 AM, Bernd Fehling wrote:
> > Shouldn't in server/solr/configsets/.../solrconfig.xml
> > <luceneMatchVersion>6.4.1</luceneMatchVersion>
> > really read
> > <luceneMatchVersion>6.4.2</luceneMatchVersion>
> >
> > May be something for package builder for future releases?
>
> That does look like it got overlooked, and is generally something that
> SHOULD be changed with each new version, but in this case, changing
> between those two version numbers will have zero effect.  It is against
> project policy to make significant changes in a bugfix release (where
> third version number changes).
>
> Any change that's significant enough to be controlled by a
> luceneMatchVersion check would only be allowed a minor or major release.
>
> Thanks,
> Shawn
>
>

Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Bernd Fehling <be...@uni-bielefeld.de>.
Hi Ishan,

yes you are right!

Left over from previous version during replace :-(

Thanks for your help,
Bernd


Am 09.03.2017 um 06:49 schrieb Ishan Chattopadhyaya:
> Hi Bernd,
> Can you please double check?
> 
> I downloaded the 6.4.2 tarball and see that they have 6.4.2:
> 
> [ishan@ishanvps solr-6.4.2]$ grep -rn "luceneMatchVersion" *|grep
> solrconfig.xml
> CHANGES.txt:1474:  <schemaFactory class="ClassicIndexSchemaFactory"/> or
> your luceneMatchVersion in the solrconfig.xml is less than 6.0
> docs/changes/Changes.html:1694:&lt;schemaFactory
> class="ClassicIndexSchemaFactory"/&gt; or your luceneMatchVersion in the
> solrconfig.xml is less than 6.0
> example/files/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> example/example-DIH/solr/tika/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> example/example-DIH/solr/rss/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> example/example-DIH/solr/mail/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> example/example-DIH/solr/db/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> example/example-DIH/solr/solr/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> server/solr/configsets/basic_configs/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> server/solr/configsets/sample_techproducts_configs/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> server/solr/configsets/data_driven_schema_configs/conf/solrconfig.xml:38:
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
> 
> 
> Maybe you downloaded the 6.4.1 version by mistake?
> Thanks,
> Ishan
> 
> 
> On Thu, Mar 9, 2017 at 10:19 AM, Shawn Heisey <ap...@elyograg.org> wrote:
> 
>> On 3/8/2017 2:36 AM, Bernd Fehling wrote:
>>> Shouldn't in server/solr/configsets/.../solrconfig.xml
>>> <luceneMatchVersion>6.4.1</luceneMatchVersion>
>>> really read
>>> <luceneMatchVersion>6.4.2</luceneMatchVersion>
>>>
>>> May be something for package builder for future releases?
>>


Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Shawn Heisey <ap...@elyograg.org>.
On 3/8/2017 2:36 AM, Bernd Fehling wrote:
> Shouldn't in server/solr/configsets/.../solrconfig.xml
> <luceneMatchVersion>6.4.1</luceneMatchVersion>
> really read
> <luceneMatchVersion>6.4.2</luceneMatchVersion>
>
> May be something for package builder for future releases?

That does look like it got overlooked, and is generally something that
SHOULD be changed with each new version, but in this case, changing
between those two version numbers will have zero effect.  It is against
project policy to make significant changes in a bugfix release (where
third version number changes).

Any change that's significant enough to be controlled by a
luceneMatchVersion check would only be allowed a minor or major release.

Thanks,
Shawn


Re: [ANNOUNCE] Apache Solr 6.4.2 released

Posted by Bernd Fehling <be...@uni-bielefeld.de>.
Shouldn't in server/solr/configsets/.../solrconfig.xml
<luceneMatchVersion>6.4.1</luceneMatchVersion>
really read
<luceneMatchVersion>6.4.2</luceneMatchVersion>

May be something for package builder for future releases?

Regards
Bernd

Am 07.03.2017 um 20:32 schrieb Ishan Chattopadhyaya:
> 7 March 2017, Apache Solr 6.4.2 available
> 
> Solr is the popular, blazing fast, open source NoSQL search platform from
> the Apache Lucene project. Its major features include powerful full-text
> search, hit highlighting, faceted search and analytics, rich document
> parsing, geospatial search, extensive REST APIs as well as parallel SQL.
> Solr is enterprise grade, secure and highly scalable, providing fault
> tolerant distributed search and indexing, and powers the search and
> navigation features of many of the world's largest internet sites.
> 
> Solr 6.4.2 is available for immediate download at:
> 
>    -
> 
>    http://lucene.apache.org/solr/mirrors-solr-latest-redir.html
> 
> Please read CHANGES.txt for a full list of new features and changes:
> 
>    -
> 
>    https://lucene.apache.org/solr/6_4_2/changes/Changes.html
> 
> Solr 6.4.2 contains 4 bug fixes since the 6.4.1 release:
> 
>    -
> 
>    Serious performance degradation in Solr 6.4 due to the metrics
>    collection. IndexWriter metrics collection turned off by default, directory
>    level metrics collection completely removed (until a better design is
>    found)
>    -
> 
>    Transaction log replay can hit an NullPointerException due to new
>    Metrics code
>    -
> 
>    NullPointerException in CloudSolrClient when reading stale alias
>    -
> 
>    UnifiedHighlighter and PostingsHighlighter bug in PrefixQuery and
>    TermRangeQuery for multi-byte text
> 
> Further details of changes are available in the change log available at:
> http://lucene.apache.org/solr/6_4_2/changes/Changes.html
> 
> Please report any feedback to the mailing lists (http://lucene.apache.org/
> solr/discussion.html)
> Note: The Apache Software Foundation uses an extensive mirroring network
> for distributing releases. It is possible that the mirror you are using may
> not have replicated the release yet. If that is the case, please try
> another mirror. This also applies to Maven access.
> 

-- 
*************************************************************
Bernd Fehling                    Bielefeld University Library
Dipl.-Inform. (FH)                LibTec - Library Technology
Universit�tsstr. 25                  and Knowledge Management
33615 Bielefeld
Tel. +49 521 106-4060       bernd.fehling(at)uni-bielefeld.de

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************