You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Andrzej Bialecki <ab...@getopt.org> on 2009/08/26 23:44:40 UTC

Lucene Search Performance Analysis Workshop

Hi all,

I am giving a free talk/ workshop next week on how to analyze and 
improve Lucene search performance for native lucene apps. If you've ever 
been challenged to get your Java Lucene search apps running faster, I 
think you might find the talk of interest.

Free online workshop:
Thursday, September 3rd 2009
11:00-11:30AM PDT / 14:00-14:30 EDT

Follow this link to sign up:
http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650dcb1d6bbc?trk=WR-SEP2009-AP

About:
Lucene Performance Workshop:
Understanding Lucene Search Performance
with Andrzej Bialecki

Experienced Java developers know how to use the Apache Lucene library to 
build powerful search applications natively in Java.
LucidGaze for Lucene from Lucid Imagination, just released this week, 
provides a powerful utility for making transparent the underlying 
indexing and search operations, and analyzing their impact on search 
performance.

Agenda:
* Understanding sources of variability in Lucene search performance
* LucidGaze for Lucene APIs for performance statistics
* Applying LucidGaze for Lucene performance statistics to real-world 
performance problems

Join us for a free online workshop. Sign up via the link below:
http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650dcb1d6bbc?trk=WR-SEP2009-AP

About the Presenter:
Andrzej Bialecki, Apache Lucene PMC Member, is on the Lucid Imagination 
Technical Advisory Board; he also serves as the project lead for Nutch, 
and as committer in the Lucene-java, Nutch and Hadoop projects. He has 
broad expertise, across domains as diverse as information retrieval, 
systems architecture, embedded systems kernels, networking and business 
process/e-commerce modeling. He's also the author of the popular Luke 
index inspection utility. Andrzej holds a master's degree in Electronics 
from Warsaw Technical University, speaks four languages and programs in 
many, many more.


-- 
Best regards,
Andrzej Bialecki     <><
  ___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene Search Performance Analysis Workshop

Posted by Erik Hatcher <er...@gmail.com>.
Fuad -

http://www.lucidimagination.com/blog/2009/05/27/filtered-query-performance-increases-for-solr-14/

Use fq=filter instead, generally speaking.

	Erik


On Aug 26, 2009, at 10:24 PM, Fuad Efendi wrote:

> I am wondering... are new SOLR filtering features faster than standard
> Lucene queries like
> {query} AND {filter}???
>
> Why can't we improve Lucene then?
>
> 	Fuad
>
>
> P.S.
> https://issues.apache.org/jira/browse/SOLR-1169
> https://issues.apache.org/jira/browse/SOLR-1179
>
>
>
>
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik.hatcher@gmail.com]
> Sent: August-26-09 8:50 PM
> To: solr-user@lucene.apache.org
> Subject: Fwd: Lucene Search Performance Analysis Workshop
>
> While Andrzej's talk will focus on things at the Lucene layer, I'm
> sure there'll be some great tips and tricks useful to Solrians too.
> Andrzej is one of the sharpest folks I've met, and he's also a very
> impressive presenter.  Tune in if you can.
>
> 	Erik
>
>
> Begin forwarded message:
>
>> From: Andrzej Bialecki <ab...@getopt.org>
>> Date: August 26, 2009 5:44:40 PM EDT
>> To: java-user@lucene.apache.org
>> Subject: Lucene Search Performance Analysis Workshop
>> Reply-To: java-user@lucene.apache.org
>>
>> Hi all,
>>
>> I am giving a free talk/ workshop next week on how to analyze and
>> improve Lucene search performance for native lucene apps. If you've
>> ever been challenged to get your Java Lucene search apps running
>> faster, I think you might find the talk of interest.
>>
>> Free online workshop:
>> Thursday, September 3rd 2009
>> 11:00-11:30AM PDT / 14:00-14:30 EDT
>>
>> Follow this link to sign up:
>>
> http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650d
> cb1d6bbc?trk=WR-SEP2009-AP
>>
>> About:
>> Lucene Performance Workshop:
>> Understanding Lucene Search Performance
>> with Andrzej Bialecki
>>
>> Experienced Java developers know how to use the Apache Lucene
>> library to build powerful search applications natively in Java.
>> LucidGaze for Lucene from Lucid Imagination, just released this
>> week, provides a powerful utility for making transparent the
>> underlying indexing and search operations, and analyzing their
>> impact on search performance.
>>
>> Agenda:
>> * Understanding sources of variability in Lucene search performance
>> * LucidGaze for Lucene APIs for performance statistics
>> * Applying LucidGaze for Lucene performance statistics to real-world
>> performance problems
>>
>> Join us for a free online workshop. Sign up via the link below:
>>
> http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650d
> cb1d6bbc?trk=WR-SEP2009-AP
>>
>> About the Presenter:
>> Andrzej Bialecki, Apache Lucene PMC Member, is on the Lucid
>> Imagination Technical Advisory Board; he also serves as the project
>> lead for Nutch, and as committer in the Lucene-java, Nutch and
>> Hadoop projects. He has broad expertise, across domains as diverse
>> as information retrieval, systems architecture, embedded systems
>> kernels, networking and business process/e-commerce modeling. He's
>> also the author of the popular Luke index inspection utility.
>> Andrzej holds a master's degree in Electronics from Warsaw Technical
>> University, speaks four languages and programs in many, many more.
>>
>>
>> -- 
>> Best regards,
>> Andrzej Bialecki     <><
>> ___. ___ ___ ___ _ _   __________________________________
>> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>> http://www.sigram.com  Contact: info at sigram dot com
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
>


Re: Lucene Search Performance Analysis Workshop

Posted by Jason Rutherglen <ja...@gmail.com>.
Agreed, Solr uses random access bitsets everywhere so I'm thinking
this could be an improvement or at least a great option to enable and
try out. I'll update LUCENE-1536 so we can benchmark.

On Thu, Aug 27, 2009 at 4:06 AM, Michael
McCandless<lu...@mikemccandless.com> wrote:
> On Thu, Aug 27, 2009 at 6:30 AM, Grant Ingersoll<gs...@apache.org> wrote:
>
>>> I am wondering... are new SOLR filtering features faster than standard
>>> Lucene queries like
>>> {query} AND {filter}???
>>
>> The new filtering features in Solr are just doing what Lucene started doing
>> in 2.4 and that is using skipping when possible.  It used to be the case in
>> both Lucene and Solr that the filter was only every applied after scoring
>> but before insertion into the Priority Queue.  That is now fixed.
>
> I think performance of filtering can still be further improved, within
> Lucene... it's still very much a work in progress.
>
> EG if a filter is random access (eg RAM resident as a bit set), which
> I think for Solr is frequently the case (?), it ought to be applied
> just like we now apply deleted documents (LUCENE-1536 is opened for
> this).  This can result in sizable performance gains, especially for
> more complex queries and no-so-dense filters.
>
> Mike
>

Re: Lucene Search Performance Analysis Workshop

Posted by Michael McCandless <lu...@mikemccandless.com>.
On Thu, Aug 27, 2009 at 6:30 AM, Grant Ingersoll<gs...@apache.org> wrote:

>> I am wondering... are new SOLR filtering features faster than standard
>> Lucene queries like
>> {query} AND {filter}???
>
> The new filtering features in Solr are just doing what Lucene started doing
> in 2.4 and that is using skipping when possible.  It used to be the case in
> both Lucene and Solr that the filter was only every applied after scoring
> but before insertion into the Priority Queue.  That is now fixed.

I think performance of filtering can still be further improved, within
Lucene... it's still very much a work in progress.

EG if a filter is random access (eg RAM resident as a bit set), which
I think for Solr is frequently the case (?), it ought to be applied
just like we now apply deleted documents (LUCENE-1536 is opened for
this).  This can result in sizable performance gains, especially for
more complex queries and no-so-dense filters.

Mike

Re: Lucene Search Performance Analysis Workshop

Posted by Grant Ingersoll <gs...@apache.org>.
On Aug 26, 2009, at 10:24 PM, Fuad Efendi wrote:

> I am wondering... are new SOLR filtering features faster than standard
> Lucene queries like
> {query} AND {filter}???

The new filtering features in Solr are just doing what Lucene started  
doing in 2.4 and that is using skipping when possible.  It used to be  
the case in both Lucene and Solr that the filter was only every  
applied after scoring but before insertion into the Priority Queue.   
That is now fixed.


>
> Why can't we improve Lucene then?
>
> 	Fuad
>
>
> P.S.
> https://issues.apache.org/jira/browse/SOLR-1169
> https://issues.apache.org/jira/browse/SOLR-1179
>
>
>
>
>
> -----Original Message-----
> From: Erik Hatcher [mailto:erik.hatcher@gmail.com]
> Sent: August-26-09 8:50 PM
> To: solr-user@lucene.apache.org
> Subject: Fwd: Lucene Search Performance Analysis Workshop
>
> While Andrzej's talk will focus on things at the Lucene layer, I'm
> sure there'll be some great tips and tricks useful to Solrians too.
> Andrzej is one of the sharpest folks I've met, and he's also a very
> impressive presenter.  Tune in if you can.
>
> 	Erik
>
>
> Begin forwarded message:
>
>> From: Andrzej Bialecki <ab...@getopt.org>
>> Date: August 26, 2009 5:44:40 PM EDT
>> To: java-user@lucene.apache.org
>> Subject: Lucene Search Performance Analysis Workshop
>> Reply-To: java-user@lucene.apache.org
>>
>> Hi all,
>>
>> I am giving a free talk/ workshop next week on how to analyze and
>> improve Lucene search performance for native lucene apps. If you've
>> ever been challenged to get your Java Lucene search apps running
>> faster, I think you might find the talk of interest.
>>
>> Free online workshop:
>> Thursday, September 3rd 2009
>> 11:00-11:30AM PDT / 14:00-14:30 EDT
>>
>> Follow this link to sign up:
>>
> http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650d
> cb1d6bbc?trk=WR-SEP2009-AP
>>
>> About:
>> Lucene Performance Workshop:
>> Understanding Lucene Search Performance
>> with Andrzej Bialecki
>>
>> Experienced Java developers know how to use the Apache Lucene
>> library to build powerful search applications natively in Java.
>> LucidGaze for Lucene from Lucid Imagination, just released this
>> week, provides a powerful utility for making transparent the
>> underlying indexing and search operations, and analyzing their
>> impact on search performance.
>>
>> Agenda:
>> * Understanding sources of variability in Lucene search performance
>> * LucidGaze for Lucene APIs for performance statistics
>> * Applying LucidGaze for Lucene performance statistics to real-world
>> performance problems
>>
>> Join us for a free online workshop. Sign up via the link below:
>>
> http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650d
> cb1d6bbc?trk=WR-SEP2009-AP
>>
>> About the Presenter:
>> Andrzej Bialecki, Apache Lucene PMC Member, is on the Lucid
>> Imagination Technical Advisory Board; he also serves as the project
>> lead for Nutch, and as committer in the Lucene-java, Nutch and
>> Hadoop projects. He has broad expertise, across domains as diverse
>> as information retrieval, systems architecture, embedded systems
>> kernels, networking and business process/e-commerce modeling. He's
>> also the author of the popular Luke index inspection utility.
>> Andrzej holds a master's degree in Electronics from Warsaw Technical
>> University, speaks four languages and programs in many, many more.
>>
>>
>> -- 
>> Best regards,
>> Andrzej Bialecki     <><
>> ___. ___ ___ ___ _ _   __________________________________
>> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
>> ___|||__||  \|  ||  |  Embedded Unix, System Integration
>> http://www.sigram.com  Contact: info at sigram dot com
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


RE: Lucene Search Performance Analysis Workshop

Posted by Fuad Efendi <fu...@efendi.ca>.
I am wondering... are new SOLR filtering features faster than standard
Lucene queries like
{query} AND {filter}???

Why can't we improve Lucene then?

	Fuad


P.S. 
https://issues.apache.org/jira/browse/SOLR-1169
https://issues.apache.org/jira/browse/SOLR-1179





-----Original Message-----
From: Erik Hatcher [mailto:erik.hatcher@gmail.com] 
Sent: August-26-09 8:50 PM
To: solr-user@lucene.apache.org
Subject: Fwd: Lucene Search Performance Analysis Workshop

While Andrzej's talk will focus on things at the Lucene layer, I'm  
sure there'll be some great tips and tricks useful to Solrians too.   
Andrzej is one of the sharpest folks I've met, and he's also a very  
impressive presenter.  Tune in if you can.

	Erik


Begin forwarded message:

> From: Andrzej Bialecki <ab...@getopt.org>
> Date: August 26, 2009 5:44:40 PM EDT
> To: java-user@lucene.apache.org
> Subject: Lucene Search Performance Analysis Workshop
> Reply-To: java-user@lucene.apache.org
>
> Hi all,
>
> I am giving a free talk/ workshop next week on how to analyze and  
> improve Lucene search performance for native lucene apps. If you've  
> ever been challenged to get your Java Lucene search apps running  
> faster, I think you might find the talk of interest.
>
> Free online workshop:
> Thursday, September 3rd 2009
> 11:00-11:30AM PDT / 14:00-14:30 EDT
>
> Follow this link to sign up:
>
http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650d
cb1d6bbc?trk=WR-SEP2009-AP
>
> About:
> Lucene Performance Workshop:
> Understanding Lucene Search Performance
> with Andrzej Bialecki
>
> Experienced Java developers know how to use the Apache Lucene  
> library to build powerful search applications natively in Java.
> LucidGaze for Lucene from Lucid Imagination, just released this  
> week, provides a powerful utility for making transparent the  
> underlying indexing and search operations, and analyzing their  
> impact on search performance.
>
> Agenda:
> * Understanding sources of variability in Lucene search performance
> * LucidGaze for Lucene APIs for performance statistics
> * Applying LucidGaze for Lucene performance statistics to real-world  
> performance problems
>
> Join us for a free online workshop. Sign up via the link below:
>
http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650d
cb1d6bbc?trk=WR-SEP2009-AP
>
> About the Presenter:
> Andrzej Bialecki, Apache Lucene PMC Member, is on the Lucid  
> Imagination Technical Advisory Board; he also serves as the project  
> lead for Nutch, and as committer in the Lucene-java, Nutch and  
> Hadoop projects. He has broad expertise, across domains as diverse  
> as information retrieval, systems architecture, embedded systems  
> kernels, networking and business process/e-commerce modeling. He's  
> also the author of the popular Luke index inspection utility.  
> Andrzej holds a master's degree in Electronics from Warsaw Technical  
> University, speaks four languages and programs in many, many more.
>
>
> -- 
> Best regards,
> Andrzej Bialecki     <><
> ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>




Fwd: Lucene Search Performance Analysis Workshop

Posted by Erik Hatcher <er...@gmail.com>.
While Andrzej's talk will focus on things at the Lucene layer, I'm  
sure there'll be some great tips and tricks useful to Solrians too.   
Andrzej is one of the sharpest folks I've met, and he's also a very  
impressive presenter.  Tune in if you can.

	Erik


Begin forwarded message:

> From: Andrzej Bialecki <ab...@getopt.org>
> Date: August 26, 2009 5:44:40 PM EDT
> To: java-user@lucene.apache.org
> Subject: Lucene Search Performance Analysis Workshop
> Reply-To: java-user@lucene.apache.org
>
> Hi all,
>
> I am giving a free talk/ workshop next week on how to analyze and  
> improve Lucene search performance for native lucene apps. If you've  
> ever been challenged to get your Java Lucene search apps running  
> faster, I think you might find the talk of interest.
>
> Free online workshop:
> Thursday, September 3rd 2009
> 11:00-11:30AM PDT / 14:00-14:30 EDT
>
> Follow this link to sign up:
> http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650dcb1d6bbc?trk=WR-SEP2009-AP
>
> About:
> Lucene Performance Workshop:
> Understanding Lucene Search Performance
> with Andrzej Bialecki
>
> Experienced Java developers know how to use the Apache Lucene  
> library to build powerful search applications natively in Java.
> LucidGaze for Lucene from Lucid Imagination, just released this  
> week, provides a powerful utility for making transparent the  
> underlying indexing and search operations, and analyzing their  
> impact on search performance.
>
> Agenda:
> * Understanding sources of variability in Lucene search performance
> * LucidGaze for Lucene APIs for performance statistics
> * Applying LucidGaze for Lucene performance statistics to real-world  
> performance problems
>
> Join us for a free online workshop. Sign up via the link below:
> http://www2.eventsvc.com/lucidimagination/event/ff97623d-3fd5-43ba-a69d-650dcb1d6bbc?trk=WR-SEP2009-AP
>
> About the Presenter:
> Andrzej Bialecki, Apache Lucene PMC Member, is on the Lucid  
> Imagination Technical Advisory Board; he also serves as the project  
> lead for Nutch, and as committer in the Lucene-java, Nutch and  
> Hadoop projects. He has broad expertise, across domains as diverse  
> as information retrieval, systems architecture, embedded systems  
> kernels, networking and business process/e-commerce modeling. He's  
> also the author of the popular Luke index inspection utility.  
> Andrzej holds a master's degree in Electronics from Warsaw Technical  
> University, speaks four languages and programs in many, many more.
>
>
> -- 
> Best regards,
> Andrzej Bialecki     <><
> ___. ___ ___ ___ _ _   __________________________________
> [__ || __|__/|__||\/|  Information Retrieval, Semantic Web
> ___|||__||  \|  ||  |  Embedded Unix, System Integration
> http://www.sigram.com  Contact: info at sigram dot com
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>