You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Liviu Matei <li...@gmail.com> on 2014/05/19 22:21:17 UTC

Performance issue when using multiple PhraseQueries against a 1+ million entries index

Hi,

In order to achieve a somehow "smarter" search that takes into
consideration also the context I decided to use PhraseQuery. Now I create
~100 phrase queries from the input text and combine them with boolean query
into one query and issue a search against the index.
Now if the index size is big (1+ million entries with a lot of content) I
am encountering performance hits - reponse time ~30 seconds which is not
acceptable. Can you please tell me if there is a way to tune the
PhraseQueries ? Or is it another way to improve perfomance besides reducing
the number of queries, I've read a little about N-Gram query but not sure
if it is suitable in this scenario ?

Thanks and regards,
Liviu

Re: Performance issue when using multiple PhraseQueries against a 1+ million entries index

Posted by Liviu Matei <li...@gmail.com>.
Thanks for the reply.
When you mention system memory you referring to RAM (or HEAP as this is
running as a java process) ?
The index size is around 13G and the java process is not given so many
memory (in terms of XMX).
Could this be the cause? My understandint while reading some articles on
the internet was that it is not good when using MMapDirectory (like I use)
to allocate all the RAM to the java process.

Thanks,
Liviu




On Mon, May 19, 2014 at 11:28 PM, Jack Krupansky <ja...@basetechnology.com>wrote:

> Does your index fit fully in system memory - the OS file cache? If not,
> there could be a lot of thrashing (I/O) as Lucene accesses the index.
>
> -- Jack Krupansky
>
> -----Original Message----- From: Liviu Matei
> Sent: Monday, May 19, 2014 4:21 PM
> To: java-user@lucene.apache.org
> Subject: Performance issue when using multiple PhraseQueries against a 1+
> million entries index
>
>
> Hi,
>
> In order to achieve a somehow "smarter" search that takes into
> consideration also the context I decided to use PhraseQuery. Now I create
> ~100 phrase queries from the input text and combine them with boolean query
> into one query and issue a search against the index.
> Now if the index size is big (1+ million entries with a lot of content) I
> am encountering performance hits - reponse time ~30 seconds which is not
> acceptable. Can you please tell me if there is a way to tune the
> PhraseQueries ? Or is it another way to improve perfomance besides reducing
> the number of queries, I've read a little about N-Gram query but not sure
> if it is suitable in this scenario ?
>
> Thanks and regards,
> Liviu
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: Performance issue when using multiple PhraseQueries against a 1+ million entries index

Posted by Jack Krupansky <ja...@basetechnology.com>.
Does your index fit fully in system memory - the OS file cache? If not, 
there could be a lot of thrashing (I/O) as Lucene accesses the index.

-- Jack Krupansky

-----Original Message----- 
From: Liviu Matei
Sent: Monday, May 19, 2014 4:21 PM
To: java-user@lucene.apache.org
Subject: Performance issue when using multiple PhraseQueries against a 1+ 
million entries index

Hi,

In order to achieve a somehow "smarter" search that takes into
consideration also the context I decided to use PhraseQuery. Now I create
~100 phrase queries from the input text and combine them with boolean query
into one query and issue a search against the index.
Now if the index size is big (1+ million entries with a lot of content) I
am encountering performance hits - reponse time ~30 seconds which is not
acceptable. Can you please tell me if there is a way to tune the
PhraseQueries ? Or is it another way to improve perfomance besides reducing
the number of queries, I've read a little about N-Gram query but not sure
if it is suitable in this scenario ?

Thanks and regards,
Liviu 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org