You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by MC Moisei <mc...@comcast.net> on 2007/03/04 22:25:40 UTC

IndexSearcher cache

Hi to all members of the user group!

Let me get to my problem. I use Lucene in two different parts of the
application. One is the SearchService and one is an AOP interceptor that
intercepts any changes in the Searcheable entities. This last part is
removing the document from the index and add the document again.

That being said, here's my test case.

My searcheable item has in content "apple banana" if I search for apple
or banana I get it back amoung the results.
If I modify it and remove banana from content when I search for apple or
banana I get same results as above (!?)
If I restart my application so the IndexSearcher is recreated, I run the
test above I only get my document if I search for apple - that leads me
to conclude that the IndexSearcher caches the results.

Is there a way to clear IndexSeacher when I do the reindexing ( I use
IndexModifer for the AOP interceptor) ?

MC

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: IndexSearcher cache

Posted by Erick Erickson <er...@gmail.com>.
<1>. Every time you close/open a reader, you pay a significant penalty
to warm up caches, etc. You may have to do some tricky dancing
to coordinate among the sessions to be able to close/reopen
the reader to allow updates to show up though.

Erick


On 3/5/07, Mohammad Norouzi <mn...@gmail.com> wrote:
>
> Hi Erick
> I am completely confused about this IndexReader.
> in my case, I have to keep the reader opened because of pagination of the
> result so I have to had a reader per session. the thing that baffled me is
> can only one reader service all the session at the same time?
>
> I mean
> 1- having one reader for all sessions and having a Hits for each session.
> 2- one reader per session.
> which one is right?
>
> On 3/5/07, Erick Erickson <er...@gmail.com> wrote:
> >
> > There was quite a long discussion thread on this topic relatively
> > recently, try searching the archive for concurrence, perhaps
> > IndexReader, etc.
> >
> > The short take-away is that you should share a single instance
> > of the reader, since opening one is an expensive operation, and
> > the first searches you perform will incur some overhead while
> > underlying caches are built. You have to build in some mechanisms
> > for gracefully shutting it down when you need to re-open them if they
> > are being shared....
> >
> > Erick
> >
> >
> > On 3/4/07, MC Moisei <mc...@comcast.net> wrote:
> > >
> > > Daniel,
> > >
> > > Thanks for replying. THe only reason I don't close my indexSearcher is
> > > that I got the "Too Many File Open Exception" and I decided to make my
> > > searcher static in the SearchService.
> > >
> > > If I do go and close then open a new one I may expose myself to some
> > > concurent access issues while people can update their files other can
> do
> > > the searches for those files ( My searcheables are files entities)
> > >
> > > Since the IndexSearcher has an underlying IndexReader I should use
> that
> > > one to handle the reads in the Interceptor but that won't do me too
> much
> > > good since I need certain methods that are no in IndexReader (
> and  BTW
> > > i thought I've mentioned that I use the IndexModifier that is a
> mixture
> > > of the IndexReader and IndexWriter ).
> > >
> > > Any ideas ?
> > >
> > > MC
> > >
> > >
> > >
> > > Daniel Noll wrote:
> > > > MC Moisei wrote:
> > > >> Hi to all members of the user group!
> > > >>
> > > >> Let me get to my problem. I use Lucene in two different parts of
> the
> > > >> application. One is the SearchService and one is an AOP interceptor
> > > that
> > > >> intercepts any changes in the Searcheable entities. This last part
> is
> > > >> removing the document from the index and add the document again.
> > > >>
> > > >> That being said, here's my test case.
> > > >>
> > > >> My searcheable item has in content "apple banana" if I search for
> > apple
> > > >> or banana I get it back amoung the results.
> > > >> If I modify it and remove banana from content when I search for
> apple
> > > or
> > > >> banana I get same results as above (!?)
> > > >> If I restart my application so the IndexSearcher is recreated, I
> run
> > > the
> > > >> test above I only get my document if I search for apple - that
> leads
> > me
> > > >> to conclude that the IndexSearcher caches the results.
> > > >
> > > > It doesn't "cache" the results, but what happens is the underlying
> > > > IndexReader effectively sees no changes for its lifetime (this is
> > > > presumably for safety; since usually you don't want the index
> changing
> > > > underneath while trying to do queries.)
> > > >
> > > >> Is there a way to clear IndexSeacher when I do the reindexing ( I
> use
> > > >> IndexModifer for the AOP interceptor) ?
> > > >
> > > > Yep.  Close the IndexSearcher and the IndexReader, and reopen them.
> > > >
> > > > If you need to do queries quite often then it probably makes sense
> to
> > > > do this only every now and then.
> > > >
> > > > Daniel
> > > >
> > > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: java-user-help@lucene.apache.org
> > >
> > >
> >
>
>
>
> --
> Regards,
> Mohammad
>

Re: IndexSearcher cache

Posted by Mohammad Norouzi <mn...@gmail.com>.
Hi Erick
I am completely confused about this IndexReader.
in my case, I have to keep the reader opened because of pagination of the
result so I have to had a reader per session. the thing that baffled me is
can only one reader service all the session at the same time?

I mean
1- having one reader for all sessions and having a Hits for each session.
2- one reader per session.
which one is right?

On 3/5/07, Erick Erickson <er...@gmail.com> wrote:
>
> There was quite a long discussion thread on this topic relatively
> recently, try searching the archive for concurrence, perhaps
> IndexReader, etc.
>
> The short take-away is that you should share a single instance
> of the reader, since opening one is an expensive operation, and
> the first searches you perform will incur some overhead while
> underlying caches are built. You have to build in some mechanisms
> for gracefully shutting it down when you need to re-open them if they
> are being shared....
>
> Erick
>
>
> On 3/4/07, MC Moisei <mc...@comcast.net> wrote:
> >
> > Daniel,
> >
> > Thanks for replying. THe only reason I don't close my indexSearcher is
> > that I got the "Too Many File Open Exception" and I decided to make my
> > searcher static in the SearchService.
> >
> > If I do go and close then open a new one I may expose myself to some
> > concurent access issues while people can update their files other can do
> > the searches for those files ( My searcheables are files entities)
> >
> > Since the IndexSearcher has an underlying IndexReader I should use that
> > one to handle the reads in the Interceptor but that won't do me too much
> > good since I need certain methods that are no in IndexReader ( and  BTW
> > i thought I've mentioned that I use the IndexModifier that is a mixture
> > of the IndexReader and IndexWriter ).
> >
> > Any ideas ?
> >
> > MC
> >
> >
> >
> > Daniel Noll wrote:
> > > MC Moisei wrote:
> > >> Hi to all members of the user group!
> > >>
> > >> Let me get to my problem. I use Lucene in two different parts of the
> > >> application. One is the SearchService and one is an AOP interceptor
> > that
> > >> intercepts any changes in the Searcheable entities. This last part is
> > >> removing the document from the index and add the document again.
> > >>
> > >> That being said, here's my test case.
> > >>
> > >> My searcheable item has in content "apple banana" if I search for
> apple
> > >> or banana I get it back amoung the results.
> > >> If I modify it and remove banana from content when I search for apple
> > or
> > >> banana I get same results as above (!?)
> > >> If I restart my application so the IndexSearcher is recreated, I run
> > the
> > >> test above I only get my document if I search for apple - that leads
> me
> > >> to conclude that the IndexSearcher caches the results.
> > >
> > > It doesn't "cache" the results, but what happens is the underlying
> > > IndexReader effectively sees no changes for its lifetime (this is
> > > presumably for safety; since usually you don't want the index changing
> > > underneath while trying to do queries.)
> > >
> > >> Is there a way to clear IndexSeacher when I do the reindexing ( I use
> > >> IndexModifer for the AOP interceptor) ?
> > >
> > > Yep.  Close the IndexSearcher and the IndexReader, and reopen them.
> > >
> > > If you need to do queries quite often then it probably makes sense to
> > > do this only every now and then.
> > >
> > > Daniel
> > >
> > >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> >
>



-- 
Regards,
Mohammad

Re: IndexSearcher cache

Posted by Erick Erickson <er...@gmail.com>.
There was quite a long discussion thread on this topic relatively
recently, try searching the archive for concurrence, perhaps
IndexReader, etc.

The short take-away is that you should share a single instance
of the reader, since opening one is an expensive operation, and
the first searches you perform will incur some overhead while
underlying caches are built. You have to build in some mechanisms
for gracefully shutting it down when you need to re-open them if they
are being shared....

Erick


On 3/4/07, MC Moisei <mc...@comcast.net> wrote:
>
> Daniel,
>
> Thanks for replying. THe only reason I don't close my indexSearcher is
> that I got the "Too Many File Open Exception" and I decided to make my
> searcher static in the SearchService.
>
> If I do go and close then open a new one I may expose myself to some
> concurent access issues while people can update their files other can do
> the searches for those files ( My searcheables are files entities)
>
> Since the IndexSearcher has an underlying IndexReader I should use that
> one to handle the reads in the Interceptor but that won't do me too much
> good since I need certain methods that are no in IndexReader ( and  BTW
> i thought I've mentioned that I use the IndexModifier that is a mixture
> of the IndexReader and IndexWriter ).
>
> Any ideas ?
>
> MC
>
>
>
> Daniel Noll wrote:
> > MC Moisei wrote:
> >> Hi to all members of the user group!
> >>
> >> Let me get to my problem. I use Lucene in two different parts of the
> >> application. One is the SearchService and one is an AOP interceptor
> that
> >> intercepts any changes in the Searcheable entities. This last part is
> >> removing the document from the index and add the document again.
> >>
> >> That being said, here's my test case.
> >>
> >> My searcheable item has in content "apple banana" if I search for apple
> >> or banana I get it back amoung the results.
> >> If I modify it and remove banana from content when I search for apple
> or
> >> banana I get same results as above (!?)
> >> If I restart my application so the IndexSearcher is recreated, I run
> the
> >> test above I only get my document if I search for apple - that leads me
> >> to conclude that the IndexSearcher caches the results.
> >
> > It doesn't "cache" the results, but what happens is the underlying
> > IndexReader effectively sees no changes for its lifetime (this is
> > presumably for safety; since usually you don't want the index changing
> > underneath while trying to do queries.)
> >
> >> Is there a way to clear IndexSeacher when I do the reindexing ( I use
> >> IndexModifer for the AOP interceptor) ?
> >
> > Yep.  Close the IndexSearcher and the IndexReader, and reopen them.
> >
> > If you need to do queries quite often then it probably makes sense to
> > do this only every now and then.
> >
> > Daniel
> >
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

Re: IndexSearcher cache

Posted by MC Moisei <mc...@comcast.net>.
Daniel,

Thanks for replying. THe only reason I don't close my indexSearcher is
that I got the "Too Many File Open Exception" and I decided to make my
searcher static in the SearchService.

If I do go and close then open a new one I may expose myself to some
concurent access issues while people can update their files other can do
the searches for those files ( My searcheables are files entities)

Since the IndexSearcher has an underlying IndexReader I should use that
one to handle the reads in the Interceptor but that won't do me too much
good since I need certain methods that are no in IndexReader ( and  BTW
i thought I've mentioned that I use the IndexModifier that is a mixture
of the IndexReader and IndexWriter ).

Any ideas ?

MC



Daniel Noll wrote:
> MC Moisei wrote:
>> Hi to all members of the user group!
>>
>> Let me get to my problem. I use Lucene in two different parts of the
>> application. One is the SearchService and one is an AOP interceptor that
>> intercepts any changes in the Searcheable entities. This last part is
>> removing the document from the index and add the document again.
>>
>> That being said, here's my test case.
>>
>> My searcheable item has in content "apple banana" if I search for apple
>> or banana I get it back amoung the results.
>> If I modify it and remove banana from content when I search for apple or
>> banana I get same results as above (!?)
>> If I restart my application so the IndexSearcher is recreated, I run the
>> test above I only get my document if I search for apple - that leads me
>> to conclude that the IndexSearcher caches the results.
>
> It doesn't "cache" the results, but what happens is the underlying
> IndexReader effectively sees no changes for its lifetime (this is
> presumably for safety; since usually you don't want the index changing
> underneath while trying to do queries.)
>
>> Is there a way to clear IndexSeacher when I do the reindexing ( I use
>> IndexModifer for the AOP interceptor) ?
>
> Yep.  Close the IndexSearcher and the IndexReader, and reopen them.
>
> If you need to do queries quite often then it probably makes sense to
> do this only every now and then.
>
> Daniel
>
>


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: IndexSearcher cache

Posted by Daniel Noll <da...@nuix.com>.
MC Moisei wrote:
> Hi to all members of the user group!
> 
> Let me get to my problem. I use Lucene in two different parts of the
> application. One is the SearchService and one is an AOP interceptor that
> intercepts any changes in the Searcheable entities. This last part is
> removing the document from the index and add the document again.
> 
> That being said, here's my test case.
> 
> My searcheable item has in content "apple banana" if I search for apple
> or banana I get it back amoung the results.
> If I modify it and remove banana from content when I search for apple or
> banana I get same results as above (!?)
> If I restart my application so the IndexSearcher is recreated, I run the
> test above I only get my document if I search for apple - that leads me
> to conclude that the IndexSearcher caches the results.

It doesn't "cache" the results, but what happens is the underlying 
IndexReader effectively sees no changes for its lifetime (this is 
presumably for safety; since usually you don't want the index changing 
underneath while trying to do queries.)

> Is there a way to clear IndexSeacher when I do the reindexing ( I use
> IndexModifer for the AOP interceptor) ?

Yep.  Close the IndexSearcher and the IndexReader, and reopen them.

If you need to do queries quite often then it probably makes sense to do 
this only every now and then.

Daniel


-- 
Daniel Noll

Nuix Pty Ltd
Suite 79, 89 Jones St, Ultimo NSW 2007, Australia    Ph: +61 2 9280 0699
Web: http://nuix.com/                               Fax: +61 2 9212 6902

This message is intended only for the named recipient. If you are not
the intended recipient you are notified that disclosing, copying,
distributing or taking any action in reliance on the contents of this
message or attachment is strictly prohibited.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org