You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Scott Smith <ss...@mainstreamdata.com> on 2013/03/15 23:15:28 UTC

Lucene slow performance

We have a system that is using lucene and the searches are very slow.  The number of documents is fairly small (less than 30,000) and each document is typically only 2 to 10 kilo-characters.  Yet, searches are taking 15-16 seconds.

One of the things I noticed was that the index directory has several thousand (3000+) .cfs files.  We do optimize the index once per day.  This is a system that probably gets several thousand document deletes and additions per day (spread out across the day).

Any thoughts.  We didn't really notice this until we went to 4.x.

Scott



RE: Lucene slow performance

Posted by Scott Smith <ss...@mainstreamdata.com>.
Here's the code for the writer:

        IndexWriterConfig iwc = new IndexWriterConfig(Constants.LUCENE_VERSION, _analyzer);
        LogByteSizeMergePolicy lbsm = new LogByteSizeMergePolicy();
        lbsm.setUseCompoundFile(true);
        iwc.setMergePolicy(lbsm);
        Directory fsDir = FSDirectory.open(new File(_IndexDirectory));
        writer = new IndexWriter(fsDir, iwc);

I don't use NRT.  I share an IndexReader among the web sessions.  However, if the index has been updated since the last time I used it, I will go get another reader.  For this particular server, the documents are getting updated very frequently.  It would not be "strange" (from the customer's perspective) if a document received today was updated 10-20 times before the end of the day and we probably get 2k-3k documents per day.

-----Original Message-----
From: Simon Willnauer [mailto:simon.willnauer@gmail.com] 
Sent: Friday, March 15, 2013 4:45 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene slow performance

Can you tell us a little more about how you use lucene, how do you index, do you use NRT or do you open an IndexReader for every request, do you maybe us a custom merge policy or somthing like this, any special IndexWriter settings?

On Fri, Mar 15, 2013 at 11:15 PM, Scott Smith <ss...@mainstreamdata.com> wrote:
> We have a system that is using lucene and the searches are very slow.  The number of documents is fairly small (less than 30,000) and each document is typically only 2 to 10 kilo-characters.  Yet, searches are taking 15-16 seconds.
>
> One of the things I noticed was that the index directory has several thousand (3000+) .cfs files.  We do optimize the index once per day.  This is a system that probably gets several thousand document deletes and additions per day (spread out across the day).
>
> Any thoughts.  We didn't really notice this until we went to 4.x.

what do you mean what didn't you notice, the slowness or the CFS files?

simon
>
> Scott
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene slow performance

Posted by Simon Willnauer <si...@gmail.com>.
Can you tell us a little more about how you use lucene, how do you
index, do you use NRT or do you open an IndexReader for every request,
do you maybe us a custom merge policy or somthing like this, any
special IndexWriter settings?

On Fri, Mar 15, 2013 at 11:15 PM, Scott Smith <ss...@mainstreamdata.com> wrote:
> We have a system that is using lucene and the searches are very slow.  The number of documents is fairly small (less than 30,000) and each document is typically only 2 to 10 kilo-characters.  Yet, searches are taking 15-16 seconds.
>
> One of the things I noticed was that the index directory has several thousand (3000+) .cfs files.  We do optimize the index once per day.  This is a system that probably gets several thousand document deletes and additions per day (spread out across the day).
>
> Any thoughts.  We didn't really notice this until we went to 4.x.

what do you mean what didn't you notice, the slowness or the CFS files?

simon
>
> Scott
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Lucene slow performance

Posted by Scott Smith <ss...@mainstreamdata.com>.
To answer your first question: "good guess" :-). Yes, this is running on windows.  Sorry, I should have mentioned this.

Your second point was very interesting.  My assumption was that the IndexReader would get closed when the garbage collector realized that these objects were no longer being used.  I use openIfChanged() to get the new IndexReader.  But, I don't do a close() on the previous reader.

-----Original Message-----
From: Uwe Schindler [mailto:uwe@thetaphi.de] 
Sent: Friday, March 15, 2013 5:29 PM
To: java-user@lucene.apache.org; simon.willnauer@gmail.com
Subject: RE: Lucene slow performance

OK, your configuration seems fine. I would have the following idea:
- Are you using windows? If yes, then IndexWriter cannot remove unused files when they are still in use (e.g. hold by an open IndexReader)
- When you get a new IndexReader after changes to the index, do you close the old ones? If not, the above will prevent IndexWriter from removing older cfs files. They are no longer used, but linger around in filesystem. Because The older IndexReader stay open forever (if you missed to close them), IndexWriter troies serveral times to delete them, but never succeeds. On Unix/Linux, open files can be deleted, not on windows.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Simon Willnauer [mailto:simon.willnauer@gmail.com]
> Sent: Saturday, March 16, 2013 12:08 AM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene slow performance
> 
> On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith 
> <ss...@mainstreamdata.com> wrote:
> > " Do you always close IndexWriter after adding few documents and 
> > when
> closing, disable "wait for merge"? In that case, all merges are 
> interrupted and the merge policy never has a chance to merge at all 
> (because you are opening and closing IndexWriter all the time with cancelling all merges)?"
> >
> > Frankly I don't quite understand what this means.  When I "close" 
> > the
> indexwriter, I simply call close().  Is that the wrong thing?
> that should be fine...
> 
> this sounds very odd though, do you see file that get actually removed 
> / merged if you call IndexWriter#forceMerge(1)
> 
> simon
> >
> > Thanks
> >
> > Scott
> >
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Friday, March 15, 2013 4:49 PM
> > To: java-user@lucene.apache.org
> > Subject: RE: Lucene slow performance
> >
> > Hi,
> >
> > with standard configuartion, this cannot happen. What merge policy 
> > do you
> use? This looks to me like a misconfigured merge policy or using the 
> NoMergePolicy. With 3,000 segments, it will be slow, the question is, 
> why do you get those?
> >
> > Another thing could be: Do you always close IndexWriter after adding 
> > few
> documents and when closing, disable "wait for merge"? In that case, 
> all merges are interrupted and the merge policy never has a chance to 
> merge at all (because you are opening and closing IndexWriter all the 
> time with cancelling all merges)?
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >> -----Original Message-----
> >> From: Scott Smith [mailto:ssmith@mainstreamdata.com]
> >> Sent: Friday, March 15, 2013 11:15 PM
> >> To: java-user@lucene.apache.org
> >> Subject: Lucene slow performance
> >>
> >> We have a system that is using lucene and the searches are very slow.
> >> The number of documents is fairly small (less than 30,000) and each 
> >> document is typically only 2 to 10 kilo-characters.  Yet, searches 
> >> are taking
> 15-16 seconds.
> >>
> >> One of the things I noticed was that the index directory has 
> >> several thousand
> >> (3000+) .cfs files.  We do optimize the index once per day.  This 
> >> is a system that probably gets several thousand document deletes 
> >> and additions per day (spread out across the day).
> >>
> >> Any thoughts.  We didn't really notice this until we went to 4.x.
> >>
> >> Scott
> >>
> >
> >
> >
> > --------------------------------------------------------------------
> > - To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Lucene slow performance

Posted by Uwe Schindler <uw...@thetaphi.de>.
OK, your configuration seems fine. I would have the following idea:
- Are you using windows? If yes, then IndexWriter cannot remove unused files when they are still in use (e.g. hold by an open IndexReader)
- When you get a new IndexReader after changes to the index, do you close the old ones? If not, the above will prevent IndexWriter from removing older cfs files. They are no longer used, but linger around in filesystem. Because The older IndexReader stay open forever (if you missed to close them), IndexWriter troies serveral times to delete them, but never succeeds. On Unix/Linux, open files can be deleted, not on windows.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Simon Willnauer [mailto:simon.willnauer@gmail.com]
> Sent: Saturday, March 16, 2013 12:08 AM
> To: java-user@lucene.apache.org
> Subject: Re: Lucene slow performance
> 
> On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith
> <ss...@mainstreamdata.com> wrote:
> > " Do you always close IndexWriter after adding few documents and when
> closing, disable "wait for merge"? In that case, all merges are interrupted and
> the merge policy never has a chance to merge at all (because you are
> opening and closing IndexWriter all the time with cancelling all merges)?"
> >
> > Frankly I don't quite understand what this means.  When I "close" the
> indexwriter, I simply call close().  Is that the wrong thing?
> that should be fine...
> 
> this sounds very odd though, do you see file that get actually removed /
> merged if you call IndexWriter#forceMerge(1)
> 
> simon
> >
> > Thanks
> >
> > Scott
> >
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Friday, March 15, 2013 4:49 PM
> > To: java-user@lucene.apache.org
> > Subject: RE: Lucene slow performance
> >
> > Hi,
> >
> > with standard configuartion, this cannot happen. What merge policy do you
> use? This looks to me like a misconfigured merge policy or using the
> NoMergePolicy. With 3,000 segments, it will be slow, the question is, why do
> you get those?
> >
> > Another thing could be: Do you always close IndexWriter after adding few
> documents and when closing, disable "wait for merge"? In that case, all
> merges are interrupted and the merge policy never has a chance to merge at
> all (because you are opening and closing IndexWriter all the time with
> cancelling all merges)?
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >> -----Original Message-----
> >> From: Scott Smith [mailto:ssmith@mainstreamdata.com]
> >> Sent: Friday, March 15, 2013 11:15 PM
> >> To: java-user@lucene.apache.org
> >> Subject: Lucene slow performance
> >>
> >> We have a system that is using lucene and the searches are very slow.
> >> The number of documents is fairly small (less than 30,000) and each
> >> document is typically only 2 to 10 kilo-characters.  Yet, searches are taking
> 15-16 seconds.
> >>
> >> One of the things I noticed was that the index directory has several
> >> thousand
> >> (3000+) .cfs files.  We do optimize the index once per day.  This is
> >> a system that probably gets several thousand document deletes and
> >> additions per day (spread out across the day).
> >>
> >> Any thoughts.  We didn't really notice this until we went to 4.x.
> >>
> >> Scott
> >>
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Lucene slow performance

Posted by Scott Smith <ss...@mainstreamdata.com>.
Thanks for the help.

The reindex was done this morning and searches now take less than a second.

I will make the change to the code.

Cheers

Scott

-----Original Message-----
From: Uwe Schindler [mailto:uwe@thetaphi.de] 
Sent: Friday, March 15, 2013 11:17 PM
To: java-user@lucene.apache.org
Subject: RE: Lucene slow performance

Please forceMerge only one time not every time (only to clean up your index)! If you are doing a reindex already, just fix your close logic as discussed before. 



Scott Smith <ss...@mainstreamdata.com> schrieb:

>Unfortunately, this is a production system which I can't touch (though 
>I was able to get a full reindex scheduled for tomorrow morning).
>
>Are you suggesting that I do:
>
>writer.forceMerge(1);
>writer.close();
>
>instead of just doing the close()?
>
>-----Original Message-----
>From: Simon Willnauer [mailto:simon.willnauer@gmail.com]
>Sent: Friday, March 15, 2013 5:08 PM
>To: java-user@lucene.apache.org
>Subject: Re: Lucene slow performance
>
>On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith 
><ss...@mainstreamdata.com> wrote:
>> " Do you always close IndexWriter after adding few documents and when
>closing, disable "wait for merge"? In that case, all merges are 
>interrupted and the merge policy never has a chance to merge at all 
>(because you are opening and closing IndexWriter all the time with 
>cancelling all merges)?"
>>
>> Frankly I don't quite understand what this means.  When I "close" the
>indexwriter, I simply call close().  Is that the wrong thing?
>that should be fine...
>
>this sounds very odd though, do you see file that get actually removed 
>/ merged if you call IndexWriter#forceMerge(1)
>
>simon
>>
>> Thanks
>>
>> Scott
>>
>> -----Original Message-----
>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>> Sent: Friday, March 15, 2013 4:49 PM
>> To: java-user@lucene.apache.org
>> Subject: RE: Lucene slow performance
>>
>> Hi,
>>
>> with standard configuartion, this cannot happen. What merge policy do
>you use? This looks to me like a misconfigured merge policy or using 
>the NoMergePolicy. With 3,000 segments, it will be slow, the question 
>is, why do you get those?
>>
>> Another thing could be: Do you always close IndexWriter after adding
>few documents and when closing, disable "wait for merge"? In that case, 
>all merges are interrupted and the merge policy never has a chance to 
>merge at all (because you are opening and closing IndexWriter all the 
>time with cancelling all merges)?
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>> -----Original Message-----
>>> From: Scott Smith [mailto:ssmith@mainstreamdata.com]
>>> Sent: Friday, March 15, 2013 11:15 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Lucene slow performance
>>>
>>> We have a system that is using lucene and the searches are very
>slow.
>>> The number of documents is fairly small (less than 30,000) and each 
>>> document is typically only 2 to 10 kilo-characters.  Yet, searches
>are taking 15-16 seconds.
>>>
>>> One of the things I noticed was that the index directory has several
>
>>> thousand
>>> (3000+) .cfs files.  We do optimize the index once per day.  This is
>
>>> a system that probably gets several thousand document deletes and 
>>> additions per day (spread out across the day).
>>>
>>> Any thoughts.  We didn't really notice this until we went to 4.x.
>>>
>>> Scott
>>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org

--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de

RE: Lucene slow performance

Posted by Uwe Schindler <uw...@thetaphi.de>.
Please forceMerge only one time not every time (only to clean up your index)! If you are doing a reindex already, just fix your close logic as discussed before. 



Scott Smith <ss...@mainstreamdata.com> schrieb:

>Unfortunately, this is a production system which I can't touch (though
>I was able to get a full reindex scheduled for tomorrow morning).  
>
>Are you suggesting that I do:
>
>writer.forceMerge(1);
>writer.close();
>
>instead of just doing the close()?
>
>-----Original Message-----
>From: Simon Willnauer [mailto:simon.willnauer@gmail.com] 
>Sent: Friday, March 15, 2013 5:08 PM
>To: java-user@lucene.apache.org
>Subject: Re: Lucene slow performance
>
>On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith
><ss...@mainstreamdata.com> wrote:
>> " Do you always close IndexWriter after adding few documents and when
>closing, disable "wait for merge"? In that case, all merges are
>interrupted and the merge policy never has a chance to merge at all
>(because you are opening and closing IndexWriter all the time with
>cancelling all merges)?"
>>
>> Frankly I don't quite understand what this means.  When I "close" the
>indexwriter, I simply call close().  Is that the wrong thing?
>that should be fine...
>
>this sounds very odd though, do you see file that get actually removed
>/ merged if you call IndexWriter#forceMerge(1)
>
>simon
>>
>> Thanks
>>
>> Scott
>>
>> -----Original Message-----
>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>> Sent: Friday, March 15, 2013 4:49 PM
>> To: java-user@lucene.apache.org
>> Subject: RE: Lucene slow performance
>>
>> Hi,
>>
>> with standard configuartion, this cannot happen. What merge policy do
>you use? This looks to me like a misconfigured merge policy or using
>the NoMergePolicy. With 3,000 segments, it will be slow, the question
>is, why do you get those?
>>
>> Another thing could be: Do you always close IndexWriter after adding
>few documents and when closing, disable "wait for merge"? In that case,
>all merges are interrupted and the merge policy never has a chance to
>merge at all (because you are opening and closing IndexWriter all the
>time with cancelling all merges)?
>>
>> Uwe
>>
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>>
>>> -----Original Message-----
>>> From: Scott Smith [mailto:ssmith@mainstreamdata.com]
>>> Sent: Friday, March 15, 2013 11:15 PM
>>> To: java-user@lucene.apache.org
>>> Subject: Lucene slow performance
>>>
>>> We have a system that is using lucene and the searches are very
>slow.
>>> The number of documents is fairly small (less than 30,000) and each 
>>> document is typically only 2 to 10 kilo-characters.  Yet, searches
>are taking 15-16 seconds.
>>>
>>> One of the things I noticed was that the index directory has several
>
>>> thousand
>>> (3000+) .cfs files.  We do optimize the index once per day.  This is
>
>>> a system that probably gets several thousand document deletes and 
>>> additions per day (spread out across the day).
>>>
>>> Any thoughts.  We didn't really notice this until we went to 4.x.
>>>
>>> Scott
>>>
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
>---------------------------------------------------------------------
>To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>For additional commands, e-mail: java-user-help@lucene.apache.org

--
Uwe Schindler
H.-H.-Meier-Allee 63, 28213 Bremen
http://www.thetaphi.de

RE: Lucene slow performance

Posted by Scott Smith <ss...@mainstreamdata.com>.
Unfortunately, this is a production system which I can't touch (though I was able to get a full reindex scheduled for tomorrow morning).  

Are you suggesting that I do:

writer.forceMerge(1);
writer.close();

instead of just doing the close()?

-----Original Message-----
From: Simon Willnauer [mailto:simon.willnauer@gmail.com] 
Sent: Friday, March 15, 2013 5:08 PM
To: java-user@lucene.apache.org
Subject: Re: Lucene slow performance

On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith <ss...@mainstreamdata.com> wrote:
> " Do you always close IndexWriter after adding few documents and when closing, disable "wait for merge"? In that case, all merges are interrupted and the merge policy never has a chance to merge at all (because you are opening and closing IndexWriter all the time with cancelling all merges)?"
>
> Frankly I don't quite understand what this means.  When I "close" the indexwriter, I simply call close().  Is that the wrong thing?
that should be fine...

this sounds very odd though, do you see file that get actually removed / merged if you call IndexWriter#forceMerge(1)

simon
>
> Thanks
>
> Scott
>
> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Friday, March 15, 2013 4:49 PM
> To: java-user@lucene.apache.org
> Subject: RE: Lucene slow performance
>
> Hi,
>
> with standard configuartion, this cannot happen. What merge policy do you use? This looks to me like a misconfigured merge policy or using the NoMergePolicy. With 3,000 segments, it will be slow, the question is, why do you get those?
>
> Another thing could be: Do you always close IndexWriter after adding few documents and when closing, disable "wait for merge"? In that case, all merges are interrupted and the merge policy never has a chance to merge at all (because you are opening and closing IndexWriter all the time with cancelling all merges)?
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Scott Smith [mailto:ssmith@mainstreamdata.com]
>> Sent: Friday, March 15, 2013 11:15 PM
>> To: java-user@lucene.apache.org
>> Subject: Lucene slow performance
>>
>> We have a system that is using lucene and the searches are very slow.
>> The number of documents is fairly small (less than 30,000) and each 
>> document is typically only 2 to 10 kilo-characters.  Yet, searches are taking 15-16 seconds.
>>
>> One of the things I noticed was that the index directory has several 
>> thousand
>> (3000+) .cfs files.  We do optimize the index once per day.  This is 
>> a system that probably gets several thousand document deletes and 
>> additions per day (spread out across the day).
>>
>> Any thoughts.  We didn't really notice this until we went to 4.x.
>>
>> Scott
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Lucene slow performance

Posted by Simon Willnauer <si...@gmail.com>.
On Sat, Mar 16, 2013 at 12:02 AM, Scott Smith <ss...@mainstreamdata.com> wrote:
> " Do you always close IndexWriter after adding few documents and when closing, disable "wait for merge"? In that case, all merges are interrupted and the merge policy never has a chance to merge at all (because you are opening and closing IndexWriter all the time with cancelling all merges)?"
>
> Frankly I don't quite understand what this means.  When I "close" the indexwriter, I simply call close().  Is that the wrong thing?
that should be fine...

this sounds very odd though, do you see file that get actually removed
/ merged if you call IndexWriter#forceMerge(1)

simon
>
> Thanks
>
> Scott
>
> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Friday, March 15, 2013 4:49 PM
> To: java-user@lucene.apache.org
> Subject: RE: Lucene slow performance
>
> Hi,
>
> with standard configuartion, this cannot happen. What merge policy do you use? This looks to me like a misconfigured merge policy or using the NoMergePolicy. With 3,000 segments, it will be slow, the question is, why do you get those?
>
> Another thing could be: Do you always close IndexWriter after adding few documents and when closing, disable "wait for merge"? In that case, all merges are interrupted and the merge policy never has a chance to merge at all (because you are opening and closing IndexWriter all the time with cancelling all merges)?
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>> -----Original Message-----
>> From: Scott Smith [mailto:ssmith@mainstreamdata.com]
>> Sent: Friday, March 15, 2013 11:15 PM
>> To: java-user@lucene.apache.org
>> Subject: Lucene slow performance
>>
>> We have a system that is using lucene and the searches are very slow.
>> The number of documents is fairly small (less than 30,000) and each
>> document is typically only 2 to 10 kilo-characters.  Yet, searches are taking 15-16 seconds.
>>
>> One of the things I noticed was that the index directory has several
>> thousand
>> (3000+) .cfs files.  We do optimize the index once per day.  This is a
>> system that probably gets several thousand document deletes and
>> additions per day (spread out across the day).
>>
>> Any thoughts.  We didn't really notice this until we went to 4.x.
>>
>> Scott
>>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Lucene slow performance

Posted by Scott Smith <ss...@mainstreamdata.com>.
" Do you always close IndexWriter after adding few documents and when closing, disable "wait for merge"? In that case, all merges are interrupted and the merge policy never has a chance to merge at all (because you are opening and closing IndexWriter all the time with cancelling all merges)?"

Frankly I don't quite understand what this means.  When I "close" the indexwriter, I simply call close().  Is that the wrong thing?

Thanks

Scott

-----Original Message-----
From: Uwe Schindler [mailto:uwe@thetaphi.de] 
Sent: Friday, March 15, 2013 4:49 PM
To: java-user@lucene.apache.org
Subject: RE: Lucene slow performance

Hi,

with standard configuartion, this cannot happen. What merge policy do you use? This looks to me like a misconfigured merge policy or using the NoMergePolicy. With 3,000 segments, it will be slow, the question is, why do you get those?

Another thing could be: Do you always close IndexWriter after adding few documents and when closing, disable "wait for merge"? In that case, all merges are interrupted and the merge policy never has a chance to merge at all (because you are opening and closing IndexWriter all the time with cancelling all merges)?

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Scott Smith [mailto:ssmith@mainstreamdata.com]
> Sent: Friday, March 15, 2013 11:15 PM
> To: java-user@lucene.apache.org
> Subject: Lucene slow performance
> 
> We have a system that is using lucene and the searches are very slow.  
> The number of documents is fairly small (less than 30,000) and each 
> document is typically only 2 to 10 kilo-characters.  Yet, searches are taking 15-16 seconds.
> 
> One of the things I noticed was that the index directory has several 
> thousand
> (3000+) .cfs files.  We do optimize the index once per day.  This is a 
> system that probably gets several thousand document deletes and 
> additions per day (spread out across the day).
> 
> Any thoughts.  We didn't really notice this until we went to 4.x.
> 
> Scott
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Lucene slow performance

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

with standard configuartion, this cannot happen. What merge policy do you use? This looks to me like a misconfigured merge policy or using the NoMergePolicy. With 3,000 segments, it will be slow, the question is, why do you get those?

Another thing could be: Do you always close IndexWriter after adding few documents and when closing, disable "wait for merge"? In that case, all merges are interrupted and the merge policy never has a chance to merge at all (because you are opening and closing IndexWriter all the time with cancelling all merges)?

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de

> -----Original Message-----
> From: Scott Smith [mailto:ssmith@mainstreamdata.com]
> Sent: Friday, March 15, 2013 11:15 PM
> To: java-user@lucene.apache.org
> Subject: Lucene slow performance
> 
> We have a system that is using lucene and the searches are very slow.  The
> number of documents is fairly small (less than 30,000) and each document is
> typically only 2 to 10 kilo-characters.  Yet, searches are taking 15-16 seconds.
> 
> One of the things I noticed was that the index directory has several thousand
> (3000+) .cfs files.  We do optimize the index once per day.  This is a system
> that probably gets several thousand document deletes and additions per day
> (spread out across the day).
> 
> Any thoughts.  We didn't really notice this until we went to 4.x.
> 
> Scott
> 



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Lucene slow performance

Posted by Scott Smith <ss...@mainstreamdata.com>.
A little more data, of the 3330 files in the index, 2173 are CFS files and average 120k.  Another 1116 files are .del's and average about 4kB.  The remaining .prx, .frq, etc. consists of 41 files and total only 101MB.  The largest files are 3 .prx files which total less than 60MB and 2 .frq of about 10MB each.

I also noticed that some of the cfs and the del files date back to July of last year (probably the last time we did a full reindexed the system).  I would have thought running an optimization (which we do on a daily basis) would have gotten rid of them.  I know optimization has changed since 1.4, but does it not merge all of the various files into a few files?

-----Original Message-----
From: Scott Smith [mailto:ssmith@mainstreamdata.com] 
Sent: Friday, March 15, 2013 4:15 PM
To: java-user@lucene.apache.org
Subject: Lucene slow performance

We have a system that is using lucene and the searches are very slow.  The number of documents is fairly small (less than 30,000) and each document is typically only 2 to 10 kilo-characters.  Yet, searches are taking 15-16 seconds.

One of the things I noticed was that the index directory has several thousand (3000+) .cfs files.  We do optimize the index once per day.  This is a system that probably gets several thousand document deletes and additions per day (spread out across the day).

Any thoughts.  We didn't really notice this until we went to 4.x.

Scott



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org