You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Doug Cutting <DC...@grandcentral.com> on 2001/12/03 17:49:03 UTC

RE: Query performance with DateFilter

I have a guess about what the problem is.  Lucene used to do a better job of
re-using TermFreq input streams.  I've attached new versions of a few files
which should restore the earlier behavior.  Try running with these.

This isn't actually a very good fix, since it uses a single element cache
(as was done before).  For example, performance will suffer again if more
than one thread uses a DateFilter at the same time.  A scalable fix would
not be much harder to implement.  So if this fixes your problem, I will
check in the more scalable version.

Doug

> -----Original Message-----
> From: Scott Stanley [mailto:sastanley3@yahoo.com]
> Sent: Friday, November 30, 2001 2:58 PM
> To: lucene-dev
> Subject: Query performance with DateFilter
> 
> 
> I have found that searching with date filtering is much slower since
> shifting from Lucene 1.1b to lucene 1.2 rc2 (basically from com.lucene
> to org.apache.lucene).
>  
> With 1.1b, search time was : 700ms
> With 1.2rc2 : 11,000 ms!
> (15 times slower)
> (with  50,000 files indexed)
>  
> However, searching  with no filtering seems to be a bit faster with
> 1.2rc2.
>  
> To be sure  that the DateFilter was responsible for the performance
> hit, I tested this:
>  
>     DateFilter df = new DateFilter("DOC_DATE", 1000087883595L,
>                                    1009087883595L)
>     BitSet bs = df.bits(IndexReader.open("/index");
>  
> With Lucene 1.1b : 668 ms
> With Lucene 1.2 rc2 : 9000 ms
>  
> Running this under JProbe, I noticed that the performance difference
> was coming from the call to SegmentTermDocs.next().  This method call
> seems to be much slower because InputStream.readByte() is slower...
>  
> I noticed that InputStream.refill() and 
> InputStream.readInternal() take
> much more time.  I finally narrowed down to
> RandomAccessFile.read(byte[], int, int) which is called 
> around 50 times
> more often in 1.2 RC2  than in the earlier version.
>  
> Is there an issue with the way FSDirectory handles 
> bufferization of the
> bytes read from the index files?  Is all of this related to the Thread
> Safety fix?   I guess the bottom line is,  is there anything we can do
> to bring the performance back up with the DateFilter? 
> 
> Scott
> 
> __________________________________________________
> Do You Yahoo!?
> Buy the perfect holiday gifts at Yahoo! Shopping.
> http://shopping.yahoo.com
> 
> --
> To unsubscribe, e-mail:   
<ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>


Problem searching for accent characters

Posted by "Kiran Kumar K.G" <ki...@net-kraft.com>.
Iam trying to search for the words which starts with accent characters.
Iam able to search for the word which starts with á.But iam unable to search
for the words which starts with à.

can someone help me

Regards,
Kiran



--
To unsubscribe, e-mail:   <ma...@jakarta.apache.org>
For additional commands, e-mail: <ma...@jakarta.apache.org>