You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Greg Gershman <gr...@yahoo.com> on 2004/07/21 15:22:13 UTC

Sort: 1.4-rc3 vs. 1.4-final

When rc3 came out, I modified the classes used for
Sorting to, in addition to Integer, Float and
String-based sort keys, use Long values.  All I did
was add extra statements in 2 classes (SortField and
FieldSortedHitQueue) that made a special case for
longs, and created a LongSortedHitQueue identical to
the IntegerSortedHitQueue, only using longs.  

This worked as expected; Long values converted to
strings and stored in Field.Keyword type fields would
be sorted according to Long order.  The initial query
would take a while, to build the sorted array, but
subsequent queries would take little to no time at
all.

I went back to look at 1.4 final, and noticed the Sort
implementation has changed quite a bit.  I tried the
same type of modifications to the existing source
files, but was unable to achieve similiar results. 
Each subsequent query seems to take a significant
amount of time, as if the Sorted array is being
rebuilt each time.  Also, I tried sorting on an
Integer fields and got similar results, which leads me
to believe there might be a caching problem somewhere.

Has anyone else seen this in 1.4-final?  Also, I would
like it if Long sorted fields could become a part of
the API; it makes sorting by date a breeze.

Thanks!

Greg Gershman


		
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Slightly off topic, I need to have luke use my Analyzer

Posted by Rob Jose <rj...@comcast.net>.
Sorry for the slightly off topic post, but I have a need to use luke with my
Analyzer.  Has anyone done this?  I have added a jar file to my classpath,
but that didn't help.

Thanks in advance
Rob


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Sort: 1.4-rc3 vs. 1.4-final

Posted by Aviran <am...@infosciences.com>.
I will post a patch soon

Aviran

-----Original Message-----
From: Doug Cutting [mailto:cutting@apache.org] 
Sent: Wednesday, July 21, 2004 13:56 PM
To: Lucene Users List
Subject: Re: Sort: 1.4-rc3 vs. 1.4-final


The key in the WeakHashMap should be the IndexReader, not the Entry.  I 
think this should become a two-level cache, a WeakHashMap of HashMaps, 
the WeakHashMap keyed by IndexReader, the HashMap keyed by Entry.  I 
think the Entry class can also be changed to not include an IndexReader 
field.  Does this make sense?  Would someone like to construct a patch 
and submit it to the developer list?

Doug

Aviran wrote:
> I think I found the problem
> FieldCacheImpl uses WeakHashMap to store the cached objects, but since 
> there is no other reference to this cache it is getting released. 
> Switching to HashMap solves it. The only problem is that I don't see 
> anywhere where the cached object will get released if you open a new 
> IndexReader.
> 
> Aviran
> 
> -----Original Message-----
> From: Greg Gershman [mailto:greggersh@yahoo.com]
> Sent: Wednesday, July 21, 2004 13:13 PM
> To: Lucene Users List
> Subject: RE: Sort: 1.4-rc3 vs. 1.4-final
> 
> 
> I've done a bit more snooping around; it seems that in 
> FieldSortedHitQueue.getCachedComparator(line 153), calls to lookup a 
> stored comparator in the cache always return null.  This occurs even 
> for the built-in sort types (I tested it on integers and my code for 
> longs).  The comparators don't even appear to be being stored in the 
> HashMap to begin with.
> 
> Any ideas?
> 
> Greg
> 
>  
> 
> --- Aviran <am...@infosciences.com> wrote:
> 
>>Since I had to implement sorting in lucene 1.2 I had
>>to write my own sorting
>>using something similar to a lucene's contribution
>>called SortField.
>>Yesterday I did some tests, trying to use lucene 1.4
>>Sort objects and I
>>realized that my old implementation works 40% faster
>>then Lucene's
>>implementation. My guess is that you are right and
>>there is a problem with
>>the cache although I couldn't find what that is yet.
>>
>>Aviran
>>
>>-----Original Message-----
>>From: Greg Gershman [mailto:greggersh@yahoo.com]
>>Sent: Wednesday, July 21, 2004 9:22 AM
>>To: lucene-user@jakarta.apache.org
>>Subject: Sort: 1.4-rc3 vs. 1.4-final
>>
>>
>>When rc3 came out, I modified the classes used for
>>Sorting to, in addition to Integer, Float and
>>String-based sort keys, use Long values.  All I did
>>was add extra statements in 2 classes (SortField and
>>FieldSortedHitQueue) that made a special case for
>>longs, and created a LongSortedHitQueue identical to
>>the IntegerSortedHitQueue, only using longs.
>>
>>This worked as expected; Long values converted to
>>strings and stored in Field.Keyword type fields
>>would
>>be sorted according to Long order.  The initial
>>query
>>would take a while, to build the sorted array, but
>>subsequent queries would take little to no time at
>>all.
>>
>>I went back to look at 1.4 final, and noticed the
>>Sort implementation has
>>changed quite a bit.  I tried the same type of
>>modifications to the existing
>>source files, but was unable to achieve similiar
>>results.
>>Each subsequent query seems to take a significant
>>amount of time, as if the Sorted array is being
>>rebuilt each time.  Also, I tried sorting on an
>>Integer fields and got similar results, which leads
>>me
>>to believe there might be a caching problem
>>somewhere.
>>
>>Has anyone else seen this in 1.4-final?  Also, I
>>would
>>like it if Long sorted fields could become a part of
>>the API; it makes sorting by date a breeze.
>>
>>Thanks!
>>
>>Greg Gershman
>>
>>
>>		
>>__________________________________
>>Do you Yahoo!?
>>New and Improved Yahoo! Mail - Send 10MB messages! 
>>http://promotions.yahoo.com/new_mail
>>
>>
> 
> ---------------------------------------------------------------------
> 
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>>
>>
>>
> 
> ---------------------------------------------------------------------
> 
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
> 
> 
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Vote for the stars of Yahoo!'s next ad campaign! 
> http://advision.webevents.yahoo.com/yahoo/votelifeengine/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Sort: 1.4-rc3 vs. 1.4-final

Posted by Doug Cutting <cu...@apache.org>.
The key in the WeakHashMap should be the IndexReader, not the Entry.  I 
think this should become a two-level cache, a WeakHashMap of HashMaps, 
the WeakHashMap keyed by IndexReader, the HashMap keyed by Entry.  I 
think the Entry class can also be changed to not include an IndexReader 
field.  Does this make sense?  Would someone like to construct a patch 
and submit it to the developer list?

Doug

Aviran wrote:
> I think I found the problem
> FieldCacheImpl uses WeakHashMap to store the cached objects, but since there
> is no other reference to this cache it is getting released.
> Switching to HashMap solves it.
> The only problem is that I don't see anywhere where the cached object will
> get released if you open a new IndexReader.
> 
> Aviran
> 
> -----Original Message-----
> From: Greg Gershman [mailto:greggersh@yahoo.com] 
> Sent: Wednesday, July 21, 2004 13:13 PM
> To: Lucene Users List
> Subject: RE: Sort: 1.4-rc3 vs. 1.4-final
> 
> 
> I've done a bit more snooping around; it seems that in
> FieldSortedHitQueue.getCachedComparator(line 153), calls to lookup a stored
> comparator in the cache always return null.  This occurs even for the
> built-in sort types (I tested it on integers and my code for longs).  The
> comparators don't even appear to be being stored in the HashMap to begin
> with.
> 
> Any ideas?
> 
> Greg
> 
>  
> 
> --- Aviran <am...@infosciences.com> wrote:
> 
>>Since I had to implement sorting in lucene 1.2 I had
>>to write my own sorting
>>using something similar to a lucene's contribution
>>called SortField.
>>Yesterday I did some tests, trying to use lucene 1.4
>>Sort objects and I
>>realized that my old implementation works 40% faster
>>then Lucene's
>>implementation. My guess is that you are right and
>>there is a problem with
>>the cache although I couldn't find what that is yet.
>>
>>Aviran
>>
>>-----Original Message-----
>>From: Greg Gershman [mailto:greggersh@yahoo.com]
>>Sent: Wednesday, July 21, 2004 9:22 AM
>>To: lucene-user@jakarta.apache.org
>>Subject: Sort: 1.4-rc3 vs. 1.4-final
>>
>>
>>When rc3 came out, I modified the classes used for
>>Sorting to, in addition to Integer, Float and
>>String-based sort keys, use Long values.  All I did
>>was add extra statements in 2 classes (SortField and
>>FieldSortedHitQueue) that made a special case for
>>longs, and created a LongSortedHitQueue identical to
>>the IntegerSortedHitQueue, only using longs.
>>
>>This worked as expected; Long values converted to
>>strings and stored in Field.Keyword type fields
>>would
>>be sorted according to Long order.  The initial
>>query
>>would take a while, to build the sorted array, but
>>subsequent queries would take little to no time at
>>all.
>>
>>I went back to look at 1.4 final, and noticed the
>>Sort implementation has
>>changed quite a bit.  I tried the same type of
>>modifications to the existing
>>source files, but was unable to achieve similiar
>>results.
>>Each subsequent query seems to take a significant
>>amount of time, as if the Sorted array is being
>>rebuilt each time.  Also, I tried sorting on an
>>Integer fields and got similar results, which leads
>>me
>>to believe there might be a caching problem
>>somewhere.
>>
>>Has anyone else seen this in 1.4-final?  Also, I
>>would
>>like it if Long sorted fields could become a part of
>>the API; it makes sorting by date a breeze.
>>
>>Thanks!
>>
>>Greg Gershman
>>
>>
>>		
>>__________________________________
>>Do you Yahoo!?
>>New and Improved Yahoo! Mail - Send 10MB messages!
>>http://promotions.yahoo.com/new_mail
>>
>>
> 
> ---------------------------------------------------------------------
> 
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
>>
>>
>>
> 
> ---------------------------------------------------------------------
> 
>>To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
>>For additional commands, e-mail: lucene-user-help@jakarta.apache.org
>>
>>
> 
> 
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Vote for the stars of Yahoo!'s next ad campaign!
> http://advision.webevents.yahoo.com/yahoo/votelifeengine/
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Sort: 1.4-rc3 vs. 1.4-final

Posted by Aviran <am...@infosciences.com>.
I think I found the problem
FieldCacheImpl uses WeakHashMap to store the cached objects, but since there
is no other reference to this cache it is getting released.
Switching to HashMap solves it.
The only problem is that I don't see anywhere where the cached object will
get released if you open a new IndexReader.

Aviran

-----Original Message-----
From: Greg Gershman [mailto:greggersh@yahoo.com] 
Sent: Wednesday, July 21, 2004 13:13 PM
To: Lucene Users List
Subject: RE: Sort: 1.4-rc3 vs. 1.4-final


I've done a bit more snooping around; it seems that in
FieldSortedHitQueue.getCachedComparator(line 153), calls to lookup a stored
comparator in the cache always return null.  This occurs even for the
built-in sort types (I tested it on integers and my code for longs).  The
comparators don't even appear to be being stored in the HashMap to begin
with.

Any ideas?

Greg

 

--- Aviran <am...@infosciences.com> wrote:
> Since I had to implement sorting in lucene 1.2 I had
> to write my own sorting
> using something similar to a lucene's contribution
> called SortField.
> Yesterday I did some tests, trying to use lucene 1.4
> Sort objects and I
> realized that my old implementation works 40% faster
> then Lucene's
> implementation. My guess is that you are right and
> there is a problem with
> the cache although I couldn't find what that is yet.
> 
> Aviran
> 
> -----Original Message-----
> From: Greg Gershman [mailto:greggersh@yahoo.com]
> Sent: Wednesday, July 21, 2004 9:22 AM
> To: lucene-user@jakarta.apache.org
> Subject: Sort: 1.4-rc3 vs. 1.4-final
> 
> 
> When rc3 came out, I modified the classes used for
> Sorting to, in addition to Integer, Float and
> String-based sort keys, use Long values.  All I did
> was add extra statements in 2 classes (SortField and
> FieldSortedHitQueue) that made a special case for
> longs, and created a LongSortedHitQueue identical to
> the IntegerSortedHitQueue, only using longs.
> 
> This worked as expected; Long values converted to
> strings and stored in Field.Keyword type fields
> would
> be sorted according to Long order.  The initial
> query
> would take a while, to build the sorted array, but
> subsequent queries would take little to no time at
> all.
> 
> I went back to look at 1.4 final, and noticed the
> Sort implementation has
> changed quite a bit.  I tried the same type of
> modifications to the existing
> source files, but was unable to achieve similiar
> results.
> Each subsequent query seems to take a significant
> amount of time, as if the Sorted array is being
> rebuilt each time.  Also, I tried sorting on an
> Integer fields and got similar results, which leads
> me
> to believe there might be a caching problem
> somewhere.
> 
> Has anyone else seen this in 1.4-final?  Also, I
> would
> like it if Long sorted fields could become a part of
> the API; it makes sorting by date a breeze.
> 
> Thanks!
> 
> Greg Gershman
> 
> 
> 		
> __________________________________
> Do you Yahoo!?
> New and Improved Yahoo! Mail - Send 10MB messages!
> http://promotions.yahoo.com/new_mail
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-user-help@jakarta.apache.org
> 
> 



	
		
__________________________________
Do you Yahoo!?
Vote for the stars of Yahoo!'s next ad campaign!
http://advision.webevents.yahoo.com/yahoo/votelifeengine/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Sort: 1.4-rc3 vs. 1.4-final

Posted by Aviran <am...@infosciences.com>.
I just saw this post, I guess we both came to the same conclusion. 
The only problem is that the cached object never gets released, and a new
one will get created every time you open a new IndexReader

Aviran

-----Original Message-----
From: Greg Gershman [mailto:greggersh@yahoo.com] 
Sent: Wednesday, July 21, 2004 13:30 PM
To: Lucene Users List
Subject: RE: Sort: 1.4-rc3 vs. 1.4-final


I switched the Comparators and FieldCache classes to
use java.util.HashMap instead of
java.util.WeakHashMap, and got the performance boost I
was looking for (test index of 100K documents; initial
search took 991 ms, all subsequent searchs took <
90ms.  Before, I was seeing initial query of ~1sec,
subsequent queries between 500 and 700 ms, with
comparator and field lookup table computed each time).

I guess the question is why use a WeakHashMap here as
opposed to a HashMap?

Greg

--- Greg Gershman <gr...@yahoo.com> wrote:
> I've done a bit more snooping around; it seems that
> in
> FieldSortedHitQueue.getCachedComparator(line 153),
> calls to lookup a stored comparator in the cache
> always return null.  This occurs even for the
> built-in
> sort types (I tested it on integers and my code for
> longs).  The comparators don't even appear to be
> being
> stored in the HashMap to begin with.
> 
> Any ideas?
> 
> Greg
> 
>  
> 
> --- Aviran <am...@infosciences.com> wrote:
> > Since I had to implement sorting in lucene 1.2 I
> had
> > to write my own sorting
> > using something similar to a lucene's contribution
> > called SortField.
> > Yesterday I did some tests, trying to use lucene
> 1.4
> > Sort objects and I
> > realized that my old implementation works 40%
> faster
> > then Lucene's
> > implementation. My guess is that you are right and
> > there is a problem with
> > the cache although I couldn't find what that is
> yet.
> > 
> > Aviran
> > 
> > -----Original Message-----
> > From: Greg Gershman [mailto:greggersh@yahoo.com]
> > Sent: Wednesday, July 21, 2004 9:22 AM
> > To: lucene-user@jakarta.apache.org
> > Subject: Sort: 1.4-rc3 vs. 1.4-final
> > 
> > 
> > When rc3 came out, I modified the classes used for
> > Sorting to, in addition to Integer, Float and
> > String-based sort keys, use Long values.  All I
> did
> > was add extra statements in 2 classes (SortField
> and
> > FieldSortedHitQueue) that made a special case for
> > longs, and created a LongSortedHitQueue identical
> to
> > the IntegerSortedHitQueue, only using longs.
> > 
> > This worked as expected; Long values converted to
> > strings and stored in Field.Keyword type fields
> > would
> > be sorted according to Long order.  The initial
> > query
> > would take a while, to build the sorted array, but subsequent 
> > queries would take little to no time at all.
> > 
> > I went back to look at 1.4 final, and noticed the
> > Sort implementation has
> > changed quite a bit.  I tried the same type of modifications to the 
> > existing source files, but was unable to achieve similiar
> > results. 
> > Each subsequent query seems to take a significant
> > amount of time, as if the Sorted array is being
> > rebuilt each time.  Also, I tried sorting on an
> > Integer fields and got similar results, which
> leads
> > me
> > to believe there might be a caching problem
> > somewhere.
> > 
> > Has anyone else seen this in 1.4-final?  Also, I
> > would
> > like it if Long sorted fields could become a part
> of
> > the API; it makes sorting by date a breeze.
> > 
> > Thanks!
> > 
> > Greg Gershman
> > 
> > 
> > 		
> > __________________________________
> > Do you Yahoo!?
> > New and Improved Yahoo! Mail - Send 10MB messages! 
> > http://promotions.yahoo.com/new_mail
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> > lucene-user-help@jakarta.apache.org
> > 
> > 
> > 
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> > lucene-user-help@jakarta.apache.org
> > 
> > 
> 
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Vote for the stars of Yahoo!'s next ad campaign!
>
http://advision.webevents.yahoo.com/yahoo/votelifeengine/
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
> 



	
		
__________________________________
Do you Yahoo!?
Vote for the stars of Yahoo!'s next ad campaign!
http://advision.webevents.yahoo.com/yahoo/votelifeengine/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Sort: 1.4-rc3 vs. 1.4-final

Posted by Greg Gershman <gr...@yahoo.com>.
I switched the Comparators and FieldCache classes to
use java.util.HashMap instead of
java.util.WeakHashMap, and got the performance boost I
was looking for (test index of 100K documents; initial
search took 991 ms, all subsequent searchs took <
90ms.  Before, I was seeing initial query of ~1sec,
subsequent queries between 500 and 700 ms, with
comparator and field lookup table computed each time).

I guess the question is why use a WeakHashMap here as
opposed to a HashMap?

Greg

--- Greg Gershman <gr...@yahoo.com> wrote:
> I've done a bit more snooping around; it seems that
> in
> FieldSortedHitQueue.getCachedComparator(line 153),
> calls to lookup a stored comparator in the cache
> always return null.  This occurs even for the
> built-in
> sort types (I tested it on integers and my code for
> longs).  The comparators don't even appear to be
> being
> stored in the HashMap to begin with.
> 
> Any ideas?
> 
> Greg
> 
>  
> 
> --- Aviran <am...@infosciences.com> wrote:
> > Since I had to implement sorting in lucene 1.2 I
> had
> > to write my own sorting
> > using something similar to a lucene's contribution
> > called SortField. 
> > Yesterday I did some tests, trying to use lucene
> 1.4
> > Sort objects and I
> > realized that my old implementation works 40%
> faster
> > then Lucene's
> > implementation. My guess is that you are right and
> > there is a problem with
> > the cache although I couldn't find what that is
> yet.
> > 
> > Aviran
> > 
> > -----Original Message-----
> > From: Greg Gershman [mailto:greggersh@yahoo.com] 
> > Sent: Wednesday, July 21, 2004 9:22 AM
> > To: lucene-user@jakarta.apache.org
> > Subject: Sort: 1.4-rc3 vs. 1.4-final
> > 
> > 
> > When rc3 came out, I modified the classes used for
> > Sorting to, in addition to Integer, Float and
> > String-based sort keys, use Long values.  All I
> did
> > was add extra statements in 2 classes (SortField
> and
> > FieldSortedHitQueue) that made a special case for
> > longs, and created a LongSortedHitQueue identical
> to
> > the IntegerSortedHitQueue, only using longs.  
> > 
> > This worked as expected; Long values converted to
> > strings and stored in Field.Keyword type fields
> > would
> > be sorted according to Long order.  The initial
> > query
> > would take a while, to build the sorted array, but
> > subsequent queries would take little to no time at
> > all.
> > 
> > I went back to look at 1.4 final, and noticed the
> > Sort implementation has
> > changed quite a bit.  I tried the same type of
> > modifications to the existing
> > source files, but was unable to achieve similiar
> > results. 
> > Each subsequent query seems to take a significant
> > amount of time, as if the Sorted array is being
> > rebuilt each time.  Also, I tried sorting on an
> > Integer fields and got similar results, which
> leads
> > me
> > to believe there might be a caching problem
> > somewhere.
> > 
> > Has anyone else seen this in 1.4-final?  Also, I
> > would
> > like it if Long sorted fields could become a part
> of
> > the API; it makes sorting by date a breeze.
> > 
> > Thanks!
> > 
> > Greg Gershman
> > 
> > 
> > 		
> > __________________________________
> > Do you Yahoo!?
> > New and Improved Yahoo! Mail - Send 10MB messages!
> > http://promotions.yahoo.com/new_mail 
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> > lucene-user-help@jakarta.apache.org
> > 
> > 
> > 
> > 
> >
>
---------------------------------------------------------------------
> > To unsubscribe, e-mail:
> > lucene-user-unsubscribe@jakarta.apache.org
> > For additional commands, e-mail:
> > lucene-user-help@jakarta.apache.org
> > 
> > 
> 
> 
> 
> 	
> 		
> __________________________________
> Do you Yahoo!?
> Vote for the stars of Yahoo!'s next ad campaign!
>
http://advision.webevents.yahoo.com/yahoo/votelifeengine/
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
> 



	
		
__________________________________
Do you Yahoo!?
Vote for the stars of Yahoo!'s next ad campaign!
http://advision.webevents.yahoo.com/yahoo/votelifeengine/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Sort: 1.4-rc3 vs. 1.4-final

Posted by Greg Gershman <gr...@yahoo.com>.
I've done a bit more snooping around; it seems that in
FieldSortedHitQueue.getCachedComparator(line 153),
calls to lookup a stored comparator in the cache
always return null.  This occurs even for the built-in
sort types (I tested it on integers and my code for
longs).  The comparators don't even appear to be being
stored in the HashMap to begin with.

Any ideas?

Greg

 

--- Aviran <am...@infosciences.com> wrote:
> Since I had to implement sorting in lucene 1.2 I had
> to write my own sorting
> using something similar to a lucene's contribution
> called SortField. 
> Yesterday I did some tests, trying to use lucene 1.4
> Sort objects and I
> realized that my old implementation works 40% faster
> then Lucene's
> implementation. My guess is that you are right and
> there is a problem with
> the cache although I couldn't find what that is yet.
> 
> Aviran
> 
> -----Original Message-----
> From: Greg Gershman [mailto:greggersh@yahoo.com] 
> Sent: Wednesday, July 21, 2004 9:22 AM
> To: lucene-user@jakarta.apache.org
> Subject: Sort: 1.4-rc3 vs. 1.4-final
> 
> 
> When rc3 came out, I modified the classes used for
> Sorting to, in addition to Integer, Float and
> String-based sort keys, use Long values.  All I did
> was add extra statements in 2 classes (SortField and
> FieldSortedHitQueue) that made a special case for
> longs, and created a LongSortedHitQueue identical to
> the IntegerSortedHitQueue, only using longs.  
> 
> This worked as expected; Long values converted to
> strings and stored in Field.Keyword type fields
> would
> be sorted according to Long order.  The initial
> query
> would take a while, to build the sorted array, but
> subsequent queries would take little to no time at
> all.
> 
> I went back to look at 1.4 final, and noticed the
> Sort implementation has
> changed quite a bit.  I tried the same type of
> modifications to the existing
> source files, but was unable to achieve similiar
> results. 
> Each subsequent query seems to take a significant
> amount of time, as if the Sorted array is being
> rebuilt each time.  Also, I tried sorting on an
> Integer fields and got similar results, which leads
> me
> to believe there might be a caching problem
> somewhere.
> 
> Has anyone else seen this in 1.4-final?  Also, I
> would
> like it if Long sorted fields could become a part of
> the API; it makes sorting by date a breeze.
> 
> Thanks!
> 
> Greg Gershman
> 
> 
> 		
> __________________________________
> Do you Yahoo!?
> New and Improved Yahoo! Mail - Send 10MB messages!
> http://promotions.yahoo.com/new_mail 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
> 
> 
> 
>
---------------------------------------------------------------------
> To unsubscribe, e-mail:
> lucene-user-unsubscribe@jakarta.apache.org
> For additional commands, e-mail:
> lucene-user-help@jakarta.apache.org
> 
> 



	
		
__________________________________
Do you Yahoo!?
Vote for the stars of Yahoo!'s next ad campaign!
http://advision.webevents.yahoo.com/yahoo/votelifeengine/

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


RE: Sort: 1.4-rc3 vs. 1.4-final

Posted by Aviran <am...@infosciences.com>.
Since I had to implement sorting in lucene 1.2 I had to write my own sorting
using something similar to a lucene's contribution called SortField. 
Yesterday I did some tests, trying to use lucene 1.4 Sort objects and I
realized that my old implementation works 40% faster then Lucene's
implementation. My guess is that you are right and there is a problem with
the cache although I couldn't find what that is yet.

Aviran

-----Original Message-----
From: Greg Gershman [mailto:greggersh@yahoo.com] 
Sent: Wednesday, July 21, 2004 9:22 AM
To: lucene-user@jakarta.apache.org
Subject: Sort: 1.4-rc3 vs. 1.4-final


When rc3 came out, I modified the classes used for
Sorting to, in addition to Integer, Float and
String-based sort keys, use Long values.  All I did
was add extra statements in 2 classes (SortField and
FieldSortedHitQueue) that made a special case for
longs, and created a LongSortedHitQueue identical to
the IntegerSortedHitQueue, only using longs.  

This worked as expected; Long values converted to
strings and stored in Field.Keyword type fields would
be sorted according to Long order.  The initial query
would take a while, to build the sorted array, but
subsequent queries would take little to no time at
all.

I went back to look at 1.4 final, and noticed the Sort implementation has
changed quite a bit.  I tried the same type of modifications to the existing
source files, but was unable to achieve similiar results. 
Each subsequent query seems to take a significant
amount of time, as if the Sorted array is being
rebuilt each time.  Also, I tried sorting on an
Integer fields and got similar results, which leads me
to believe there might be a caching problem somewhere.

Has anyone else seen this in 1.4-final?  Also, I would
like it if Long sorted fields could become a part of
the API; it makes sorting by date a breeze.

Thanks!

Greg Gershman


		
__________________________________
Do you Yahoo!?
New and Improved Yahoo! Mail - Send 10MB messages!
http://promotions.yahoo.com/new_mail 

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org