You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@lucene.apache.org by Erik Hatcher <er...@ehatchersolutions.com> on 2003/09/13 12:34:26 UTC

DateFilter.Before/After

Is it odd to anyone that DateFilter.Before/After is *inclusive* of the 
date you specify?  Seems counter-intuitive to me, but I'm going to add 
"on or before/after" to the Javadocs at least.  The methods should have 
probably been called OnOrBefore/After :)

	Erik

RE: need help on updating index

Posted by samir <sa...@i-link.co.in>.

Hi Erik,
Thank you very much for a prompt reply
Regards
Samir


-----Original Message-----
From: Erik Hatcher [mailto:erik@ehatchersolutions.com] 
Sent: 23 February 2006 13:52
To: java-dev@lucene.apache.org
Subject: Re: need help on updating index


On Feb 23, 2006, at 2:53 AM, samir wrote:
> I have one document in the index.
> If that document is only renamed but its content is same as before  
> then can
> I update the index.
> As the Book Lucene in Action says that Lucene doesn't have any  
> thing like
> update(...) method. And for updating one need to delete the old  
> document and
> add the new document.
> Is it the only solution?

Yes, delete/add is the only solution to "update".

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: need help on updating index

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Feb 23, 2006, at 2:53 AM, samir wrote:
> I have one document in the index.
> If that document is only renamed but its content is same as before  
> then can
> I update the index.
> As the Book Lucene in Action says that Lucene doesn't have any  
> thing like
> update(...) method. And for updating one need to delete the old  
> document and
> add the new document.
> Is it the only solution?

Yes, delete/add is the only solution to "update".

	Erik


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

need help on updating index

Posted by samir <sa...@i-link.co.in>.

Hello,
I have one document in the index.
If that document is only renamed but its content is same as before then can
I update the index.
As the Book Lucene in Action says that Lucene doesn't have any thing like
update(...) method. And for updating one need to delete the old document and
add the new document.
Is it the only solution?


Regards,
Samir






---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re[2]: RE : DateFilter.Before/After

Posted by Maxim Patramanskij <ma...@osua.de>.

Hello Erik,

Monday, September 15, 2003, 4:27:27 PM, you wrote:

EH> On Monday, September 15, 2003, at 09:45  AM, Bruce Ritchie wrote:
>> I would suggest *not* using caching inside of filters provided by 
>> lucene but rather provide a wrapper to do the caching. The reason is 
>> that some applications really don't want the libraries they use to be 
>> a source of concern for memory usage. i.e. if I search for a string 
>> using 10,000 different date filters (an extreme example, but possible) 
>> I want the ability to control how those bitsets are going to be > cached.

EH> In the case of QueryFilter, simply construct a new one to avoid caching 
EH> rather than reuse the same instance.  So you have control there as 
EH> well.  The only thing that is cached is a BitSet, so it should be much 
EH> of a memory usage concern.

>> public class CachedFilter extends Filter {
>>     BitSet bits;
>>     Filter filter;
>>
>>     public CachedFilter(Filter filter) {
>>         this.filter = filter;
>>         this.bits = null;
>>     }
>>
>>     public BitSet bits(IndexReader reader) throws IOException {
>>         if (bits != null) {
>>             return bits;
>>         }
>>
>>         bits = filter.bits(reader);
>>         return bits;
>>     }
>> }

EH> You would have problems if you searched a different index or different 
EH> instance of IndexReader even with your caching here.  You should cache 
EH> like QueryFilter does to avoid a potential mismatch with IndexReader 
EH> instances.

EH> But you're implementation is exactly what I was envisioning with the 
EH> added WeakHashMap of QueryFilter.

EH>         Erik


EH> ---------------------------------------------------------------------
EH> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
EH> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org




-- 
Best regards,
 Maxim                            mailto:max@osua.de


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-dev-help@lucene.apache.org

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Erik Hatcher wrote:
> So, if there was a caching filter implemented like yours, but with the 
> WeakHashMap cache like QueryFilter, would you use it instead of what 
> you've done?  I'm in agreement with you about where the caching should 
> be.  Would anyone object to such an implementation added to Lucene's core?

+1

Doug

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Doug Cutting wrote:
> FSInputStream.file is shared between all clones of an open file.  Access 
> to it is synchronized.  This works correctly.  Cloned input streams are 
> already used extensively.
> 
>> The solution I think is to recreate the FSInputStream.file object 
>> whenever a FSInputStream is cloned. I've attached what I think is a 
>> fix for the issue below.
> 
> 
> This would cause lots more file handles to be opened, and would also 
> fail once a file has been deleted.  On UNIX, an open file handle may be 
> used after it is deleted, and Lucene leverages this so that updates can 
> occur while an index is still being searched.  I don't think there's a 
> bug here to be fixed.

You're correct of course. Note to self: Don't reports bugs at the end of the day ;)


Regards,

Bruce Ritchie

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Bruce Ritchie wrote:
> One thing I saw while tracing back the clone() call is that 
> InputStream.clone() has this remark:
> 
> Expert: Subclasses must ensure that clones may be positioned at
> different points in the input from each other and from the stream they
> were cloned from.
> 
> I'm not actually certain if that's the case for FSInputStream. As I see 
> it I don't think the file variable in the FSInputStream class will be 
> cloned correctly and will cause issues when cloned InputStreams are used.

FSInputStream.file is shared between all clones of an open file.  Access 
to it is synchronized.  This works correctly.  Cloned input streams are 
already used extensively.

> The solution I think is to recreate the FSInputStream.file object 
> whenever a FSInputStream is cloned. I've attached what I think is a fix 
> for the issue below.

This would cause lots more file handles to be opened, and would also 
fail once a file has been deleted.  On UNIX, an open file handle may be 
used after it is deleted, and Lucene leverages this so that updates can 
occur while an index is still being searched.  I don't think there's a 
bug here to be fixed.

Doug

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Doug Cutting wrote:
>> I think that using a pool of cloned inputStreams would be the best 
>> solution. I've implemented such a solution locally using two pools of 
>> 3 readers each (configurable via system properties) and will post the 
>> diff after I do some testing to confirm accuracy and speed improvements.
> 
> 
> Could you also benchmark this against a version that clones new streams 
> for each call?  That sounds extravagant, but it removes a configuration 
> parameter, always a good thing.

Sure. On some thought I expect them to perform about the same so I may just go with the simplier 
method if testing supports that thought.

One thing I saw while tracing back the clone() call is that InputStream.clone() has this remark:

Expert: Subclasses must ensure that clones may be positioned at
different points in the input from each other and from the stream they
were cloned from.

I'm not actually certain if that's the case for FSInputStream. As I see it I don't think the file 
variable in the FSInputStream class will be cloned correctly and will cause issues when cloned 
InputStreams are used.

 From Object.clone():

... this method creates a new instance of the class of this
object and initializes all its fields with exactly the contents of
the corresponding fields of this object, as if by assignment; the
contents of the fields are not themselves cloned. Thus, this method
performs a "shallow copy" of this object, not a "deep copy" operation.

The solution I think is to recreate the FSInputStream.file object whenever a FSInputStream is 
cloned. I've attached what I think is a fix for the issue below.

diff -u -r1.20 FSDirectory.java
--- src/java/org/apache/lucene/store/FSDirectory.java	29 May 2003 20:18:18 -0000	1.20
+++ src/java/org/apache/lucene/store/FSDirectory.java	15 Sep 2003 22:14:36 -0000
@@ -381,9 +381,11 @@

  final class FSInputStream extends InputStream {
    private class Descriptor extends RandomAccessFile {
+    File f = null;
      public long position;
      public Descriptor(File file, String mode) throws IOException {
        super(file, mode);
+      this.f = file;
      }
    }

@@ -431,6 +433,8 @@
    public Object clone() {
      FSInputStream clone = (FSInputStream)super.clone();
      clone.isClone = true;
+    try { clone.file = new Descriptor(file.f, "r"); }
+    catch (IOException e) { /* umm? */ }
      return clone;
    }
  }

Regards,

Bruce Ritchie

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Doug Cutting wrote:
> Couldn't you use a custom HitCollector?
> 
> For example, you could maintain an array of floats which is the current 
> rating for each document.  You'd need to rebuild this array each time 
> the index is altered, but you could maintain it incrementally as 
> documents are viewed.  Then your HitCollector can multiply this into the 
> score or somesuch.  Similarly, for external sort criteria, you can keep 
> an array of the sort value for each document that is used by a 
> HitCollector that only collects values in the desired range.  The same 
> technique should be usable for permissions too.
> 
> These are much like Filters, a cached array indexed by document id, but 
> are instead explicitly used by application logic in a HitCollector. 
> Could such a technique be applicable?  Or would it be too hard to 
> maintain these arrays?

Hmm. It may be possible to do this in combination with the searchbean or something similar. We'de 
have to maintain a reverse documentID -> UID map for every document for this to work (otherwise it 
would still require calling doc(i) for every (uncached) document to lookup our stored UID for the 
document). The amount of work required to make this work would not be insignificant and given our 
time constraints I don't think I'll have time to test this idea out until our next major release(s). 
I'll have to stick with using our current approach but using a single IndexReader and my suggested 
change to the FieldsReader class for the time being.

Thanks for the ideas, they are very much appreciated.

Regards,

Bruce Ritchie

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Bruce Ritchie wrote:
> We would dearly love to not have to post-process results returned from 
> lucene. Unfortunately, we can't foresee a way to do this given the 
> current architecture of our applications and Lucene. The issue is that 
> we must both exclude search results based upon an external (to lucene) 
> permission system and be able to sort results based upon criteria(s) 
> that again can't be stored inside lucene (document rating is an 
> example). Neither the permissions nor the external sort criteria(s) can 
> be stored in lucene because they can impact too many documents when they 
> change (1 permission change could require 'updating' a field in every 
> document in the lucene store) or change too often (it's quite probable 
> that a document rating will change every time a document is viewed for 
> example).
> 
> The only way I foresee that we could internalize both of these factors 
> into lucene is if it was possible to modify a document inside of lucene 
> at basically no cost. Since that's not currently possible, we are stuck 
> with retrieving all the documents from lucene and post-processing them. 
> Even if updating a document was possible we might decide that it's just 
> not worth it to store some document attributes in lucene from an overall 
> performance perspective. There may of course be other possible solutions 
> however we haven't yet thought of them

Couldn't you use a custom HitCollector?

For example, you could maintain an array of floats which is the current 
rating for each document.  You'd need to rebuild this array each time 
the index is altered, but you could maintain it incrementally as 
documents are viewed.  Then your HitCollector can multiply this into the 
score or somesuch.  Similarly, for external sort criteria, you can keep 
an array of the sort value for each document that is used by a 
HitCollector that only collects values in the desired range.  The same 
technique should be usable for permissions too.

These are much like Filters, a cached array indexed by document id, but 
are instead explicitly used by application logic in a HitCollector. 
Could such a technique be applicable?  Or would it be too hard to 
maintain these arrays?

Cheers,

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Bruce Ritchie wrote:
> We would dearly love to not have to post-process results returned from 
> lucene. Unfortunately, we can't foresee a way to do this given the 
> current architecture of our applications and Lucene. The issue is that 
> we must both exclude search results based upon an external (to lucene) 
> permission system and be able to sort results based upon criteria(s) 
> that again can't be stored inside lucene (document rating is an 
> example). Neither the permissions nor the external sort criteria(s) can 
> be stored in lucene because they can impact too many documents when they 
> change (1 permission change could require 'updating' a field in every 
> document in the lucene store) or change too often (it's quite probable 
> that a document rating will change every time a document is viewed for 
> example).
> 
> The only way I foresee that we could internalize both of these factors 
> into lucene is if it was possible to modify a document inside of lucene 
> at basically no cost. Since that's not currently possible, we are stuck 
> with retrieving all the documents from lucene and post-processing them. 
> Even if updating a document was possible we might decide that it's just 
> not worth it to store some document attributes in lucene from an overall 
> performance perspective. There may of course be other possible solutions 
> however we haven't yet thought of them

Couldn't you use a custom HitCollector?

For example, you could maintain an array of floats which is the current 
rating for each document.  You'd need to rebuild this array each time 
the index is altered, but you could maintain it incrementally as 
documents are viewed.  Then your HitCollector can multiply this into the 
score or somesuch.  Similarly, for external sort criteria, you can keep 
an array of the sort value for each document that is used by a 
HitCollector that only collects values in the desired range.  The same 
technique should be usable for permissions too.

These are much like Filters, a cached array indexed by document id, but 
are instead explicitly used by application logic in a HitCollector. 
Could such a technique be applicable?  Or would it be too hard to 
maintain these arrays?

Cheers,

Doug

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Doug Cutting wrote:
>> for (int i = 0; i < numResults; i++) {
>>    ids[i] = Long.parseLong((hits.doc(i)).get("messageID"));
>> }
> 
> This is not a recommended way to use Lucene.  The intent is that you 
> should only have to call Hits.doc() for documents that you actually 
> display, usually around 10 per query.  Is this still a bottleneck when 
> you fetch a max of 10 or 20 documents?

I didn't test this case.

> So I'd be interested to hear why you need 1500 hits.  My guess is that 
> you're doing post-processing of hits, then selecting 10 or so to 
> actually display.  If you can figure out a way to do this post 
> processing without accessing the document object, i.e., through the 
> query, a custom HitCollector, or the SearchBean, then this optimization 
> is probably not needed.

We would dearly love to not have to post-process results returned from lucene. Unfortunately, we 
can't foresee a way to do this given the current architecture of our applications and Lucene. The 
issue is that we must both exclude search results based upon an external (to lucene) permission 
system and be able to sort results based upon criteria(s) that again can't be stored inside lucene 
(document rating is an example). Neither the permissions nor the external sort criteria(s) can be 
stored in lucene because they can impact too many documents when they change (1 permission change 
could require 'updating' a field in every document in the lucene store) or change too often (it's 
quite probable that a document rating will change every time a document is viewed for example).

The only way I foresee that we could internalize both of these factors into lucene is if it was 
possible to modify a document inside of lucene at basically no cost. Since that's not currently 
possible, we are stuck with retrieving all the documents from lucene and post-processing them. Even 
if updating a document was possible we might decide that it's just not worth it to store some 
document attributes in lucene from an overall performance perspective. There may of course be other 
possible solutions however we haven't yet thought of them

> A 30% optimization to a slow algorithm is better than nothing, but it 
> would be better yet to improve the algorithm.  That said, this sort of 
> improvement is not always trivial, and lots of people use Lucene in the 
> way that you have, so it's still may be worth optimizing this.

30% on my machine - I think it's likely to be quite a bit faster when the lucene files are stripped 
across multiple disks. I can't test that assumption though as I don't have the hardware available. I 
believe the speedup is beneficial in almost all situations and the cost associated with the 
optimization is quite minimal, especially when compared to the alternative (slow searches under 
heavy load or more memory usage/file descriptors through multiple readers).

Regards,

Bruce Ritchie

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Doug Cutting wrote:
> I wonder if SearchBean, or something like it, should be added to the 
> core?  This is something lots of folks ask for.  SearchBean's technique 
> can use a fair amount of memory, but most folks are not short on RAM 
> these days.  One could optimize SearchBean's sorting for integer-valued 
> fields, but that could also be done after it is added to the core.
> 
> What do folks think about adding SearchBean to the core?  Perhaps it 
> could be merged with the existing Hits code, as a primary API for 
> accessing search results?

+1 for this as well.

I may not use it immediately, but it sure would be nice to have something like this in the core of 
lucene.


Regards,

Bruce Ritchie

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Terry Steichen <te...@net-frame.com>.

----- Original Message ----- 
From: "Doug Cutting" <cu...@lucene.com>
To: "Lucene Developers List" <lu...@jakarta.apache.org>
Sent: Tuesday, September 16, 2003 12:26 PM
Subject: Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)
 
> What do folks think about adding SearchBean to the core?  Perhaps it 
> could be merged with the existing Hits code, as a primary API for 
> accessing search results?

++1

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Terry Steichen <te...@net-frame.com>.

----- Original Message ----- 
From: "Doug Cutting" <cu...@lucene.com>
To: "Lucene Developers List" <lu...@jakarta.apache.org>
Sent: Tuesday, September 16, 2003 12:26 PM
Subject: Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)
 
> What do folks think about adding SearchBean to the core?  Perhaps it 
> could be merged with the existing Hits code, as a primary API for 
> accessing search results?

++1




---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

Sorting via SearchBean (was Re: Caching filter wrapper)

Posted by Terry Steichen <te...@net-frame.com>.

What I believe SearchBean does is create a complete replica only of the
contents of each field (not the entire document) that is to be used as the
sort key (typically only one field, such as a date field).

Regards,

Terry

----- Original Message -----
From: "Barry Kaplan" <ba...@livespark.com>
To: "Lucene Developers List" <lu...@jakarta.apache.org>
Sent: Tuesday, September 16, 2003 9:02 PM
Subject: RE: Caching filter wrapper (was Re: RE : DateFilter.Before/After)


> Doug Cutting wrote:
>
> >I wonder if SearchBean, or something like it, should be added to the
> >core?  This is something lots of folks ask for.  SearchBean's technique
> >can use a fair amount of memory, but most folks are not short on RAM
> >these days.  One could optimize SearchBean's sorting for integer-valued
> >fields, but that could also be done after it is added to the core.
> >
> >What do folks think about adding SearchBean to the core?  Perhaps it
> >could be merged with the existing Hits code, as a primary API for
> >accessing search results?
> >
> >Doug
>
>
> I think there could be some other optimization with the SearchBean if
added
> to the core in how it builds its cache. Currently it looks like the
> SearchBean will create the entire document once per sorted field needed
from
> the document, this could be changed. Also, if you only need one field from
> the doc it would be nice if you only read that one field from the index
> instead of building the entire document and than selecting out the one
field
> needed.
>
> -Barry
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>

Sorting via SearchBean (was Re: Caching filter wrapper)

Posted by Terry Steichen <te...@net-frame.com>.

What I believe SearchBean does is create a complete replica only of the
contents of each field (not the entire document) that is to be used as the
sort key (typically only one field, such as a date field).

Regards,

Terry

----- Original Message -----
From: "Barry Kaplan" <ba...@livespark.com>
To: "Lucene Developers List" <lu...@jakarta.apache.org>
Sent: Tuesday, September 16, 2003 9:02 PM
Subject: RE: Caching filter wrapper (was Re: RE : DateFilter.Before/After)


> Doug Cutting wrote:
>
> >I wonder if SearchBean, or something like it, should be added to the
> >core?  This is something lots of folks ask for.  SearchBean's technique
> >can use a fair amount of memory, but most folks are not short on RAM
> >these days.  One could optimize SearchBean's sorting for integer-valued
> >fields, but that could also be done after it is added to the core.
> >
> >What do folks think about adding SearchBean to the core?  Perhaps it
> >could be merged with the existing Hits code, as a primary API for
> >accessing search results?
> >
> >Doug
>
>
> I think there could be some other optimization with the SearchBean if
added
> to the core in how it builds its cache. Currently it looks like the
> SearchBean will create the entire document once per sorted field needed
from
> the document, this could be changed. Also, if you only need one field from
> the doc it would be nice if you only read that one field from the index
> instead of building the entire document and than selecting out the one
field
> needed.
>
> -Barry
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

RE: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Barry Kaplan <ba...@livespark.com>.

Doug Cutting wrote:

>I wonder if SearchBean, or something like it, should be added to the
>core?  This is something lots of folks ask for.  SearchBean's technique
>can use a fair amount of memory, but most folks are not short on RAM
>these days.  One could optimize SearchBean's sorting for integer-valued
>fields, but that could also be done after it is added to the core.
>
>What do folks think about adding SearchBean to the core?  Perhaps it
>could be merged with the existing Hits code, as a primary API for
>accessing search results?
>
>Doug

I think there could be some other optimization with the SearchBean if added
to the core in how it builds its cache. Currently it looks like the
SearchBean will create the entire document once per sorted field needed from
the document, this could be changed. Also, if you only need one field from
the doc it would be nice if you only read that one field from the index
instead of building the entire document and than selecting out the one field
needed.

-Barry

RE: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Barry Kaplan <ba...@livespark.com>.

Doug Cutting wrote:

>I wonder if SearchBean, or something like it, should be added to the
>core?  This is something lots of folks ask for.  SearchBean's technique
>can use a fair amount of memory, but most folks are not short on RAM
>these days.  One could optimize SearchBean's sorting for integer-valued
>fields, but that could also be done after it is added to the core.
>
>What do folks think about adding SearchBean to the core?  Perhaps it
>could be merged with the existing Hits code, as a primary API for
>accessing search results?
>
>Doug

I think there could be some other optimization with the SearchBean if added
to the core in how it builds its cache. Currently it looks like the
SearchBean will create the entire document once per sorted field needed from
the document, this could be changed. Also, if you only need one field from
the doc it would be nice if you only read that one field from the index
instead of building the entire document and than selecting out the one field
needed.

-Barry

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Bruce Ritchie wrote:
> The times shown above is only the time taken to call the following code 
> (numResults is a max of 1500 or hits.length(), whichever is smaller):
> 
> for (int i = 0; i < numResults; i++) {
>    ids[i] = Long.parseLong((hits.doc(i)).get("messageID"));
> }

This is not a recommended way to use Lucene.  The intent is that you 
should only have to call Hits.doc() for documents that you actually 
display, usually around 10 per query.  Is this still a bottleneck when 
you fetch a max of 10 or 20 documents?

So I'd be interested to hear why you need 1500 hits.  My guess is that 
you're doing post-processing of hits, then selecting 10 or so to 
actually display.  If you can figure out a way to do this post 
processing without accessing the document object, i.e., through the 
query, a custom HitCollector, or the SearchBean, then this optimization 
is probably not needed.

A 30% optimization to a slow algorithm is better than nothing, but it 
would be better yet to improve the algorithm.  That said, this sort of 
improvement is not always trivial, and lots of people use Lucene in the 
way that you have, so it's still may be worth optimizing this.

If your post-processsing is done in order to sort the results, then I 
recommend trying the SearchBean, in the Lucene sandbox.  I've never used 
it myself, but it is able to provide results sorted by any field without 
accessing the document object of each hit while the query is processed 
(it caches tables of field values when constructed).  Examining the 
SearchBean code, I see an optimization: it would be more efficient if it 
used a HitCollector rather than a Hits when sorting, as the Hits may 
have to re-query a few times to get the full set of results, but even 
with that, I suspect you'd see a speedup.

I wonder if SearchBean, or something like it, should be added to the 
core?  This is something lots of folks ask for.  SearchBean's technique 
can use a fair amount of memory, but most folks are not short on RAM 
these days.  One could optimize SearchBean's sorting for integer-valued 
fields, but that could also be done after it is added to the core.

What do folks think about adding SearchBean to the core?  Perhaps it 
could be merged with the existing Hits code, as a primary API for 
accessing search results?

Doug

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Bruce Ritchie wrote:
> The times shown above is only the time taken to call the following code 
> (numResults is a max of 1500 or hits.length(), whichever is smaller):
> 
> for (int i = 0; i < numResults; i++) {
>    ids[i] = Long.parseLong((hits.doc(i)).get("messageID"));
> }

This is not a recommended way to use Lucene.  The intent is that you 
should only have to call Hits.doc() for documents that you actually 
display, usually around 10 per query.  Is this still a bottleneck when 
you fetch a max of 10 or 20 documents?

So I'd be interested to hear why you need 1500 hits.  My guess is that 
you're doing post-processing of hits, then selecting 10 or so to 
actually display.  If you can figure out a way to do this post 
processing without accessing the document object, i.e., through the 
query, a custom HitCollector, or the SearchBean, then this optimization 
is probably not needed.

A 30% optimization to a slow algorithm is better than nothing, but it 
would be better yet to improve the algorithm.  That said, this sort of 
improvement is not always trivial, and lots of people use Lucene in the 
way that you have, so it's still may be worth optimizing this.

If your post-processsing is done in order to sort the results, then I 
recommend trying the SearchBean, in the Lucene sandbox.  I've never used 
it myself, but it is able to provide results sorted by any field without 
accessing the document object of each hit while the query is processed 
(it caches tables of field values when constructed).  Examining the 
SearchBean code, I see an optimization: it would be more efficient if it 
used a HitCollector rather than a Hits when sorting, as the Hits may 
have to re-query a few times to get the full set of results, but even 
with that, I suspect you'd see a speedup.

I wonder if SearchBean, or something like it, should be added to the 
core?  This is something lots of folks ask for.  SearchBean's technique 
can use a fair amount of memory, but most folks are not short on RAM 
these days.  One could optimize SearchBean's sorting for integer-valued 
fields, but that could also be done after it is added to the core.

What do folks think about adding SearchBean to the core?  Perhaps it 
could be merged with the existing Hits code, as a primary API for 
accessing search results?

Doug

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Doug Cutting wrote:
>> I think that using a pool of cloned inputStreams would be the best 
>> solution. I've implemented such a solution locally using two pools of 
>> 3 readers each (configurable via system properties) and will post the 
>> diff after I do some testing to confirm accuracy and speed improvements.
> 
> 
> Could you also benchmark this against a version that clones new streams 
> for each call?  That sounds extravagant, but it removes a configuration 
> parameter, always a good thing.

After doing some testing tonight I've come up with the following numbers after running some tests on 
my personal workstation.

Dell 2 ghz with single HD, Win XP SP1, 768 Mb ram
Sun JDK 1.4.2
Resin 2.1.0 (-DdisableLuceneLocks=true -J-server)
Lucene index with 474128 documents, not completely optimized (most content is in 1 segment)
First run after startup discarded
Tested with 5 simultaneous threads

Unmodified CVS source

Run  Count Time (ms) Queries/Second
  1   1001  542050    1.85
  2   1001  508458    1.97
  3   1001  524396    1.93

CVS source using suggested clone solution for FieldsReader and removing synchronized from 
SegmentReader.document(i)

Run  Count Time (ms) Queries/Second
  1   1008  674123    1.495
  2   1017  675363    1.51
  3   1005  655551    1.53

CVS source using pool of 3 input streams for the previous fieldsStream and indexStream variables in 
FieldsReader and removing synchronized from SegmentReader.document(i)

Run  Count Time (ms) Queries/Second
  1   1009  392536    2.57
  2   999   364783    2.74
  3   995   386501    2.57

The times shown above is only the time taken to call the following code (numResults is a max of 1500 
or hits.length(), whichever is smaller):

for (int i = 0; i < numResults; i++) {
    ids[i] = Long.parseLong((hits.doc(i)).get("messageID"));
}

I've uploaded my testing app with 3 prebuilt lucene libs (unomodified, clone and pool) and my source 
modifications to FieldsReader to http://www.jivesoftware.com/~bruce/lucene/lucene-test.zip (624K) if 
anyone else wants to run the tests on their hardware. You'll have to edit the 
lucenetest.LuceneTestThread class as it has the location of the search directory hardcoded, but it 
should be pretty easy to understand what is going on.

Regards,

Bruce Ritchie

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Bruce Ritchie wrote:
> I think that using a pool of cloned inputStreams would be the best 
> solution. I've implemented such a solution locally using two pools of 3 
> readers each (configurable via system properties) and will post the diff 
> after I do some testing to confirm accuracy and speed improvements.

Could you also benchmark this against a version that clones new streams 
for each call?  That sounds extravagant, but it removes a configuration 
parameter, always a good thing.

Thanks,

Doug

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Doug Cutting wrote:
> If you do get a chance to look into this, I'd love to hear more.
> 
> FieldsReader.doc() could easily be re-written to be re-entrant.  For a 
> start, it could synchronize separately on fieldStream and indexStream, 
> which would let two threads use it at once.  (If an index is not 
> optimized, the situtation would be even better, since there would be a 
> fieldStream and indexStream per index segment.)
> 
> If that's not enough, then it could be re-written to use either a pool 
> of cloned input streams, or just to clone a new stream for each call. 
> (The primary expense of cloning a stream is allocating a 1k buffer.)

I think that using a pool of cloned inputStreams would be the best solution. I've implemented such a 
solution locally using two pools of 3 readers each (configurable via system properties) and will 
post the diff after I do some testing to confirm accuracy and speed improvements.

I'm a little unsure of how exactly to best write JUnit test(s) for these changes as any tests should 
take into account multiple threads and simultaneous reads (to test that the object pool I wrote 
actually works as advertised). Any ideas would be appreciated.

Regards,

Bruce Ritchie

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Bruce Ritchie wrote:
> The source reason why I'm using multiple readers was that I was hitting 
> a synchronization issue with hits.doc(i) blocking across multiple 
> threads on a busy customer site causing searches to become slower and 
> slower as more searches were attempted simultaneously. I believe the 
> root cause was that SegmentReader.document(i) was synchronized (I could 
> be wrong, it's been a while), however I didn't have time to look into 
> the core code of Lucene when opening multiple readers was such a simple 
> solution and proved to solve the issue. Of course, now that I've got a 
> (bit) more time it might be worthwhile to investigate alternatives :)

If you do get a chance to look into this, I'd love to hear more.

FieldsReader.doc() could easily be re-written to be re-entrant.  For a 
start, it could synchronize separately on fieldStream and indexStream, 
which would let two threads use it at once.  (If an index is not 
optimized, the situtation would be even better, since there would be a 
fieldStream and indexStream per index segment.)

If that's not enough, then it could be re-written to use either a pool 
of cloned input streams, or just to clone a new stream for each call. 
(The primary expense of cloning a stream is allocating a 1k buffer.)

Cheers,

Doug

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Doug Cutting wrote:
> Why do you open multiple IndexReaders against a single index? Ordinarily 
> I would only expect an application to open a new index reader when the 
> index has changed, or in order to do deletions.  In both of these cases, 
> the cache would work correctly.

The source reason why I'm using multiple readers was that I was hitting a synchronization issue with 
hits.doc(i) blocking across multiple threads on a busy customer site causing searches to become 
slower and slower as more searches were attempted simultaneously. I believe the root cause was that 
SegmentReader.document(i) was synchronized (I could be wrong, it's been a while), however I didn't 
have time to look into the core code of Lucene when opening multiple readers was such a simple 
solution and proved to solve the issue. Of course, now that I've got a (bit) more time it might be 
worthwhile to investigate alternatives :)

> Note that an open index reader uses much more memory than another bit 
> vector in a cache will.  It caches a byte per document for each field 
> you've searched, plus 1/128th of all the terms in the index.  So, e.g., 
> the cached bit vectors could become dominant if you use more than eight 
> caches and only search a single field.

I knew that readers are relatively heavy however the real issue with using multiple readers proved 
to be file descriptors, not memory usage (I'd really love a performant solution to that issue). I've 
got the number of readers set to a max of 3 by default and configurable if need be.

In this case it's not the fact that the cache may have 'duplicate' values in it for the same filter 
that I'm concerned about, but rather that a cache miss can be so painful (the slowness of DateFilter 
over a large index impacting search performance to the order of seconds is an example).

Regards,

Bruce Ritchie

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Doug Cutting <cu...@lucene.com>.

Bruce Ritchie wrote:
> I think it would depend on whether the cache key's are independant of 
> IndexReaders (i.e. an implementation that's not implemented in the same 
> manner as the QueryFilter by using an IndexReader as a cache key (or 
> part thereof). This is because I open multiple IndexReaders against a 
> single index which would cause (false) cache misses.

Why do you open multiple IndexReaders against a single index? 
Ordinarily I would only expect an application to open a new index reader 
when the index has changed, or in order to do deletions.  In both of 
these cases, the cache would work correctly.

Note that an open index reader uses much more memory than another bit 
vector in a cache will.  It caches a byte per document for each field 
you've searched, plus 1/128th of all the terms in the index.  So, e.g., 
the cached bit vectors could become dominant if you use more than eight 
caches and only search a single field.

Doug

Re: Caching filter wrapper

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

I've added the proposed CachingWrapperFilter using QueryFilter's method 
of caching along with the fix Otis put in this morning to avoid an NPE 
when using remote searching.

See the @todo's there to see if there is more work to be done.  Using 
QueryFilter or this caching one through remote searcher will void any 
caching - is this of concern here?  Do we need to provide a 
user-definable Map implementation for the cache store as an option?

I'll write up some javadocs for this once we've ironed out the 
implementation.

	Erik

On Tuesday, September 16, 2003, at 03:28  PM, Bruce Ritchie wrote:

> Erik Hatcher wrote:
>> Cool.... I'll work on adding an implementation then.  But what would 
>> be the key to the map if not the IndexReader instance?  It ought to 
>> be something related to that at least for the scenario's where a 
>> single filter instance is being used over multiple indices.  Or would 
>> simply two different constructors be enough (one taking a Filter and 
>> defaulting to a WeakHashMap, and the other taking a Filter and a Map 
>> to use), and still use IndexReader as the key?
>
> Well, seeing as how things are resolving on the other half of this 
> thread, I'll take back my concern about tying the caching to a reader 
> instance (since I've come up with a decent solution which will allow 
> me to use a single reader).
>
>
> Regards,
>
> Bruce Ritchie
> <smime.p7s>

---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

Re: Caching filter wrapper

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

I've added the proposed CachingWrapperFilter using QueryFilter's method 
of caching along with the fix Otis put in this morning to avoid an NPE 
when using remote searching.

See the @todo's there to see if there is more work to be done.  Using 
QueryFilter or this caching one through remote searcher will void any 
caching - is this of concern here?  Do we need to provide a 
user-definable Map implementation for the cache store as an option?

I'll write up some javadocs for this once we've ironed out the 
implementation.

	Erik

On Tuesday, September 16, 2003, at 03:28  PM, Bruce Ritchie wrote:

> Erik Hatcher wrote:
>> Cool.... I'll work on adding an implementation then.  But what would 
>> be the key to the map if not the IndexReader instance?  It ought to 
>> be something related to that at least for the scenario's where a 
>> single filter instance is being used over multiple indices.  Or would 
>> simply two different constructors be enough (one taking a Filter and 
>> defaulting to a WeakHashMap, and the other taking a Filter and a Map 
>> to use), and still use IndexReader as the key?
>
> Well, seeing as how things are resolving on the other half of this 
> thread, I'll take back my concern about tying the caching to a reader 
> instance (since I've come up with a decent solution which will allow 
> me to use a single reader).
>
>
> Regards,
>
> Bruce Ritchie
> <smime.p7s>

Re: Caching filter wrapper

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Erik Hatcher wrote:
> Cool.... I'll work on adding an implementation then.  But what would be 
> the key to the map if not the IndexReader instance?  It ought to be 
> something related to that at least for the scenario's where a single 
> filter instance is being used over multiple indices.  Or would simply 
> two different constructors be enough (one taking a Filter and defaulting 
> to a WeakHashMap, and the other taking a Filter and a Map to use), and 
> still use IndexReader as the key?

Well, seeing as how things are resolving on the other half of this thread, I'll take back my concern 
about tying the caching to a reader instance (since I've come up with a decent solution which will 
allow me to use a single reader).

Regards,

Bruce Ritchie

Re: Caching filter wrapper

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Monday, September 15, 2003, at 12:46  PM, Bruce Ritchie wrote:
> Erik Hatcher wrote:
>> So, if there was a caching filter implemented like yours, but with 
>> the WeakHashMap cache like QueryFilter, would you use it instead of 
>> what you've done?
>
> I think it would depend on whether the cache key's are independant of 
> IndexReaders (i.e. an implementation that's not implemented in the 
> same manner as the QueryFilter by using an IndexReader as a cache key 
> (or part thereof). This is because I open multiple IndexReaders 
> against a single index which would cause (false) cache misses. If that 
> wasn't the case then I think I'd be ok with using it, irregardless of 
> my preference to use our own cache architecture. I'd definitely use it 
> if I could provide the backing map via a setMap() method or the like.

Cool.... I'll work on adding an implementation then.  But what would be 
the key to the map if not the IndexReader instance?  It ought to be 
something related to that at least for the scenario's where a single 
filter instance is being used over multiple indices.  Or would simply 
two different constructors be enough (one taking a Filter and 
defaulting to a WeakHashMap, and the other taking a Filter and a Map to 
use), and still use IndexReader as the key?

	Erik

Re: Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Erik Hatcher wrote:
> So, if there was a caching filter implemented like yours, but with the 
> WeakHashMap cache like QueryFilter, would you use it instead of what 
> you've done?  

I think it would depend on whether the cache key's are independant of IndexReaders (i.e. an 
implementation that's not implemented in the same manner as the QueryFilter by using an IndexReader 
as a cache key (or part thereof). This is because I open multiple IndexReaders against a single 
index which would cause (false) cache misses. If that wasn't the case then I think I'd be ok with 
using it, irregardless of my preference to use our own cache architecture. I'd definitely use it if 
I could provide the backing map via a setMap() method or the like.

> I'm in agreement with you about where the caching should 
> be.  Would anyone object to such an implementation added to Lucene's core?

It's fine by me but I'm only one user :)

Regards,

Bruce Ritchie

Caching filter wrapper (was Re: RE : DateFilter.Before/After)

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Monday, September 15, 2003, at 10:47  AM, Bruce Ritchie wrote:
> Well, there's a part of the application that you're not seeing that 
> does the checking of the index state. This is done inside of what is 
> essentially a wrapper to a map which clears all filters upon a state 
> change in the index. Your point is perfectly valid however for anyone 
> who does not have such a wrapper and one that I should have mentioned 
> in my previous email.

So, if there was a caching filter implemented like yours, but with the 
WeakHashMap cache like QueryFilter, would you use it instead of what 
you've done?  I'm in agreement with you about where the caching should 
be.  Would anyone object to such an implementation added to Lucene's 
core?

	Erik

Re: RE : DateFilter.Before/After

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Erik Hatcher wrote:

> On Monday, September 15, 2003, at 09:45  AM, Bruce Ritchie wrote:
> 
>> I would suggest *not* using caching inside of filters provided by 
>> lucene but rather provide a wrapper to do the caching. The reason is 
>> that some applications really don't want the libraries they use to be 
>> a source of concern for memory usage. i.e. if I search for a string 
>> using 10,000 different date filters (an extreme example, but possible) 
>> I want the ability to control how those bitsets are going to be > cached.
> 
> 
> In the case of QueryFilter, simply construct a new one to avoid caching 
> rather than reuse the same instance.  So you have control there as 
> well.  The only thing that is cached is a BitSet, so it should be much 
> of a memory usage concern.

Perhaps. I guess my point is that I would prefer a wrapper architecture for caching rather that 
having it built-in directly in the filters. Having it built-in would require me to rip it out when I 
upgraded our applications to the latest release. Implementing it as a wrapper allows me to bypass 
the caching built-in and decide what I want cached, how many I want cached and for how long (our 
caches can be size and time limited). Having it as a wrapper also allows other people to make the 
same sort of decisions as I require.

>> public class CachedFilter extends Filter {
>>     BitSet bits;
>>     Filter filter;
>>
>>     public CachedFilter(Filter filter) {
>>         this.filter = filter;
>>         this.bits = null;
>>     }
>>
>>     public BitSet bits(IndexReader reader) throws IOException {
>>         if (bits != null) {
>>             return bits;
>>         }
>>
>>         bits = filter.bits(reader);
>>         return bits;
>>     }
>> }
> 
> 
> You would have problems if you searched a different index or different 
> instance of IndexReader even with your caching here.  You should cache 
> like QueryFilter does to avoid a potential mismatch with IndexReader 
> instances.

Well, there's a part of the application that you're not seeing that does the checking of the index 
state. This is done inside of what is essentially a wrapper to a map which clears all filters upon a 
state change in the index. Your point is perfectly valid however for anyone who does not have such a 
wrapper and one that I should have mentioned in my previous email.

Regards,

Bruce Ritchie

Re: RE : DateFilter.Before/After

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Monday, September 15, 2003, at 10:27  AM, Erik Hatcher wrote:
> On Monday, September 15, 2003, at 09:45  AM, Bruce Ritchie wrote:
>> I would suggest *not* using caching inside of filters provided by 
>> lucene but rather provide a wrapper to do the caching. The reason is 
>> that some applications really don't want the libraries they use to be 
>> a source of concern for memory usage. i.e. if I search for a string 
>> using 10,000 different date filters (an extreme example, but 
>> possible) I want the ability to control how those bitsets are going 
>> to be > cached.
>
> In the case of QueryFilter, simply construct a new one to avoid 
> caching rather than reuse the same instance.  So you have control 
> there as well.  The only thing that is cached is a BitSet, so it 
> should be much of a memory usage concern.

oops... typo... should NOT

Re: RE : DateFilter.Before/After

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

On Monday, September 15, 2003, at 09:45  AM, Bruce Ritchie wrote:
> I would suggest *not* using caching inside of filters provided by 
> lucene but rather provide a wrapper to do the caching. The reason is 
> that some applications really don't want the libraries they use to be 
> a source of concern for memory usage. i.e. if I search for a string 
> using 10,000 different date filters (an extreme example, but possible) 
> I want the ability to control how those bitsets are going to be > cached.

In the case of QueryFilter, simply construct a new one to avoid caching 
rather than reuse the same instance.  So you have control there as 
well.  The only thing that is cached is a BitSet, so it should be much 
of a memory usage concern.

> public class CachedFilter extends Filter {
>     BitSet bits;
>     Filter filter;
>
>     public CachedFilter(Filter filter) {
>         this.filter = filter;
>         this.bits = null;
>     }
>
>     public BitSet bits(IndexReader reader) throws IOException {
>         if (bits != null) {
>             return bits;
>         }
>
>         bits = filter.bits(reader);
>         return bits;
>     }
> }

You would have problems if you searched a different index or different 
instance of IndexReader even with your caching here.  You should cache 
like QueryFilter does to avoid a potential mismatch with IndexReader 
instances.

But you're implementation is exactly what I was envisioning with the 
added WeakHashMap of QueryFilter.

	Erik

Re: RE : DateFilter.Before/After

Posted by Bruce Ritchie <br...@jivesoftware.com>.

Erik Hatcher wrote:

> Also, regarding DateFilter.... would it be reasonable to apply the same 
> bitset caching that QueryFilter uses?
> 
> what about a CachingFilterWrapper implementation that implements Filter 
> and the QueryFilter-like caching, and passes through to another filter 
> for the actual gathering of the bitset?

I would suggest *not* using caching inside of filters provided by lucene but rather provide a 
wrapper to do the caching. The reason is that some applications really don't want the libraries they 
use to be a source of concern for memory usage. i.e. if I search for a string using 10,000 different 
date filters (an extreme example, but possible) I want the ability to control how those bitsets are 
going to be cached. Yes, I realize that a weakhashmap can and is used by the QueryFilter however I 
think it's best if the client application does the caching as only it is best suited to know what 
should and should not be cached. It's dead simple for the client app to just do the lookup of the 
filter against a key in the hashmap.

I use the following - it's simple and effective.

public class CachedFilter extends Filter {
     BitSet bits;
     Filter filter;

     public CachedFilter(Filter filter) {
         this.filter = filter;
         this.bits = null;
     }

     public BitSet bits(IndexReader reader) throws IOException {
         if (bits != null) {
             return bits;
         }

         bits = filter.bits(reader);
         return bits;
     }
}

Regards,

Bruce Ritchie

Re: RE : DateFilter.Before/After

Posted by Erik Hatcher <er...@ehatchersolutions.com>.

Also, regarding DateFilter.... would it be reasonable to apply the same 
bitset caching that QueryFilter uses?

what about a CachingFilterWrapper implementation that implements Filter 
and the QueryFilter-like caching, and passes through to another filter 
for the actual gathering of the bitset?

On Monday, September 15, 2003, at 06:58  AM, Rasik Pandey wrote:

> Hello,
>>
>> Is it odd to anyone that DateFilter.Before/After is
>> *inclusive* of the
>> date you specify?  Seems counter-intuitive to me, but I'm
>> going to add
>> "on or before/after" to the Javadocs at least.  The methods
>> should have
>> probably been called OnOrBefore/After :)
>>
>
> I was stumped by this recently. I agree the documentation could be
> better....
>
>
> Rasik Pandey
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: lucene-dev-unsubscribe@jakarta.apache.org
> For additional commands, e-mail: lucene-dev-help@jakarta.apache.org

RE : DateFilter.Before/After

Posted by Rasik Pandey <ra...@ajlsm.com>.

Hello,
 >
 >Is it odd to anyone that DateFilter.Before/After is 
 >*inclusive* of the 
 >date you specify?  Seems counter-intuitive to me, but I'm 
 >going to add 
 >"on or before/after" to the Javadocs at least.  The methods 
 >should have 
 >probably been called OnOrBefore/After :)
 >
 
I was stumped by this recently. I agree the documentation could be
better....


Rasik Pandey