You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Dragon Fly <dr...@hotmail.com> on 2012/02/26 14:30:32 UTC

Most recent document within a group ...

Hi,

Let's say I have 6 documents and each document has 2 fields (i.e. CustomerName and OrderDate).  For example:

Doc 1    John    20120115
Doc 2    Mary    20120113
Doc 3    Peter   20120117
Doc 4    Kate    20120208
Doc 5    John    20120211
Doc 6    Alan    20110423

Is there a way to execute a search to return the document with the most recent OrderDate for a CustomerName? For instance, if I search for John, it should return Doc 5 (because Doc 5 is more recent than Doc 1).  Thank you. 		 	   		  

RE: Most recent document within a group ...

Posted by Dragon Fly <dr...@hotmail.com>.
I'll give it a try, thanks.

> Date: Mon, 27 Feb 2012 08:29:57 -0500
> Subject: Re: Most recent document within a group ...
> From: erickerickson@gmail.com
> To: java-user@lucene.apache.org
> 
> Just try it. Sorting doesn't load the document, it does load
> the unique values for the sort field. Which is why indexing
> dates benefits from using the coarsest resolution you can,
> i.e. don't store millisecond resolution if all you care about
> is the day something was published.
> 
> In fact, sorting doesn't load the documents at all, the values
> are read from the inverted index.
> 
> Best
> Erick
> 
> On Mon, Feb 27, 2012 at 8:04 AM, Dragon Fly <dr...@hotmail.com> wrote:
> >
> > Erick, what if the search returns 100,000 hits? I'm trying to avoid loading a large number of documents from disk (i.e. a slow operation) and then pick up the top one.  I know how to execute a search (sorted by date).  Is there a way to just load the first hit from disk? I don't know which Lucene method call would actually load the documents from disk.  searcher.doc () maybe? Thanks.
> >
> >> Date: Sun, 26 Feb 2012 15:39:21 -0500
> >> Subject: Re: Most recent document within a group ...
> >> From: erickerickson@gmail.com
> >> To: java-user@lucene.apache.org
> >>
> >> Have you looked at the Searcher.search variant
> >> that takes a Sort parameter?
> >>
> >> Best
> >> Erick
> >>
> >> On Sun, Feb 26, 2012 at 8:30 AM, Dragon Fly <dr...@hotmail.com> wrote:
> >> >
> >> > Hi,
> >> >
> >> > Let's say I have 6 documents and each document has 2 fields (i.e. CustomerName and OrderDate).  For example:
> >> >
> >> > Doc 1    John    20120115
> >> > Doc 2    Mary    20120113
> >> > Doc 3    Peter   20120117
> >> > Doc 4    Kate    20120208
> >> > Doc 5    John    20120211
> >> > Doc 6    Alan    20110423
> >> >
> >> > Is there a way to execute a search to return the document with the most recent OrderDate for a CustomerName? For instance, if I search for John, it should return Doc 5 (because Doc 5 is more recent than Doc 1).  Thank you.
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
 		 	   		  

Re: Most recent document within a group ...

Posted by Erick Erickson <er...@gmail.com>.
Just try it. Sorting doesn't load the document, it does load
the unique values for the sort field. Which is why indexing
dates benefits from using the coarsest resolution you can,
i.e. don't store millisecond resolution if all you care about
is the day something was published.

In fact, sorting doesn't load the documents at all, the values
are read from the inverted index.

Best
Erick

On Mon, Feb 27, 2012 at 8:04 AM, Dragon Fly <dr...@hotmail.com> wrote:
>
> Erick, what if the search returns 100,000 hits? I'm trying to avoid loading a large number of documents from disk (i.e. a slow operation) and then pick up the top one.  I know how to execute a search (sorted by date).  Is there a way to just load the first hit from disk? I don't know which Lucene method call would actually load the documents from disk.  searcher.doc () maybe? Thanks.
>
>> Date: Sun, 26 Feb 2012 15:39:21 -0500
>> Subject: Re: Most recent document within a group ...
>> From: erickerickson@gmail.com
>> To: java-user@lucene.apache.org
>>
>> Have you looked at the Searcher.search variant
>> that takes a Sort parameter?
>>
>> Best
>> Erick
>>
>> On Sun, Feb 26, 2012 at 8:30 AM, Dragon Fly <dr...@hotmail.com> wrote:
>> >
>> > Hi,
>> >
>> > Let's say I have 6 documents and each document has 2 fields (i.e. CustomerName and OrderDate).  For example:
>> >
>> > Doc 1    John    20120115
>> > Doc 2    Mary    20120113
>> > Doc 3    Peter   20120117
>> > Doc 4    Kate    20120208
>> > Doc 5    John    20120211
>> > Doc 6    Alan    20110423
>> >
>> > Is there a way to execute a search to return the document with the most recent OrderDate for a CustomerName? For instance, if I search for John, it should return Doc 5 (because Doc 5 is more recent than Doc 1).  Thank you.
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Most recent document within a group ...

Posted by Dragon Fly <dr...@hotmail.com>.
Erick, what if the search returns 100,000 hits? I'm trying to avoid loading a large number of documents from disk (i.e. a slow operation) and then pick up the top one.  I know how to execute a search (sorted by date).  Is there a way to just load the first hit from disk? I don't know which Lucene method call would actually load the documents from disk.  searcher.doc () maybe? Thanks.

> Date: Sun, 26 Feb 2012 15:39:21 -0500
> Subject: Re: Most recent document within a group ...
> From: erickerickson@gmail.com
> To: java-user@lucene.apache.org
> 
> Have you looked at the Searcher.search variant
> that takes a Sort parameter?
> 
> Best
> Erick
> 
> On Sun, Feb 26, 2012 at 8:30 AM, Dragon Fly <dr...@hotmail.com> wrote:
> >
> > Hi,
> >
> > Let's say I have 6 documents and each document has 2 fields (i.e. CustomerName and OrderDate).  For example:
> >
> > Doc 1    John    20120115
> > Doc 2    Mary    20120113
> > Doc 3    Peter   20120117
> > Doc 4    Kate    20120208
> > Doc 5    John    20120211
> > Doc 6    Alan    20110423
> >
> > Is there a way to execute a search to return the document with the most recent OrderDate for a CustomerName? For instance, if I search for John, it should return Doc 5 (because Doc 5 is more recent than Doc 1).  Thank you.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
 		 	   		  

Re: Most recent document within a group ...

Posted by Erick Erickson <er...@gmail.com>.
Have you looked at the Searcher.search variant
that takes a Sort parameter?

Best
Erick

On Sun, Feb 26, 2012 at 8:30 AM, Dragon Fly <dr...@hotmail.com> wrote:
>
> Hi,
>
> Let's say I have 6 documents and each document has 2 fields (i.e. CustomerName and OrderDate).  For example:
>
> Doc 1    John    20120115
> Doc 2    Mary    20120113
> Doc 3    Peter   20120117
> Doc 4    Kate    20120208
> Doc 5    John    20120211
> Doc 6    Alan    20110423
>
> Is there a way to execute a search to return the document with the most recent OrderDate for a CustomerName? For instance, if I search for John, it should return Doc 5 (because Doc 5 is more recent than Doc 1).  Thank you.

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org