You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by "Dr. Fish" <sp...@thetunaman.com> on 2008/06/22 18:29:46 UTC

How to make mutually exclusive lists of results

Hi,

I currently am using Lucene to index documents. I index 4 fields, the body
of the document, the city it is related to, the state it is related to, and
the country it is related to.

I have a java web application where the user types in some search text.. and
it searches the body of the document for matches. This works fine.

However, what I am loooking to do is generate 3 separate mutually exclusive
search results.

List 1) Results from text search, but also matching current user city,
state, and country

List 2) Results from text search, but also matching current state and
country

List 3) Result from text search, but also matching current country

The idea would be for each search to be mutually exclusive.

The naive way I know to do this, is to run 3 searches. For example if My
Search text was "bacon", and my City was Chicago, IL... I could do something
like


List 1 -> body:"bacon" AND city:"chicago" AND state:"illinois" AND
country:"united states"

List 2 ->  body:"bacon" NOT city:chicago" AND state:"illinois" AND
country:"united states"

List 3-> body:"bacon" NOT city:chicago" NOT state:"illinois" AND
country:"united states"


So this gives me my 3 mutually exclusive lists... but it makes me search the
database 3 times for each search I want to do. This seems rather jank.  Is
there some fancy Lucene tool I am missing that would let me do this?  I
don't have a ton of Lucene experience, so I think I am missing something
obvious.




-- 
View this message in context: http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-results-tp18056289p18056289.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to make mutually exclusive lists of results

Posted by "Dr. Fish" <sp...@thetunaman.com>.
Ah great that was the kind of stuff I was looking for.

Right now I have < 1000 results, so it won't kill me to use hits. I will
look into Hit Collection for my next release to fix that... 




Erick Erickson wrote:
> 
> Watch out though. By "hit' I assume you mean a Hits object. Iterating
> over a large result set with a Hits object can be very inefficient because
> the query re-executes every 100 or so. Think about a HitCollector
> instead.
> 
> Best
> Erick.
> 
> 
> 
> On Sun, Jun 22, 2008 at 2:59 PM, Dr. Fish <sp...@thetunaman.com> wrote:
> 
>>
>> Ah I think I got it
>>
>>
>> hit.getDocument().getField("CityID").stringValue() seems to be what I
>> wanted
>>
>>
>> Thanks!
>>
>>
>>
>> Dr. Fish wrote:
>> >
>> > I tried this first, and I was having trouble iterating over them.
>> >
>> >
>> > If I do something like this
>> >
>> > hit.getDocument().getField("CountryID").toString()
>> >
>> > I get a big long BS object.. not just the result of my search.
>> >
>> > I also tried messing with the tokenStreamValue(), and had the same
>> result.
>> > How do I just get the value "countryID" out of the document?
>> >
>> >
>> > Steven A Rowe wrote:
>> >>
>> >> Hi Dr. Fish,
>> >>
>> >> You could make just a single query with the broadest query possible -
>> >> e.g.
>> >>
>> >>   bacon AND country:"united states"
>> >>
>> >> and then iterate over all results, dividing them into your three
>> buckets
>> >> based on the values of the other two fields.
>> >>
>> >> Steve
>> >>
>> >> On 06/22/2008 at 12:29 PM, Dr. Fish wrote:
>> >>>
>> >>> Hi,
>> >>>
>> >>> I currently am using Lucene to index documents. I index 4
>> >>> fields, the body
>> >>> of the document, the city it is related to, the state it is
>> >>> related to, and
>> >>> the country it is related to.
>> >>>
>> >>> I have a java web application where the user types in some
>> >>> search text.. and
>> >>> it searches the body of the document for matches. This works fine.
>> >>>
>> >>> However, what I am loooking to do is generate 3 separate
>> >>> mutually exclusive
>> >>> search results.
>> >>>
>> >>> List 1) Results from text search, but also matching current user
>> city,
>> >>> state, and country
>> >>>
>> >>> List 2) Results from text search, but also matching current state and
>> >>> country
>> >>>
>> >>> List 3) Result from text search, but also matching current country
>> >>>
>> >>> The idea would be for each search to be mutually exclusive.
>> >>>
>> >>> The naive way I know to do this, is to run 3 searches. For
>> >>> example if My
>> >>> Search text was "bacon", and my City was Chicago, IL... I
>> >>> could do something
>> >>> like
>> >>>
>> >>>
>> >>> List 1 -> body:"bacon" AND city:"chicago" AND state:"illinois" AND
>> >>> country:"united states"
>> >>>
>> >>> List 2 ->  body:"bacon" NOT city:chicago" AND state:"illinois" AND
>> >>> country:"united states"
>> >>>
>> >>> List 3-> body:"bacon" NOT city:chicago" NOT state:"illinois" AND
>> >>> country:"united states"
>> >>>
>> >>>
>> >>> So this gives me my 3 mutually exclusive lists... but it
>> >>> makes me search the
>> >>> database 3 times for each search I want to do. This seems
>> >>> rather jank.  Is
>> >>> there some fancy Lucene tool I am missing that would let me
>> >>> do this?  I
>> >>> don't have a ton of Lucene experience, so I think I am
>> >>> missing something
>> >>> obvious.
>> >>>
>> >>>
>> >>>
>> >>>
>> >>> -- View this message in context:
>> >>> http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-
>> >>> results-tp18056289p18056289.html Sent from the Lucene - Java Users
>> >>> mailing list archive at Nabble.com.
>> >>>
>> >>>
>> >>> ---------------------------------------------------------------------
>> To
>> >>> unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For
>> >>> additional commands, e-mail: java-user-help@lucene.apache.org
>> >>>
>> >>>
>> >>
>> >>
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> >> For additional commands, e-mail: java-user-help@lucene.apache.org
>> >>
>> >>
>> >>
>> >
>> >
>>
>> --
>> View this message in context:
>> http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-results-tp18056289p18057829.html
>> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>>
>>
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-results-tp18056289p18061117.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: How to make mutually exclusive lists of results

Posted by Erick Erickson <er...@gmail.com>.
Watch out though. By "hit' I assume you mean a Hits object. Iterating
over a large result set with a Hits object can be very inefficient because
the query re-executes every 100 or so. Think about a HitCollector
instead.

Best
Erick.



On Sun, Jun 22, 2008 at 2:59 PM, Dr. Fish <sp...@thetunaman.com> wrote:

>
> Ah I think I got it
>
>
> hit.getDocument().getField("CityID").stringValue() seems to be what I
> wanted
>
>
> Thanks!
>
>
>
> Dr. Fish wrote:
> >
> > I tried this first, and I was having trouble iterating over them.
> >
> >
> > If I do something like this
> >
> > hit.getDocument().getField("CountryID").toString()
> >
> > I get a big long BS object.. not just the result of my search.
> >
> > I also tried messing with the tokenStreamValue(), and had the same
> result.
> > How do I just get the value "countryID" out of the document?
> >
> >
> > Steven A Rowe wrote:
> >>
> >> Hi Dr. Fish,
> >>
> >> You could make just a single query with the broadest query possible -
> >> e.g.
> >>
> >>   bacon AND country:"united states"
> >>
> >> and then iterate over all results, dividing them into your three buckets
> >> based on the values of the other two fields.
> >>
> >> Steve
> >>
> >> On 06/22/2008 at 12:29 PM, Dr. Fish wrote:
> >>>
> >>> Hi,
> >>>
> >>> I currently am using Lucene to index documents. I index 4
> >>> fields, the body
> >>> of the document, the city it is related to, the state it is
> >>> related to, and
> >>> the country it is related to.
> >>>
> >>> I have a java web application where the user types in some
> >>> search text.. and
> >>> it searches the body of the document for matches. This works fine.
> >>>
> >>> However, what I am loooking to do is generate 3 separate
> >>> mutually exclusive
> >>> search results.
> >>>
> >>> List 1) Results from text search, but also matching current user city,
> >>> state, and country
> >>>
> >>> List 2) Results from text search, but also matching current state and
> >>> country
> >>>
> >>> List 3) Result from text search, but also matching current country
> >>>
> >>> The idea would be for each search to be mutually exclusive.
> >>>
> >>> The naive way I know to do this, is to run 3 searches. For
> >>> example if My
> >>> Search text was "bacon", and my City was Chicago, IL... I
> >>> could do something
> >>> like
> >>>
> >>>
> >>> List 1 -> body:"bacon" AND city:"chicago" AND state:"illinois" AND
> >>> country:"united states"
> >>>
> >>> List 2 ->  body:"bacon" NOT city:chicago" AND state:"illinois" AND
> >>> country:"united states"
> >>>
> >>> List 3-> body:"bacon" NOT city:chicago" NOT state:"illinois" AND
> >>> country:"united states"
> >>>
> >>>
> >>> So this gives me my 3 mutually exclusive lists... but it
> >>> makes me search the
> >>> database 3 times for each search I want to do. This seems
> >>> rather jank.  Is
> >>> there some fancy Lucene tool I am missing that would let me
> >>> do this?  I
> >>> don't have a ton of Lucene experience, so I think I am
> >>> missing something
> >>> obvious.
> >>>
> >>>
> >>>
> >>>
> >>> -- View this message in context:
> >>> http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-
> >>> results-tp18056289p18056289.html Sent from the Lucene - Java Users
> >>> mailing list archive at Nabble.com.
> >>>
> >>>
> >>> ---------------------------------------------------------------------
> To
> >>> unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For
> >>> additional commands, e-mail: java-user-help@lucene.apache.org
> >>>
> >>>
> >>
> >>
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> >> For additional commands, e-mail: java-user-help@lucene.apache.org
> >>
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-results-tp18056289p18057829.html
> Sent from the Lucene - Java Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

RE: How to make mutually exclusive lists of results

Posted by "Dr. Fish" <sp...@thetunaman.com>.
Ah I think I got it


hit.getDocument().getField("CityID").stringValue() seems to be what I wanted


Thanks!



Dr. Fish wrote:
> 
> I tried this first, and I was having trouble iterating over them.
> 
> 
> If I do something like this
> 
> hit.getDocument().getField("CountryID").toString()
> 
> I get a big long BS object.. not just the result of my search.
> 
> I also tried messing with the tokenStreamValue(), and had the same result. 
> How do I just get the value "countryID" out of the document?
> 
> 
> Steven A Rowe wrote:
>> 
>> Hi Dr. Fish,
>> 
>> You could make just a single query with the broadest query possible -
>> e.g. 
>> 
>>   bacon AND country:"united states"
>> 
>> and then iterate over all results, dividing them into your three buckets
>> based on the values of the other two fields.
>> 
>> Steve
>> 
>> On 06/22/2008 at 12:29 PM, Dr. Fish wrote:
>>> 
>>> Hi,
>>> 
>>> I currently am using Lucene to index documents. I index 4
>>> fields, the body
>>> of the document, the city it is related to, the state it is
>>> related to, and
>>> the country it is related to.
>>> 
>>> I have a java web application where the user types in some
>>> search text.. and
>>> it searches the body of the document for matches. This works fine.
>>> 
>>> However, what I am loooking to do is generate 3 separate
>>> mutually exclusive
>>> search results.
>>> 
>>> List 1) Results from text search, but also matching current user city,
>>> state, and country
>>> 
>>> List 2) Results from text search, but also matching current state and
>>> country
>>> 
>>> List 3) Result from text search, but also matching current country
>>> 
>>> The idea would be for each search to be mutually exclusive.
>>> 
>>> The naive way I know to do this, is to run 3 searches. For
>>> example if My
>>> Search text was "bacon", and my City was Chicago, IL... I
>>> could do something
>>> like
>>> 
>>> 
>>> List 1 -> body:"bacon" AND city:"chicago" AND state:"illinois" AND
>>> country:"united states"
>>> 
>>> List 2 ->  body:"bacon" NOT city:chicago" AND state:"illinois" AND
>>> country:"united states"
>>> 
>>> List 3-> body:"bacon" NOT city:chicago" NOT state:"illinois" AND
>>> country:"united states"
>>> 
>>> 
>>> So this gives me my 3 mutually exclusive lists... but it
>>> makes me search the
>>> database 3 times for each search I want to do. This seems
>>> rather jank.  Is
>>> there some fancy Lucene tool I am missing that would let me
>>> do this?  I
>>> don't have a ton of Lucene experience, so I think I am
>>> missing something
>>> obvious.
>>> 
>>> 
>>> 
>>> 
>>> -- View this message in context:
>>> http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-
>>> results-tp18056289p18056289.html Sent from the Lucene - Java Users
>>> mailing list archive at Nabble.com.
>>> 
>>> 
>>> --------------------------------------------------------------------- To
>>> unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For
>>> additional commands, e-mail: java-user-help@lucene.apache.org
>>> 
>>>
>> 
>>  
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 
>> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-results-tp18056289p18057829.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: How to make mutually exclusive lists of results

Posted by "Dr. Fish" <sp...@thetunaman.com>.
I tried this first, and I was having trouble iterating over them.


If I do something like this

hit.getDocument().getField("CountryID").toString()

I get a big long BS object.. not just the result of my search.

I also tried messing with the tokenStreamValue(), and had the same result. 
How do I just get the value "countryID" out of the document?


Steven A Rowe wrote:
> 
> Hi Dr. Fish,
> 
> You could make just a single query with the broadest query possible - e.g. 
> 
>   bacon AND country:"united states"
> 
> and then iterate over all results, dividing them into your three buckets
> based on the values of the other two fields.
> 
> Steve
> 
> On 06/22/2008 at 12:29 PM, Dr. Fish wrote:
>> 
>> Hi,
>> 
>> I currently am using Lucene to index documents. I index 4
>> fields, the body
>> of the document, the city it is related to, the state it is
>> related to, and
>> the country it is related to.
>> 
>> I have a java web application where the user types in some
>> search text.. and
>> it searches the body of the document for matches. This works fine.
>> 
>> However, what I am loooking to do is generate 3 separate
>> mutually exclusive
>> search results.
>> 
>> List 1) Results from text search, but also matching current user city,
>> state, and country
>> 
>> List 2) Results from text search, but also matching current state and
>> country
>> 
>> List 3) Result from text search, but also matching current country
>> 
>> The idea would be for each search to be mutually exclusive.
>> 
>> The naive way I know to do this, is to run 3 searches. For
>> example if My
>> Search text was "bacon", and my City was Chicago, IL... I
>> could do something
>> like
>> 
>> 
>> List 1 -> body:"bacon" AND city:"chicago" AND state:"illinois" AND
>> country:"united states"
>> 
>> List 2 ->  body:"bacon" NOT city:chicago" AND state:"illinois" AND
>> country:"united states"
>> 
>> List 3-> body:"bacon" NOT city:chicago" NOT state:"illinois" AND
>> country:"united states"
>> 
>> 
>> So this gives me my 3 mutually exclusive lists... but it
>> makes me search the
>> database 3 times for each search I want to do. This seems
>> rather jank.  Is
>> there some fancy Lucene tool I am missing that would let me
>> do this?  I
>> don't have a ton of Lucene experience, so I think I am
>> missing something
>> obvious.
>> 
>> 
>> 
>> 
>> -- View this message in context:
>> http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-
>> results-tp18056289p18056289.html Sent from the Lucene - Java Users
>> mailing list archive at Nabble.com.
>> 
>> 
>> --------------------------------------------------------------------- To
>> unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For
>> additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>>
> 
>  
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-results-tp18056289p18056606.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: How to make mutually exclusive lists of results

Posted by Steven A Rowe <sa...@syr.edu>.
Hi Dr. Fish,

You could make just a single query with the broadest query possible - e.g. 

  bacon AND country:"united states"

and then iterate over all results, dividing them into your three buckets based on the values of the other two fields.

Steve

On 06/22/2008 at 12:29 PM, Dr. Fish wrote:
> 
> Hi,
> 
> I currently am using Lucene to index documents. I index 4
> fields, the body
> of the document, the city it is related to, the state it is
> related to, and
> the country it is related to.
> 
> I have a java web application where the user types in some
> search text.. and
> it searches the body of the document for matches. This works fine.
> 
> However, what I am loooking to do is generate 3 separate
> mutually exclusive
> search results.
> 
> List 1) Results from text search, but also matching current user city,
> state, and country
> 
> List 2) Results from text search, but also matching current state and
> country
> 
> List 3) Result from text search, but also matching current country
> 
> The idea would be for each search to be mutually exclusive.
> 
> The naive way I know to do this, is to run 3 searches. For
> example if My
> Search text was "bacon", and my City was Chicago, IL... I
> could do something
> like
> 
> 
> List 1 -> body:"bacon" AND city:"chicago" AND state:"illinois" AND
> country:"united states"
> 
> List 2 ->  body:"bacon" NOT city:chicago" AND state:"illinois" AND
> country:"united states"
> 
> List 3-> body:"bacon" NOT city:chicago" NOT state:"illinois" AND
> country:"united states"
> 
> 
> So this gives me my 3 mutually exclusive lists... but it
> makes me search the
> database 3 times for each search I want to do. This seems
> rather jank.  Is
> there some fancy Lucene tool I am missing that would let me
> do this?  I
> don't have a ton of Lucene experience, so I think I am
> missing something
> obvious.
> 
> 
> 
> 
> -- View this message in context:
> http://www.nabble.com/How-to-make-mutually-exclusive-lists-of-
> results-tp18056289p18056289.html Sent from the Lucene - Java Users
> mailing list archive at Nabble.com.
> 
> 
> --------------------------------------------------------------------- To
> unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org For
> additional commands, e-mail: java-user-help@lucene.apache.org
> 
>

 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org