You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Chris Schilling <ch...@cellixis.com> on 2011/01/24 22:02:17 UTC

Indexing with weights

Hello,

I have a bunch of text documents formatted like so:

keyword1 wt1
keyword2 wt2
keyword3 wt3

I would like to index the documents based on the keywords.  When I retrieve (search) for a keyword, I would like the list of documents to be sorted by the weight for that keyword.   Is there an example anywhere of how to do this.  I own LIA, but have not made it through the entire book yet.  Apologies if this is addressed.

Thank you!
Chris S.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing with weightsjxcmhcl$bn

Posted by ba...@gmail.com.
Wwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwwww
Sent via BlackBerry from T-Mobile

-----Original Message-----
From: Erick Erickson <er...@gmail.com>
Date: Mon, 24 Jan 2011 16:16:54 
To: <ja...@lucene.apache.org>
Reply-To: java-user@lucene.apache.org
Subject: Re: Indexing with weights

I think all you need to do is index the keywords in one field and weights in
another.
Then just search on keywords and sort on weight.

Note: the field you sort on should NOT be tokenized.

Best
Erick

On Mon, Jan 24, 2011 at 4:02 PM, Chris Schilling <ch...@cellixis.com> wrote:

> Hello,
>
> I have a bunch of text documents formatted like so:
>
> keyword1 wt1
> keyword2 wt2
> keyword3 wt3
>
> I would like to index the documents based on the keywords.  When I retrieve
> (search) for a keyword, I would like the list of documents to be sorted by
> the weight for that keyword.   Is there an example anywhere of how to do
> this.  I own LIA, but have not made it through the entire book yet.
>  Apologies if this is addressed.
>
> Thank you!
> Chris S.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>


Re: Indexing with weights

Posted by Chris Schilling <ch...@cellixis.com>.
Thanks Erick,

So something like:

while(parseFile) {
	String keyword = ...;
	String score = ...;
	doc.add(new Field("keywords", keyword, Field.Store.NO, Field.Index.ANALYZED));
	doc.add(new Field("scores", score, Field.Store.YES, Field.Index.NOT_ANALYZED));
}

How would I then retrieve the score associated with the keyword I searched for?



On Jan 24, 2011, at 1:16 PM, Erick Erickson wrote:

> I think all you need to do is index the keywords in one field and weights in
> another.
> Then just search on keywords and sort on weight.
> 
> Note: the field you sort on should NOT be tokenized.
> 
> Best
> Erick
> 
> On Mon, Jan 24, 2011 at 4:02 PM, Chris Schilling <ch...@cellixis.com> wrote:
> 
>> Hello,
>> 
>> I have a bunch of text documents formatted like so:
>> 
>> keyword1 wt1
>> keyword2 wt2
>> keyword3 wt3
>> 
>> I would like to index the documents based on the keywords.  When I retrieve
>> (search) for a keyword, I would like the list of documents to be sorted by
>> the weight for that keyword.   Is there an example anywhere of how to do
>> this.  I own LIA, but have not made it through the entire book yet.
>> Apologies if this is addressed.
>> 
>> Thank you!
>> Chris S.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Indexing with weights

Posted by Chris Schilling <ch...@cellixis.com>.
Well, maybe this trick is better?

while(parseFile) {
	String keyword = ...;
	String score = ...;
	doc.add(new Field("keywords", keyword, Field.Store.NO, Field.Index.ANALYZED));
	doc.add(new NumericField(keyword).setAsDouble(score));
}

Then, I guess I can sort based on the value of the field corresponding to the keyword that I search for.  

Ill run with this for now to see if it works.

Thanks
C

On Jan 24, 2011, at 1:16 PM, Erick Erickson wrote:

> I think all you need to do is index the keywords in one field and weights in
> another.
> Then just search on keywords and sort on weight.
> 
> Note: the field you sort on should NOT be tokenized.
> 
> Best
> Erick
> 
> On Mon, Jan 24, 2011 at 4:02 PM, Chris Schilling <ch...@cellixis.com> wrote:
> 
>> Hello,
>> 
>> I have a bunch of text documents formatted like so:
>> 
>> keyword1 wt1
>> keyword2 wt2
>> keyword3 wt3
>> 
>> I would like to index the documents based on the keywords.  When I retrieve
>> (search) for a keyword, I would like the list of documents to be sorted by
>> the weight for that keyword.   Is there an example anywhere of how to do
>> this.  I own LIA, but have not made it through the entire book yet.
>> Apologies if this is addressed.
>> 
>> Thank you!
>> Chris S.
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: java-user-help@lucene.apache.org
>> 
>> 


Re: Indexing with weights

Posted by Erick Erickson <er...@gmail.com>.
I think all you need to do is index the keywords in one field and weights in
another.
Then just search on keywords and sort on weight.

Note: the field you sort on should NOT be tokenized.

Best
Erick

On Mon, Jan 24, 2011 at 4:02 PM, Chris Schilling <ch...@cellixis.com> wrote:

> Hello,
>
> I have a bunch of text documents formatted like so:
>
> keyword1 wt1
> keyword2 wt2
> keyword3 wt3
>
> I would like to index the documents based on the keywords.  When I retrieve
> (search) for a keyword, I would like the list of documents to be sorted by
> the weight for that keyword.   Is there an example anywhere of how to do
> this.  I own LIA, but have not made it through the entire book yet.
>  Apologies if this is addressed.
>
> Thank you!
> Chris S.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>