You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Christopher Condit <co...@sdsc.edu> on 2009/08/05 23:11:27 UTC

RE: Analysis Question

Perhaps a better question: let's say I have a few thousand terms or phrases. I want to prefer documents with these phrases in my search results over documents that do not have these terms or phrases. What's the best way to accomplish this?
Thanks,
-Chris

> -----Original Message-----
> From: Christopher Condit [mailto:condit@sdsc.edu]
> Sent: Tuesday, July 21, 2009 2:48 PM
> To: java-user@lucene.apache.org
> Subject: Analysis Question
> 
> I'm trying to implement an analyzer that will compute a score based on
> vocabulary terms in the indexed content (ie a document field with more
> terms in the vocabulary will score higher). Although I can see the
> tokens I can't seem to access the document from the analyzer to set a
> new field on it after I compute the value. Is there a way to do this
> from an Analyzer? Or is there another preferred way to do this?
> Thanks,
> -Chris
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Analysis Question

Posted by Ian Lea <ia...@gmail.com>.
You could write your own analyzer that worked out a boost as it
analyzed the document fields and had a getBoost() method that you
would call to get the value to add to the document as a separate
field.  If you write your own you can pass it what you like and it can
do whatever you want.


--
Ian.

On Thu, Aug 6, 2009 at 8:37 PM, Christopher Condit<co...@sdsc.edu> wrote:
> Hi Anshum-
>> You might want to look at writing a custom analyzer or something and
>> add a
>> document boost (while indexing) for documents containing those terms.
>
> Do you know how to access the document from an analyzer? It seems to only have access to the field...
>
> Thanks,
> -Chris
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


RE: Analysis Question

Posted by Christopher Condit <co...@sdsc.edu>.
Hi Anshum-
> You might want to look at writing a custom analyzer or something and
> add a
> document boost (while indexing) for documents containing those terms.

Do you know how to access the document from an analyzer? It seems to only have access to the field...

Thanks,
-Chris


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org


Re: Analysis Question

Posted by Anshum <an...@gmail.com>.
Hi Cristopher,
You might want to look at writing a custom analyzer or something and add a
document boost (while indexing) for documents containing those terms.

--
Anshum Gupta
Naukri Labs!
http://ai-cafe.blogspot.com

The facts expressed here belong to everybody, the opinions to me. The
distinction is yours to draw............


On Thu, Aug 6, 2009 at 2:41 AM, Christopher Condit <co...@sdsc.edu> wrote:

> Perhaps a better question: let's say I have a few thousand terms or
> phrases. I want to prefer documents with these phrases in my search results
> over documents that do not have these terms or phrases. What's the best way
> to accomplish this?
> Thanks,
> -Chris
>
> > -----Original Message-----
> > From: Christopher Condit [mailto:condit@sdsc.edu]
> > Sent: Tuesday, July 21, 2009 2:48 PM
> > To: java-user@lucene.apache.org
> > Subject: Analysis Question
> >
> > I'm trying to implement an analyzer that will compute a score based on
> > vocabulary terms in the indexed content (ie a document field with more
> > terms in the vocabulary will score higher). Although I can see the
> > tokens I can't seem to access the document from the analyzer to set a
> > new field on it after I compute the value. Is there a way to do this
> > from an Analyzer? Or is there another preferred way to do this?
> > Thanks,
> > -Chris
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: java-user-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
> For additional commands, e-mail: java-user-help@lucene.apache.org
>
>