You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Michael Rusch <mc...@facstaff.wisc.edu> on 2006/11/16 02:15:22 UTC

newbie scoring question

I have documents that have a variety of IDs and names by which people may
commonly refer to them.  There is one guaranteed unique ID and multiple
other names, synonyms, etc. that are "almost unique".  Furthermore, any
document may have text that refers to another document by any of these
various identifiers.

What I want is that when somebody does a search using one of these
identifiers, the document(s) identified by that identifier score the best.

For example, there could be document A identified by ABE-0000008, and
document B that has within it's content "this is kinda like ABE-0000008."  I
want a search for ABE-0000008 to hit both of these, but for A to score
better.

I already am already indexing the identifier in the same field with the rest
of the content for the docs, so I'm getting hits to both, just not sure how
to achieve the scoring I want.

Thanks,
Michael.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: newbie scoring question

Posted by Chris Hostetter <ho...@fucit.org>.

: So you can make a specific field's relevancy for a given term higher
: compared to another term using something like
:
: Id_field:someterm^2 || blob_field:someterm
:
: Im kind of a newb myself but I think this should work for you.

Indeed that is the way i would recommend solving this problem (which i use
frequently)



-Hoss


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

RE: newbie scoring question

Posted by Phil Rosen <pr...@optaros.com>.

You can set the boost on specific terms within your query, I believe the 
syntax is:

Id_field:someterm^2

So you can make a specific field's relevancy for a given term higher 
compared to another term using something like

Id_field:someterm^2 || blob_field:someterm

Im kind of a newb myself but I think this should work for you.

-----Original Message-----
From: Michael Rusch [mailto:mcrusch@facstaff.wisc.edu]
Sent: Wednesday, November 15, 2006 8:15 PM
To: Lucene Users
Subject: newbie scoring question

I have documents that have a variety of IDs and names by which people may
commonly refer to them.  There is one guaranteed unique ID and multiple
other names, synonyms, etc. that are "almost unique".  Furthermore, any
document may have text that refers to another document by any of these
various identifiers.

What I want is that when somebody does a search using one of these
identifiers, the document(s) identified by that identifier score the best.

For example, there could be document A identified by ABE-0000008, and
document B that has within it's content "this is kinda like ABE-0000008."  I
want a search for ABE-0000008 to hit both of these, but for A to score
better.

I already am already indexing the identifier in the same field with the rest
of the content for the docs, so I'm getting hits to both, just not sure how
to achieve the scoring I want.

Thanks,
Michael.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org