You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Kalvir Sandhu <ka...@kalv.co.uk> on 2007/08/31 18:08:23 UTC
Weighting Issue
Hi all.
I am working on building a lucene index to search names of people. I want to
be able to score things differently. Here is an example of the behaviour i
need.
Doc 1 with aliases
name: Bob Jones
alias: John Smith Andrew Jones
Doc 2 without aliases
name: John Andrew Smith
alias: none
When i run a search with the lucene query:
name:(John Smith) alias:(John Smith)
I get Doc 2 as higher scored result than Doc 1. And the score of Doc 2 is
quite low. I need the score to not reflect how many names were assigned to
the document. I have been playing with the DefaultSimilarity to override
certain fields but not getting anywhere.
I could use a ConstantScoreQuery but i want to be able to perfom Fuzzy query
options sometimes too.
Any Ideas?
Kalv.
Re: Weighting Issue
Posted by Chris Hostetter <ho...@fucit.org>.
> Have you tried giving the name field a boost? E.g. name:(John Smith)^10
> alias:(John Smith)
i'm also guessing youd be much happier with a sloppy phrase query then
with the boolean queries you are currently using..
name:"John Smith"~3^10 alias:"John Smith"~3
-Hoss
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Weighting Issue
Posted by Kalvir Sandhu <ka...@kalv.co.uk>.
Thanks for the reply - i have tried boosting but not like you stated. I have
tried to boost the Alias field so that it would score as high as a match on
the name field. But it didn't increase enough. like :
name:(John Smith) alias:(John Smith)^10
I think it has something to do with the fact that there is a lot of terms
stored in that document for alias, therefore weighting lower.
On 8/31/07, Michael Stoppelman <st...@gmail.com> wrote:
>
> Kalvir,
>
> Have you tried giving the name field a boost? E.g. name:(John Smith)^10
> alias:(John Smith)
>
> -M
>
> On 8/31/07, Kalvir Sandhu <ka...@kalv.co.uk> wrote:
> >
> > Hi all.
> >
> > I am working on building a lucene index to search names of people. I
> want
> > to
> > be able to score things differently. Here is an example of the behaviour
> i
> > need.
> >
> > Doc 1 with aliases
> > name: Bob Jones
> > alias: John Smith Andrew Jones
> >
> > Doc 2 without aliases
> > name: John Andrew Smith
> > alias: none
> >
> > When i run a search with the lucene query:
> > name:(John Smith) alias:(John Smith)
> >
> > I get Doc 2 as higher scored result than Doc 1. And the score of Doc 2
> is
> > quite low. I need the score to not reflect how many names were assigned
> to
> > the document. I have been playing with the DefaultSimilarity to override
> > certain fields but not getting anywhere.
> >
> > I could use a ConstantScoreQuery but i want to be able to perfom Fuzzy
> > query
> > options sometimes too.
> >
> > Any Ideas?
> >
> > Kalv.
> >
>
Re: Weighting Issue
Posted by Michael Stoppelman <st...@gmail.com>.
Kalvir,
Have you tried giving the name field a boost? E.g. name:(John Smith)^10
alias:(John Smith)
-M
On 8/31/07, Kalvir Sandhu <ka...@kalv.co.uk> wrote:
>
> Hi all.
>
> I am working on building a lucene index to search names of people. I want
> to
> be able to score things differently. Here is an example of the behaviour i
> need.
>
> Doc 1 with aliases
> name: Bob Jones
> alias: John Smith Andrew Jones
>
> Doc 2 without aliases
> name: John Andrew Smith
> alias: none
>
> When i run a search with the lucene query:
> name:(John Smith) alias:(John Smith)
>
> I get Doc 2 as higher scored result than Doc 1. And the score of Doc 2 is
> quite low. I need the score to not reflect how many names were assigned to
> the document. I have been playing with the DefaultSimilarity to override
> certain fields but not getting anywhere.
>
> I could use a ConstantScoreQuery but i want to be able to perfom Fuzzy
> query
> options sometimes too.
>
> Any Ideas?
>
> Kalv.
>