You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Samuel García Martínez <sa...@gmail.com> on 2011/11/14 10:40:30 UTC

How to determine result quality

Hi list,

I have been searching about score normalization few days (now i know this
can't be done) in Lucene using this list, wiki, blogposts, etc. I'm going
to expose my problem because I'm not sure that score normalization is what
our project need.

*Background*:
  In our project, we are using Solr on top of Lucene with custom
RequestHandlers and SearchComponents. For a given query, we need to detect
when a query got poor results to trigger different actions.

*Assumptions*:
  Inmutable index (once indexed, it is not updated) and Same query tipology
(dismax qparser with same field boosting, without boost functions nor boost
queries).

*Problem*:
  We know that score normalization is not implementable. But is there any
way to determine (using TF/IDF and boost field assumptions) when search
results match quality are poor?

*Example: *We've got an index with science papers and other one with
medcare centre's info. When a user query against first index and got poor
results (inferring it from score?), we want to query second index and merge
results using some threshold (score threshold?)

Thanks in advance
-- 
Un saludo,
Samuel García.

Re: How to determine result quality

Posted by Ian Lea <ia...@gmail.com>.
While you are correct that as a general rule it can't be done, there
is nothing stopping you from trying in your particular circumstances,
with an unchanging index and standard queries.  Look at the scores and
the matches and decide where the good/bad threshold lies and hard code
that in your application.  Just remember that if you change things you
are likely to need to change the threshold too.


--
Ian.


2011/11/14 Samuel García Martínez <sa...@gmail.com>:
> Hi list,
>
> I have been searching about score normalization few days (now i know this
> can't be done) in Lucene using this list, wiki, blogposts, etc. I'm going
> to expose my problem because I'm not sure that score normalization is what
> our project need.
>
> *Background*:
>  In our project, we are using Solr on top of Lucene with custom
> RequestHandlers and SearchComponents. For a given query, we need to detect
> when a query got poor results to trigger different actions.
>
> *Assumptions*:
>  Inmutable index (once indexed, it is not updated) and Same query tipology
> (dismax qparser with same field boosting, without boost functions nor boost
> queries).
>
> *Problem*:
>  We know that score normalization is not implementable. But is there any
> way to determine (using TF/IDF and boost field assumptions) when search
> results match quality are poor?
>
> *Example: *We've got an index with science papers and other one with
> medcare centre's info. When a user query against first index and got poor
> results (inferring it from score?), we want to query second index and merge
> results using some threshold (score threshold?)
>
> Thanks in advance
> --
> Un saludo,
> Samuel García.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org