You are viewing a plain text version of this content. The canonical link for it is here.

Posted to java-user@lucene.apache.org by Ivan Brusic <iv...@brusic.com> on 2014/06/12 19:47:04 UTC

Relevancy tests

Perhaps more of an NLP question, but are there any tests regarding
relevance for Lucene? Given an example corpus of documents, what are the
golden sets for specific queries? The Wikidump dump is used as a
benchmarking tool for both indexing and querying in Lucene, but there are
no metrics in terms of precision.

The Open Relevance project was closed yesterday (
http://lucene.apache.org/openrelevance/), which is what prompted me to ask
this question. Was the sub-project closed because others have found
alternate solutions?

Relevancy is of course extremely context-dependent and objective, but my
hope is that there is an example catalog somewhere with defined golden sets.

Cheers,

Ivan

Re: Relevancy tests

Posted by Ahmet Arslan <io...@yahoo.com.INVALID>.

Hi,

Relevance Judgments are labor intensive and expensive. Some Information Retrieval forums ( TREC, CLEF, etc) provide these golden sets. But they are not public.

http://rosenfeldmedia.com/books/search-analytics/ talks about how to create a "golden set" for your top n queries.


Also there are some works describing how to tune parameters of search system using click trough data.



On Thursday, June 12, 2014 8:47 PM, Ivan Brusic <iv...@brusic.com> wrote:
Perhaps more of an NLP question, but are there any tests regarding
relevance for Lucene? Given an example corpus of documents, what are the
golden sets for specific queries? The Wikidump dump is used as a
benchmarking tool for both indexing and querying in Lucene, but there are
no metrics in terms of precision.

The Open Relevance project was closed yesterday (
http://lucene.apache.org/openrelevance/), which is what prompted me to ask
this question. Was the sub-project closed because others have found
alternate solutions?

Relevancy is of course extremely context-dependent and objective, but my
hope is that there is an example catalog somewhere with defined golden sets.

Cheers,

Ivan


---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org

Re: Relevancy tests

Posted by Doug Turnbull <dt...@opensourceconnections.com>.

Relevancy judgement lists ARE very context sensitive. For example, in a
medical search application you'll have very different relevancy
requirements between a point-of-care applications vs an application being
used to perform general "sit at your desk" research ***even if the content
being served is identical*

Point-of-care is about getting to a solution fast. Its targeted. Recency
may be more of a factor. Specific solutions to medical problems may be more
important.

Sit-at-your-desk Research may be more about futzing around with general
knowledge and more about  the "discovery" aspect of search.

Even IF the data sets for the two applications were 100% identical, you
would almost certainly provide different relevancy rules based on the
different use cases.

We do a lot of testing with judgement lists (mostly through our product
Quepid <http://quepid.com> but there are other home-grown scripted tools
people use too). Judgement lists are great for collaborating closely with
your client on what you expect search to do -- ie capture informal use
cases. It lets them make assertions about what the correct order search
results should be. This allows you to optimize for a reasonable set of use
cases.

We've had it work well as long as they're representative in nature. For
example, in a name search application you don't need a "D. Turnbull" and a
"Y. Seeley" to test the case of "first initial/last name" search. You often
just need one exemplar to test/work against to prove you've solved (and
continue to solve) that problem.

Judgement lists based on "experts" tend to break down occasionally when the
person you're collaborating with does not actually reflect the actual
behavior of users. So we'll also work on relevancy in the context of
judgement lists generated programatically through user behavior (ie query
logs) not just what the expert is. More integration work, and requires more
data, but potentially more beneficial for relevancy tuning.

We blog a fair amount about relevancy preproduction and regression testing.
You can read more here
<http://www.opensourceconnections.com/2013/10/21/search-quality-is-about-effective-collaboration/>
, here
<http://www.opensourceconnections.com/blog/2014/06/10/what-is-search-relevancy/>,
and here
<http://www.opensourceconnections.com/2013/10/14/what-is-test-driven-search-relevancy/>.
Hope its helpful to you.

Good luck
-Doug
Search Relevancy Consultant
OpenSource Connections

On Thu, Jun 12, 2014 at 1:47 PM, Ivan Brusic <iv...@brusic.com> wrote:

> Perhaps more of an NLP question, but are there any tests regarding
> relevance for Lucene? Given an example corpus of documents, what are the
> golden sets for specific queries? The Wikidump dump is used as a
> benchmarking tool for both indexing and querying in Lucene, but there are
> no metrics in terms of precision.
>
> The Open Relevance project was closed yesterday (
> http://lucene.apache.org/openrelevance/), which is what prompted me to ask
> this question. Was the sub-project closed because others have found
> alternate solutions?
>
> Relevancy is of course extremely context-dependent and objective, but my
> hope is that there is an example catalog somewhere with defined golden
> sets.
>
> Cheers,
>
> Ivan
>

-- 
Doug Turnbull
Search & Big Data Architect
OpenSource Connections <http://o19s.com>