You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Karl Koch <Th...@gmx.net> on 2004/01/17 15:33:38 UTC

Relevance Feedback (2)

Hello group,

I would like to implement Relevance Feedback functionality for my system.
>From the privious discussion in this group I know that this is not implemented
in Lucene. 

We all know that Relevance Feedback has two fields, which are 
1) Term Reweighting
2) Query Expansion

I am interesting in doing both of it. 

My first thought was that Term Reweighting can be solved with term boosing
and expansion, well, with basically generation a new query. Looking close to
one of the classic term reweighting formula's (Rocchio) however reveals that I
need access to the term vector of the relevant as well as the term vector of
the non-relevant documents. Bringing this to Lucenen it would mean, that I
need to have the score of each term in the relevant and non-relevant documents
to process the reweigthing formula.

Coming back to Lucene, this would mean that I need to extract Documents from
the Hits object after the search. From this Documents I would need to get
all terms and its scores.

However, Lucene does not provide this. Only Documents can be retrieved and
its scores. It does not provide access to its terms and therefore no access to
Term scores.

Does somebody have ideas of workaround for Term Reweighting and Query
Expansion withouth using the way over Hits. Does somebody have produces workarounds
and can provide it to me? 

Thank you very much in advance,
Karl


-- 
+++ GMX - die erste Adresse für Mail, Message, More +++
Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org


Re: Relevance Feedback (2)

Posted by Karl Koch <Th...@gmx.net>.
Hello all,

oh, I just found a mail from Doug where he wrote that Dmitry Serebrennikov
developed something who provides Document vector access:

> Dmitry Serebrennikov [dmitrys?X0040;earthlink.net] has implemented a
substantial
> extension to Lucene which should help folks doing this sort of research. 
It
>  provides an explicit vector representation for documents.  This way you
can,
> e.g., retrieve a number of documents, efficiently sum their vectors, then
> derive a new query from the sum.  This code was posted to the list a long
> while back, but is now out of date.  As soon as the 1.2 release is final,
> and Dmitry has time, he intends to merge it into Lucene.

Who has this code? Could somebody email it to me? I would highly appreciate
it.

Is there any attempt from Dmitry or somebody else to adapt it to Lucene 1.3?


I wish you all a nice weekend,
Karl

-- 
+++ GMX - die erste Adresse für Mail, Message, More +++
Bis 31.1.: TopMail + Digicam für nur 29 EUR http://www.gmx.net/topmail


---------------------------------------------------------------------
To unsubscribe, e-mail: lucene-user-unsubscribe@jakarta.apache.org
For additional commands, e-mail: lucene-user-help@jakarta.apache.org