You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Hemant Joshi <he...@gmail.com> on 2006/01/25 15:32:51 UTC

MoreDisLikeThis query

Hi,
           I have looked at MoreLikeThis functionality. I would like to 
add moreDisLikeThis functionality as well. It is important for me to 
learn from similarity as well as dissimilarity with other documents. I 
have done the basic ground work of forming two queries (one with 
MoreLikeThis class for similarity to given documents - query1) and other 
to include high TF terms from documents to avoid (or dislike) as query2.

I combine the 2 queries as
query1 NOT query2 and obtain Hits through searcher.

Is this the way to do it? Does high term frequency indicate most likely 
terms of the document to be represented in the query?
For example, if I would like to search for all documents with term 
cancer, then I can simply search for query +(contents:cancer). If I have 
to look for all documents with cancer as disease and not as a sun sign, 
then probably my query should be +(contents:cancer contents:disease) NOT 
(contents:astrology contents:zodiac)

Any comments would be appreciated.
-Hemant

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org