You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Diaa Abdallah <di...@gmail.com> on 2014/05/18 15:34:54 UTC

How to use query terms tfidf as a factor in document similarity calculation

Hi,
I'm trying to implement Explicit semantic analysis(ESA) via Lucene.

How do I take a term TFIDF in a query into consideration when matching
documents?

For example:
Query:"a b c a d a"
Doc1:"a b a"
Doc2:"a b c"

The query should match Doc1 better than 2.
I'd like this to work without impacting performance.
I'm doing this through query boosting. Is there a better way?

Thanks,
Diaa