You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Diaa Abdallah <di...@gmail.com> on 2014/05/18 15:34:54 UTC
How to use query terms tfidf as a factor in document similarity calculation
Hi,
I'm trying to implement Explicit semantic analysis(ESA) via Lucene.
How do I take a term TFIDF in a query into consideration when matching
documents?
For example:
Query:"a b c a d a"
Doc1:"a b a"
Doc2:"a b c"
The query should match Doc1 better than 2.
I'd like this to work without impacting performance.
I'm doing this through query boosting. Is there a better way?
Thanks,
Diaa