You are viewing a plain text version of this content. The canonical link for it is here.

Posted to general@lucene.apache.org by Paul Bedaride <pa...@xilopix.com> on 2015/07/08 13:47:40 UTC

Token type similarity

Hello,

I wonder how token type are taken in account in similarity scoring.

 From my test it appears that lucene do a scoring on the term text
and the term type separately.

For instance, with the documents (with term text/type)
d1: w1/t1 w2/t1 w3/t2
d2: w1/t2 w2/t1 w3/t1

and the search w1/t1, I get the same score for d1 and d2

Is there a way to improve the score of d1 because the same token
hat the right token text and type ?

Thanks

Paul Bédaride