You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Damerian <da...@gmail.com> on 2012/02/23 15:10:19 UTC

Custom scoring

Hello,
  I am trying to implement my own Jaccard similarity for Lucene.
So far i have the following code
public class JaccardSimilarity extends DefaultSimilarity {
     int numberOfDocumentTerms;
     //String field="contents"; // Should the Jaccard similarity be only 
based in the contents field????

     @Override
     public float idf(int i, int i1) {
     return 1;
   }
     @Override
     public float tf(int i) {
     return 1;
   }

     public int getNumberOfDocumentTerms() {
         return numberOfDocumentTerms;
     }

     public void setNumberOfDocumentTerms(int numberOfDocumentTerms) {
         this.numberOfDocumentTerms = numberOfDocumentTerms;
     }

     @Override
     public float queryNorm(float i) {
     return 1.0f;
   }
     @Override
     public float computeNorm(String field, FieldInvertState state) {


         numberOfDocumentTerms=state.getLength();//for each field we get 
the number of terms
         setNumberOfDocumentTerms(numberOfDocumentTerms);

         System.out.println("numberOfDocumentTerms from compute : " + 
numberOfDocumentTerms);
     return 1.0f;
   }

     @Override
     public float coord(int overlap, int maxOverlap) {
         System.out.println("numberOfDocumentTerms : " + 
getNumberOfDocumentTerms());
     return (overlap/(numberOfDocumentTerms+(maxOverlap-overlap)));
   }
}

The problem is that coord() method is not used (or at least so that i 
understand) neither in searching nor in indexing
What do i do wrong? i need the

    |overlap| - the number of query terms matched in the document
    |maxOverlap| - the total number of terms in the query
to implement my scoring.
Any help would be highly appreciated
Thank you in advance!


Re: Custom scoring

Posted by Ahmet Arslan <io...@yahoo.com>.
> The problem is that coord() method is not used (or at least
> so that i understand) neither in searching nor in indexing
> What do i do wrong? 

If you want to see coord() values, use a multi-word query (two or more query terms) and go to last page of result set. 

---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org