You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Sethu_424 <se...@gmail.com> on 2010/05/26 16:15:31 UTC
Re: Getting DF & IDF
Hi,
I am not sure if you are still searching the answer for your question. If
so, then please read on...
You can get the DF & IDF for each of the query terms in the query as below..
IndexReader reader = IndexReader.open(FSDirectory.open(new File(indexDir)),
true);
//Create a FilterIndexReader to invoke the abstract methods
FilterIndexReader filterIndexReader = new FilterIndexReader(reader);
//Number of documents in the index
int numDocs = filterIndexReader.numDocs();
//Iterate over each of the query words
for(String queryWord : queryWords){
Term term = new Term(searchField, queryWord.toLowerCase());
int docFreq = 0;
try {
docFreq = filterIndexReader.docFreq(term);
} catch (IOException e) {
logger.log(Level.SEVERE, null, e);
}
//Calculate IDF
double idf = 0.0;
if(docFreq > 0){
idf = Math.log10((double) numDocs / docFreq);
}
System.out.println(queryWord + "\tDF -" + docFreq + "\tIDF -" + idf);
}
--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-DF-IDF-tp547386p844962.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org
Re: Getting DF & IDF
Posted by "yura.minsk" <yu...@gmail.com>.
int numDocs = filterIndexReader.numDocs();
...
idf = Math.log10((double) numDocs / docFreq);
Sethu_424 wrote
>
>
wrong formula. numDoc should not be a count of documents in index - but
documents containing searching term.
We need something like IndexReader.docFreq( term );
--
View this message in context: http://lucene.472066.n3.nabble.com/Getting-DF-IDF-tp547386p3984938.html
Sent from the Lucene - Java Users mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org