You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Tupelo-Schneck (JIRA)" <ji...@apache.org> on 2016/02/18 22:52:18 UTC

[jira] [Commented] (SOLR-3818) TermVectorComponent tfidf is not tf/idf by anyone's definition

    [ https://issues.apache.org/jira/browse/SOLR-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153187#comment-15153187 ] 

Robert Tupelo-Schneck commented on SOLR-3818:
---------------------------------------------

This is at least a documentation bug in https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component and https://wiki.apache.org/solr/TermVectorComponent .  

The source code makes it clear:       // TODO: this is not TF/IDF by anyone's definition!

The documentation should make it just as clear, so people don't stumble into using it incorrectly.

> TermVectorComponent tfidf is not tf/idf by anyone's definition
> --------------------------------------------------------------
>
>                 Key: SOLR-3818
>                 URL: https://issues.apache.org/jira/browse/SOLR-3818
>             Project: Solr
>          Issue Type: Bug
>            Reporter: Robert Muir
>
> {quote}
> tv.tf_idf - Calculates tf*idf for each term. Requires the parameters tv.tf and tv.df to be "true". This can be expensive. (not shown in example output) 
> {quote}
> But the current computation is tf/docFreq



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org