You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Tupelo-Schneck (JIRA)" <ji...@apache.org> on 2016/02/18 22:52:18 UTC
[jira] [Commented] (SOLR-3818) TermVectorComponent tfidf is not
tf/idf by anyone's definition
[ https://issues.apache.org/jira/browse/SOLR-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15153187#comment-15153187 ]
Robert Tupelo-Schneck commented on SOLR-3818:
---------------------------------------------
This is at least a documentation bug in https://cwiki.apache.org/confluence/display/solr/The+Term+Vector+Component and https://wiki.apache.org/solr/TermVectorComponent .
The source code makes it clear: // TODO: this is not TF/IDF by anyone's definition!
The documentation should make it just as clear, so people don't stumble into using it incorrectly.
> TermVectorComponent tfidf is not tf/idf by anyone's definition
> --------------------------------------------------------------
>
> Key: SOLR-3818
> URL: https://issues.apache.org/jira/browse/SOLR-3818
> Project: Solr
> Issue Type: Bug
> Reporter: Robert Muir
>
> {quote}
> tv.tf_idf - Calculates tf*idf for each term. Requires the parameters tv.tf and tv.df to be "true". This can be expensive. (not shown in example output)
> {quote}
> But the current computation is tf/docFreq
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org