You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@solr.apache.org by "Cassandra Targett (Jira)" <ji...@apache.org> on 2021/08/11 19:08:00 UTC

[jira] [Resolved] (SOLR-3818) TermVectorComponent tfidf is not tf/idf by anyone's definition

     [ https://issues.apache.org/jira/browse/SOLR-3818?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cassandra Targett resolved SOLR-3818.
-------------------------------------
    Resolution: Fixed

This was done at some point, as the current description for the {{tv.tf_idf}} parameter explains that it's not a true TF-IDF:

{noformat}
If `true`, calculates TF / DF (i.e.,: TF * IDF) for each term.
Please note that this is a _literal_ calculation of "Term Frequency multiplied by Inverse Document Frequency" and *not* a classical TF-IDF similarity measure.
{noformat}

> TermVectorComponent tfidf is not tf/idf by anyone's definition
> --------------------------------------------------------------
>
>                 Key: SOLR-3818
>                 URL: https://issues.apache.org/jira/browse/SOLR-3818
>             Project: Solr
>          Issue Type: Bug
>          Components: documentation
>            Reporter: Robert Muir
>            Priority: Major
>
> {quote}
> tv.tf_idf - Calculates tf*idf for each term. Requires the parameters tv.tf and tv.df to be "true". This can be expensive. (not shown in example output) 
> {quote}
> But the current computation is tf/docFreq



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@solr.apache.org
For additional commands, e-mail: issues-help@solr.apache.org