You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Pascal Schumacher (JIRA)" <ji...@apache.org> on 2018/10/18 07:00:00 UTC
[jira] [Closed] (TEXT-130) JaroWinklerDistance: Wrong results due
to precision of transpositions
[ https://issues.apache.org/jira/browse/TEXT-130?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Pascal Schumacher closed TEXT-130.
----------------------------------
Resolution: Fixed
Fix Version/s: 1.5
> JaroWinklerDistance: Wrong results due to precision of transpositions
> ---------------------------------------------------------------------
>
> Key: TEXT-130
> URL: https://issues.apache.org/jira/browse/TEXT-130
> Project: Commons Text
> Issue Type: Bug
> Affects Versions: 1.4
> Reporter: Jan Martin Keil
> Assignee: Rob Tompkins
> Priority: Major
> Fix For: 1.5
>
>
> The method {{JaroWinklerDistance#matches}} returns {{transpositions / 2}} as integer. However, it is not granted for {{transpositions}} to be even. E.g. comparing "aaabcd" and "aaacdb" will result in {{transpositions}} = 3. Therefore the method must return 1.5, not 1. Otherwise the similarity is 0.9611111111111111 instead of 0.9416666666666667.
> I recommend to return {{halfTranspositions}} instead of {{transpositions}} and doing the cast and division ({{(double) mtp[1] / 2}}) in {{JaroWinklerDistance#apply}}.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)