You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@commons.apache.org by "Arturo Bernal (Jira)" <ji...@apache.org> on 2021/01/15 12:33:01 UTC
[jira] [Commented] (CODEC-249) Incorrect transform of CH digraph
according basic rules
[ https://issues.apache.org/jira/browse/CODEC-249?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17266005#comment-17266005 ]
Arturo Bernal commented on CODEC-249:
-------------------------------------
Hi [~Kanaduchi]
IMO you're right. There is a problem, but i think not only calculating CH.
And for some reason that I don't know it 's limitated to 4
assertEquals( "SKMT", this.getStringEncoder().metaphone("SCHEMATIC") ); should be --> SXMTK
assertEquals( "KRKT", this.getStringEncoder().metaphone("CHARACTER") ); should be --> XRKTR
assertEquals( "AKSK", this.getStringEncoder().metaphone("AXEAXE") ); should be --> AKSKS
> Incorrect transform of CH digraph according basic rules
> -------------------------------------------------------
>
> Key: CODEC-249
> URL: https://issues.apache.org/jira/browse/CODEC-249
> Project: Commons Codec
> Issue Type: Bug
> Reporter: Andrey
> Priority: Major
>
> I detected incorrect transform of CH digraph by metaphone algorithm.
> According _Philips_ _Lawrence_ CH should be transformed to 'X':
> {code:java}
> 'C' transforms to 'X' if followed by 'IA' or 'H' (unless in latter case, it is part of '-SCH-', in which case it transforms to 'K'). 'C' transforms to 'S' if followed by 'I', 'E', or 'Y'. Otherwise, 'C' transforms to 'K'.
> {code}
> But in Apache realization I see
> {code:java}
> if (isNextChar(local, n, 'H')) { // detect CH
> if (n == 0 &&
> wdsz >= 3 &&
> isVowel(local,2) ) { // CH consonant -> K consonant
> code.append('K');
> } else {
> code.append('X'); // CHvowel -> X
> }
> {code}
> So after transformation I get 'K' instead of 'X'
> *Example*: CHERI should be transformed to 'XR' but I get 'KR' which is wrong
> This bug has major priority due to big impact on results of metaphone algorithm
--
This message was sent by Atlassian Jira
(v8.3.4#803005)