You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/06/17 22:55:07 UTC
[jira] [Commented] (PDFBOX-755) Wrong translation of capital
letters with combining diacritics
[ https://issues.apache.org/jira/browse/PDFBOX-755?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14034361#comment-14034361 ]
John Hewson commented on PDFBOX-755:
------------------------------------
Confirmed, here's what I get with PDFBox 2.0 trunk:
{code}
S. KALABUSˇIC´ ANDM. R. S. KULENOVIC´
{code}
and with Adobe Acrobat:
{code}
S. KALABUˇSI´C AND M. R. S. KULENOVI´C
{code}
> Wrong translation of capital letters with combining diacritics
> --------------------------------------------------------------
>
> Key: PDFBOX-755
> URL: https://issues.apache.org/jira/browse/PDFBOX-755
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 1.2.0
> Environment: Mac OS X 10.6.4
> Reporter: Thomas Fischer
> Attachments: 139-p.1+3.pdf, 139-p.1+3.txt
>
>
> S. KALABUˇSI ´C ANDM. R. S. KULENOVI ´C
> vs.
> S. KALABUŠIĆ AND M. R. S. KULENOVIĆ
> 1. ´ before vs. ́ behind the letter (\x20 \xB4 vs. \x301)
> 2. ˇ before vs. ̌ behind the letter (\x27C vs. \x30C)
> 3. ANDM. : space missing
> Note:
> S. Kalabušić is translated correctly
--
This message was sent by Atlassian JIRA
(v6.2#6252)