You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Keiji Suzuki (JIRA)" <ji...@apache.org> on 2010/08/28 09:04:53 UTC

[jira] Updated: (PDFBOX-805) Extratced ascii text in CJK document is malformed

     [ https://issues.apache.org/jira/browse/PDFBOX-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Keiji Suzuki updated PDFBOX-805:
--------------------------------

    Attachment: cjk.pdf
                CMapParser.java.patch

The patch is for org/apache/fontbox/cmap/CMapParser.java in trunk. The sample pdf is made from iText sample code(
http://www.1t3xt.info/examples/browse/?page=example&id=142)


> Extratced ascii text in CJK document is malformed
> -------------------------------------------------
>
>                 Key: PDFBOX-805
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-805
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox
>    Affects Versions: 1.2.1
>            Reporter: Keiji Suzuki
>         Attachments: cjk.pdf, CMapParser.java.patch
>
>
> When I run ExtractText with CJK PDF document with ascii text, the only ascii text is malformed. This does not occur in version 1.1.0.
> I can fix it with the attached patch. I attach an example pdf.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.