You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Keiji Suzuki (JIRA)" <ji...@apache.org> on 2010/08/28 09:04:53 UTC
[jira] Updated: (PDFBOX-805) Extratced ascii text in CJK document
is malformed
[ https://issues.apache.org/jira/browse/PDFBOX-805?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Keiji Suzuki updated PDFBOX-805:
--------------------------------
Attachment: cjk.pdf
CMapParser.java.patch
The patch is for org/apache/fontbox/cmap/CMapParser.java in trunk. The sample pdf is made from iText sample code(
http://www.1t3xt.info/examples/browse/?page=example&id=142)
> Extratced ascii text in CJK document is malformed
> -------------------------------------------------
>
> Key: PDFBOX-805
> URL: https://issues.apache.org/jira/browse/PDFBOX-805
> Project: PDFBox
> Issue Type: Bug
> Components: FontBox
> Affects Versions: 1.2.1
> Reporter: Keiji Suzuki
> Attachments: cjk.pdf, CMapParser.java.patch
>
>
> When I run ExtractText with CJK PDF document with ascii text, the only ascii text is malformed. This does not occur in version 1.1.0.
> I can fix it with the attached patch. I attach an example pdf.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.