You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2011/03/30 20:02:06 UTC

[jira] [Resolved] (PDFBOX-147) Can't read Japanese fonts

     [ https://issues.apache.org/jira/browse/PDFBOX-147?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler resolved PDFBOX-147.
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.5.0
         Assignee: Andreas Lehmkühler

Works fine with PDFBox 1.5.0 (see attached files), except some garbage on the first and the last page. That part of the pdf uses type3 fonts which can't be extracted.

> Can't read Japanese fonts
> -------------------------
>
>                 Key: PDFBOX-147
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-147
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>            Assignee: Andreas Lehmkühler
>             Fix For: 1.5.0
>
>         Attachments: PDFBOX147-corporate_guide.pdf, PDFBOX147-corporate_guide.txt
>
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552832&aid=1464488
> Originally submitted by nobody on 2006-04-04 13:55.
> When I try to run ExtractText, the output is garbage.
> I'm running the latest PDFBox-0.7.3-dev-20060404 . I
> run with parameters:
> -encoding utf-8 jp.pdf jp.txt
> Sourceforge won't let me upload a file, so here is a
> URL:
> http://www.denso.co.jp/ja/aboutdenso/download/pdf/corporate_guide.pdf
> I've seen another discussion thread about this issue
> too. Can you explain how to enable pdfbox to work with
> japanese fonts? (Are there properties that need to be
> set or additional resources needed?)
> Domo arigato ;)
> sunfishy (at) gmail

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira