You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2015/02/03 19:27:35 UTC

[jira] [Updated] (PDFBOX-2532) Text extraction fails due to the usage of the internal font mapping

     [ https://issues.apache.org/jira/browse/PDFBOX-2532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Hewson updated PDFBOX-2532:
--------------------------------
    Fix Version/s:     (was: 2.0.0)
                   2.1.0

> Text extraction fails due to the usage of the internal font mapping
> -------------------------------------------------------------------
>
>                 Key: PDFBOX-2532
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2532
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 2.0.0
>            Reporter: Andreas Lehmkühler
>             Fix For: 2.1.0
>
>         Attachments: PDFBOX2247-701542.pdf, PDFBOX2247-701542_cp_acrobat.txt, PDFBOX2247-701542_sa_acrobat.txt, PDFBOX2247-701542_sa_acrobat_osx.txt, PDFBOX2247-701542_sa_reader_osx.txt, PDFBOX2247-Debugger.png
>
>
> If a pdf doesn't provide any mapping (neither an encoding nor a toUnicode mapping) we have to decide where to get a suitable mapping ourselves. We can't use the internal font mapping of the type1C font as it doesn't work in every case, see PDFBOX-2377 which provides a solution for the 1.8-branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org