You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Joscha Feth (JIRA)" <ji...@apache.org> on 2011/06/24 08:16:47 UTC

[jira] [Updated] (PDFBOX-1048) Extracted PDF (text) partially garbled

     [ https://issues.apache.org/jira/browse/PDFBOX-1048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joscha Feth updated PDFBOX-1048:
--------------------------------

    Attachment: output.nfo
                PDFTest.java
                agb_de.pdf

agb_de.pdf is the respective PDF
PDFTest.java prints out the converted PDF
output.nfo contains the garbled textual contents of the PDF

> Extracted PDF (text) partially garbled
> --------------------------------------
>
>                 Key: PDFBOX-1048
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1048
>             Project: PDFBox
>          Issue Type: Bug
>         Environment: OSX 10.6
>            Reporter: Joscha Feth
>         Attachments: PDFTest.java, agb_de.pdf, output.nfo
>
>
> When using Tika 0.9 to etxract text from the given PDF, the text partially gets garbled.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira