You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "John Hewson (JIRA)" <ji...@apache.org> on 2014/06/17 22:43:05 UTC

[jira] [Resolved] (PDFBOX-861) german umlaute are not recognized

     [ https://issues.apache.org/jira/browse/PDFBOX-861?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

John Hewson resolved PDFBOX-861.
--------------------------------

    Resolution: Not a Problem

Adobe acrobat is not able to extract the umlaute from this PDF, nor are any of the other PDF readers which I've tried. Looks like there's nothing we can do.

> german umlaute are not recognized
> ---------------------------------
>
>                 Key: PDFBOX-861
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-861
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.3.1
>         Environment: tika-0.8
>            Reporter: Reinhard Schwab
>         Attachments: stts-guide.pdf
>
>
> german umlaute are not recognized in this document
> http://www.computing.dcu.ie/~irehbein/SS08/uebung1/stts-guide.pdf
> Guidelines f
> 
> ur das Tagging deutscher Textcorpora



--
This message was sent by Atlassian JIRA
(v6.2#6252)