You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2010/03/15 07:47:27 UTC

[jira] Commented: (PDFBOX-624) Misplaced text

    [ https://issues.apache.org/jira/browse/PDFBOX-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12845202#action_12845202 ] 

Andreas Lehmkühler commented on PDFBOX-624:
-------------------------------------------

Sounds curious. An int of 4 bytes will be read and be casted to a short value. Whether your are using the lowest or the highest bytes, the half of the value is lost at all times. Is there any specification for that font type available?

> Misplaced text
> --------------
>
>                 Key: PDFBOX-624
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-624
>             Project: PDFBox
>          Issue Type: Bug
>          Components: FontBox, Text extraction, Utilities
>    Affects Versions: 1.0.0
>            Reporter: Villu Ruusmann
>            Priority: Critical
>             Fix For: 1.1.0
>
>         Attachments: documenta_math-fixed.txt, documenta_math.pdf, documenta_math.txt, documenta_math_page4-fixed.png, documenta_math_page4.png, FontBox.patch
>
>
> Thomas Fischer reported to users@pdfbox.apache.org that org.apache.pdfbox.ExtractText interchanges typographic ligatures "fi" and "fl". The sample document "documenta_math.pdf" was created using TeX and AFPL Ghostscript 6.50.
> I used PDFBox 1.0.1-SNAPSHOT to verify this problem. The "fi" ligature behaves correctly (ie. text extraction yields "finite" and "infinite", not "flnite" and "inflnite"), but the overall text layout is a complete mess. Please see the PDF text extraction result "documenta_math.txt" and PDF rendering result "documenta_math_page4.png".
> The cause of the horizontal text misplacement is not yet known. This could affect all PDF documents which have been created using TeX.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.