You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Jukka Zitting (JIRA)" <ji...@apache.org> on 2010/06/21 17:40:23 UTC

[jira] Resolved: (PDFBOX-684) Incorrect ordering of compound Arabic glyphs

     [ https://issues.apache.org/jira/browse/PDFBOX-684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jukka Zitting resolved PDFBOX-684.
----------------------------------

         Assignee: Jukka Zitting
    Fix Version/s: 1.2.0
       Resolution: Fixed

Patch committed in revision 956624. Thanks!

> Incorrect ordering of compound Arabic glyphs
> --------------------------------------------
>
>                 Key: PDFBOX-684
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-684
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.0.0, 1.1.0
>            Reporter: Yigal Dayan
>            Assignee: Jukka Zitting
>            Priority: Minor
>             Fix For: 1.2.0
>
>         Attachments: PDFStreamEngine.patch, PDFTextStripper.patch, zzz.after_fix.txt, zzz.before_fix.txt, zzz.pdf
>
>   Original Estimate: 3h
>  Remaining Estimate: 3h
>
> Some Arabic PDFs contain compound glyphs for stylistic reasons.
> Such glyphs encode two letters: FI, SI, LI, LJ, LM, etc.
> Before a line gets sent to the bidirectional algorithm, all characters have been sorted into a visual order, except for these pairs. This is because they are handled as one unit and maintain their original (logical) order. The bidi algorithm straightens out most characters, but reverses the glyph pairs.
> To fix this, the output of font.encode() should be examined and reversed on the spot if it contains pairs of Arabic characters. Possibly you need to add a stub method to PDFStreamEngine (in method processEncodedText) that PDFTextStripper can override (in sort mode only).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.