You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Sean Bridges (JIRA)" <ji...@apache.org> on 2009/05/16 01:28:45 UTC

[jira] Created: (PDFBOX-477) extra spaces added to rotated text

extra spaces added to rotated text
----------------------------------

                 Key: PDFBOX-477
                 URL: https://issues.apache.org/jira/browse/PDFBOX-477
             Project: PDFBox
          Issue Type: Bug
    Affects Versions: 0.8.0-incubator
            Reporter: Sean Bridges
             Fix For: 0.8.0-incubator
         Attachments: rotated.pdf

Rotated text is not properly extracted.  Extra line breaks are inserted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (PDFBOX-477) extra spaces added to rotated text

Posted by "Brian Carrier (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Brian Carrier resolved PDFBOX-477.
----------------------------------

    Resolution: Invalid

By default, PDFBox does not sort the text by coordinate, which can result in corrupt text (especially when dealing with rotation).  If you enable sorting, the output  is correct. 

> extra spaces added to rotated text
> ----------------------------------
>
>                 Key: PDFBOX-477
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-477
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 0.8.0-incubator
>            Reporter: Sean Bridges
>             Fix For: 0.8.0-incubator
>
>         Attachments: rotated.pdf
>
>
> Rotated text is not properly extracted.  Extra line breaks are inserted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (PDFBOX-477) extra spaces added to rotated text

Posted by "Sean Bridges (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/PDFBOX-477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Bridges updated PDFBOX-477:
--------------------------------

    Attachment: rotated.pdf



The text extracted from this file is,

Rotated t
ext is br
ok
en i
nto s
everal pe
ices


While it should be

Rotated text is boken into several peices



> extra spaces added to rotated text
> ----------------------------------
>
>                 Key: PDFBOX-477
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-477
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 0.8.0-incubator
>            Reporter: Sean Bridges
>             Fix For: 0.8.0-incubator
>
>         Attachments: rotated.pdf
>
>
> Rotated text is not properly extracted.  Extra line breaks are inserted.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.