You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2009/01/10 10:59:59 UTC

[jira] Commented: (PDFBOX-400) TextExtractor do not extract complete text

    [ https://issues.apache.org/jira/browse/PDFBOX-400?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12662646#action_12662646 ] 

Andreas Lehmkühler commented on PDFBOX-400:
-------------------------------------------

Did you ever try the upcoming version 0.8? 
Do you get an error message or is just the mentioned part of the text missing?
Please attach a sample document to this issue if possible.

> TextExtractor do not extract complete text
> ------------------------------------------
>
>                 Key: PDFBOX-400
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-400
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 0.7.3
>         Environment: Win Xp Professional SP2
>            Reporter: prashant dhaka
>            Priority: Critical
>
> Hi , 
> Need your help and advice.. 
>  
> when we extract text from pdf acrylic text is not extracted.. 
> Example: 
>  
> TEXT In PDF 
> "Th" of "The" is acrylic 
>  
> The remaining text  
>  
> Extracted Text: 
> e remaining text  
>  
> Th is missing 
>  
> Note: Acrobat extract the complete text. 
>  
> Plz provide your suggestion to resolve this issue. 
>  
> Thnx 
> Prashant 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.