You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (Resolved JIRA)" <ji...@apache.org> on 2011/11/09 08:13:51 UTC

[jira] [Resolved] (PDFBOX-895) Infinite recursion when trying to extract text from specific types of PDFs

     [ https://issues.apache.org/jira/browse/PDFBOX-895?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler resolved PDFBOX-895.
---------------------------------------

       Resolution: Fixed
    Fix Version/s: 1.7.0

I found a suitable solution for PDFBOX-956 and now the performance is back.
Set to resolved.
                
> Infinite recursion when trying to extract text from specific types of PDFs
> --------------------------------------------------------------------------
>
>                 Key: PDFBOX-895
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-895
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Text extraction
>    Affects Versions: 1.3.1
>            Reporter: Panayiotis Vlissidis
>            Assignee: Andreas Lehmkühler
>            Priority: Critical
>             Fix For: 1.7.0
>
>         Attachments: test.pdf
>
>
> Hello and thanks for PDFBox.
> We just started using PDFBox for text extraction(through Tika) 
> and it fails to finish text extraction falling in an infinite loop
> and never returning the text.
> Please note that this happens only for a specific type of PDF
> documents(used for hand writing recognition) such as the one attached. 
> Not sure if this is a bug of PDFBox or due to the nature of the PDFs,
> but I think that PDFBox should at least break out if extraction is not possible.
> I wish I could give you more information but I know nothing about PDF format, parsing, etc. 
> Please let me know if you need any information or my help in any way.
> Thanks a lot for your time.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira