You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2013/10/31 11:53:18 UTC

[jira] [Closed] (PDFBOX-100) NPE with corrupt pdf-document

     [ https://issues.apache.org/jira/browse/PDFBOX-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler closed PDFBOX-100.
-------------------------------------

    Resolution: Cannot Reproduce
      Assignee: Andreas Lehmkühler

Can't reproduce as there isn't any sample pdf.

Set to closed

> NPE with corrupt pdf-document
> -----------------------------
>
>                 Key: PDFBOX-100
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-100
>             Project: PDFBox
>          Issue Type: New Feature
>          Components: Text extraction
>            Assignee: Andreas Lehmkühler
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552835&aid=1326521
> Originally submitted by pjwalstrom on 2005-10-14 01:35.
> I have a corrupted pdf-document, which displays the
> following error when Acrobat is opened: "There was an
> error opening this document. The file is damaged and
> could not be repaired". 
>  
> I am using PDFBox 0.7.2 with Lucene, and when I try to
> index the document with the following command 
> org.apache.Lucene.document.Document doc =
> LucenePDFDocument.getDocument(fileToIndex); 
> I get a NPE: 
>  
> 2005-10-12 14:33:16,041 ERROR [STDERR]
> java.lang.NullPointerException 
> 2005-10-12 14:33:16,041 ERROR [STDERR] at
> org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
> 2005-10-12 14:33:16,041 ERROR [STDERR] at
> org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
> 2005-10-12 14:33:16,041 ERROR [STDERR] at
> org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:162)
> 2005-10-12 14:33:16,042 ERROR [STDERR] at
> org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:220)
> 2005-10-12 14:33:16,042 ERROR [STDERR] at
> org.pdfbox.searchengine.lucene.LucenePDFDocument.addContent(LucenePDFDocument.java:278)
> 2005-10-12 14:33:16,042 ERROR [STDERR] at
> org.pdfbox.searchengine.lucene.LucenePDFDocument.getDocument(LucenePDFDocument.java:187)
>  
> Shouldn't PDFBox throw anything else than a NPE when
> dealing with corrupt documents? 
> A CorruptPDFException would do.



--
This message was sent by Atlassian JIRA
(v6.1#6144)