You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2013/10/31 11:53:18 UTC
[jira] [Closed] (PDFBOX-100) NPE with corrupt pdf-document
[ https://issues.apache.org/jira/browse/PDFBOX-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Andreas Lehmkühler closed PDFBOX-100.
-------------------------------------
Resolution: Cannot Reproduce
Assignee: Andreas Lehmkühler
Can't reproduce as there isn't any sample pdf.
Set to closed
> NPE with corrupt pdf-document
> -----------------------------
>
> Key: PDFBOX-100
> URL: https://issues.apache.org/jira/browse/PDFBOX-100
> Project: PDFBox
> Issue Type: New Feature
> Components: Text extraction
> Assignee: Andreas Lehmkühler
>
> [imported from SourceForge]
> http://sourceforge.net/tracker/index.php?group_id=78314&atid=552835&aid=1326521
> Originally submitted by pjwalstrom on 2005-10-14 01:35.
> I have a corrupted pdf-document, which displays the
> following error when Acrobat is opened: "There was an
> error opening this document. The file is damaged and
> could not be repaired".
>
> I am using PDFBox 0.7.2 with Lucene, and when I try to
> index the document with the following command
> org.apache.Lucene.document.Document doc =
> LucenePDFDocument.getDocument(fileToIndex);
> I get a NPE:
>
> 2005-10-12 14:33:16,041 ERROR [STDERR]
> java.lang.NullPointerException
> 2005-10-12 14:33:16,041 ERROR [STDERR] at
> org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:194)
> 2005-10-12 14:33:16,041 ERROR [STDERR] at
> org.pdfbox.pdmodel.PDPageNode.getAllKids(PDPageNode.java:182)
> 2005-10-12 14:33:16,041 ERROR [STDERR] at
> org.pdfbox.pdmodel.PDDocumentCatalog.getAllPages(PDDocumentCatalog.java:162)
> 2005-10-12 14:33:16,042 ERROR [STDERR] at
> org.pdfbox.util.PDFTextStripper.writeText(PDFTextStripper.java:220)
> 2005-10-12 14:33:16,042 ERROR [STDERR] at
> org.pdfbox.searchengine.lucene.LucenePDFDocument.addContent(LucenePDFDocument.java:278)
> 2005-10-12 14:33:16,042 ERROR [STDERR] at
> org.pdfbox.searchengine.lucene.LucenePDFDocument.getDocument(LucenePDFDocument.java:187)
>
> Shouldn't PDFBox throw anything else than a NPE when
> dealing with corrupt documents?
> A CorruptPDFException would do.
--
This message was sent by Atlassian JIRA
(v6.1#6144)