You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2013/10/28 18:58:38 UTC

[jira] [Updated] (PDFBOX-1716) PDDocument.getNumberOfPages() return 0 for certain PDF document

     [ https://issues.apache.org/jira/browse/PDFBOX-1716?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler updated PDFBOX-1716:
---------------------------------------

    Fix Version/s:     (was: 1.8.2)

> PDDocument.getNumberOfPages() return 0 for certain PDF document
> ---------------------------------------------------------------
>
>                 Key: PDFBOX-1716
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-1716
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.2
>            Reporter: Tom
>
> Sample document(https://issues.apache.org/jira/secure/attachment/12430914/FormI-9-English.pdf) can be found here https://issues.apache.org/jira/browse/PDFBOX-578.  Looks the NPE issue fix in that work item https://issues.apache.org/jira/browse/PDFBOX-578 is a work around.
> When I try to extract the text content from /FormI-9-English.pdf , when I call PDDocument.getNumberOfPages(), this method return 0 which makes the extraction of the text content impossible:
> InputStream in = <PDF  InputStream>
> PDFParser parser = new PDFParser(content);
> 				PDFTextStripper pdfStripper = null;
> 				String parsedText = null;
> 				parser.parse();
> 				cosDoc = parser.getDocument();
> 				pdfStripper = new PDFTextStripper();
> 				pdDoc = new PDDocument(cosDoc);
> 				
> 				for(int i=1; i<= pdDoc.getNumberOfPages(); i++) { // pdDoc.getNumberOfPages() return 0, which is incorrect
>                                 
>                                 }
> Note:
> 1. This problem is found in the PDFBox latest version 1.8.2
> 2. I didn't which component to file this defect, so please assign to the correct component if needed, Thanks



--
This message was sent by Atlassian JIRA
(v6.1#6144)