You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Alistair Oldfield (Jira)" <ji...@apache.org> on 2021/09/07 19:57:00 UTC
[jira] [Created] (PDFBOX-5278) PDPage.getAnnotations() causes
subsequent calls to PDDocument.getPages() to fail
Alistair Oldfield created PDFBOX-5278:
-----------------------------------------
Summary: PDPage.getAnnotations() causes subsequent calls to PDDocument.getPages() to fail
Key: PDFBOX-5278
URL: https://issues.apache.org/jira/browse/PDFBOX-5278
Project: PDFBox
Issue Type: Bug
Components: Text extraction
Affects Versions: 2.0.24
Reporter: Alistair Oldfield
I have stumbled across a strange issue with a certain PDF where PDPage.getAnnotations() causes subsequent calls to PDDocument.getPages() to fail.
I am not at liberty to share the PDF publicly, but am happy to DM the PDF privately if it helps.
The code to reproduce is pretty straightforward:
{code:java}
import java.io.File;
import org.apache.pdfbox.pdmodel.PDDocument;
import org.apache.pdfbox.pdmodel.PDPage;
public class AnnotationsTest {
public static void main(String[] args) throws Exception {
try( PDDocument doc = PDDocument.load(new File(args[0]));){
for (PDPage page : doc.getPages()) {
//this line will cause the doc to not be re-iterable in the next block, commenting it out will allow it to pass.
page.getAnnotations();
}
System.out.println("We get here, no problem - not sure why we can't re-iterate again...");
//doc.getPages() fails.
for (PDPage page : doc.getPages()) {
//do something
}
}
}
{code}
The Exception:
Exception in thread "main" java.lang.IllegalStateException: Expected 'Page' but found COSName\{Annot}Exception in thread "main" java.lang.IllegalStateException: Expected 'Page' but found COSName\{Annot} at org.apache.pdfbox.pdmodel.PDPageTree.sanitizeType(PDPageTree.java:266) at org.apache.pdfbox.pdmodel.PDPageTree.access$400(PDPageTree.java:43) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.next(PDPageTree.java:224) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.next(PDPageTree.java:172) at com.onlinedoctranslator.test.AnnotationsTest.main(AnnotationsTest.java:28)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org