You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Tilman Hausherr (Jira)" <ji...@apache.org> on 2021/09/08 03:36:00 UTC
[jira] [Commented] (PDFBOX-5278) PDPage.getAnnotations() causes
subsequent calls to PDDocument.getPages() to fail
[ https://issues.apache.org/jira/browse/PDFBOX-5278?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17411656#comment-17411656 ]
Tilman Hausherr commented on PDFBOX-5278:
-----------------------------------------
You can send me the file to me here: tilman at snafu dot de .
> PDPage.getAnnotations() causes subsequent calls to PDDocument.getPages() to fail
> --------------------------------------------------------------------------------
>
> Key: PDFBOX-5278
> URL: https://issues.apache.org/jira/browse/PDFBOX-5278
> Project: PDFBox
> Issue Type: Bug
> Components: Text extraction
> Affects Versions: 2.0.24
> Reporter: Alistair Oldfield
> Priority: Major
>
> I have stumbled across a strange issue with a certain PDF where PDPage.getAnnotations() causes subsequent calls to PDDocument.getPages() to fail.
>
> I am not at liberty to share the PDF publicly, but am happy to DM the PDF privately if it helps.
>
> The code to reproduce is pretty straightforward:
>
>
> {code:java}
> import java.io.File;
> import org.apache.pdfbox.pdmodel.PDDocument;
> import org.apache.pdfbox.pdmodel.PDPage;
> public class AnnotationsTest {
>
> public static void main(String[] args) throws Exception {
>
>
> try( PDDocument doc = PDDocument.load(new File(args[0]));){
> for (PDPage page : doc.getPages()) {
> //this line will cause the doc to not be re-iterable in the next block, commenting it out will allow it to pass.
> page.getAnnotations();
> }
>
> System.out.println("We get here, no problem - not sure why we can't re-iterate again...");
>
> //doc.getPages() fails.
> for (PDPage page : doc.getPages()) {
> //do something
>
> }
> }
> }
> {code}
> The Exception:
>
> Exception in thread "main" java.lang.IllegalStateException: Expected 'Page' but found COSName\{Annot}Exception in thread "main" java.lang.IllegalStateException: Expected 'Page' but found COSName\{Annot} at org.apache.pdfbox.pdmodel.PDPageTree.sanitizeType(PDPageTree.java:266) at org.apache.pdfbox.pdmodel.PDPageTree.access$400(PDPageTree.java:43) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.next(PDPageTree.java:224) at org.apache.pdfbox.pdmodel.PDPageTree$PageIterator.next(PDPageTree.java:172) at AnnotationsTest.main(AnnotationsTest.java:28)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org