You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Dave Smith (JIRA)" <ji...@apache.org> on 2014/06/04 22:13:02 UTC

[jira] [Commented] (PDFBOX-2114) ObjStm is being processed to late

    [ https://issues.apache.org/jira/browse/PDFBOX-2114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14018096#comment-14018096 ] 

Dave Smith commented on PDFBOX-2114:
------------------------------------

Ahh I see


   if( !document.isEncrypted() )
            {
                document.dereferenceObjectStreams();
            }

I guess the questions is why do we care?

> ObjStm is being processed to late
> ---------------------------------
>
>                 Key: PDFBOX-2114
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2114
>             Project: PDFBox
>          Issue Type: Bug
>            Reporter: Dave Smith
>
> I have a pdf that has the following 
> 1 0 obj^M
> <</Type/Catalog/Pages 5 0 R/Metadata 8 0 R/AcroForm<</Fields[]>>>>^M
> and
> 22 0 obj^M
> <</Type /ObjStm /N 2/First 10/Length 175/Filter /FlateDecode>>stream^M
> Inside the 22 0 obj is the 5 0 which holds the pages. When 1 0obj is parsed then a place holder is set for the 5 0Obj with it's value set to null. When 22 0 is parsed it is not expanded so 5 0 is always null.
> When I go to get all the pages 
> document.getDocumentCatalog().getAllPages() it returns 0 since
> (COSDictionary)root.getDictionaryObject( COSName.PAGES ) is null.
> Should ObjStm not get processed immediately so the objects tha are in there are filled? 
> I have a pdf as an example but it is confidential so I can send it someone off list 



--
This message was sent by Atlassian JIRA
(v6.2#6252)