You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (JIRA)" <ji...@apache.org> on 2014/10/13 22:38:34 UTC

[jira] [Updated] (PDFBOX-2015) Hybrid reference pdf still contain XRefStm info in the trailer dictionary afterPDDocument#save

     [ https://issues.apache.org/jira/browse/PDFBOX-2015?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler updated PDFBOX-2015:
---------------------------------------
    Affects Version/s:     (was: 1.8.4)
                       2.0.0
                       1.8.7

> Hybrid reference pdf still contain XRefStm info in the trailer dictionary afterPDDocument#save
> ----------------------------------------------------------------------------------------------
>
>                 Key: PDFBOX-2015
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-2015
>             Project: PDFBox
>          Issue Type: Bug
>          Components: PDModel
>    Affects Versions: 1.8.7, 2.0.0
>            Reporter: Tim Costermans
>             Fix For: 1.8.8, 2.0.0
>
>         Attachments: Word2010.pdf, XRefStm_not_updated.patch, modified_Word2010.pdf
>
>
> Word2010.pdf is the input pdf, I open the document with PDFBOX add a string to the pdf. In this case ‘Hello world!’.
> Afterwards I save the pdf. 
> If I look at the content of the pdf before and after I modified it (using Notepad++) I see this:
> Word2010.pdf:
> Line 647: <</Size 18/Root 1 0 R/Info 7 0 R/ID[<AE9AF29D5A22AE47B47C4DA29170BE64><AE9AF29D5A22AE47B47C4DA29170BE64>] /Prev 81972/XRefStm 81702>>
> modified_Word2010.pdf:
> Line 791: /XRefStm 81702
> XRefStm is not updated although the original pdf had multiple revisions that were merged into a new pdf document.
> A third party library we use defends on this XRefStm value and cannot open the pdf after it was modified. (Stack trace see previous msg)
> Any help would be much appreciated.
> Maruan:
> that’s a bug. 
> Explanation: The original file uses what’s called a hybrid reference. That’s for compatibility with readers which do not support compressed reference streams.  The file generated by PDFBox doesn’t use hybrid references any more but still contains the XRefStm info in the trailer dictionary.
> See http://mail-archives.apache.org/mod_mbox/pdfbox-users/201403.mbox/%3C4425DF0D5759D64AA8845AA3EC444E1D014AE30AB3%40EXCHANGE03.unifiedpost.com%3E for more info.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)