You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Marc Kaufman <ka...@cs.stanford.edu> on 2023/10/06 17:50:21 UTC

Looking for a Debugger that can show which incremental save an object belongs to

I find myself debugging PDF files where Acrobat claims "Document has 
been altered or corrupted since it was signed." I would dearly love to 
see which objects belong to the last xref (color code is OK). Has anyone 
added that feature to PDF Debugger, or know where I can find one? Just 
comparing revisions is not enough, since sometimes the "changed" object 
is identical to the same object in the previous revision.

Thanks for any pointers. - Marc


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Looking for a Debugger that can show which incremental save an object belongs to

Posted by Marc Kaufman <ka...@cs.stanford.edu>.
It's a lot more subtle than that. I wrote the MDP code for Adobe Acrobat 
9 (June 2008). Certainly there are the obvious changes: something in a 
content stream changed, but then there are less obvious changes 
(appearance string changes for annotations that are locked). One which I 
just ran into is a MarkInfo dictionary in the latest incremental save 
that is identical to the MarkInfo dictionary in the previous save. The 
fact that it is in the latest incremental save triggered the change 
notice. Back when I had access to Acrobat source code I could debug and 
discover what was being changed. But that was many years ago and I don't 
know what modifications to the original code have been made. It's my own 
fault for not spitting more detailed debug information, I guess, but 
that's 15 years ago.

MDP analysis is between a signed version and subsequent versions, 
pairwise. Not every change is invalidating.

On 10/6/2023 9:43 PM, John Lussmyer wrote:
> I doubt there is a way.
> It's most likely that the signing code makes a MD5 checksum (or 
> similar) of the file when it is signed.
> If the file is changed, checking the signing will re-calculate the 
> checksum and find that it is different.  There isn't any info on what 
> changed, just that SOMETHING changed.
>
> On 10/6/2023 8:50 PM, Tilman Hausherr wrote:
>> On 06.10.2023 19:50, Marc Kaufman wrote:
>>> I find myself debugging PDF files where Acrobat claims "Document has 
>>> been altered or corrupted since it was signed." I would dearly love 
>>> to see which objects belong to the last xref (color code is OK). Has 
>>> anyone added that feature to PDF Debugger, or know where I can find 
>>> one? Just comparing revisions is not enough, since sometimes the 
>>> "changed" object is identical to the same object in the previous 
>>> revision. 
>>
>> I don't know of any. I research such questions the hard way, with 
>> NOTEPAD++.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Looking for a Debugger that can show which incremental save an object belongs to

Posted by Andreas Lehmkühler <an...@lehmi.de.INVALID>.
Am 07.10.23 um 06:43 schrieb John Lussmyer:
> I doubt there is a way.
> It's most likely that the signing code makes a MD5 checksum (or similar) 
> of the file when it is signed.
> If the file is changed, checking the signing will re-calculate the 
> checksum and find that it is different.  There isn't any info on what 
> changed, just that SOMETHING changed.
IMHO there two possible cases of manipulation ...

First, someone changed the signed part of a pdf so that the checksum is 
altered and doesn't match with the checksum when signing the pdf. In 
such cases it is hard to say which object was altered without doing a diff.

Second, someone adds some content to a signed pdf using incremental 
save. In such cases the signed part itself is still intact w.r.t the 
signature but the new one isn't if the pdf isn't signed a second time. 
In such cases the objects in question are at the end of the pdf, simply 
appended to the origin pdf.

I guess Marcs question is about the second one.

PDFBox doesn't store the information about the origin of the xref entry 
so that we are not able to mark objects added by an incremental update.

For now, TIlmans suggestion to use an editor of your choice to inspect 
the pdf is the way to go. As I said, the objects your are looking for 
are at the end of the pdf, right after the end of the origin pdf.


Andreas
> 
> On 10/6/2023 8:50 PM, Tilman Hausherr wrote:
>> On 06.10.2023 19:50, Marc Kaufman wrote:
>>> I find myself debugging PDF files where Acrobat claims "Document has 
>>> been altered or corrupted since it was signed." I would dearly love 
>>> to see which objects belong to the last xref (color code is OK). Has 
>>> anyone added that feature to PDF Debugger, or know where I can find 
>>> one? Just comparing revisions is not enough, since sometimes the 
>>> "changed" object is identical to the same object in the previous 
>>> revision. 
>>
>> I don't know of any. I research such questions the hard way, with 
>> NOTEPAD++.
>>
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Looking for a Debugger that can show which incremental save an object belongs to

Posted by John Lussmyer <Co...@CasaDelGato.Com>.
I doubt there is a way.
It's most likely that the signing code makes a MD5 checksum (or similar) 
of the file when it is signed.
If the file is changed, checking the signing will re-calculate the 
checksum and find that it is different.  There isn't any info on what 
changed, just that SOMETHING changed.

On 10/6/2023 8:50 PM, Tilman Hausherr wrote:
> On 06.10.2023 19:50, Marc Kaufman wrote:
>> I find myself debugging PDF files where Acrobat claims "Document has 
>> been altered or corrupted since it was signed." I would dearly love 
>> to see which objects belong to the last xref (color code is OK). Has 
>> anyone added that feature to PDF Debugger, or know where I can find 
>> one? Just comparing revisions is not enough, since sometimes the 
>> "changed" object is identical to the same object in the previous 
>> revision. 
>
> I don't know of any. I research such questions the hard way, with 
> NOTEPAD++.
>

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Looking for a Debugger that can show which incremental save an object belongs to

Posted by Tilman Hausherr <TH...@t-online.de>.
On 06.10.2023 19:50, Marc Kaufman wrote:
> I find myself debugging PDF files where Acrobat claims "Document has 
> been altered or corrupted since it was signed." I would dearly love to 
> see which objects belong to the last xref (color code is OK). Has 
> anyone added that feature to PDF Debugger, or know where I can find 
> one? Just comparing revisions is not enough, since sometimes the 
> "changed" object is identical to the same object in the previous 
> revision. 

I don't know of any. I research such questions the hard way, with NOTEPAD++.

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org