You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Petras (JIRA)" <ji...@apache.org> on 2016/04/19 10:46:25 UTC

[jira] [Comment Edited] (PDFBOX-3321) ASCII stream data size is increased when written

    [ https://issues.apache.org/jira/browse/PDFBOX-3321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15247371#comment-15247371 ] 

Petras edited comment on PDFBOX-3321 at 4/19/16 8:45 AM:
---------------------------------------------------------

An example of resulted content stream dictionary (generated by _PDVisibleSigBuilder_) with CRLF at the end included by _BaseParser#readUntilEndStream_ when parsing:
{code}
<< ...
/Length 26
/Type /XObject
/Subtype /Form
 >>
stream
q 1 0 0 1 0 0 cm /n0 Do Q

endstream
{code}

Though the */Length* entry indicates 26 bytes, the actual length of stream data is 28 bytes:
{code}
71 20 31 20 30 20 30 20 31 20 30 20 30 20 63 6D 20 2F 6E 30 20 44 6F 20 51 0A 0D 0A
{code}


was (Author: abyss):
An example of resulted content stream dictionary (generated by _PDVisibleSigBuilder_) with CRLF at the end added by _BaseParser#readUntilEndStream_ when parsing:
{code}
<< ...
/Length 26
/Type /XObject
/Subtype /Form
 >>
stream
q 1 0 0 1 0 0 cm /n0 Do Q

endstream
{code}

Though the */Length* entry indicates 26 bytes, the actual length of stream data is 28 bytes:
{code}
71 20 31 20 30 20 30 20 31 20 30 20 30 20 63 6D 20 2F 6E 30 20 44 6F 20 51 0A 0D 0A
{code}

> ASCII stream data size is increased when written
> ------------------------------------------------
>
>                 Key: PDFBOX-3321
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-3321
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Parsing
>    Affects Versions: 1.8.11
>            Reporter: Petras
>            Priority: Critical
>              Labels: signature, streams
>
> This bug is quite complicated and was discovered when visual signatures were used along with parsing of the document with Preflight before signing. 
> I dig a bit trying to investigate this bug nature as the bug does not appear regularly. It appears that it manifests itself under such conditions:
> # Document is parsed when opened (ex. by Preflight) and entry with number value is detected, which is marked as direct by _BaseParser.parseCOSDictionary(BaseParser.java:381)_;
> # Stream with ASCII filter is created or present in document having the same length as the number found in step 1 (ex. when visual signature is created by calling _SignatureOptions#setVisualSignature()_);
> # While written _COSWriter_ checks the stream length by its _direct_ property. If */Length* is present and is flaged as direct, it is not recalculated when written.
> As a result, when doucument is written, the stream length is changed: written stream is increased by 2 bytes, while */Length* entry still indicate the original length. That violates PDF requirements for the */Length* entry:
> bq. The number of bytes from the beginning of the line following the keyword *stream* to the last byte just before the keyword *endstream*. (There may be an additional EOL marker, preceding *endstream*, that is not included in the count and is not logically part of the stream data.)
> These bugs complement to this effect:
> * PDFBOX-3320 & PDFBOX-2685, as number used for stream length is marked as direct;
> * _BaseParser.parseCOSStream(BaseParser.java:490)_ parses ASCII stream using _EndstreamOutputStream_ class, which always includes all characters till the *endstream* keyword, though CRLF preceding *endstream* is not part of the stream data;
> * _COSWriter_ checks the stream length by its _direct_ property, even though it could be set as indirect via _COSObject_. As it is flaged as direct due to mutability of cached COSNumber, the stream length is not recalculated.
> As _COSWriter_ always adds CRLF at the end of the stream, the final stream data increased by 2 bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org