You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Andreas Lehmkühler (Jira)" <ji...@apache.org> on 2020/12/02 07:07:00 UTC

[jira] [Assigned] (PDFBOX-4999) Dangerous COSDictionary.addAll(COSDictionary) method

     [ https://issues.apache.org/jira/browse/PDFBOX-4999?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andreas Lehmkühler reassigned PDFBOX-4999:
------------------------------------------

    Assignee: Andreas Lehmkühler

> Dangerous COSDictionary.addAll(COSDictionary) method
> ----------------------------------------------------
>
>                 Key: PDFBOX-4999
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4999
>             Project: PDFBox
>          Issue Type: Bug
>    Affects Versions: 2.0.21, 3.0.0 PDFBox
>            Reporter: Michael Klink
>            Assignee: Andreas Lehmkühler
>            Priority: Critical
>             Fix For: 2.0.22, 3.0.0 PDFBox
>
>
> The method {{COSDictionary.addAll(COSDictionary)}} creates the impression, by name and by JavaDoc comment,
> {code:java}
> /**
>  * This will add all of the dictionaries keys/values to this dictionary.
> ...
> {code}
> that it can be used for exactly that, adding all key/value pairs from the argument dictionary to the current one, replacing old entries for the same keys.
>  If one looks at the implementation, though, one is in for a surprise:
> {code:java}
> /**
>  * This will add all of the dictionaries keys/values to this dictionary.
>  * Only called when adding keys to a trailer that already exists.
>  *
>  * @param dic The dictionaries to get the keys from.
>  */
> public void addAll(COSDictionary dic)
> {
>     dic.forEach((key, value) ->
>     {
>         /*
>          * If we're at a second trailer, we have a linearized pdf file, meaning that the first Size entry represents
>          * all of the objects so we don't need to grab the second.
>          */
>         if (!COSName.SIZE.equals(key) || !items.containsKey(COSName.SIZE))
>         {
>             setItem(key, value);
>         }
>     });
> }
> {code}
> Here existing *Size* entries explicitly are not replaced!
> This appears to be a relic from times when PDFBox parsed PDF documents front to back, ignoring cross reference streams, for improved results with linearized files when merging trailer dictionaries.
> Nowadays this exceptional treatment of *Size* does not make any sense anymore, see [this stack overflow answer|https://stackoverflow.com/a/64502740/1729265].
> Furthermore, this method is used in other contexts than creating trailer unions, even some PDFBox methods use it to create arbitrary dictionary unions:
> * {{org.apache.pdfbox.pdmodel.PDDocument.assignAcroFormDefaultResource(PDAcroForm, COSDictionary)}}
> * {{org.apache.pdfbox.filter.JPXFilter.decode(InputStream, OutputStream, COSDictionary, int, DecodeOptions)}}
> * {{org.apache.pdfbox.examples.interactive.form.FieldTriggers.main(String[])}}
> * {{org.apache.pdfbox.pdmodel.graphics.image.PDImageXObject.PDImageXObject(PDStream, PDResources)}}
> * {{org.apache.pdfbox.pdmodel.graphics.image.PDInlineImage.PDInlineImage(COSDictionary, byte[], PDResources)}}
> * {{org.apache.pdfbox.pdmodel.graphics.image.PDInlineImageTest.testInlineImage()}}
> * {{org.apache.pdfbox.pdfparser.XrefTrailerResolver.setStartxref(long)}}
> (This list is offered by eclipse as callers of that method. There may be other, hidden calls.)
> Thus, this exception should be removed after all usages of that method in PDFBox have been analyzed.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org