You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Jonathan (JIRA)" <ji...@apache.org> on 2019/05/09 12:22:00 UTC

[jira] [Comment Edited] (PDFBOX-4540) COSWriter sometimes retrieves wrong ObjectKey

    [ https://issues.apache.org/jira/browse/PDFBOX-4540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16836327#comment-16836327 ] 

Jonathan edited comment on PDFBOX-4540 at 5/9/19 12:21 PM:
-----------------------------------------------------------

Frankly, after revisiting the issue in more detail, I'm not sure if an issue would ever arise during normal use of PDFBox. We have a rather special case here, as we use PDFBox's infrastructure to write linearized pdfs. We have the following method which assigns object numbers to every object contained in the pdf:
{code:java}
private void setObjectNumbers(final PDFObjectQueue queue,  final LinearizedPDFWriter writer)
   {
      for (final Map.Entry<COSBase, ObjectMetaData> entry : queue.entrySet())
      {
         writer.getObjectKeys().put(entry.getKey(), new COSObjectKey(entry.getValue().objNumber, 0));  // always gen 0
         if (entry.getKey() instanceof COSObject)
         {
            writer.getObjectKeys().put(((COSObject)entry.getKey()).getObject(), new COSObjectKey(entry.getValue().objNumber, 0));  // always gen 0
         }
      }
   }
{code}
Later, when we then write the pdf, `COSWriter.doWriteObject(COSBase)` calls `COSWriter.getObjectKey(COSBase)` to obtain the `currentObjectKey` for the element to be written. Without my fix, the object will be assigned a new number, even though it was already defined previously.

I've uploaded a pdf which displays the issue when used as input.


was (Author: rahn2):
Frankly, after revisiting the issue in more detail, I'm not sure if an issue would ever arise during normal use of PDFBox. We have a rather special case here, as we use PDFBox's infrastructure to write linearized pdfs. We have the following method which assigns object numbers to every object contained in the pdf:
{code:java}
private void setObjectNumbers(final PDFObjectQueue queue,  final LinearizedPDFWriter writer)
   {
      for (final Map.Entry<COSBase, ObjectMetaData> entry : queue.entrySet())
      {
         writer.getObjectKeys().put(entry.getKey(), new COSObjectKey(entry.getValue().objNumber, 0));  // always gen 0
         if (entry.getKey() instanceof COSObject)
         {
            writer.getObjectKeys().put(((COSObject)entry.getKey()).getObject(), new COSObjectKey(entry.getValue().objNumber, 0));  // always gen 0
         }
      }
   }
{code}

Later, when we then write the pdf, `COSWriter.doWriteObject(COSBase)` calls `COSWriter.getObjectKey(COSBase)` to obtain the `currentObjectKey` for the element to be written. Without my fix, the object will be assigned a new number, even though it was already defined previously. 

> COSWriter sometimes retrieves wrong ObjectKey
> ---------------------------------------------
>
>                 Key: PDFBOX-4540
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4540
>             Project: PDFBox
>          Issue Type: Bug
>          Components: Writing
>    Affects Versions: 2.0.14
>            Reporter: Jonathan
>            Priority: Major
>              Labels: patch, pull-request-available
>         Attachments: sample.pdf
>
>
> If a COSBase is directly embedded in a COSObject, it should not be assigned a new object number by the writer. We suggest the following implementation for `COSWriter.getObjectKey(COSBase)`: 
> {code:java}
> /**
>  * This will get the object key for the object.
>  *
>  * @param obj The object to get the key for.
>  *
>  * @return The object key for the object.
> */
> protected COSObjectKey getObjectKey( COSBase obj )
> {
>     COSBase actual = obj;
>     if( actual instanceof COSObject )
>     {
>         actual = ((COSObject)obj).getObject();
>     }
>     COSObjectKey key = null;
>     key = objectKeys.get(obj);
>     if( key == null && actual != null )
>     {
>         key = objectKeys.get(actual);
>     } 
>     if (key == null)
>     {
>         setNumber(getNumber()+1);
>         key = new COSObjectKey(getNumber(),0);
>         objectKeys.put(obj, key);
>         if( actual != null )
>         {
>             objectKeys.put(actual, key);
>         }
>     }
>     return key;
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org