You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by GitBox <gi...@apache.org> on 2023/01/10 11:33:35 UTC

[GitHub] [pdfbox] bernhardf-ro opened a new pull request, #149: Allow negative struct parent keys in PDFMergerUtility

bernhardf-ro opened a new pull request, #149:
URL: https://github.com/apache/pdfbox/pull/149

   PDFMergerUtility expects structural parent tree keys to be non-negative.
   However, negative values don't seem to be forbidden by the specification and are accepted by validators.
   This patch adapts the class to handle negative struct parent keys correctly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] THausherr commented on pull request #149: Allow negative struct parent keys in PDFMergerUtility

Posted by GitBox <gi...@apache.org>.
THausherr commented on PR #149:
URL: https://github.com/apache/pdfbox/pull/149#issuecomment-1377581995

   Do you have a PDF where this happens?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] bernhardf-ro commented on a diff in pull request #149: Allow negative struct parent keys in PDFMergerUtility

Posted by GitBox <gi...@apache.org>.
bernhardf-ro commented on code in PR #149:
URL: https://github.com/apache/pdfbox/pull/149#discussion_r1082619424


##########
pdfbox/src/main/java/org/apache/pdfbox/multipdf/PDFMergerUtility.java:
##########
@@ -1498,8 +1499,8 @@ private void updateStructParentEntries(PDPage page, int structParentOffset) thro
         List<PDAnnotation> newannots = new ArrayList<>(annots.size());
         annots.forEach(annot ->
         {
-            int structParent = annot.getStructParent();
-            if (structParent >= 0)
+            int structParent = annot.getCOSObject().getInt(COSName.STRUCT_PARENT, Integer.MIN_VALUE); // allow for negative struct parent values

Review Comment:
   That would be an API change which would break all specific (i.e. positiv or negative equals) checks for -1 in all integrations.
   If a one-parameter method that defaults to `Integer.MIN_VALUE` is necessary (IMHO it is not), it would have to be a new one, e.g. `getIntSigned(COSName)` (and similar for array, possibly for other number types).
   The only change in this regard that I consider necessary is improving the API documentation of `getInt(COSName)` (and similar methods) to clarify that they cannot be used if the result may be negative.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] bernhardf-ro commented on pull request #149: Allow negative struct parent keys in PDFMergerUtility

Posted by GitBox <gi...@apache.org>.
bernhardf-ro commented on PR #149:
URL: https://github.com/apache/pdfbox/pull/149#issuecomment-1378817161

   Here is a document to verify the issue. (Sorry, I forgot to add that to the issue in the first place.)
   [merge.pdf](https://github.com/apache/pdfbox/files/10393126/merge.pdf)
   It is a valid PDF/UA-1, according to PDF Accessibility Checker 2021.
   Appending it to itself using PDFMergerUtility results in a document that is not valid PDF/UA. ("Structural parent tree" issue)
   With the patch applied the merge result is valid PDF/UA-1.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] lehmi commented on a diff in pull request #149: Allow negative struct parent keys in PDFMergerUtility

Posted by GitBox <gi...@apache.org>.
lehmi commented on code in PR #149:
URL: https://github.com/apache/pdfbox/pull/149#discussion_r1082157988


##########
pdfbox/src/main/java/org/apache/pdfbox/multipdf/PDFMergerUtility.java:
##########
@@ -1498,8 +1499,8 @@ private void updateStructParentEntries(PDPage page, int structParentOffset) thro
         List<PDAnnotation> newannots = new ArrayList<>(annots.size());
         annots.forEach(annot ->
         {
-            int structParent = annot.getStructParent();
-            if (structParent >= 0)
+            int structParent = annot.getCOSObject().getInt(COSName.STRUCT_PARENT, Integer.MIN_VALUE); // allow for negative struct parent values

Review Comment:
   So the question is, do we need to change the default value if the value isn't set?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] lehmi commented on a diff in pull request #149: Allow negative struct parent keys in PDFMergerUtility

Posted by GitBox <gi...@apache.org>.
lehmi commented on code in PR #149:
URL: https://github.com/apache/pdfbox/pull/149#discussion_r1068971320


##########
pdfbox/src/main/java/org/apache/pdfbox/multipdf/PDFMergerUtility.java:
##########
@@ -1498,8 +1499,8 @@ private void updateStructParentEntries(PDPage page, int structParentOffset) thro
         List<PDAnnotation> newannots = new ArrayList<>(annots.size());
         annots.forEach(annot ->
         {
-            int structParent = annot.getStructParent();
-            if (structParent >= 0)
+            int structParent = annot.getCOSObject().getInt(COSName.STRUCT_PARENT, Integer.MIN_VALUE); // allow for negative struct parent values

Review Comment:
   This doesn't change anything but the default value if STRUCT_PARENT is null. If Integer.MIN_VALUE makes more sense than -1 the getter should be changed rather than this piece of code



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org


[GitHub] [pdfbox] bernhardf-ro commented on a diff in pull request #149: Allow negative struct parent keys in PDFMergerUtility

Posted by GitBox <gi...@apache.org>.
bernhardf-ro commented on code in PR #149:
URL: https://github.com/apache/pdfbox/pull/149#discussion_r1069164981


##########
pdfbox/src/main/java/org/apache/pdfbox/multipdf/PDFMergerUtility.java:
##########
@@ -1498,8 +1499,8 @@ private void updateStructParentEntries(PDPage page, int structParentOffset) thro
         List<PDAnnotation> newannots = new ArrayList<>(annots.size());
         annots.forEach(annot ->
         {
-            int structParent = annot.getStructParent();
-            if (structParent >= 0)
+            int structParent = annot.getCOSObject().getInt(COSName.STRUCT_PARENT, Integer.MIN_VALUE); // allow for negative struct parent values

Review Comment:
   I intentionally kept the patch to one class and as simple and minimal as possible, e.g. not renaming variables. So feel free to adapt it and let me know if you have any questions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org