You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Matthew Jung (Jira)" <ji...@apache.org> on 2021/06/07 22:44:00 UTC

[jira] [Updated] (PDFBOX-5198) When merging multiple pdf ua documents, Tags become nested

     [ https://issues.apache.org/jira/browse/PDFBOX-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Matthew Jung updated PDFBOX-5198:
---------------------------------
    Attachment: 1623105725988blob.jpg

Hi Hausherr
When I test using the Matterhorn Protocol document, it seems to work fine. I apologize I missed this but when testing with ADA documents that contain Page Piece Dictionary Tags it still nested the tags. I am using version 024 but I will try and do some more testing and also see if I can see you some scrubbed files
Matt


 
    On Sunday, June 6, 2021, 12:00:06 AM EDT, Tilman Hausherr (Jira) <ji...@apache.org> wrote:  
 
 
    [ https://issues.apache.org/jira/browse/PDFBOX-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351460#comment-17351460 ] 

Tilman Hausherr edited comment on PDFBOX-5198 at 6/6/21, 3:59 AM:
------------------------------------------------------------------

i am not able to get the file from that url. Can you email me the pDF ?


was (Author: mcjung1):
HI Hausherr
i am not able to get the file from that url. Can you email me the pDF ?

Matt
    On Tuesday, May 25, 2021, 12:12:03 PM EDT, Tilman Hausherr (Jira) <ji...@apache.org> wrote:  
 
 
    [ https://issues.apache.org/jira/browse/PDFBOX-5198?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17351170#comment-17351170 ] 

Tilman Hausherr commented on PDFBOX-5198:
-----------------------------------------

Here's a new attempt, now there are 4 top level Documents. Please tell me if this looks better in Adobe.
http://www.filedropper.com/pdfua-merged-new
(I couldn't upload it, JIRA has an overzealous malware check)




--
This message was sent by Atlassian Jira
(v8.3.4#803005)





--
This message was sent by Atlassian Jira
(v8.3.4#803005)


> When merging multiple pdf ua documents, Tags become nested
> ----------------------------------------------------------
>
>                 Key: PDFBOX-5198
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-5198
>             Project: PDFBox
>          Issue Type: Wish
>          Components: Utilities
>    Affects Versions: 2.0.21, 2.0.23
>            Reporter: Matthew Jung
>            Assignee: Tilman Hausherr
>            Priority: Major
>             Fix For: 2.0.24, 3.0.0 PDFBox
>
>         Attachments: 1622000586495blob.jpg, 1622120149457blob.jpg, 1622120149457blob.jpg, 1622123253165blob.jpg, 1622123790854blob.jpg, 1623105725988blob.jpg, Binder1.pdf, PDFA3A-merged-new.pdf, PDFUA-in-a-Nutshell-PDFUA_1.pdf, nested_tags_4documents_merged_using_pdfbox.tif, non_nested_tags_4documents_combined_using+adobe_pro.tif, screenshot-1.png
>
>
> When merging PDF UA documents the tags seen in Adobe reader are nested. If merging 200 documents then the tags are 200 nested deep. It does not appear to affect that JAWS reader can still read the document  but it may slow down performance when loading to a content repository.
> <DOCUMENT>
>           <DOCUMENT>
>                        <DOCUMENT>
> when using Adobe DC to merge multiple documents the tags are flatten
> <DOCUMENT>
>      <DOCUMENT>
>       <DOCUMENT>
>       <DOCUMENT>
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org