You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "DvdM (Jira)" <ji...@apache.org> on 2020/02/27 14:41:00 UTC

[jira] [Updated] (PDFBOX-4788) Flattening fields results in non-widget annotations being removed

     [ https://issues.apache.org/jira/browse/PDFBOX-4788?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

DvdM updated PDFBOX-4788:
-------------------------
    Description: 
I'm running into an issue when flattening form fields, using PDFBox version v2.0.19. When calling {{PDAcroForm.flatten()}}, all annotations on pages without form fields get removed.

 

I created an sample document to illustrate this issue, it document contains 2 pages:
 * page 1: a text field and a link annotation
 * page 2: only a link annotation

When you flatten this document, the link annotation on the 2nd page gets removed, while it shouldn't be.

PDF Documents and Java files to reproduce this are attached:
 * {{CreateDocument.java}} creates {{flatten.pdf}}
 * {{FlattenDocument.java}} flattens {{flatten.pdf}} and creates {{flattened.pdf}}

 
----
After debugging, I think I found the cause. In the {{PDAcroForm}} class, {{flatten(...)}} calls the {{buildPagesWidgetsMap(...)}} method, which iterates over the form fields and builds a map of pages and their widget annotations. Because the 2nd page doesn't contain widget annotations, this page is not added to the map. Then {{flatten()}} iterates over the pages and gets the widgets for that page from the created {{pagesWidgetsMap}} map. However, because the 2nd page didn't have annotations and therefore wasn't added to the map, this results in {{widgetsForPageMap}} being {{null}}.

Next, for every annotation on this page, the following check is performed:
{code:java}
if (widgetsForPageMap != null && !widgetsForPageMap.contains(annotation.getCOSObject()))
{
    annotations.add(annotation);                 
}
{code}
Because {{widgetsForPageMap}} is {{null}}, the annotations is not added to the {{annotations}} list and therefore not retained in the document. The first page did contain a field and is thus added to the {{pagesWidgetsMap}}, resulting in {{widgetsForPageMap}} not being null, the annotation being added the {{annotations}} list and thus the annotation is retained.

I thinks this is a regression from [https://svn.apache.org/r1828871] and could be solved by using:
{code:java}
if (widgetsForPageMap == null || !widgetsForPageMap.contains(annotation.getCOSObject()))
{
    annotations.add(annotation);                 
}
{code}
 

Please let me know if you have any questions!

  was:
I'm running into an issue when flattening form fields, using PDFBox version v2.0.19. When calling {{PDAcroForm.flatten()}}, all annotations on pages without form fields get removed.

 

I created an sample document to illustrate this issue, it document contains 2 pages:
 * page 1: a text field and a link annotation
 * page 2: only a link annotation

When you flatten this document, the link annotation on the 2nd page gets removed, while it shouldn't be.

PDF Documents and Java files to reproduce this are attached:
 * {{CreateDocument.java}} creates {{flatten.pdf}}
 * {{FlattenDocument.java}} flattens {{flatten.pdf}} and creates {{flattened.pdf}}

 
----
After debugging, I think I found the cause. In the {{PDAcroForm}} class, {{flatten(...)}} calls the {{buildPagesWidgetsMap(...)}} method, which iterates over the form fields and builds a map of pages and their widget annotations. Because the 2nd page doesn't contain widget annotations, this page is not added to the map. Then {{flatten()}} iterates over the pages and gets the widgets for that page from the created {{pagesWidgetsMap}} map. However, because the 2nd page didn't have annotations and therefore wasn't added to the map, this results in {{widgetsForPageMap}} being {{null}}.

Next, for every annotation on this page, the following check is performed:
{code:java}
if (widgetsForPageMap != null && !widgetsForPageMap.contains(annotation.getCOSObject()))
{
    annotations.add(annotation);                 
}
{code}
Because {{widgetsForPageMap}} is {{null}}, the annotations is not added to the {{annotations}} list and therefore not retained in the document. The first page did contain a field and is thus added to the {{pagesWidgetsMap}}, resulting in {{widgetsForPageMap}} not being null, the annotation being added the {{annotations}} list and thus the annotation is retained.

I thinks this issue could be solved by using:
{code:java}
if (widgetsForPageMap == null || !widgetsForPageMap.contains(annotation.getCOSObject()))
{
    annotations.add(annotation);                 
}
{code}
 

Please let me know if you have any questions!


> Flattening fields results in non-widget annotations being removed
> -----------------------------------------------------------------
>
>                 Key: PDFBOX-4788
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4788
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm
>    Affects Versions: 2.0.19
>            Reporter: DvdM
>            Priority: Major
>         Attachments: CreateDocument.java, FlattenDocument.java, flatten.pdf, flattened.pdf
>
>
> I'm running into an issue when flattening form fields, using PDFBox version v2.0.19. When calling {{PDAcroForm.flatten()}}, all annotations on pages without form fields get removed.
>  
> I created an sample document to illustrate this issue, it document contains 2 pages:
>  * page 1: a text field and a link annotation
>  * page 2: only a link annotation
> When you flatten this document, the link annotation on the 2nd page gets removed, while it shouldn't be.
> PDF Documents and Java files to reproduce this are attached:
>  * {{CreateDocument.java}} creates {{flatten.pdf}}
>  * {{FlattenDocument.java}} flattens {{flatten.pdf}} and creates {{flattened.pdf}}
>  
> ----
> After debugging, I think I found the cause. In the {{PDAcroForm}} class, {{flatten(...)}} calls the {{buildPagesWidgetsMap(...)}} method, which iterates over the form fields and builds a map of pages and their widget annotations. Because the 2nd page doesn't contain widget annotations, this page is not added to the map. Then {{flatten()}} iterates over the pages and gets the widgets for that page from the created {{pagesWidgetsMap}} map. However, because the 2nd page didn't have annotations and therefore wasn't added to the map, this results in {{widgetsForPageMap}} being {{null}}.
> Next, for every annotation on this page, the following check is performed:
> {code:java}
> if (widgetsForPageMap != null && !widgetsForPageMap.contains(annotation.getCOSObject()))
> {
>     annotations.add(annotation);                 
> }
> {code}
> Because {{widgetsForPageMap}} is {{null}}, the annotations is not added to the {{annotations}} list and therefore not retained in the document. The first page did contain a field and is thus added to the {{pagesWidgetsMap}}, resulting in {{widgetsForPageMap}} not being null, the annotation being added the {{annotations}} list and thus the annotation is retained.
> I thinks this is a regression from [https://svn.apache.org/r1828871] and could be solved by using:
> {code:java}
> if (widgetsForPageMap == null || !widgetsForPageMap.contains(annotation.getCOSObject()))
> {
>     annotations.add(annotation);                 
> }
> {code}
>  
> Please let me know if you have any questions!



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org