You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@pdfbox.apache.org by "Michael Klink (JIRA)" <ji...@apache.org> on 2019/03/29 09:20:00 UTC

[jira] [Comment Edited] (PDFBOX-4499) PDDocument.getSignatureFields() does not return all existing signature fields

    [ https://issues.apache.org/jira/browse/PDFBOX-4499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16804205#comment-16804205 ] 

Michael Klink edited comment on PDFBOX-4499 at 3/29/19 9:19 AM:
----------------------------------------------------------------

The cause is that the signature field is not referenced (neither directly nor indirectly) from the *AcroForm* form definition *Fields* dictionary, it is only referenced from the the page's *Annots* array.

PDFBox counts on all fields being referenced from the *AcroForm* form definition *Fields*, and that is ok because according to the specification
|*Fields*|array|_(Required)_ An array of references to the document’s root fields (those with no ancestors in the field hierarchy).|

I.e. all the document's root fields have to be in that array and all non-root fields have ancestors up to a root field which (see before) is in that array.

In case of your document exactly one field is not referenced from the *Fields* array:
{noformat}
38 0 obj
<<
  /Rect [ 335.353 332.67 575.797 351.39 ] 
  /Border [ 0 0 0 ] 
  /T (EMPLOYEE SIGNATURE)
  /F 4 
  /Subtype /Widget 
  /TU (EMPLOYEE SIGNATURE) 
  /DA (/Helvetica 12 Tf 0 g)
  /AP 115 0 R 
  /Contents () 
  /AAPL:AKExtras 116 0 R 
  /FT /Sig 
  /Type /Annot 
  /Ff 0 
  /DR 117 0 R 
>> 
{noformat}
As you see it doesn't have a *Parent* entry. Thus, it is a root field. On the other hand the *AcroForm* dictionary looks like this:
{noformat}
2 0 obj
<<
  /Fields [ 33 0 R 34 0 R 35 0 R 41 0 R 42 0 R 36 0 R 39 0 R 40 0 R 22 0 R
    23 0 R 14 0 R 16 0 R 18 0 R 19 0 R 20 0 R 24 0 R 25 0 R 29 0 R 37 0 R 15 0 R
    17 0 R 21 0 R 26 0 R 27 0 R 28 0 R 209 0 R 30 0 R ]
  /NeedAppearances true
>>
endobj 
{noformat}
As you see, no reference {{38 0 R}}
----
{quote}The signature field contained in the document can be signed on Adobe Reader and in a other close sourced commercial library.
{quote}
Some PDF processors don't only look for fields in the *AcroForm* *Fields* but also in the page *Annots*. In particular PDF viewers and libraries derived from such viewers do so, especially when displaying the page on which the orphaned field is an annotation.

While this might be appropriate for a viewer - it would be weird if the viewer displayed the field as annotation but wouldn't allow to fill it - it is not necessarily a good idea for automatic processing. After all, just as it is possible that an orphaned field was forgotten to be added to the *AcroForm* *Fields*, it is also possible that it was forgotten to be removed from the page *Annots*. A human viewer of the form might decide based on plausibility, an automatic form filler most likely not.


was (Author: mkl):
The cause is that the signature field is not referenced (neither directly nor indirectly) from the *AcroForm* form definition *Fields* dictionary, it is only referenced from the the page's *Annots* array.

PDFBox counts on all fields being referenced from the *AcroForm* form definition *Fields*, and that is ok because according to the specification

|*Fields*|array|_(Required)_ An array of references to the document’s root fields (those with no ancestors in the field hierarchy).|

I.e. all the document's root fields have to be in that array and all non-root fields have ancestors up to a root field which (see before) is in that array.

In case of your document exactly one field is not referenced from the *Fields* array: 


{noformat}
38 0 obj
<<
  /Rect [ 335.353 332.67 575.797 351.39 ] 
  /Border [ 0 0 0 ] 
  /T (EMPLOYEE SIGNATURE)
  /F 4 
  /Subtype /Widget 
  /TU (EMPLOYEE SIGNATURE) 
  /DA (/Helvetica 12 Tf 0 g)
  /AP 115 0 R 
  /Contents () 
  /AAPL:AKExtras 116 0 R 
  /FT /Sig 
  /Type /Annot 
  /Ff 0 
  /DR 117 0 R 
>> 
{noformat}

As you see it doesn't have a *Parent* entry. Thus, it is a root field. On the other hand the *AcroForm* dictionary looks like this:

{noformat}
2 0 obj
<<
  /Fields [ 33 0 R 34 0 R 35 0 R 41 0 R 42 0 R 36 0 R 39 0 R 40 0 R 22 0 R
    23 0 R 14 0 R 16 0 R 18 0 R 19 0 R 20 0 R 24 0 R 25 0 R 29 0 R 37 0 R 15 0 R
    17 0 R 21 0 R 26 0 R 27 0 R 28 0 R 209 0 R 30 0 R ]
  /NeedAppearances true
>>
endobj 
{noformat}

As you see, no reference {{38 0 R}}

> PDDocument.getSignatureFields() does not return all existing signature fields
> -----------------------------------------------------------------------------
>
>                 Key: PDFBOX-4499
>                 URL: https://issues.apache.org/jira/browse/PDFBOX-4499
>             Project: PDFBox
>          Issue Type: Bug
>          Components: AcroForm, Signing
>    Affects Versions: 2.0.14
>            Reporter: Noé Beuret
>            Priority: Major
>         Attachments: Form_example.pdf
>
>
> Hello, 
> I'm having an issue with a PDF while trying to retrieve an existing signature field in the document.
> I'm using the method {{PDDocument.getSignatureFields()}} which return {{PDSignatureField}} objects on some PDF, but nothing on others which I'm sure they have signature fields.
> {noformat}
> PDDocument document = PDDocument.load(getDocumentData("Form_example.pdf"));
> assertNotEmpty(document.getSignatureFields());{noformat}
> One of those PDF is attached to the ticket. The signature field contained in the document can be signed on Adobe Reader and in a other close sourced commercial library. But on PDFBox, the signature field does not exist. 
> Is it an issue or a known limitation of the library or I did I miss something ?
>  
> Thanks



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: dev-help@pdfbox.apache.org