You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Dwayne Parks <d....@jplush.com> on 2024/02/02 22:28:25 UTC

Re: Modifying the order of AcroForm Fields and/or associated Widget Annotations...

Thank you very much for the detailed info!  I've been very busy with 
other work, but I will respond here once I have done some more testing 
and have uploaded a sample PDF and sample code on my personal Google 
Drive account.

It sounds like the proprietary web service should be using only the 
Annotation information to establish the starting order for what it calls 
"fields" in its configuration UI.  I'll do some testing to see if that 
is the case.  If so then I know at least the area to focus on.

Thank you for the info on the field tree.  I was aware of that but I now 
see how it is an even stronger reason for leaving the field tree 
(effectively the "data model" component of form fields) alone if at all 
possible and working exclusively with the Annotations that represent the 
"widgets" (that determine the "view or views" on the form fields).

And thank you for taking your time to explain how I went down the wrong 
path by focusing on the field tree and not the ("Widget") annotations!

Have a great weekend!!!

- Dwayne

On 1/31/2024 11:27 AM, sahyoun@fileaffairs.de wrote:
>
> added note: as Tabs is defined on a Page level it's clear that this
> addresses the Annotation and not the form field as a form field can
> have multiple representations on the same and/or multiple pages. The
> visual part of the form field is defined by the annotation. The form
> field itself i.e. without the annotation(s) doesn't have a physical
> representation.
>
> Am Mittwoch, dem 31.01.2024 um 18:22 +0100 schrieb
> sahyoun@fileaffairs.de:
>> Dear Dwayne,
>>
>> for a generic solution reordering the fields won't help.
>>
>> A field can be nested inside a field tree but let's say one of the
>> nested fields is on top of the page and the other is on the bottom of
>> the page. Now another nested structure might have fields in between.
>> You can not move fields out of a nested structure to match the
>> physical
>> order as this will have consequences such as naming etc.
>>
>> E.g.
>>
>> Visual Order on the page
>>
>> 	[Policy.PolicyNumber]    [Date]
>> 	[Policy.PolicyName]
>>
>>
>> from that you'd need to prompt for Policy.PolicyNumber, Date,
>> Policy.PolicyName
>>
>>
>>
>> AcroForm Order
>>
>> 	Policy
>> 		PolicyNumber
>> 		PolicyName
>> 	Date
>>
>> You can not move PolicyName below Date as it's nested inside Policy.
>> If
>> you move PolicyName out the structure would be
>>
>> 	Policy
>> 		PolicyNumber
>> 	Date
>> 	PolicyName
>>
>> But that changes a) the fully qualified name of the field and as
>> childs
>> can inherit from parents moving might miss properties only defined in
>> Policy.
>>
>>
>> The UI application needs to follow the definition of the visual order
>> as specified in the PDF. This should depend on the (Widget)
>> annotations
>> location on the page as this defines the physical location and is
>> what
>> you are looking for.
>>
>> There is also a (optional) Tabs key inside the Page dictionary which
>> can define the order the application should follow when tabbing
>> through
>> the (visual appearance) of the fields.
>>
>> from the spec:
>>
>> "R (row order), C (column order), and S (structure
>> order). Beginning with PDF 2.0, additional values also include A
>> (annotations array order) and W (widget order). Annotations array
>> order refers to the order of the annotation enumerated in the Annots
>> entry of the Page dictionary (see "Table 31 — Entries in a page
>> object"). Widget order means using the same array ordering but
>> making two passes, the first only picking the widget annotations and
>> the second picking all other annotations."
>>
>>
>> Now if the proprietry software doesn't follow these rules what about
>> parsing the PDF and generating the "prompt" list instead of doing it
>> manually. Generating can be done by looking at the physical location
>> of
>> the Widget annotations associated to a particular form field so you'd
>> be able to generate the field list the way they appear in the PDF and
>> feed that into your configuration for the form.
>>
>> BR
>> Maruan
>>
>>
>> Am Dienstag, dem 30.01.2024 um 18:40 -0600 schrieb Dwayne Parks:
>>> I am almost certain that the expected order is basically top-left
>>> to
>>> bottom-right, yes.  Currently there is no calculation being used
>>> that
>>> I
>>> know of.
>>>
>>> Flattening:  The issue isn't in the actual flattening itself.  I
>>> need
>>> to
>>> explain more about the way the PDFs are used.
>>>
>>> The proprietary software is running as a web service where we
>>> upload
>>> multiple "forms" in PDF form as a library.  At the simplest level,
>>> the
>>> fields on the form are one of two types.
>>>
>>> Field Type 1 is an internal field name that the software matches to
>>> internal data that it uses to set the field's value.  Say, if the
>>> field
>>> name is "Policy.PolicyNumber" then it sets the field's contents to
>>> its
>>> internal data for the Policy # data that it has... and that is what
>>> it
>>> uses when it flattens the PDF.
>>>
>>> Field Type 2 has a user-defined field name and the software (during
>>> the
>>> process of generating the output PDF, before flattening the fields)
>>> prompts the user for each user-defined field's contents that will
>>> be
>>> used during the flattening.
>>>
>>> There is a configuration page for each form that allows some
>>> control
>>> over the prompting of data from the user (validation constraints,
>>> descriptive names for prompts, etc.) and a basic way to reorder the
>>> order that the fields are processed (drag and drop a field up or
>>> down
>>> in
>>> the order, one field at a time), but if the form is edited in any
>>> way,
>>> this order "resets" to one based off of the PDF's contents.
>>>
>>> Some forms have hundreds of fields on them and so we are having
>>> semi-technical people trying to "build out" multiple forms and
>>> getting
>>> very frustrated when they need to make a small change in an edition
>>> of a
>>> form and suddenly the order is reset to an unexpected order (I
>>> believe
>>> the same order that fields/widgets appear in the PDFBox debugger's
>>> "internal structure" tree view under Root/Catalog -> AcroForm ->
>>> Fields)
>>> when they re-upload the PDF file.
>>>
>>> Why this order is important (for the Type 2 fields only) is that we
>>> want
>>> the user to be prompted for each user-defined field in order from
>>> top-left to bottom-right, row by row.  When the order is off, this
>>> is
>>> no
>>> longer possible.
>>>
>>> No errors are thrown as the proprietary software will happily
>>> prompt
>>> the
>>> user for the user-defined fields, but... it is adding hours to the
>>> form
>>> updating time and starting to drive our semi-technical people
>>> crazy.
>>>
>>> One other approach is to figure out how to force the order of the
>>> fields
>>> in Acrobat (which can be changed by dragging the fields up/down to
>>> position them in the list of field names) to be "honored" when it
>>> writes
>>> out the PDF contents to a file.  It doesn't appear to do so.  And
>>> it
>>> also sometimes creates Fields with Widgets as "Kids" and fields
>>> with
>>> the
>>> Widget data combined with the Field data when new fields are
>>> created
>>> via
>>> copy/paste...  all of this I had hoped to handle with a "cleanup"
>>> utility that would take the user-edited PDFs as a source and create
>>> cleaned up PDFs as separate output files.
>>>
>>> I hope that that makes more sense on the why.  Thanks for
>>> listening!!!
>>>
>>> - Dwayne
>>>
>>> On 1/30/2024 3:33 PM, sahyoun@fileaffairs.de wrote:
>>>>
>>>>
>>>> what is the expected order? Is it by location, top left to bottom
>>>> right? Calculation order ...
>>>>
>>>> Never heard that order matters for flattening. Is the proprietry
>>>> software throwing any errors which would be a hint?
>>>>
>>>>
>>>> BR
>>>> Maruan
>>>>
>>>> Am Dienstag, dem 30.01.2024 um 15:27 -0600 schrieb Dwayne Parks:
>>>>> Hello list!
>>>>>
>>>>> I'm dealing with a proprietary software product that accepts
>>>>> PDFs
>>>>> with
>>>>> fields in them to "flatten" into a final output PDF.  The
>>>>> difficulty
>>>>> is
>>>>> that it expects the ordering of the fields (or their associated
>>>>> widgets)
>>>>> to be in a certain order.  I don't know the exact details of
>>>>> this,
>>>>> but
>>>>> it takes much trial and error for our folks here manually
>>>>> deleting
>>>>> and
>>>>> recreating fields, trying them and seeing if they are accepted.
>>>>>
>>>>> So, to greatly streamline the process of getting the
>>>>> field/widget
>>>>> content in the PDF files in a correct order, I would like to
>>>>> write a
>>>>> utility that takes a configuration file containing a list of
>>>>> Field
>>>>> Names
>>>>> and reorders the content in the PDF to match the order they are
>>>>> in
>>>>> the
>>>>> configuration file.
>>>>>
>>>>> My naive initial idea is to:
>>>>>
>>>>>    - Write a utility that outputs the current list of fields
>>>>> (in
>>>>> the
>>>>>      PDF in the order that they are there) into a config file
>>>>>    - Allow a user to reorder the lines of field names as
>>>>> desired
>>>>>    - Write a utility that takes the config file and the PDF and
>>>>>      rebuilds the field list/tree in the order that the config
>>>>> file
>>>>>      specifies... then writes out the updated PDF contents to a
>>>>> new
>>>>>      PDF file
>>>>>
>>>>> Alternately, I believe that there is an order for forms/widgets
>>>>> that
>>>>> is
>>>>> specified in Adobe Acrobat (tab order?) that I might be able to
>>>>> try
>>>>> to
>>>>> try to recreate.  I'm not sure if that will work, but it would
>>>>> allow
>>>>> non-technical users to define the needed order without
>>>>> intervention
>>>>> from
>>>>> technical staff.
>>>>>
>>>>> I realize that there might be issues with combined field/widget
>>>>> fields
>>>>> if it comes to needing to order the widgets instead, but I am
>>>>> wanting
>>>>> to
>>>>> start with the above and go from there.
>>>>>
>>>>> So, I have a few questions to start with that someone might be
>>>>> able
>>>>> to
>>>>> help me out with!
>>>>>
>>>>> - Are there any examples of doing this sort of order
>>>>> modification?
>>>>> - Is it possible to reorder field contents at the PDDocument /
>>>>>    PDAcroForm / PDField level?
>>>>> - Is it possible to reorder widget annotations at the
>>>>> PDAnnotiation /
>>>>>    PDAnnotationWidget level?
>>>>> - Do I need to drop down to the COS* object level to do this?
>>>>>
>>>>> Thanks in advance for any pointers, info or suggestions!
>>>>>
>>>>> - Dwayne
>>>>>
>>>>> ---------------------------------------------------------------
>>>>> --
>>>>> ----
>>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>>
>>>>
>>>>
>>>> -----------------------------------------------------------------
>>>> --
>>>> --
>>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>>
>>>
>>>
>>> -------------------------------------------------------------------
>>> --
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org