You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Jacob MacWilliams <ja...@gmail.com> on 2022/09/12 17:16:06 UTC

Extracting field names from pdf

Hi,

The names of the various fields in my acroForm are all fairly 
unintuitive (Check Box1, Check Box2, etc.) so for the simple case of 
check boxes whose labels in the pdf document are contained to one line I 
am considering simply iterating through all of the fields with the 
FieldIterator, extracting the location of the checkboxes, and using 
PDFTextStripperByArea to extract the region immediately to the right of 
the checkbox location. I guess I just wanted to double check here that 
there's no functionality provided by PDFBox that I'm overlooking here 
that may help me with the task at hand (particularly anything that might 
allow me to generalize simple case I described above).

Jacob MacWilliams


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Extracting field names from pdf

Posted by Gilad Denneboom <gi...@gmail.com>.
Or above, or under, or none at all...

On Mon, Sep 12, 2022 at 8:38 PM Tilman Hausherr <TH...@t-online.de>
wrote:

> Indeed, there isn't. But there is no guarantee that the "label" is to
> the right. It could be to the left.
>
> Tilman
>
> On 12.09.2022 19:16, Jacob MacWilliams wrote:
> > Hi,
> >
> > The names of the various fields in my acroForm are all fairly
> > unintuitive (Check Box1, Check Box2, etc.) so for the simple case of
> > check boxes whose labels in the pdf document are contained to one line
> > I am considering simply iterating through all of the fields with the
> > FieldIterator, extracting the location of the checkboxes, and using
> > PDFTextStripperByArea to extract the region immediately to the right
> > of the checkbox location. I guess I just wanted to double check here
> > that there's no functionality provided by PDFBox that I'm overlooking
> > here that may help me with the task at hand (particularly anything
> > that might allow me to generalize simple case I described above).
> >
> > Jacob MacWilliams
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Re: Extracting field names from pdf

Posted by Tilman Hausherr <TH...@t-online.de>.
Indeed, there isn't. But there is no guarantee that the "label" is to 
the right. It could be to the left.

Tilman

On 12.09.2022 19:16, Jacob MacWilliams wrote:
> Hi,
>
> The names of the various fields in my acroForm are all fairly 
> unintuitive (Check Box1, Check Box2, etc.) so for the simple case of 
> check boxes whose labels in the pdf document are contained to one line 
> I am considering simply iterating through all of the fields with the 
> FieldIterator, extracting the location of the checkboxes, and using 
> PDFTextStripperByArea to extract the region immediately to the right 
> of the checkbox location. I guess I just wanted to double check here 
> that there's no functionality provided by PDFBox that I'm overlooking 
> here that may help me with the task at hand (particularly anything 
> that might allow me to generalize simple case I described above).
>
> Jacob MacWilliams
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org