You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Jacob MacWilliams <ja...@gmail.com> on 2022/09/12 17:16:06 UTC
Extracting field names from pdf
Hi,
The names of the various fields in my acroForm are all fairly
unintuitive (Check Box1, Check Box2, etc.) so for the simple case of
check boxes whose labels in the pdf document are contained to one line I
am considering simply iterating through all of the fields with the
FieldIterator, extracting the location of the checkboxes, and using
PDFTextStripperByArea to extract the region immediately to the right of
the checkbox location. I guess I just wanted to double check here that
there's no functionality provided by PDFBox that I'm overlooking here
that may help me with the task at hand (particularly anything that might
allow me to generalize simple case I described above).
Jacob MacWilliams
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Extracting field names from pdf
Posted by Gilad Denneboom <gi...@gmail.com>.
Or above, or under, or none at all...
On Mon, Sep 12, 2022 at 8:38 PM Tilman Hausherr <TH...@t-online.de>
wrote:
> Indeed, there isn't. But there is no guarantee that the "label" is to
> the right. It could be to the left.
>
> Tilman
>
> On 12.09.2022 19:16, Jacob MacWilliams wrote:
> > Hi,
> >
> > The names of the various fields in my acroForm are all fairly
> > unintuitive (Check Box1, Check Box2, etc.) so for the simple case of
> > check boxes whose labels in the pdf document are contained to one line
> > I am considering simply iterating through all of the fields with the
> > FieldIterator, extracting the location of the checkboxes, and using
> > PDFTextStripperByArea to extract the region immediately to the right
> > of the checkbox location. I guess I just wanted to double check here
> > that there's no functionality provided by PDFBox that I'm overlooking
> > here that may help me with the task at hand (particularly anything
> > that might allow me to generalize simple case I described above).
> >
> > Jacob MacWilliams
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
> >
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
Re: Extracting field names from pdf
Posted by Tilman Hausherr <TH...@t-online.de>.
Indeed, there isn't. But there is no guarantee that the "label" is to
the right. It could be to the left.
Tilman
On 12.09.2022 19:16, Jacob MacWilliams wrote:
> Hi,
>
> The names of the various fields in my acroForm are all fairly
> unintuitive (Check Box1, Check Box2, etc.) so for the simple case of
> check boxes whose labels in the pdf document are contained to one line
> I am considering simply iterating through all of the fields with the
> FieldIterator, extracting the location of the checkboxes, and using
> PDFTextStripperByArea to extract the region immediately to the right
> of the checkbox location. I guess I just wanted to double check here
> that there's no functionality provided by PDFBox that I'm overlooking
> here that may help me with the task at hand (particularly anything
> that might allow me to generalize simple case I described above).
>
> Jacob MacWilliams
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org