You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Tilman Hausherr <TH...@t-online.de> on 2016/09/25 10:31:04 UTC
Re: Newbie question about parsing PDFs
Am 25.09.2016 um 12:24 schrieb David Goodenough:
> I need to take a PDF document and extract each item of text with its
> position on the page. PDFBox looks to be a good tool to use, but the
> examples are mainly to do with building PDFs rather than parsing them
> and the API is very rich (for which read large).
>
> Does anyone have any code they would be prepared to share that does
> this kind of parsing, or some pointers as to which classes I should
> be looking at?
Have a look at PrintTextLocations.java in the source download.
Tilman
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Newbie question about parsing PDFs
Posted by David Goodenough <da...@btconnect.com>.
On Sunday, 25 September 2016 12:31:04 BST Tilman Hausherr wrote:
> Am 25.09.2016 um 12:24 schrieb David Goodenough:
> > I need to take a PDF document and extract each item of text with its
> > position on the page. PDFBox looks to be a good tool to use, but the
> > examples are mainly to do with building PDFs rather than parsing them
> > and the API is very rich (for which read large).
> >
> > Does anyone have any code they would be prepared to share that does
> > this kind of parsing, or some pointers as to which classes I should
> > be looking at?
>
> Have a look at PrintTextLocations.java in the source download.
>
> Tilman
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
Wonderful, looks like exactly what I was looking for.
David