You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Lorena Leishman <lo...@yahoo.com.INVALID> on 2015/06/05 16:55:23 UTC
PDFTextStripper question
Is there a way to use PDFTextStripper and return the text in the position they were at in the pdf? or Is there a way to return the position where words were at?
Lorena
Re: PDFTextStripper question
Posted by John Hewson <jo...@jahewson.com>.
If you’re also interested in getting the bounding boxes of individual glyphs then check out:
https://github.com/apache/pdfbox/blob/trunk/examples/src/main/java/org/apache/pdfbox/examples/rendering/CustomPageDrawer.java <https://github.com/apache/pdfbox/blob/trunk/examples/src/main/java/org/apache/pdfbox/examples/rendering/CustomPageDrawer.java>
— John
> On 5 Jun 2015, at 09:31, Lorena Leishman <lo...@yahoo.com.INVALID> wrote:
>
> I'll do. Thanks!
> From: Tilman Hausherr <TH...@t-online.de>
> To: users@pdfbox.apache.org
> Sent: Friday, June 5, 2015 10:13 AM
> Subject: Re: PDFTextStripper question
>
> Yes, see the PrintTextLocations.java example.
>
> See also
> https://stackoverflow.com/questions/11873801/using-pdfbox-to-determine-the-coordinates-of-words-in-a-document
> https://stackoverflow.com/questions/16579146/pdfbox-1-8-printtextlocations-wrong-textposition-height-for-a-multi-page-pdf
> https://stackoverflow.com/questions/21207943/pdfbox-text-extraction-with-bold-italic-info-does-not-work-on-some-files
> for possible problems / solutions.
>
> Tilman
>
>
>
> Am 05.06.2015 um 16:55 schrieb Lorena Leishman:
>> Is there a way to use PDFTextStripper and return the text in the position they were at in the pdf? or Is there a way to return the position where words were at?
>> Lorena
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
>
Re: PDFTextStripper question
Posted by Lorena Leishman <lo...@yahoo.com.INVALID>.
I'll do. Thanks!
From: Tilman Hausherr <TH...@t-online.de>
To: users@pdfbox.apache.org
Sent: Friday, June 5, 2015 10:13 AM
Subject: Re: PDFTextStripper question
Yes, see the PrintTextLocations.java example.
See also
https://stackoverflow.com/questions/11873801/using-pdfbox-to-determine-the-coordinates-of-words-in-a-document
https://stackoverflow.com/questions/16579146/pdfbox-1-8-printtextlocations-wrong-textposition-height-for-a-multi-page-pdf
https://stackoverflow.com/questions/21207943/pdfbox-text-extraction-with-bold-italic-info-does-not-work-on-some-files
for possible problems / solutions.
Tilman
Am 05.06.2015 um 16:55 schrieb Lorena Leishman:
> Is there a way to use PDFTextStripper and return the text in the position they were at in the pdf? or Is there a way to return the position where words were at?
> Lorena
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: PDFTextStripper question
Posted by Tilman Hausherr <TH...@t-online.de>.
Yes, see the PrintTextLocations.java example.
See also
https://stackoverflow.com/questions/11873801/using-pdfbox-to-determine-the-coordinates-of-words-in-a-document
https://stackoverflow.com/questions/16579146/pdfbox-1-8-printtextlocations-wrong-textposition-height-for-a-multi-page-pdf
https://stackoverflow.com/questions/21207943/pdfbox-text-extraction-with-bold-italic-info-does-not-work-on-some-files
for possible problems / solutions.
Tilman
Am 05.06.2015 um 16:55 schrieb Lorena Leishman:
> Is there a way to use PDFTextStripper and return the text in the position they were at in the pdf? or Is there a way to return the position where words were at?
> Lorena
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org