You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Tim Allison <ta...@apache.org> on 2019/06/04 14:51:03 UTC
Extract actual /Table /TD /TR markup info?
All,
I have some pdfs with actual /Table /TD /TR markup.
How much effort would it be to extend PDFTextStripper to add, e.g.
startTable(), endTable(), startTD(), endTD(), etc...?
If I do have time to work on this (uncertain at this point), would
there be interest in putting this into PDFBox...or am I missing where
it already exists?
Cheers,
Tim
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org
Re: Extract actual /Table /TD /TR markup info?
Posted by Tilman Hausherr <TH...@t-online.de>.
There is certainly a need for it. Related questions are regularly asked
on SO.
I don't know how much effort is needed... I did work a bit on the
structure tree, but in an abstract way so that I've never really
understood the meaning.
Tilman
Am 04.06.2019 um 16:51 schrieb Tim Allison:
> All,
> I have some pdfs with actual /Table /TD /TR markup.
>
> How much effort would it be to extend PDFTextStripper to add, e.g.
> startTable(), endTable(), startTD(), endTD(), etc...?
>
> If I do have time to work on this (uncertain at this point), would
> there be interest in putting this into PDFBox...or am I missing where
> it already exists?
>
> Cheers,
>
> Tim
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org