You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by zhangkun <zh...@huawei.com> on 2011/05/06 09:01:34 UTC
How to Extract a Table from the PDF file
Dear Sir
I am trying to extract a table from the PDF file. I could get the text from
the PDF file. However, when there is some blank in the table, there will be
some trouble in reading the table.
For example, there are two tables in the PDF file.
Table A:
Col1 Col2 Col3 Col4
Row1 100 200 300 400
Row2 17 89 985
Row3 98 134
Table B:
Col1 Col2 Col3 Col4
Row1 100 200 300 400
Row2 17 89 985
Row3 98 134
In the text extracted from PDF file, Both Table A and Table B would be:
Col1 Col2 Col3 Col4
Row1 100 200 300 400
Row2 17 89 985
Row3 98 134
I could not distinguish between Table A and Table B. Please give me a help.
Best Regards
Zhangkun
2011/5/6