You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Benjamin Westphal <be...@saxess.ag> on 2014/08/06 11:04:22 UTC

Find drawn lines on pdf-page

Hey there,

my problem is as follows:
i have a document with a table on it. For determining where the table 
starts, the table endes and where the table header is, i would like to 
find the border of the table and the inner row-lines.

My idea was to find all drawn lines on a pdf document and all borders 
and get the start and endposition of this lines (I'm sorry, im not very 
into the pdf specification)

is this possible with pdfbox and if yes, how? i thought about iterating 
about the tokens, but - as i said -  im not very into the spec of pdf 
and so i didn't really know what to look for.

i found a function in PDPageContentStream.drawLine() but i dind't get 
informations in it how to may READ/get lines.

thanks in advance

Re: Find drawn lines on pdf-page

Posted by Tilman Hausherr <TH...@t-online.de>.
The solution will probably be to extend PageDrawer, and/or to extend the 
line related operators, e.g. LineTo, MoveTo, CurveTo, and many more. 
You'll have to look at the source code. To see whats really going on, do 
a trace of PDFStreamEngine.

Tilman


Am 06.08.2014 11:04, schrieb Benjamin Westphal:
> Hey there,
>
> my problem is as follows:
> i have a document with a table on it. For determining where the table 
> starts, the table endes and where the table header is, i would like to 
> find the border of the table and the inner row-lines.
>
> My idea was to find all drawn lines on a pdf document and all borders 
> and get the start and endposition of this lines (I'm sorry, im not 
> very into the pdf specification)
>
> is this possible with pdfbox and if yes, how? i thought about 
> iterating about the tokens, but - as i said -  im not very into the 
> spec of pdf and so i didn't really know what to look for.
>
> i found a function in PDPageContentStream.drawLine() but i dind't get 
> informations in it how to may READ/get lines.
>
> thanks in advance