You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Michel Cozzolino <mi...@gmail.com> on 2019/01/11 17:17:53 UTC

deleting a square from pdf file

Hello,

I’ve been using Pdfbox with full satisfaction since a couple of months.
Anyway, for the problem I’m facing now I can’t find a viable solution so
I’m asking for some help.

I have to delete from the pdf file, a small black square that is on the odd
pages along the right border of the page (here is an example
https://www.dropbox.com/s/tlny4qkiek5efb0/example.pdf?dl=0  ). I got the
square dimension, so can easily retrieve the x coordinate: the problem is
that I don’t have the y as it could be placed at any height in the page. I
thought to cover it by placing a withe rectangle as a strip alongside the
right border of the page, but this is not feasible as some row text could
end inside that zone.

The rest of the page contains text and picture and that square is the only
“shape” in the page. Is there a way to get it or its coordinates?

Many thanks

Michele
[image: image.png]

Re: deleting a square from pdf file

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

You could get the content stream, and then search for something like this:

q
   1 0 0 1 564.094 785.197 cm
   0 0 m
   31.182 0 l
   31.182 -31.181 l
   0 -31.181 l
   f*
Q

This is the same in all pages except for the 6th number.

Then you rewrite the content stream. See the RemoveAllTexts example for 
some inspiration on how to get and rewrite the tokens.

I used PDFDebugger to look at the content stream.

Tilman

Am 11.01.2019 um 18:17 schrieb Michel Cozzolino:
>
> Hello,
>
> I’ve been using Pdfbox with full satisfaction since a couple of 
> months. Anyway, for the problem I’m facing now I can’t find a viable 
> solution so I’m asking for some help.
>
> I have to delete from the pdf file, a small black square that is on 
> the odd pages along the right border of the page (here is an 
> examplehttps://www.dropbox.com/s/tlny4qkiek5efb0/example.pdf?dl=0). I 
> got the square dimension, so can easily retrieve the x coordinate: the 
> problem is that I don’t have the y as it could be placed at any height 
> in the page. I thought to cover it by placing a withe rectangle as a 
> strip alongside the right border of the page, but this is not feasible 
> as some row text could end inside that zone.
>
> The rest of the page contains text and picture and that square is the 
> only “shape” in the page. Is there a way to get it or its coordinates?
>
> Many thanks
>
> Michele
>
> image.png