You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Vaishali Sharma <va...@gmail.com> on 2023/08/09 13:59:44 UTC

Query: Finding overlapping texts in the pdf

Hi Team

As part of our project, we have around 700 PDF's generated through
automation. Being part of testing team I have to make sure that all content
is rendered correctly in the pdf on page1 and page2.

Issue here is that there is usually an overlapping text at some places..is
there a way we can identify if any files out of 700 has overlaps using
pdfbox classes and methods. We don't need to know what text is overlapping.
Just the number of files out of 700 which has any overlapping data.

Regards
Vaishali Sharma

Re: Query: Finding overlapping texts in the pdf

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,
There is no solution out of the box.
You could try to get the bounding boxes of the glyphs by modifying the 
DrawPrintTextLocations.java example (cyan boxes) and find if there is 
any overlap.
However this would fail if any kerning is done, because then the 
bounding boxes would overlap.
You could get the shapes and check whether there is any overlap, however 
this would definitively fail for type 3 fonts because these aren't 
vector fonts.

Tilman

On 09.08.2023 15:59, Vaishali Sharma wrote:
> Hi Team
>
> As part of our project, we have around 700 PDF's generated through
> automation. Being part of testing team I have to make sure that all content
> is rendered correctly in the pdf on page1 and page2.
>
> Issue here is that there is usually an overlapping text at some places..is
> there a way we can identify if any files out of 700 has overlaps using
> pdfbox classes and methods. We don't need to know what text is overlapping.
> Just the number of files out of 700 which has any overlapping data.
>
> Regards
> Vaishali Sharma
>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org