You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Joel Hirsh <jo...@gmail.com> on 2019/07/31 20:42:50 UTC

Detecting invisible (white on white?) lines

I had asked about this more than a year ago, and the answer was that my
basic method seemed correct. However I was never able to resolve the issue,
and now it has come up again.   I am trying to detect, and thereby ignore,
invisible lines.

However, when trying to read either stroking (or nonstroking) colors of the
lines, I am getting that all the colors are always 0.0.

You had previously requested a PDF and some code in order to do anything
more.

Attached is:
1)  a pdf that has a lot of invisible lines.
2)  the core pieces of my Java code to process lines.
3) a listing of the lines found in this PDF.
4) a screenshot of examining a PDFGrahicsState object in the eclipse
debugger on a call to  lineTo(), showing that all the colors I can get are
0.0.

Thanks for your help

Re: Detecting invisible (white on white?) lines

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

That is a different file. I don't see any troubles with that one, color 
is black and the lines are stroked.

I don't know what's wrong with your code because it doesn't run (syntax 
errors, no main, ...) and I'm lazy. However see this SO answer:
https://stackoverflow.com/questions/38931422/pdfbox-2-0-2-calling-of-pagedrawer-processpage-method-caught-exceptions

I added
System.out.println("strokePath: " + getGraphicsState().getStrokingColor());
in strokePath() and lineTo() and I got valid results:


moveTo
lineTo
lineTo: PDColor{components=[0.0, 0.0, 0.0], patternName=null, 
colorSpace=DeviceRGB}
java.awt.geom.Rectangle2D$Float[x=35.0,y=84.0,w=0.0,h=510.0]
strokePath: PDColor{components=[0.0, 0.0, 0.0], patternName=null, 
colorSpace=DeviceRGB}
moveTo
lineTo
lineTo: PDColor{components=[0.0, 0.0, 0.0], patternName=null, 
colorSpace=DeviceRGB}
java.awt.geom.Rectangle2D$Float[x=478.0,y=84.0,w=0.0,h=510.0]
strokePath: PDColor{components=[0.0, 0.0, 0.0], patternName=null, 
colorSpace=DeviceRGB}
moveTo
lineTo
lineTo: PDColor{components=[0.0, 0.0, 0.0], patternName=null, 
colorSpace=DeviceRGB}
java.awt.geom.Rectangle2D$Float[x=553.0,y=84.0,w=0.0,h=510.0]
strokePath: PDColor{components=[0.0, 0.0, 0.0], patternName=null, 
colorSpace=DeviceRGB}
moveTo
lineTo
lineTo: PDColor{components=[0.6, 0.6, 0.6], patternName=null, 
colorSpace=DeviceRGB}
java.awt.geom.Rectangle2D$Float[x=61.0,y=84.0,w=0.0,h=532.0]
strokePath: PDColor{components=[0.6, 0.6, 0.6], patternName=null, 
colorSpace=DeviceRGB}
moveTo
lineTo
lineTo: PDColor{components=[0.6, 0.6, 0.6], patternName=null, 
colorSpace=DeviceRGB}
java.awt.geom.Rectangle2D$Float[x=428.0,y=84.0,w=0.0,h=532.0]
strokePath: PDColor{components=[0.6, 0.6, 0.6], patternName=null, 
colorSpace=DeviceRGB}
moveTo
lineTo
lineTo: PDColor{components=[0.6, 0.6, 0.6], patternName=null, 
colorSpace=DeviceRGB}
java.awt.geom.Rectangle2D$Float[x=501.0,y=84.0,w=0.0,h=532.0]
strokePath: PDColor{components=[0.6, 0.6, 0.6], patternName=null, 
colorSpace=DeviceRGB}

Tilman

Am 02.08.2019 um 08:07 schrieb Joel Hirsh:
> More to the point what I really need is to recognize the visible lines in
> this file:
> https://drive.google.com/file/d/1rhCbCBV1cztdp3w3MFLEmWcVMVMotJe8/view?usp=sharing
>
> And when I look at the PDGraphicsState for these visible lines I cannot
> find a color either.
>
> Is there a color there that I can use?  Or some other attribute that would
> distinguish the visible from the invisible?
>
> Thank you
>
>
> On Thu, Aug 1, 2019 at 4:05 PM Joel Hirsh <jo...@gmail.com> wrote:
>
>> Ok, is there any way to identify them as being part of a clipping region
>> rather than lines that are part of the content?
>>
>> On Thu, Aug 1, 2019 at 9:22 AM Tilman Hausherr <TH...@t-online.de>
>> wrote:
>>
>>> I found the "white lines":
>>>
>>> Root/Pages/Kids/[0]/Resources/XObject/Fm0/Resources/XObject/Fm1
>>>
>>> they all look like this:
>>>
>>>       0 792 m
>>>       0 0 l
>>>       612 0 l
>>>       612 792 l
>>>       h
>>>       W
>>>       n
>>>
>>> These are lines, but they build a clipping region.
>>>
>>>
>>>
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>>
>>>


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Detecting invisible (white on white?) lines

Posted by Joel Hirsh <jo...@gmail.com>.
More to the point what I really need is to recognize the visible lines in
this file:
https://drive.google.com/file/d/1rhCbCBV1cztdp3w3MFLEmWcVMVMotJe8/view?usp=sharing

And when I look at the PDGraphicsState for these visible lines I cannot
find a color either.

Is there a color there that I can use?  Or some other attribute that would
distinguish the visible from the invisible?

Thank you


On Thu, Aug 1, 2019 at 4:05 PM Joel Hirsh <jo...@gmail.com> wrote:

> Ok, is there any way to identify them as being part of a clipping region
> rather than lines that are part of the content?
>
> On Thu, Aug 1, 2019 at 9:22 AM Tilman Hausherr <TH...@t-online.de>
> wrote:
>
>> I found the "white lines":
>>
>> Root/Pages/Kids/[0]/Resources/XObject/Fm0/Resources/XObject/Fm1
>>
>> they all look like this:
>>
>>      0 792 m
>>      0 0 l
>>      612 0 l
>>      612 792 l
>>      h
>>      W
>>      n
>>
>> These are lines, but they build a clipping region.
>>
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
>> For additional commands, e-mail: users-help@pdfbox.apache.org
>>
>>

Re: Detecting invisible (white on white?) lines

Posted by Tilman Hausherr <TH...@t-online.de>.
Am 02.08.2019 um 01:05 schrieb Joel Hirsh:
> Ok, is there any way to identify them as being part of a clipping region
> rather than lines that are part of the content?


Just ignore them, because these aren't part of a stroke or fill operation.

Tilman


---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Detecting invisible (white on white?) lines

Posted by Joel Hirsh <jo...@gmail.com>.
Ok, is there any way to identify them as being part of a clipping region
rather than lines that are part of the content?

On Thu, Aug 1, 2019 at 9:22 AM Tilman Hausherr <TH...@t-online.de>
wrote:

> I found the "white lines":
>
> Root/Pages/Kids/[0]/Resources/XObject/Fm0/Resources/XObject/Fm1
>
> they all look like this:
>
>      0 792 m
>      0 0 l
>      612 0 l
>      612 792 l
>      h
>      W
>      n
>
> These are lines, but they build a clipping region.
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org
>
>

Re: Detecting invisible (white on white?) lines

Posted by Tilman Hausherr <TH...@t-online.de>.
I found the "white lines":

Root/Pages/Kids/[0]/Resources/XObject/Fm0/Resources/XObject/Fm1

they all look like this:

     0 792 m
     0 0 l
     612 0 l
     612 792 l
     h
     W
     n

These are lines, but they build a clipping region.



---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
For additional commands, e-mail: users-help@pdfbox.apache.org


Re: Detecting invisible (white on white?) lines

Posted by Joel Hirsh <jo...@gmail.com>.
Would be you referring to a clipping region in pdfbox or the PDF?
I know I am not defining one in my code, and I would be surprised if there
was one in the PDF.  But who knows.

I put the files on google drive as public files:
PDF:
https://drive.google.com/file/d/1W-wfzOxSBhmsDuzQEv5UKBWdlFs3qVlC/view?usp=sharing
Java:
https://drive.google.com/file/d/1goXAJDIQmf_lJK2HhlQfvLeEjoDR8XlW/view?usp=sharing
Debug:
https://drive.google.com/file/d/1PrDuHm86Iv6rzVMxIh7DS3WDQGT7g353/view?usp=sharing

Also, BTW I am currently using pdfbox 2.0.16.

Thanks

On Wed, Jul 31, 2019 at 11:42 PM Tilman Hausherr <TH...@t-online.de>
wrote:

> Only the line list got through.
>
> Maybe the "invisible" lines are outside a clipping region? In that case
> it would be much more difficult, one would have to adjust the list
> according to it.
>
> Tilman
>
> Am 31.07.2019 um 22:42 schrieb Joel Hirsh:
> > I had asked about this more than a year ago, and the answer was that
> > my basic method seemed correct. However I was never able to resolve
> > the issue, and now it has come up again.   I am trying to detect, and
> > thereby ignore, invisible lines.
> >
> > However, when trying to read either stroking (or nonstroking) colors
> > of the lines, I am getting that all the colors are always 0.0.
> >
> > You had previously requested a PDF and some code in order to do
> > anything more.
> >
> > Attached is:
> > 1)  a pdf that has a lot of invisible lines.
> > 2)  the core pieces of my Java code to process lines.
> > 3) a listing of the lines found in this PDF.
> > 4) a screenshot of examining a PDFGrahicsState object in the eclipse
> > debugger on a call to  lineTo(), showing that all the colors I can get
> > are 0.0.
> >
> > Thanks for your help
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> > For additional commands, e-mail: users-help@pdfbox.apache.org
>
>
>

Re: Detecting invisible (white on white?) lines

Posted by Tilman Hausherr <TH...@t-online.de>.
Only the line list got through.

Maybe the "invisible" lines are outside a clipping region? In that case 
it would be much more difficult, one would have to adjust the list 
according to it.

Tilman

Am 31.07.2019 um 22:42 schrieb Joel Hirsh:
> I had asked about this more than a year ago, and the answer was that 
> my basic method seemed correct. However I was never able to resolve 
> the issue, and now it has come up again.   I am trying to detect, and 
> thereby ignore, invisible lines.
>
> However, when trying to read either stroking (or nonstroking) colors 
> of the lines, I am getting that all the colors are always 0.0.
>
> You had previously requested a PDF and some code in order to do 
> anything more.
>
> Attached is:
> 1)  a pdf that has a lot of invisible lines.
> 2)  the core pieces of my Java code to process lines.
> 3) a listing of the lines found in this PDF.
> 4) a screenshot of examining a PDFGrahicsState object in the eclipse 
> debugger on a call to  lineTo(), showing that all the colors I can get 
> are 0.0.
>
> Thanks for your help
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: users-unsubscribe@pdfbox.apache.org
> For additional commands, e-mail: users-help@pdfbox.apache.org