You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Thomas Sörensen <th...@excosoft.se> on 2022/12/01 09:32:04 UTC

Links in pdf cause exception

Hi

I have pdf file with a lot of links. They generally look like this in Foxit.
[cid:image001.png@01D9056E.E7AC9910]

In notepad the links look something like this:
/Subtype /Link
/A 140 0 R
/Type /Annot
/Rect [561.26 196.299 572.598 234.567]
>>
endobj
140 0 obj
<<
/D [5 0 R /Fit]
/Next 307 0 R
/Type /Action
/S /GoTo
>>
endobj

When attempting to split the pdf using the following code:
final PDDocument p1PD = PDDocument.load(p1.toFile());
final Splitter splitter = new Splitter();
final List<PDDocument> listPDFPages1 = splitter.split(p1PD);

I get the following exception:
java.io.IOException: Error: can't convert to Destination COSArray{[COSName{Fit}]}
                             at org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination.create(PDDestination.java:98)
                             at org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink.getDestination(PDAnnotationLink.java:148)


It is thrown from PDDestination.create(COSBase base):98.
It happens because base is an COSArray with only one item [COSName{fit}]
[cid:image003.png@01D90570.0D8EF910]

If I change the code in Splitter to catch the exception and ignore it:
[cid:image002.png@01D9056E.E7AC9910]
It will process all the annotations and annotation links and simply ignore the error.
The splitted pages now contain links that look like this:
91 0 obj
<<
/D [null /Fit]
/Next 178 0 R
/Type /Action
/S /GoTo
>>
endobj


Now instead of /D [5 0 R /Fit] it is /D [null /Fit]
If I now try to process the annotations of the page PDAnnotationLink.getDestination will not throw an Exception anymore because the argument base will contain an COSArray the contains two items [COSNull{},COSName{fit}]

I don't know how to interpret this. My solution for right now is to use my own customized copy of the Splitter class that simply absorb the exception.
I feel like that for some reason PDFBox is ignoring the 5 0 R part of /D [5 0 R /Fit] when building the annotations. I don't know enough about the PDF Spec to tell what it does.

I am using PDFBox 2.0.27


BR
Thomas

Re: Links in pdf cause exception

Posted by Tilman Hausherr <TH...@t-online.de>.
On 16.12.2022 15:52, Tilman Hausherr wrote:
> This is weird, /D [5 0 R /Fit] means the array has two elements?!

What I mean is that this seems correct (if object 5 is a page)


I suspect you have [ /Fit ] somewhere in the PDF. (The spaces are optional)

Tilman

Re: Links in pdf cause exception

Posted by Tilman Hausherr <TH...@t-online.de>.
Hi,

Sorry for the late answer. Please share the PDF, your screenshots didn't 
get through. This is weird, /D [5 0 R /Fit] means the array has two 
elements?! Maybe the "bad" object is elsewhere?

Tilman

On 01.12.2022 10:32, Thomas Sörensen wrote:
>
> Hi
>
> I have pdf file with a lot of links. They generally look like this in 
> Foxit.
>
> In notepad the links look something like this:
>
> /Subtype /Link
>
> /A 140 0 R
>
> /Type /Annot
>
> /Rect [561.26 196.299 572.598 234.567]
>
> >>
>
> endobj
>
> 140 0 obj
>
> <<
>
> /D [5 0 R /Fit]
>
> /Next 307 0 R
>
> /Type /Action
>
> /S /GoTo
>
> >>
>
> endobj
>
> When attempting to split the pdf using the following code:
>
> final PDDocument p1PD = PDDocument.load(p1.toFile());
>
> final Splitter splitter = new Splitter();
>
> final List<PDDocument> listPDFPages1 = splitter.split(p1PD);
>
> I get the following exception:
>
> java.io.IOException: Error: can't convert to Destination 
> COSArray{[COSName{Fit}]}
>
> at 
> org.apache.pdfbox.pdmodel.interactive.documentnavigation.destination.PDDestination.create(PDDestination.java:98)
>
> at 
> org.apache.pdfbox.pdmodel.interactive.annotation.PDAnnotationLink.getDestination(PDAnnotationLink.java:148)
>
> It is thrown from PDDestination.create(COSBase base):98.
>
> It happens because base is an COSArray with only one item [COSName{fit}]
>
> If I change the code in Splitter to catch the exception and ignore it:
>
> It will process all the annotations and annotation links and simply 
> ignore the error.
>
> The splitted pages now contain links that look like this:
>
> 91 0 obj
>
> <<
>
> /D [null /Fit]
>
> /Next 178 0 R
>
> /Type /Action
>
> /S /GoTo
>
> >>
>
> endobj
>
> Now instead of /D [5 0 R /Fit] it is /D [*null* /Fit]
>
> If I now try to process the annotations of the page 
> PDAnnotationLink.getDestination will not throw an Exception anymore 
> because the argument base will contain an COSArray the contains two 
> items [COSNull{},COSName{fit}]
>
> I don’t know how to interpret this. My solution for right now is to 
> use my own customized copy of the Splitter class that simply absorb 
> the exception.
>
> I feel like that for some reason PDFBox is ignoring the 5 0 R part of 
> /D [5 0 R /Fit] when building the annotations. I don’t know enough 
> about the PDF Spec to tell what it does.
>
> I am using PDFBox 2.0.27
>
> BR
>
> Thomas
>