You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by Adam Nichols <Ad...@swmc.com> on 2009/05/21 23:07:17 UTC

Getting page number for bookmarks

I'm trying to parse PDF files, look at all the bookmarks, determine what 
page they are on, and then split the document based on the bookmarks.  For 
now, I'm not worried about supporting name based bookmarks.

The PrintBookmarks example gave me example code so now I can loop through 
the bookmarks with no problem.  My problem is that the PDOutlineItem 
objects all have null for the Destination.  I looked at the PDF in a hex 
editor and read the PDF spec, and it looks like this is because the 
bookmark doesn't have a "Dest", instead it has a "GoTo" action. 

Here's my bookmark:
102 0 obj
<</Parent 101 0 R/A 100 0 R/Next 104 0 R/Title(A002\r)>>
endobj

And the action which I believe it's referencing. 
100 0 obj
<</D[2 0 R/FitH 795]/S/GoTo>>
endobj

If I dig into the COSDictionary, I can pull out the {2, 0} in a COSObject. 
 My problem now is determining which page number number that is.  Or if 
anyone has already done this, please point me in the right direction.  I 
didn't see much on the mailing list archives about parsing bookmarks for 
page numbers.  If I missed it, let me know.

Thanks,
Adam