You are viewing a plain text version of this content. The canonical link for it is here.
Posted to java-user@lucene.apache.org by Ching Zheng <zc...@gmail.com> on 2010/03/02 17:11:30 UTC
Help wanted with Indexing PDF Documents
Hi,
I have about 50 PDF douments with size of each is around 10MB. I am using
PDFbox for parsing, just wondering how I can index bookmarsk with its
corresponded page information?
I use PDDocumentOutline to get bookmark's title, but I only have
PDNamedDestination which offers no page number info. Can someone shed some
light on this? Thanks a lot.
Re: Help wanted with Indexing PDF Documents
Posted by Ian Lea <ia...@gmail.com>.
Sounds like a question for the PDFBox mailing list. Once you've got
the relevant info out of the PDF you can index it however you like.
--
Ian.
On Tue, Mar 2, 2010 at 4:11 PM, Ching Zheng <zc...@gmail.com> wrote:
> Hi,
> I have about 50 PDF douments with size of each is around 10MB. I am using
> PDFbox for parsing, just wondering how I can index bookmarsk with its
> corresponded page information?
>
> I use PDDocumentOutline to get bookmark's title, but I only have
> PDNamedDestination which offers no page number info. Can someone shed some
> light on this? Thanks a lot.
>
---------------------------------------------------------------------
To unsubscribe, e-mail: java-user-unsubscribe@lucene.apache.org
For additional commands, e-mail: java-user-help@lucene.apache.org