You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@lucene.apache.org by Apache Wiki <wi...@apache.org> on 2013/12/02 16:00:53 UTC

[Lucene-java Wiki] Update of "PDF" by SteveRowe

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-java Wiki" for change notification.

The "PDF" page has been changed by SteveRowe:
https://wiki.apache.org/lucene-java/PDF?action=diff&rev1=2&rev2=3

Comment:
fix pdfbox link

  == Extracting text from a PDF document ==
  
  In the event that you are going to index the content of a PDF, a good place to look first is a Java library called PDFBox
- http://www.pdfbox.org/userguide/text_extraction.html
+ http://pdfbox.apache.org/cookbook/textextraction.html