You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Ken Krugler <kk...@transpac.com> on 2010/01/04 22:27:48 UTC

PDFBox bug in 0.8-incubating

Just in case anybody else is trying to use Tika to parse a wide range  
of PDFs, I've run into several hangs due to this issue:

https://issues.apache.org/jira/browse/PDFBOX-541

It's been fixed in PDFBox trunk, from what I can see, but not in the  
0.8-incubating jar that Tika is currently using.

I don't see snapshot builds of PDFBox in the Apache Maven repo, so for  
now I'm going to build from trunk and override the Tika dependency.

-- Ken

--------------------------------------------
Ken Krugler
+1 530-210-6378
http://bixolabs.com
e l a s t i c   w e b   m i n i n g