You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@pdfbox.apache.org by alexander sulz <a....@digiconcept.net> on 2011/07/20 15:50:20 UTC

unparseable PDF with 1.6.0

Hello!

While indexing PDF's with solr I stumbled upon one copy which threw an
"Unexpected RuntimeException from 
org.apache.tika.parser.pdf.PDFParser@b9b618"
I've used tika version 0.9 stable (uses pdfbox 1.4). [1]
I also tried and the latest stable version of pdfbox 1.6.0 [2] which 
throws many errors.
Should I upload that PDF somwhere? If yes, where?
Btw I can open and read the PDF in Adobe Reader just fine.

regards
  alex

[1] stacktrace.txt  -> tika 0.9 (pdfbox 1.5.0)
[2] stacktrace2.txt -> pdfbox 1.6.0

Re: unparseable PDF with 1.6.0

Posted by Andreas Lehmkuehler <an...@lehmi.de>.
Hi,

Am 20.07.2011 15:50, schrieb alexander sulz:
> Hello!
>
> While indexing PDF's with solr I stumbled upon one copy which threw an
> "Unexpected RuntimeException from org.apache.tika.parser.pdf.PDFParser@b9b618"
> I've used tika version 0.9 stable (uses pdfbox 1.4). [1]
> I also tried and the latest stable version of pdfbox 1.6.0 [2] which throws many
> errors.
> Should I upload that PDF somwhere? If yes, where?
Please, create an issue on JIRA [1] and attach the pdf in question.

> Btw I can open and read the PDF in Adobe Reader just fine.
>
> regards
> alex
>
> [1] stacktrace.txt -> tika 0.9 (pdfbox 1.5.0)
> [2] stacktrace2.txt -> pdfbox 1.6.0

TIA
Andreas Lehmkühler

[1] https://issues.apache.org/jira/browse/PDFBOX