You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@tika.apache.org by Shalom Ben-Zvi <sh...@gmail.com> on 2012/06/26 00:12:09 UTC

Tika doesn't parse any text from a specific

Hi all
anyone has an idea why Tika doesn't parse any text from the attached
document?
I'm using Tika the same way to parse other english/german documents with
no problem, just this one and others like it.

Thank you

Re: Tika doesn't parse any text from a specific

Posted by Jukka Zitting <ju...@gmail.com>.

Hi,

On Tue, Jun 26, 2012 at 12:12 AM, Shalom Ben-Zvi <sh...@gmail.com> wrote:
> anyone has an idea why Tika doesn't parse any text from the attached
> document?

Looks like the PDF document just contains scanned images of the pages,
so there's no text that Tika could access without advanced OCR
tooling.

BR,

Jukka Zitting