You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by Chris Mattmann <ma...@apache.org> on 2017/05/03 13:58:09 UTC
Re: Apache Tika
Hi Gorka,
See: http://wiki.apache.org/tika/TikaOCR/
Is that what you’re looking for? If so, then you can simply enable OCR for Tika REST server, and then
point your TIka Python at that. Does that help?
Cheers,
Chris
From: gorka gallo <go...@gmail.com>
Date: Wednesday, May 3, 2017 at 2:19 AM
To: "Mattmann, Chris A (3010)" <ch...@jpl.nasa.gov>
Subject: Apache Tika
Hi Chris,
I am Gorka Gallo, a research technician from Bilbao, Spain.
Is there any method to extract embedded images in PDF files with Apache Tika using Python?
Thanks,
Best regards,
Gorka.