You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (JIRA)" <ji...@apache.org> on 2018/10/04 12:41:04 UTC
[jira] [Created] (TIKA-2749) OCR on PDFs should "just work" out of
the box
Tim Allison created TIKA-2749:
---------------------------------
Summary: OCR on PDFs should "just work" out of the box
Key: TIKA-2749
URL: https://issues.apache.org/jira/browse/TIKA-2749
Project: Tika
Issue Type: Task
Reporter: Tim Allison
There are now two different ways (with various parameters) to trigger OCR on inline images within PDFs. The user has to 1) understand that these are available and then 2) elect to turn one of those on.
I think we should make OCR'ing on PDFs "just work" perhaps with a hybrid strategy between the 2 options. Users should still be allowed to configure as they wish, of course.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)