You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/01/06 15:44:00 UTC
[jira] [Created] (TIKA-3264) Improve the per page OCR heuristics
for AUTO mode
Tim Allison created TIKA-3264:
---------------------------------
Summary: Improve the per page OCR heuristics for AUTO mode
Key: TIKA-3264
URL: https://issues.apache.org/jira/browse/TIKA-3264
Project: Tika
Issue Type: Improvement
Affects Versions: 2.0.0
Reporter: Tim Allison
We're currently using character count per page as the sole reason to run OCR in AUTO mode on PDFs.
Let's use this issue to discuss better options.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)