You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Matthew Caruana Galizia (JIRA)" <ji...@apache.org> on 2017/01/11 11:48:58 UTC
[jira] [Created] (TIKA-2235) Use Tesseract's recommended DPI for
PDF images
Matthew Caruana Galizia created TIKA-2235:
---------------------------------------------
Summary: Use Tesseract's recommended DPI for PDF images
Key: TIKA-2235
URL: https://issues.apache.org/jira/browse/TIKA-2235
Project: Tika
Issue Type: Improvement
Components: parser
Affects Versions: 1.14
Reporter: Matthew Caruana Galizia
Priority: Minor
From the [Tesseract wiki|https://github.com/tesseract-ocr/tesseract/wiki/ImproveQuality]:
{quote}
Tesseract works best on images which have a DPI of at least 300 dpi....
{quote}
PDFParserConfig is currently initialised with a value of 200 for ocrDPI.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)