You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@tika.apache.org by Julien Massiera <ju...@francelabs.com> on 2019/06/17 13:54:27 UTC
OCR'ing of PDFs
Hi,
I am really interested in being able to configure a Tika server that
decides whether to OCR a PDF or not. I noticed a new AUTO mode in the
CHANGES of the last tika version (1.21) but in case of a Tika server I
don't know where to select this mode. In addition, did you by chance
have the time to write a documentation about this mode and the
heuristics that you use (and the way to configure those) ?
Thanks for help,
Julien