You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@tika.apache.org by Julien Massiera <ju...@francelabs.com> on 2019/06/17 13:54:27 UTC

OCR'ing of PDFs

Hi,

I am really interested in being able to configure a Tika server that 
decides whether to OCR a PDF or not. I noticed a new AUTO mode in the 
CHANGES of the last tika version (1.21) but in case of a Tika server I 
don't know where to select this mode. In addition, did you by chance 
have the time to write a documentation about this mode and the 
heuristics that you use (and the way to configure those) ?

Thanks for help,
Julien