You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@tika.apache.org by Mirko Hering <mi...@asia-europe.uni-heidelberg.de> on 2016/04/27 15:56:56 UTC

Tika OCR: available languages and response format

Hello everyone,

we are using TikaOCR to access tesseract OCR via Tika Server's web API,
which is working perfectly satisfying.
However, as we process documents in different languages, I was wondering
if it is possible to get a list of available
languages from the server? Furthermore, does anybody know how I can tell
TikaOCR not to return a response in
plain text but hOCR-XML?

Thanks in advance,
Mirko