You are viewing a plain text version of this content. The canonical link for it is here.
Posted to solr-user@lucene.apache.org by Yasufumi Mizoguchi <ya...@gmail.com> on 2018/07/24 02:47:45 UTC
How to use tika-OCR in data import handler?
Hi,
I am trying to use tika-OCR(Tesseract) in data import handler
and found that processing English documents was quite good.
But I am struggling to process the other languages such as
Japanese, Chinese, etc...
So, I want to know how to switch Tesseract-OCR's processing
language via data import handler config or tikaConfig param.
Any points would be appreciated.
Thanks,
Yasufumi