You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/03/30 19:57:00 UTC
[jira] [Created] (TIKA-3343) Remove Tika custom lang detection for
2.x
Tim Allison created TIKA-3343:
---------------------------------
Summary: Remove Tika custom lang detection for 2.x
Key: TIKA-3343
URL: https://issues.apache.org/jira/browse/TIKA-3343
Project: Tika
Issue Type: Task
Reporter: Tim Allison
In the back of my mind, this was an agreed upon change for 2.x. I can't find documentation, tho, so I'm opening this issue to discuss.
My memory is that we agreed that we should outsource language id to other tools and remove our own lang ider for 2.x. If my memory is wrong, or if there's a good reason to keep our language detection algorithm and data, let's discuss.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)