You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2020/09/01 19:30:00 UTC

[jira] [Commented] (TIKA-3176) Tika 2.0.0 -- Modularize language detectors

    [ https://issues.apache.org/jira/browse/TIKA-3176?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17188774#comment-17188774 ] 

Tim Allison commented on TIKA-3176:
-----------------------------------

[~kkrugler] and fellow devs, if you have a chance, can you take a look at the modularized tika-langdetect module now in {{tika-main}} and let me know what you think?

The goal is to allow users to pick one ld and not have to drag in the dependencies from all the others.

We've hardcoded optimaize in a couple of places that should be refactored, but I think that's a separate issue that doesn't block 2.0.0.

> Tika 2.0.0 -- Modularize language detectors
> -------------------------------------------
>
>                 Key: TIKA-3176
>                 URL: https://issues.apache.org/jira/browse/TIKA-3176
>             Project: Tika
>          Issue Type: Improvement
>            Reporter: Tim Allison
>            Priority: Blocker
>              Labels: 2.0.0
>
> For 2.0.0, it'd be nice to do for language detection what we did for parsers.  Create a parent module and then a submodule for each language detector so that users don't have to bring in all the dependencies of all lang-detectors if they only want one.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)