You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/06/28 19:02:00 UTC

[jira] [Resolved] (TIKA-3456) LanguageDetector should try to respect hasEnoughText more intelligently

     [ https://issues.apache.org/jira/browse/TIKA-3456?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Allison resolved TIKA-3456.
-------------------------------
    Resolution: Fixed

> LanguageDetector should try to respect hasEnoughText more intelligently
> -----------------------------------------------------------------------
>
>                 Key: TIKA-3456
>                 URL: https://issues.apache.org/jira/browse/TIKA-3456
>             Project: Tika
>          Issue Type: Task
>            Reporter: Tim Allison
>            Priority: Minor
>
> If a user calls LanguageDetector's detect(String txt) or addText(String txt), the full string is passed on to the subclasses and there is no check on "hasEnoughText()".  For large strings, LanguageDetector should break the string into smaller parts and check for hasEnoughText().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)