You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tika.apache.org by "Tim Allison (Jira)" <ji...@apache.org> on 2021/06/28 15:53:00 UTC

[jira] [Created] (TIKA-3456) LanguageDetector should try to respect hasEnoughText more intelligently

Tim Allison created TIKA-3456:
---------------------------------

             Summary: LanguageDetector should try to respect hasEnoughText more intelligently
                 Key: TIKA-3456
                 URL: https://issues.apache.org/jira/browse/TIKA-3456
             Project: Tika
          Issue Type: Task
            Reporter: Tim Allison


If a user calls LanguageDetector's detect(String txt) or addText(String txt), the full string is passed on to the subclasses and there is no check on "hasEnoughText()".  For large strings, LanguageDetector should break the string into smaller parts and check for hasEnoughText().



--
This message was sent by Atlassian Jira
(v8.3.4#803005)