You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Navodit Bansod (Jira)" <ji...@apache.org> on 2020/09/01 11:10:00 UTC

[jira] [Created] (SOLR-14801) Multiple Language Detection is not reflecting properly with apache Tika/Solr Jar ()

Navodit Bansod created SOLR-14801:
-------------------------------------

             Summary: Multiple Language Detection is not reflecting properly with apache Tika/Solr Jar ()
                 Key: SOLR-14801
                 URL: https://issues.apache.org/jira/browse/SOLR-14801
             Project: Solr
          Issue Type: Bug
      Security Level: Public (Default Security Level. Issues are Public)
            Reporter: Navodit Bansod


Hi Team,

Please find  the following issues occurring in case of multiple lang detection in apache Solr :
 # Primary and Secondary language is not getting detected using separate fields/attributes for each. The language is getting generalized with the language having major chunk of data and thus reflect as same is both fields - "lang and langs" (attribute primary and secondary language)
 # The Distance(or length) setting parameter in solrconfig.xml is properly SET in our cluster but still it seems this parameter is not showing any difference with change of values. (
<str name="langid.threshold">0.2</str>)
 # Following Versions are being used in our solr cloud setup: 
 # tika-core-1.24.1.jar
 # tika-parsers-1.24.1.jar

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org