You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "Navodit Bansod (Jira)" <ji...@apache.org> on 2020/09/01 11:10:00 UTC
[jira] [Created] (SOLR-14801) Multiple Language Detection is not
reflecting properly with apache Tika/Solr Jar ()
Navodit Bansod created SOLR-14801:
-------------------------------------
Summary: Multiple Language Detection is not reflecting properly with apache Tika/Solr Jar ()
Key: SOLR-14801
URL: https://issues.apache.org/jira/browse/SOLR-14801
Project: Solr
Issue Type: Bug
Security Level: Public (Default Security Level. Issues are Public)
Reporter: Navodit Bansod
Hi Team,
Please find the following issues occurring in case of multiple lang detection in apache Solr :
# Primary and Secondary language is not getting detected using separate fields/attributes for each. The language is getting generalized with the language having major chunk of data and thus reflect as same is both fields - "lang and langs" (attribute primary and secondary language)
# The Distance(or length) setting parameter in solrconfig.xml is properly SET in our cluster but still it seems this parameter is not showing any difference with change of values. (
<str name="langid.threshold">0.2</str>)
# Following Versions are being used in our solr cloud setup:
# tika-core-1.24.1.jar
# tika-parsers-1.24.1.jar
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org