You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Shruthi Khatawkar (JIRA)" <ji...@apache.org> on 2013/06/21 08:35:20 UTC

[jira] [Created] (SOLR-4945) Japanese Autocomplete and Highlighter broken

Shruthi Khatawkar created SOLR-4945:
---------------------------------------

             Summary: Japanese Autocomplete and Highlighter broken
                 Key: SOLR-4945
                 URL: https://issues.apache.org/jira/browse/SOLR-4945
             Project: Solr
          Issue Type: Bug
          Components: highlighter
            Reporter: Shruthi Khatawkar


Autocomplete is implemented with Highlighter functionality. This works fine for most of the languages but breaks for Japanese.

multivalued,termVector,termPositions and termOffset are set to true.

Here is an example:

Query: product classic.

Result:

Actual : 

この商品の互換性の機種にproduct 1 やclassic Touch2 が記載が有りません。 USB接続ケーブルをproduct 1 やclassic Touch2に付属の物を使えば利用出来ると思いますが 間違っていますか?

With Highlighter (<em> </em> tags being used):

この商品の互換性の機種<em>にproduct</em> 1 <em>やclassic</em> Touch2 が記載が有りません。 USB接続ケーブルをproduct 1 やclassic Touch2に付属の物を使えば利用出来ると思いますが 間違っていますか?

Though query terms "product classic" is repeated twice, highlighting is happening only on the first instance. As shown above.

Solr returns only first instance offset and second instance is ignored.

Also it's observed, highlighter repeats first letter of the token if there is numeric.
For eg.Query : product and We have product1, highlighter returns as p<em>product</em>1.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org