You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Munkyu Im (JIRA)" <ji...@apache.org> on 2019/04/30 04:10:00 UTC

[jira] [Updated] (LUCENE-8784) Nori(Korean) tokenizer removes the decimal point.

     [ https://issues.apache.org/jira/browse/LUCENE-8784?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Munkyu Im updated LUCENE-8784:
------------------------------
    Description: 
This is the same issue that I mentioned to [https://github.com/elastic/elasticsearch/issues/41401#event-2293189367]

unlike standard analyzer, nori analyzer removes the decimal point.

nori tokenizer removes "." character by default.
 In this case, it is difficult to index the keywords including the decimal point.

It would be nice if there had the option whether add a decimal point or not.

Like Japanese tokenizer does,  Nori need an option to preserve decimal point.

 

  was:
This is the same issue that I mentioned to [https://github.com/elastic/elasticsearch/issues/41401#event-2293189367]

unlike standard analyzer, nori analyzer removes the decimal point.

nori tokenizer removes "." character by default.
In this case, it is difficult to index the keywords including the decimal point.

It would be nice if there had the option whether add a decimal point or not like Japanese tokenizer

 


>  Nori(Korean) tokenizer removes the decimal point. 
> ---------------------------------------------------
>
>                 Key: LUCENE-8784
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8784
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Munkyu Im
>            Priority: Major
>
> This is the same issue that I mentioned to [https://github.com/elastic/elasticsearch/issues/41401#event-2293189367]
> unlike standard analyzer, nori analyzer removes the decimal point.
> nori tokenizer removes "." character by default.
>  In this case, it is difficult to index the keywords including the decimal point.
> It would be nice if there had the option whether add a decimal point or not.
> Like Japanese tokenizer does,  Nori need an option to preserve decimal point.
>  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org