You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@lucene.apache.org by "pavithra kariyawasam (Jira)" <ji...@apache.org> on 2019/11/13 13:12:00 UTC

[jira] [Updated] (LUCENE-9043) Currently Lucene doesn't have an analyzer for Sinhala. We have built analyzer which consist of language dependent tokenizer, stemming algorithm and list of stop words.

     [ https://issues.apache.org/jira/browse/LUCENE-9043?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

pavithra kariyawasam updated LUCENE-9043:
-----------------------------------------
    Status: Open  (was: Patch Available)

> Currently Lucene doesn't have an analyzer for Sinhala. We have built analyzer which consist of language dependent tokenizer, stemming algorithm and list of stop words.
> -----------------------------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-9043
>                 URL: https://issues.apache.org/jira/browse/LUCENE-9043
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: modules/analysis
>    Affects Versions: 8.3
>            Reporter: pavithra kariyawasam
>            Priority: Major
>             Fix For: 5.5.6
>
>         Attachments: SinhalaAnalyzer.java, SinhalaStemmer.java, SinhalaTokenizer.java, stopwords.txt
>
>
> This component is developed based on three main researches. 
>  Sinhala Analyzer, as it word implies it is an enhanced software library to analyze documents which are written in Sinhala language. Sinhala Analyzer has implemented by performing Sinhala morphological analysis. Tokenizing the document content precisely, Removing stopwords accordingly and converting the terms to its base/root form accurately are the main three functionalities of Sinhala Analyzer.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@lucene.apache.org
For additional commands, e-mail: issues-help@lucene.apache.org