You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by "Robert Muir (JIRA)" <ji...@apache.org> on 2017/10/31 13:09:00 UTC

[jira] [Commented] (LUCENE-8028) Arabic Stemmer improvement for Better Search Accuracy

    [ https://issues.apache.org/jira/browse/LUCENE-8028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16226750#comment-16226750 ] 

Robert Muir commented on LUCENE-8028:
-------------------------------------

Hi, we should add it as an option! It is ok to have multiple stemmers (choices).

I think we should be conservative about changing the default: at least for the second paper (which isn't paywalled, so i could quickly look), this appears to incorporate a dictionary-based approach (domain-dependent, typically perform less well on average than rule-based due to OOV) and i don't yet see any standard IR experiments confirming the improvement.

> Arabic Stemmer improvement for Better Search Accuracy
> -----------------------------------------------------
>
>                 Key: LUCENE-8028
>                 URL: https://issues.apache.org/jira/browse/LUCENE-8028
>             Project: Lucene - Core
>          Issue Type: Improvement
>            Reporter: Ayah Shamandi
>              Labels: Arabic, Stemmer, improvement
>
> HI, this is Ayah - bidi developer at IBM Egypt - Globalization Team, we are responsible to support Arabic at IBM products and services and as we use lucence at many of services, we found that it needs major improvement at Arabic stemmer, we implement the following two papers https://dl.acm.org/citation.cfm?id=1921657 and http://waset.org/publications/10005688/arabic-light-stemmer-for-better-search-accuracy to improve lucene arabic stemmer function and would like to open a Pull request to let you integrate it as a part of lucene 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org