You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hivemall.apache.org by "Makoto Yui (JIRA)" <ji...@apache.org> on 2017/09/13 12:51:02 UTC

[jira] [Closed] (HIVEMALL-142) Implement SingularizeUDF for English singular-ization

     [ https://issues.apache.org/jira/browse/HIVEMALL-142?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Makoto Yui closed HIVEMALL-142.
-------------------------------
    Resolution: Fixed

> Implement SingularizeUDF for English singular-ization
> -----------------------------------------------------
>
>                 Key: HIVEMALL-142
>                 URL: https://issues.apache.org/jira/browse/HIVEMALL-142
>             Project: Hivemall
>          Issue Type: New Feature
>            Reporter: Takuya Kitazawa
>            Assignee: Takuya Kitazawa
>
> Something like `singularize('movies')` => `'movie'` could be very useful in a combination of `tokenize()` for English NLP on Hivemall. 
> Implementation  mostly relies on regexp as:
> * Jave example: https://github.com/sundrio/sundrio/blob/master/codegen/src/main/java/io/sundr/codegen/functions/Singularize.java
> * One of the most famous Python implementation https://github.com/clips/pattern/blob/master/pattern/text/en/inflect.py#L445-L623



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)