You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "yuhao yang (JIRA)" <ji...@apache.org> on 2016/11/15 08:44:58 UTC

[jira] [Commented] (SPARK-18441) Add Smote in spark mlib and ml

    [ https://issues.apache.org/jira/browse/SPARK-18441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15666539#comment-15666539 ] 

yuhao yang commented on SPARK-18441:
------------------------------------

Hi [~licl], I had an implementation of Smote at https://github.com/hhbyyh/Test/blob/master/Smote/src/SmoteTest.scala . Hope it can help you. 

Right now Spark 2.1 is already in the QA phase (feature lock down), so the earliest possible window would be 2.2. Before that, we need to resolve some issues, like if SmoteSampler is a feature transformer, we need to find a way to disable it in the pipeline during the predict phase. Besides, guess we need to collect more opinions from the community to see if this is a common requirement. 

> Add Smote in spark mlib and ml
> ------------------------------
>
>                 Key: SPARK-18441
>                 URL: https://issues.apache.org/jira/browse/SPARK-18441
>             Project: Spark
>          Issue Type: Wish
>          Components: ML, MLlib
>    Affects Versions: 2.0.1
>            Reporter: lichenglin
>
> PLZ Add Smote in spark mlib and ml in case of  the "not balance of train data" for Classification



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org