You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2015/04/12 12:39:12 UTC

[jira] [Resolved] (SPARK-1303) Added discretization capability to MLlib.

     [ https://issues.apache.org/jira/browse/SPARK-1303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved SPARK-1303.
------------------------------
    Resolution: Won't Fix

Sounds like this should start outside MLlib: https://github.com/apache/spark/pull/216

> Added discretization capability to MLlib.
> -----------------------------------------
>
>                 Key: SPARK-1303
>                 URL: https://issues.apache.org/jira/browse/SPARK-1303
>             Project: Spark
>          Issue Type: New Feature
>          Components: MLlib
>            Reporter: LIDIAgroup
>
> Some time ago, we have commented with Ameet Talwalkar the possibilty of including both Feature Selection and Discretization algorithms to MLlib.
> In this patch we've implemented Entropy Minimization Discretization following the algorithm described in the paper "Multi-interval discretization of continuous-valued attributes for classification learning" by Fayyad and Irani (1993). This is one of the most used Discretizers and is already included in most libraries like Weka, etc. This can be used as base for FS algorims and the NaiveBayes already included in MLlib.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org