You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2015/04/02 05:06:53 UTC

[jira] [Commented] (SPARK-6113) Stabilize DecisionTree and ensembles APIs

    [ https://issues.apache.org/jira/browse/SPARK-6113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14392054#comment-14392054 ] 

Joseph K. Bradley commented on SPARK-6113:
------------------------------------------

I just noted that this is blocked by the 2 indexer JIRAs.  (Really, it requires at least one of them.)  This is because we made a decision to add this API directly to the spark.ml package, rather than creating another tree API within the spark.mllib package.  In the spark.ml package, we will require some way to test categorical features and multiclass classification, which will require one of the indexer JIRAs (to add category metadata).

> Stabilize DecisionTree and ensembles APIs
> -----------------------------------------
>
>                 Key: SPARK-6113
>                 URL: https://issues.apache.org/jira/browse/SPARK-6113
>             Project: Spark
>          Issue Type: Sub-task
>          Components: MLlib, PySpark
>    Affects Versions: 1.4.0
>            Reporter: Joseph K. Bradley
>            Assignee: Joseph K. Bradley
>            Priority: Critical
>
> *Issue*: The APIs for DecisionTree and ensembles (RandomForests and GradientBoostedTrees) have been experimental for a long time.  The API has become very convoluted because trees and ensembles have many, many variants, some of which we have added incrementally without a long-term design.
> *Proposal*: This JIRA is for discussing changes required to finalize the APIs.  After we discuss, I will make a PR to update the APIs and make them non-Experimental.  This will require making many breaking changes; see the design doc for details.
> [Design doc | https://docs.google.com/document/d/1rJ_DZinyDG3PkYkAKSsQlY0QgCeefn4hUv7GsPkzBP4]: This outlines current issues and the proposed API.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org