You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Xiangrui Meng (JIRA)" <ji...@apache.org> on 2015/07/17 07:28:04 UTC

[jira] [Resolved] (SPARK-7131) Move tree,forest implementation from spark.mllib to spark.ml

     [ https://issues.apache.org/jira/browse/SPARK-7131?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Xiangrui Meng resolved SPARK-7131.
----------------------------------
       Resolution: Fixed
    Fix Version/s: 1.5.0

Issue resolved by pull request 7294
[https://github.com/apache/spark/pull/7294]

> Move tree,forest implementation from spark.mllib to spark.ml
> ------------------------------------------------------------
>
>                 Key: SPARK-7131
>                 URL: https://issues.apache.org/jira/browse/SPARK-7131
>             Project: Spark
>          Issue Type: Improvement
>          Components: ML, MLlib
>    Affects Versions: 1.4.0
>            Reporter: Joseph K. Bradley
>            Assignee: Joseph K. Bradley
>             Fix For: 1.5.0
>
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> We want to change and improve the spark.ml API for trees and ensembles, but we cannot change the old API in spark.mllib.  To support the changes we want to make, we should move the implementation from spark.mllib to spark.ml.  We will generalize and modify it, but will also ensure that we do not change the behavior of the old API.
> This JIRA should be done in several PRs, in this order:
> 1. Copy the implementation over to spark.ml and change the spark.ml classes to use that implementation, rather than calling the spark.mllib implementation.  The current spark.ml tests will ensure that the 2 implementations learn exactly the same models.  Note: This should include performance testing to make sure the updated code does not have any regressions.
> 2. Remove the spark.mllib implementation, and make the spark.mllib APIs wrappers around the spark.ml implementation.  The spark.ml tests will again ensure that we do not change any behavior.
> 3. Move the unit tests to spark.ml, and change the spark.mllib unit tests to verify model equivalence.
> After these updates, we can more safely generalize and improve the spark.ml implementation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org