You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2015/06/22 03:03:00 UTC

[jira] [Resolved] (SPARK-7443) MLlib 1.4 QA plan

     [ https://issues.apache.org/jira/browse/SPARK-7443?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joseph K. Bradley resolved SPARK-7443.
--------------------------------------
       Resolution: Fixed
    Fix Version/s: 1.4.0

I'm closing this and marking it Fixed.  I've checked to make sure the still-unresolved JIRAs linked from this umbrella are marked for appropriate targets.  The remaining ones are mainly perf tests (necessary for the next release) or documentation (which we can add to the Spark user guide when those docs are ready).

> MLlib 1.4 QA plan
> -----------------
>
>                 Key: SPARK-7443
>                 URL: https://issues.apache.org/jira/browse/SPARK-7443
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML, MLlib
>    Affects Versions: 1.4.0
>            Reporter: Xiangrui Meng
>            Assignee: Joseph K. Bradley
>            Priority: Critical
>             Fix For: 1.4.0
>
>
> h2. API
> * Check API compliance using java-compliance-checker (SPARK-7458)
> * Audit new public APIs (from the generated html doc)
> ** Scala (do not forget to check the object doc) (SPARK-7537)
> ** Java compatibility (SPARK-7529)
> ** Python API coverage (SPARK-7536)
> * audit Pipeline APIs (SPARK-7535)
> * graduate spark.ml from alpha (SPARK-7748)
> ** remove AlphaComponent annotations
> ** remove mima excludes for spark.ml
> ** mark concrete classes final wherever reasonable
> h2. Algorithms and performance
> *Performance*
> * _List any other missing performance tests from spark-perf here_
> * LDA online/EM (SPARK-7455)
> * ElasticNet for linear regression and logistic regression (SPARK-7456)
> * Bernoulli naive Bayes (SPARK-7453)
> * PIC (SPARK-7454)
> * ALS.recommendAll (SPARK-7457)
> * perf-tests in Python (SPARK-7539)
> *Correctness*
> * PMML
> ** scoring using PMML evaluator vs. MLlib models (SPARK-7540)
> * model save/load (SPARK-7541)
> h2. Documentation and example code
> * Create JIRAs for the user guide to each new algorithm and assign them to the corresponding author.  Link here as "requires"
> ** Now that we have algorithms in spark.ml which are not in spark.mllib, we should start making subsections for the spark.ml API as needed.  We can follow the structure of the spark.mllib user guide.
> *** The spark.ml user guide can provide: (a) code examples and (b) info on algorithms which do not exist in spark.mllib.
> *** We should not duplicate info in the spark.ml guides.  Since spark.mllib is still the primary API, we should provide links to the corresponding algorithms in the spark.mllib user guide for more info.
> * Create example code for major components.  Link here as "requires"
> ** cross validation in python (SPARK-7387)
> ** pipeline with complex feature transformations (scala/java/python) (SPARK-7546)
> ** elastic-net (possibly with cross validation) (SPARK-7547)
> ** kernel density (SPARK-7707)
> * Update Programming Guide for 1.4 (towards end of QA) (SPARK-7715)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org