You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2015/12/09 23:49:11 UTC
[jira] [Updated] (SPARK-12212) Clarify the distinction between
spark.mllib and spark.ml
[ https://issues.apache.org/jira/browse/SPARK-12212?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Joseph K. Bradley updated SPARK-12212:
--------------------------------------
Assignee: Timothy Hunter
> Clarify the distinction between spark.mllib and spark.ml
> --------------------------------------------------------
>
> Key: SPARK-12212
> URL: https://issues.apache.org/jira/browse/SPARK-12212
> Project: Spark
> Issue Type: Sub-task
> Components: Documentation
> Affects Versions: 1.5.2
> Reporter: Timothy Hunter
> Assignee: Timothy Hunter
>
> There is a confusion in the documentation of MLLib as to what exactly MLlib: is it the package, or is it the whole effort of ML on spark, and how it differs from spark.ml? Is MLLib going to be deprecated?
> We should do the following:
> - refer to the mllib the code package as spark.mllib across all the documentation. Alternative name is "RDD API of MLlib".
> - refer to MLlib the project that encompasses spark.ml + spark.mllib as MLlib (it should be the default)
> - replaces reference to "Pipeline API" by spark.ml or the "Dataframe API of MLlib". I would deemphasize that this API is for building pipelines. Some users are lead to believe from the documentation that spark.ml can only be used for building pipelines and that using a single algorithm can only be done with spark.mllib.
> Most relevant places:
> - {{mllib-guide.md}}
> - {{mllib-linear-methods.md}}
> - {{mllib-dimensionality-reduction.md}}
> - {{mllib-pmml-model-export.md}}
> - {{mllib-statistics.md}}
> In these files, most references to {{MLlib}} are meant to refer to {{spark.mllib}} instead.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org