You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@spark.apache.org by "Max Moroz (JIRA)" <ji...@apache.org> on 2016/07/01 07:32:11 UTC

[jira] [Updated] (SPARK-16319) Non-linear (DAG) pipelines need better explanation

     [ https://issues.apache.org/jira/browse/SPARK-16319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Max Moroz updated SPARK-16319:
------------------------------
    Summary: Non-linear (DAG) pipelines need better explanation  (was: Pipeline / DAG)

> Non-linear (DAG) pipelines need better explanation
> --------------------------------------------------
>
>                 Key: SPARK-16319
>                 URL: https://issues.apache.org/jira/browse/SPARK-16319
>             Project: Spark
>          Issue Type: Documentation
>          Components: ML
>    Affects Versions: 2.0.0
>            Reporter: Max Moroz
>            Priority: Minor
>
> There's a [paragraph|http://spark.apache.org/docs/2.0.0-preview/ml-guide.html#details] about non-linear pipeline in the ML docs, but it's not clear how DAG pipeline differs from a linear pipeline, and in fact, it seems that a "DAG Pipeline" results in the behavior identical to that of a regular linear pipeline (the stages are simply applied in the order provided when the pipeline is created). In addition, no checks of input and output columns seem to occur when the pipeline.fit() or pipeline.transform() is called.
> It would be better to clarify in the docs and/or remove that paragraph.
> I'd be happy to write it up, but I have no idea what the intention of this concept is at this point.
> [Additional reference on SO|http://stackoverflow.com/questions/37541668/non-linear-dag-ml-pipelines-in-apache-spark]



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org