You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@spark.apache.org by "Joseph K. Bradley (JIRA)" <ji...@apache.org> on 2018/01/04 01:40:00 UTC

[jira] [Updated] (SPARK-21926) Compatibility between ML Transformers and Structured Streaming

     [ https://issues.apache.org/jira/browse/SPARK-21926?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joseph K. Bradley updated SPARK-21926:
--------------------------------------
    Target Version/s:   (was: 2.3.0)

> Compatibility between ML Transformers and Structured Streaming
> --------------------------------------------------------------
>
>                 Key: SPARK-21926
>                 URL: https://issues.apache.org/jira/browse/SPARK-21926
>             Project: Spark
>          Issue Type: Umbrella
>          Components: ML, Structured Streaming
>    Affects Versions: 2.2.0
>            Reporter: Bago Amirbekian
>
> We've run into a few cases where ML components don't play nice with streaming dataframes (for prediction). This ticket is meant to help aggregate these known cases in one place and provide a place to discuss possible fixes.
> Failing cases:
> 1) VectorAssembler where one of the inputs is a VectorUDT column with no metadata.
> Possible fixes:
> More details here SPARK-22346.
> 2) OneHotEncoder where the input is a column with no metadata.
> Possible fixes:
> a) Make OneHotEncoder an estimator (SPARK-13030).
> -b) Allow user to set the cardinality of OneHotEncoder.-



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@spark.apache.org
For additional commands, e-mail: issues-help@spark.apache.org