You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@kudu.apache.org by "Grant Henke (JIRA)" <ji...@apache.org> on 2018/09/26 15:48:00 UTC

[jira] [Resolved] (KUDU-2539) Supporting Spark Streaming DataFrame in KuduContext

     [ https://issues.apache.org/jira/browse/KUDU-2539?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Grant Henke resolved KUDU-2539.
-------------------------------
       Resolution: Fixed
    Fix Version/s: 1.8.0

This is resolved via [8020cbf|https://github.com/apache/kudu/commit/8020cbf2760483c46ed0766dfdebe3c12d0107f1].

> Supporting Spark Streaming DataFrame in KuduContext
> ---------------------------------------------------
>
>                 Key: KUDU-2539
>                 URL: https://issues.apache.org/jira/browse/KUDU-2539
>             Project: Kudu
>          Issue Type: Improvement
>          Components: spark
>    Affects Versions: 1.8.0
>            Reporter: Attila Zsolt Piros
>            Assignee: Attila Zsolt Piros
>            Priority: Minor
>             Fix For: 1.8.0
>
>
> Currently KuduContext does not support Spark Streaming DataFrame. The problem comes from a foreachPartition call which in case of spark streaming is an unsupported operation, like foreach: 
> [unsupported operations in streaming|https://spark.apache.org/docs/latest/structured-streaming-programming-guide.html#unsupported-operations]
> I have created a small example app with a custom Kudu sink which can be used for testing:
> [kudu custom sink and example app|https://github.com/attilapiros/kudu_custom_sink]
> The patch fixing this issue is also available for kudu-spark, so soon a gerrit review can be expected with the solution.  



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)