You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@beam.apache.org by "Amit Sela (JIRA)" <ji...@apache.org> on 2016/11/04 21:54:58 UTC

[jira] [Updated] (BEAM-913) Create the skeleton for a Dataset API Spark runner

     [ https://issues.apache.org/jira/browse/BEAM-913?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Amit Sela updated BEAM-913:
---------------------------
    Description: 
As discussed in Beam Dev list, we should have a second runner for Spark based on the Dataset API.
As part of this the Spark runner will have three modules: {{runner-spark-core}}, {{runner-spark-rdd}} (Spark 1.6.x) and {{runner-spark-dataset}} (Spark 2.x).

This work should go in a feature branch (runner-spark2 already exists).

This ticket is about creating a skeleton for the structure mentioned, and everything that can be easily ported from the current runner.

Some of the work is already in the current feature branch, but a lot has changed since it was last updated.   


  was:
As discussed in Beam Dev list, we should have a second runner for Spark based on the Dataset API.
As part of this the Spark runner will have three modules: {{runner-spark-core}}, {{runner-spark-rdd}} (Spark 1.6.x) and {{runner-spark-dataset}} (Spark 2.x).

This work should go in a feature branch (runner-spark2 already exists).

This ticket is about creating a skeleton for the structure mentioned, and everything that can be easily ported from the current runner.

Some of the work is already in the current feature branch, but a lot as changed since.   



> Create the skeleton for a Dataset API Spark runner 
> ---------------------------------------------------
>
>                 Key: BEAM-913
>                 URL: https://issues.apache.org/jira/browse/BEAM-913
>             Project: Beam
>          Issue Type: Wish
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> As discussed in Beam Dev list, we should have a second runner for Spark based on the Dataset API.
> As part of this the Spark runner will have three modules: {{runner-spark-core}}, {{runner-spark-rdd}} (Spark 1.6.x) and {{runner-spark-dataset}} (Spark 2.x).
> This work should go in a feature branch (runner-spark2 already exists).
> This ticket is about creating a skeleton for the structure mentioned, and everything that can be easily ported from the current runner.
> Some of the work is already in the current feature branch, but a lot has changed since it was last updated.   



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)