You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@beam.apache.org by "Ryan Brush (JIRA)" <ji...@apache.org> on 2016/03/07 18:27:40 UTC

[jira] [Commented] (BEAM-11) Integrate Spark runner with Beam

    [ https://issues.apache.org/jira/browse/BEAM-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183305#comment-15183305 ] 

Ryan Brush commented on BEAM-11:
--------------------------------

Not sure if others are working on this, but the commit linked below is probably the smallest possible change to get spark-dataflow running with the current Beam code. 

Here it is: https://github.com/rbrush/spark-dataflow/commit/0a11d747eeb6bb47bb46e179deca4c85a9d5cf33

We need to do quite a bit more with the runner before it's broadly usable; see the ugly "TODO" around state internals in the commit. So perhaps the best path forward is to just create a development branch of Beam that includes the dataflow runner and we can improve on it there? Once it's in a better state we can squash/rebase (or whatever conventions this project follows) to get a clean merge into master.

I'm happy to create the branch if desired (although I lack commit privs), or feel free to just grab the code from the above commit if it makes sense.



> Integrate Spark runner with Beam
> --------------------------------
>
>                 Key: BEAM-11
>                 URL: https://issues.apache.org/jira/browse/BEAM-11
>             Project: Beam
>          Issue Type: Task
>          Components: runner-spark
>            Reporter: Amit Sela
>            Assignee: Amit Sela
>
> Refactor and integrate the Spark runner code against Google's contributed version of Dataflow - Beam.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)