You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by "Ryan Brush (JIRA)" <ji...@apache.org> on 2016/03/07 18:27:40 UTC
[jira] [Commented] (BEAM-11) Integrate Spark runner with Beam
[ https://issues.apache.org/jira/browse/BEAM-11?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15183305#comment-15183305 ]
Ryan Brush commented on BEAM-11:
--------------------------------
Not sure if others are working on this, but the commit linked below is probably the smallest possible change to get spark-dataflow running with the current Beam code.
Here it is: https://github.com/rbrush/spark-dataflow/commit/0a11d747eeb6bb47bb46e179deca4c85a9d5cf33
We need to do quite a bit more with the runner before it's broadly usable; see the ugly "TODO" around state internals in the commit. So perhaps the best path forward is to just create a development branch of Beam that includes the dataflow runner and we can improve on it there? Once it's in a better state we can squash/rebase (or whatever conventions this project follows) to get a clean merge into master.
I'm happy to create the branch if desired (although I lack commit privs), or feel free to just grab the code from the above commit if it makes sense.
> Integrate Spark runner with Beam
> --------------------------------
>
> Key: BEAM-11
> URL: https://issues.apache.org/jira/browse/BEAM-11
> Project: Beam
> Issue Type: Task
> Components: runner-spark
> Reporter: Amit Sela
> Assignee: Amit Sela
>
> Refactor and integrate the Spark runner code against Google's contributed version of Dataflow - Beam.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)