You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "ASF GitHub Bot (JIRA)" <ji...@apache.org> on 2019/03/22 01:34:00 UTC

[jira] [Work logged] (BEAM-2590) SparkRunner shim for Job API

     [ https://issues.apache.org/jira/browse/BEAM-2590?focusedWorklogId=217088&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-217088 ]

ASF GitHub Bot logged work on BEAM-2590:
----------------------------------------

                Author: ASF GitHub Bot
            Created on: 22/Mar/19 01:33
            Start Date: 22/Mar/19 01:33
    Worklog Time Spent: 10m 
      Work Description: ibzib commented on pull request #8115: [BEAM-2590] Implement basic Spark portable runner
URL: https://github.com/apache/beam/pull/8115
 
 
   This PR implements a prototype for a Spark portable runner.
   - This work follows the Flink runner's example very closely.
   - I chose to build this on top of the more stable and fully-featured legacy Spark runner rather than developing concurrently with the structured streaming branch. Therefore, RDDs are used rather than Datasets.
   - This iteration implements and tests Impulse, Executable Stage, and GBK. Note that the GBK translation is mostly identical to the legacy runner.
   - PAssert is not yet working (because metrics are not supported). However, `SparkPortableExecutionTest` is able to informally verify that the executable stage consumes and outputs elements correctly with a print statement.
   - For now, streaming, side inputs, user state/timers, metrics, etc. are not supported. Side inputs are next on my road map.
   
   R: @angoenka, @robertwb, @iemejia 
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | --- | --- | --- | ---
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_Verify/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python3_Verify/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/) <br> [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/) | --- | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
 
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


Issue Time Tracking
-------------------

            Worklog Id:     (was: 217088)
            Time Spent: 10m
    Remaining Estimate: 0h

> SparkRunner shim for Job API
> ----------------------------
>
>                 Key: BEAM-2590
>                 URL: https://issues.apache.org/jira/browse/BEAM-2590
>             Project: Beam
>          Issue Type: Sub-task
>          Components: runner-spark
>            Reporter: Kenneth Knowles
>            Assignee: Kyle Weaver
>            Priority: Major
>              Labels: portability, triaged
>          Time Spent: 10m
>  Remaining Estimate: 0h
>
> Whatever the result of https://s.apache.org/beam-job-api we will need a way for the JVM-based SparkRunner to receive and run pipelines authors in Python.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)