You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@beam.apache.org by "Enis Nazif (Jira)" <ji...@apache.org> on 2020/01/25 22:11:00 UTC

[jira] [Commented] (BEAM-8970) Spark portable runner supports Yarn

    [ https://issues.apache.org/jira/browse/BEAM-8970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17023657#comment-17023657 ] 

Enis Nazif commented on BEAM-8970:
----------------------------------

looking at this issue, to run a pipeline on YARN backed sparked, a user should be able to specify runner options of
{code:java}
['--runner=SparkRunner',
'--spark_submit_uber_jar',
'--spark_rest_url=http://spark-rest-api:6066',
'--spark_master_url='yarn']{code}
As it stands, the `spark_master_url` isn't being passed into the request created in in [https://github.com/apache/beam/blob/master/sdks/python/apache_beam/runners/portability/spark_uber_jar_job_server.py#L145]

It seems that this is necessary to support YARN

Failing this, an alternative way may be to bypass the Spark REST API (which seems like fairly hidden functionality) and instead directly `spark-submit` the portable jars that are created. 

 

 

 

> Spark portable runner supports Yarn
> -----------------------------------
>
>                 Key: BEAM-8970
>                 URL: https://issues.apache.org/jira/browse/BEAM-8970
>             Project: Beam
>          Issue Type: Wish
>          Components: runner-spark
>            Reporter: Kyle Weaver
>            Assignee: Kyle Weaver
>            Priority: Major
>              Labels: portability-spark
>




--
This message was sent by Atlassian Jira
(v8.3.4#803005)