You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@beam.apache.org by "Amit Sela (JIRA)" <ji...@apache.org> on 2016/07/19 15:47:20 UTC

[jira] [Commented] (BEAM-470) Spark Runner does not send the job execution information into the Spark History Server

    [ https://issues.apache.org/jira/browse/BEAM-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15384390#comment-15384390 ] 

Amit Sela commented on BEAM-470:
--------------------------------

This is something to consider. On one hand, you could say that that's a spark-runtime detail, and you should actually use  spark-submit (to run on cluster). 
Anyway, I think it would be a good idea to pass configuration as key-value strings that could be propagated to the runner. In Spark's case we'll just add them to the constructed SparkContext. 

> Spark Runner does not send the job execution information into the Spark History Server
> --------------------------------------------------------------------------------------
>
>                 Key: BEAM-470
>                 URL: https://issues.apache.org/jira/browse/BEAM-470
>             Project: Beam
>          Issue Type: Improvement
>          Components: runner-spark
>    Affects Versions: 0.2.0-incubating
>            Reporter: Ismaël Mejía
>            Assignee: Amit Sela
>            Priority: Minor
>
> If you run a Beam pipeline using the spark runner from spark (via spark-submit), the execution is registered in the spark-history-server if it is active and configured.
> if you do this directly from a main method with --runner=SparkRunner (the beam way) the Beam runner does not report the execution to the history server, it seems the issue is the runner does not take into account an existing spark configuration file SPARK_HOME/conf/spark-defaults.conf (or there is not a way to tell the runner to take such conf into account).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)