You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "Rob Keevil (JIRA)" <ji...@apache.org> on 2018/03/07 14:13:00 UTC

[jira] [Commented] (AIRFLOW-2140) Add Kubernetes Scheduler to Spark Submit Operator

    [ https://issues.apache.org/jira/browse/AIRFLOW-2140?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16389598#comment-16389598 ] 

Rob Keevil commented on AIRFLOW-2140:
-------------------------------------

on_kill code is written, but is currently never called by Airflow.  This will need to be retested when AIRFLOW-1623 has been fixed.

I also set the spark-submit log level to info instead of debug, as this is very important info to see in the logs (i.e. did the submit fail).  Perhaps this will be overly verbose in some environments. 

> Add Kubernetes Scheduler to Spark Submit Operator
> -------------------------------------------------
>
>                 Key: AIRFLOW-2140
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2140
>             Project: Apache Airflow
>          Issue Type: New Feature
>    Affects Versions: 1.9.0
>            Reporter: Rob Keevil
>            Assignee: Rob Keevil
>            Priority: Major
>
> Spark 2.3 adds the Kubernetes resource manager to Spark, alongside the existing Standalone, Yarn and Mesos resource managers. 
> https://github.com/apache/spark/blob/master/docs/running-on-kubernetes.md
> We should extend the spark submit operator to enable the new K8s spark submit options, and to be able to monitor Spark jobs running within Kubernetes.
> I already have working code for this, I need to test the monitoring/log parsing code and make sure that Airflow is able to terminate Kubernetes pods when jobs are cancelled etc.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)