You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "ASF subversion and git services (JIRA)" <ji...@apache.org> on 2017/04/06 15:30:41 UTC

[jira] [Commented] (AIRFLOW-1028) Databricks Operator for Airflow

    [ https://issues.apache.org/jira/browse/AIRFLOW-1028?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15959101#comment-15959101 ] 

ASF subversion and git services commented on AIRFLOW-1028:
----------------------------------------------------------

Commit 53ca5084561fd5c13996609f2eda6baf717249b5 in incubator-airflow's branch refs/heads/master from [~andrewmchen]
[ https://git-wip-us.apache.org/repos/asf?p=incubator-airflow.git;h=53ca508 ]

[AIRFLOW-1028] Databricks Operator for Airflow

Add DatabricksSubmitRun Operator

In this PR, we contribute a DatabricksSubmitRun operator and a
Databricks hook. This operator enables easy integration of Airflow
with Databricks. In addition to the operator, we have created a
databricks_default connection, an example_dag using this
DatabricksSubmitRunOperator, and matching documentation.

Closes #2202 from andrewmchen/databricks-operator-
squashed


> Databricks Operator for Airflow
> -------------------------------
>
>                 Key: AIRFLOW-1028
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-1028
>             Project: Apache Airflow
>          Issue Type: New Feature
>            Reporter: Andrew Chen
>            Assignee: Andrew Chen
>
> It would be nice to have a Databricks Operator/Hook in Airflow so users of Databricks can more easily integrate with Airflow.
> The operator would submit a spark job to our new /jobs/runs/submit endpoint. This endpoint is similar to https://docs.databricks.com/api/latest/jobs.html#jobscreatejob but does not include the email_notifications, max_retries, min_retry_interval_millis, retry_on_timeout, schedule, max_concurrent_runs fields. (The submit docs are not out because it's still a private endpoint.)
> Our proposed design for the operator then is to match this REST API endpoint. Each argument to the parameter is named to be one of the fields of the REST API request and the value of the argument will match the type expected by the REST API. We will also merge extra keys from kwargs which should not be passed to the BaseOperator into our API call in order to be flexible to updates.
> In the case that this interface is not very user friendly, we can later add more operators which extend this operator.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)