You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/02/21 10:03:49 UTC

[GitHub] [airflow] akifcakir opened a new pull request #21709: Add-raising-ValueError-feature-to-DatabricksSubmitRunOperator

akifcakir opened a new pull request #21709:
URL: https://github.com/apache/airflow/pull/21709


   <!--
   Thank you for contributing! Please make sure that your code changes
   are covered with tests. And in case of new features or big changes
   remember to adjust the documentation.
   
   Feel free to ping committers for the review!
   
   In case of existing issue, reference it using one of the following:
   
   closes: #ISSUE
   related: #ISSUE
   
   How to write a good git commit message:
   http://chris.beams.io/posts/git-commit/
   -->
   
   ---
   **^ Add meaningful description above**
   
   As its known in Airflow when we have a failed step (submitted a run for a Databricks notebook), we are not able to see the error in Airflow logs and need to visit the run page via the given link. 
   
   In order to see the notebook error directly in Airflow step logs as in below, in airflow.providers.databricks.operators.databricks - DatabricksSubmitRunOperator, I enriched the AirflowException error message with ValueError from run output by using the Databricks api/2.0/jobs/runs/get-output API and additional method in databricks hook (airflow.providers.databricks.hooks.databricks - DatabricksHook). The operator and hook has been already tested with an example DAG in local setup by running test on both with succeed and fail cases.
   
   The new operator has all the functionality of prev version DatabricksSubmitRunOperator. Besides it has the functionality that with the API endpoint fetches the notebook error and shows the notebook error directly in step logs in Airflow.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1052833326


   Static checks are failing


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1055192152


   Please take a look at the failed check and read both explanation of the error and instuctions how to fix it (via pre-commit): It is all there if you follow details:
   
   ![Screenshot from 2022-03-01 10-10-18](https://user-images.githubusercontent.com/595491/156139686-d6ed0a8c-34c2-4b36-8bdb-ebb0e81a31c5.png)
   
   Look at  explains it all. If you install pre-commit as:
   
   
   *  strongly recommended in CONTRIBUTING.rst 
   * described in detail in STATIC.txt
   * explained in detail in the error message you see in the static checks (including the exact command to run)
   
   You won't even have to fix it, because pre-commits will fix it for you.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1053740726


   needs rebase


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] akifcakir commented on a change in pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
akifcakir commented on a change in pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#discussion_r812146616



##########
File path: airflow/providers/databricks/operators/databricks.py
##########
@@ -87,7 +87,10 @@ def _handle_databricks_operator_execution(operator, hook, log, context) -> None:
                     log.info('View run status, Spark UI, and logs at %s', run_page_url)
                     return
                 else:
-                    error_message = f'{operator.task_id} failed with terminal state: {run_state}'
+                    run_output = hook.get_run_output(operator.run_id)
+                    notebook_error = run_output['error']
+                    error_message = f'{operator.task_id} failed with terminal state: {run_state} ' \

Review comment:
       @ephraimbuddy added here, can you please check ? :https://github.com/apache/airflow/pull/21709/commits/c5846a8fb3a4002cb6d1fd351838faddbce9bca3




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk merged pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
potiuk merged pull request #21709:
URL: https://github.com/apache/airflow/pull/21709


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] akifcakir commented on a change in pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
akifcakir commented on a change in pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#discussion_r812146616



##########
File path: airflow/providers/databricks/operators/databricks.py
##########
@@ -87,7 +87,10 @@ def _handle_databricks_operator_execution(operator, hook, log, context) -> None:
                     log.info('View run status, Spark UI, and logs at %s', run_page_url)
                     return
                 else:
-                    error_message = f'{operator.task_id} failed with terminal state: {run_state}'
+                    run_output = hook.get_run_output(operator.run_id)
+                    notebook_error = run_output['error']
+                    error_message = f'{operator.task_id} failed with terminal state: {run_state} ' \

Review comment:
       added here, can you please check ? :https://github.com/apache/airflow/pull/21709/commits/c5846a8fb3a4002cb6d1fd351838faddbce9bca3




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] akifcakir commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
akifcakir commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1055845575


   Hi @potiuk do you think that pr can be merged ? It looks like passed all the checks. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
potiuk commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1055851377


   Merged !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1055851233


   Awesome work, congrats on your first merged pull request!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] github-actions[bot] commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1052694065


   The PR is likely OK to be merged with just subset of tests for default Python and Database versions without running the full matrix of tests, because it does not modify the core of Airflow. If the committers decide that the full tests matrix is needed, they will add the label 'full tests needed'. Then you should rebase to the latest main or amend the last commit of the PR, and push it with --force-with-lease.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk edited a comment on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
potiuk edited a comment on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1055192152


   Please take a look at the failed check and read both explanation of the error and instuctions how to fix it (via pre-commit): It is all there if you follow details:
   
   ![Screenshot from 2022-03-01 10-10-18](https://user-images.githubusercontent.com/595491/156139686-d6ed0a8c-34c2-4b36-8bdb-ebb0e81a31c5.png)
   
   Look at  explains it all. If you install pre-commit as:
   
   
   *  strongly recommended in CONTRIBUTING.rst 
   * described in detail in STATIC.txt
   * explained in detail in the error message you see in the static checks (including the exact command to run)
   
   
   ![image](https://user-images.githubusercontent.com/595491/156140003-8042f12d-28f8-4c16-b283-f04961b6fa0b.png)
   
   
   You won't even have to fix it, because pre-commits will fix it for you.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] akifcakir commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
akifcakir commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1055186529


   Hi @potiuk, I could not see anything wrong which causes the static checks failure. If you could, can you point out that please ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on pull request #21709: Add-raising-ValueError-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1046687257


   Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst)
   Here are some useful points:
   - Pay attention to the quality of your code (flake8, mypy and type annotations). Our [pre-commits]( https://github.com/apache/airflow/blob/main/STATIC_CODE_CHECKS.rst#prerequisites-for-pre-commit-hooks) will help you with that.
   - In case of a new feature add useful documentation (in docstrings or in `docs/` directory). Adding a new operator? Check this short [guide](https://github.com/apache/airflow/blob/main/docs/apache-airflow/howto/custom-operator.rst) Consider adding an example DAG that shows how users should use it.
   - Consider using [Breeze environment](https://github.com/apache/airflow/blob/main/BREEZE.rst) for testing locally, itโ€™s a heavy docker but it ships with a working Airflow and a lot of integrations.
   - Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
   - Please follow [ASF Code of Conduct](https://www.apache.org/foundation/policies/conduct) for all communication including (but not limited to) comments on Pull Requests, Mailing list and Slack.
   - Be sure to read the [Airflow Coding style]( https://github.com/apache/airflow/blob/main/CONTRIBUTING.rst#coding-style-and-best-practices).
   Apache Airflow is a community-driven project and together we are making it better ๐Ÿš€.
   In case of doubts contact the developers at:
   Mailing List: dev@airflow.apache.org
   Slack: https://s.apache.org/airflow-slack
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ephraimbuddy commented on a change in pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
ephraimbuddy commented on a change in pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#discussion_r811649582



##########
File path: airflow/providers/databricks/operators/databricks.py
##########
@@ -87,7 +87,10 @@ def _handle_databricks_operator_execution(operator, hook, log, context) -> None:
                     log.info('View run status, Spark UI, and logs at %s', run_page_url)
                     return
                 else:
-                    error_message = f'{operator.task_id} failed with terminal state: {run_state}'
+                    run_output = hook.get_run_output(operator.run_id)
+                    notebook_error = run_output['error']
+                    error_message = f'{operator.task_id} failed with terminal state: {run_state} ' \

Review comment:
       It'll make sense to add a test for this




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] akifcakir commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
akifcakir commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1054343877


   @potiuk I rebased the pr, can you please run workflows ?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] akifcakir commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
akifcakir commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1055552830


   Hi @potiuk I run with pre-commit and pushed already , can you please run workflows?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] akifcakir commented on pull request #21709: Add-showing-runtime-error-feature-to-DatabricksSubmitRunOperator

Posted by GitBox <gi...@apache.org>.
akifcakir commented on pull request #21709:
URL: https://github.com/apache/airflow/pull/21709#issuecomment-1055679977


   Hi @potiuk I run with pre-commit run --all-files again since in first it did not catch the one error , can you please run workflows?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org