You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/01/19 13:13:22 UTC

[GitHub] [airflow] karthyk16 opened a new issue #20949: Spark Submit Operator reporting job failure. Not able to poll the successful completion of Spark Job.

karthyk16 opened a new issue #20949:
URL: https://github.com/apache/airflow/issues/20949


   ### Apache Airflow Provider(s)
   
   apache-spark
   
   ### Versions of Apache Airflow Providers
   
   2.0.3. Same issue in 2.0.1 as well.
   
   ### Apache Airflow version
   
   2.1.2
   
   ### Operating System
   
   Red Hat Enterprise Linux
   
   ### Deployment
   
   Other
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   We have used Spark Submit operator from Airflow to submit a Spark job on Standalone cluster mode. Spark job gets submitted successfully and we have used spark.standalone.submit.waitAppCompletion = true as a conf. So, the client waits for the application to complete. On successful completion of the application and executor termination, airflow is not able to poll the successful completion of job and mark the job as successful. Instead it marks the job as failure even though the job was successfully finished. Status of application in Spark History shows Finished.
   
   [2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37 INFO ClientEndpoint: State of driver driver-20220118204802-0000 is FINISHED, exiting spark-submit JVM.
   [2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37 INFO ShutdownHookManager: Shutdown hook called
   [2022-01-18, 20:49:37 UTC] {spark_submit.py:523} INFO - 22/01/18 20:49:37 INFO ShutdownHookManager: Deleting directory /tmp/spark-1b7f430e-2cb8-493e-83b3-07b1a9c5728b
   [2022-01-18, 20:49:38 UTC] {spark_submit.py:456} DEBUG - Should track driver: True
   .
   .
   .
   .
   [2022-01-18, 20:50:08 UTC] {spark_submit.py:587} DEBUG - polling status of spark driver with id driver-20220118204802-0000
   [2022-01-18, 20:50:08 UTC] {spark_submit.py:409} DEBUG - Poll driver status cmd: ['spark-submit', '--master', 'spark://ip-10-150-101-107.ap-south-1.compute.internal:7077', '--status', 'driver-20220118204802-0000']
   [2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status log: 22/01/18 20:50:10 WARN RestSubmissionClient: Unable to connect to server spark://ip-14-1XXXX-127.ap-south-1.compute.internal:7077.
   [2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status log: Exception in thread "main" org.apache.spark.deploy.rest.SubmitRestConnectionException: Unable to connect to server
   [2022-01-18, 20:50:10 UTC] {spark_submit.py:541} DEBUG - spark driver status log: at org.apache.spark.deploy.rest.RestSubmissionClient.$anonfun$requestSubmissionStatus$3(RestSubmissionClient.scala:163)
   
   
   ### What you expected to happen
   
   Airflow hook should be able to poll the status correctly and mark the job status appropriately.(Failed in case of genuine failure and successful in case of successful completion of job.
   
   ### How to reproduce
   
   Create an airflow DAG with Spark submit operator. Use "spark.standalone.submit.waitAppCompletion = true" in conf so that client will wait for application to complete. Enable Debug logs in Airflow to see the log trace.
   
   ### Anything else
   
   Happens every time and its a blocker for using spark submit operator.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #20949: Spark Submit Operator reporting job failure. Not able to poll the successful completion of Spark Job.

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20949:
URL: https://github.com/apache/airflow/issues/20949#issuecomment-1019552955


   > > Assigned you @karthyk16 - feel free to fix it!
   > 
   > Thanks @potiuk : Our team member has fixed it using a custom operator. Our team member will raise the PR for this issue in the next couple of days. Hope that's fine.
   
   Perfect :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #20949: Spark Submit Operator reporting job failure. Not able to poll the successful completion of Spark Job.

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #20949:
URL: https://github.com/apache/airflow/issues/20949#issuecomment-1016453003


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] potiuk commented on issue #20949: Spark Submit Operator reporting job failure. Not able to poll the successful completion of Spark Job.

Posted by GitBox <gi...@apache.org>.
potiuk commented on issue #20949:
URL: https://github.com/apache/airflow/issues/20949#issuecomment-1019531071


   Assigned you @karthyk16  - feel free to fix it!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] karthyk16 commented on issue #20949: Spark Submit Operator reporting job failure. Not able to poll the successful completion of Spark Job.

Posted by GitBox <gi...@apache.org>.
karthyk16 commented on issue #20949:
URL: https://github.com/apache/airflow/issues/20949#issuecomment-1019552019


   > Assigned you @karthyk16 - feel free to fix it!
   
   Thanks @potiuk  : Our team member has fixed it using a custom operator. Our team member will raise the PR for this issue in the next couple of days. Hope that's fine.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org