You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2022/09/07 09:17:02 UTC

[GitHub] [airflow] nsuthar-lumiq opened a new issue, #26196: Amazon Glue job providers not printing log when job get completed or failed.

nsuthar-lumiq opened a new issue, #26196:
URL: https://github.com/apache/airflow/issues/26196

   ### Apache Airflow Provider(s)
   
   amazon
   
   ### Versions of Apache Airflow Providers
   
   5.0.0
   
   ### Apache Airflow version
   
   >= 2.3.2
   
   ### Operating System
   
   Linux
   
   ### Deployment
   
   Docker-Compose
   
   ### Deployment details
   
   docker-compose 
   
   ### What happened
   
   The method **job_completion** of **GlueJobHook** call **print_job_logs** in finally clauses  that will be never call when job get completed or failed since when job get completed it return value by using return statement (that will not execute finally block) and similarly in case of failure it will raise exception that will also not execute the finally block.
   Due to that airflow does not show Glue job logs from CloudWatch.
   
   ```
   
   def job_completion(self, job_name: str, run_id: str, verbose: bool = False) -> Dict[str, str]:
           """
           Waits until Glue job with job_name completes or
           fails and return final state if finished.
           Raises AirflowException when the job failed
           :param job_name: unique job name per AWS account
           :param run_id: The job-run ID of the predecessor job run
           :param verbose: If True, more Glue Job Run logs show in the Airflow Task Logs.  (default: False)
           :return: Dict of JobRunState and JobRunId
           """
           failed_states = ['FAILED', 'TIMEOUT']
           finished_states = ['SUCCEEDED', 'STOPPED']
           next_log_token = None
           job_failed = False
   
           while True:
               try:
                   job_run_state = self.get_job_state(job_name, run_id)
                   if job_run_state in finished_states:
                       self.log.info('Exiting Job %s Run State: %s', run_id, job_run_state)
                       return {'JobRunState': job_run_state, 'JobRunId': run_id}
                   if job_run_state in failed_states:
                       job_failed = True
                       job_error_message = f'Exiting Job {run_id} Run State: {job_run_state}'
                       self.log.info(job_error_message)
                       raise AirflowException(job_error_message)
                   else:
                       self.log.info(
                           'Polling for AWS Glue Job %s current run state with status %s',
                           job_name,
                           job_run_state,
                       )
                       time.sleep(self.JOB_POLL_INTERVAL)
               finally:
                   if verbose:
                       next_log_token = self.print_job_logs(
                           job_name=job_name,
                           run_id=run_id,
                           job_failed=job_failed,
                           next_token=next_log_token,
                       )
   ```
   
   ### What you think should happen instead
   
   It should print the log in all cases, (failure or success) when `verbose =True`.
   
   ### How to reproduce
   
   Use latest version of amazon providers 5.0.0 and create a Airflow task for any Glue job.
   Make sure to pass `verbose = True` in **GlueJobOperator** as below:
   
   ```
   job_task = GlueJobOperator(task_id ="testJob", 
                                                  job_name= "<Glue job name>",
                                                  job_desc="Test Job",
                                                  region_name="ap-south-1",
                                                  verbose=True,
                                                  script_location="s3://..//../", 
                                                  num_of_dpus=2) 
   ```
   
   ### Anything else
   
   This is code bug that will cause this issue every time. I will provide the solution with enhancement for continuous logging.
   
   ### Are you willing to submit PR?
   
   - [X] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on issue #26196: Amazon Glue job providers not printing log when job get completed or failed.

Posted by "o-nikolas (via GitHub)" <gi...@apache.org>.
o-nikolas commented on issue #26196:
URL: https://github.com/apache/airflow/issues/26196#issuecomment-1421532592

   Not seeing any action on this one, if the issue arises again we can re-open.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas closed issue #26196: Amazon Glue job providers not printing log when job get completed or failed.

Posted by "o-nikolas (via GitHub)" <gi...@apache.org>.
o-nikolas closed issue #26196: Amazon Glue job providers not printing log when job get completed or failed.
URL: https://github.com/apache/airflow/issues/26196


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on issue #26196: Amazon Glue job providers not printing log when job get completed or failed.

Posted by GitBox <gi...@apache.org>.
o-nikolas commented on issue #26196:
URL: https://github.com/apache/airflow/issues/26196#issuecomment-1281579978

   > when job get completed successfully it return value by using return statement (that will not execute finally block) and similarly in case of failure it will raise exception that will also not execute the finally block.
   
   Is this really true? From Python docs [here](https://docs.python.org/3/tutorial/errors.html#defining-clean-up-actions):
   > If a [finally](https://docs.python.org/3/reference/compound_stmts.html#finally) clause is present, the finally clause will execute as the last task before the [try](https://docs.python.org/3/reference/compound_stmts.html#try) statement completes. The finally clause runs whether or not the try statement produces an exception. The following points discuss more complex cases when an exception occurs:
   > - If an exception occurs during execution of the try clause, the [except](https://docs.python.org/3/reference/compound_stmts.html#except)ion may be handled by an except clause. If the exception is not handled by an except clause, the exception is re-raised after the finally clause has been executed.
   > - An exception could occur during execution of an except or else clause. Again, the exception is re-raised after the finally clause has been executed.
   > - If the try statement reaches a [break](https://docs.python.org/3/reference/simple_stmts.html#break), [continue](https://docs.python.org/3/reference/simple_stmts.html#continue) or [return](https://docs.python.org/3/reference/simple_stmts.html#return) statement, the finally clause will execute just prior to the break, continue or return statement’s execution.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] boring-cyborg[bot] commented on issue #26196: Amazon Glue job providers not printing log when job get completed or failed.

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #26196:
URL: https://github.com/apache/airflow/issues/26196#issuecomment-1239129213

   Thanks for opening your first issue here! Be sure to follow the issue template!
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] ferruzzi commented on issue #26196: Amazon Glue job providers not printing log when job get completed or failed.

Posted by GitBox <gi...@apache.org>.
ferruzzi commented on issue #26196:
URL: https://github.com/apache/airflow/issues/26196#issuecomment-1281615466

   As Niko said, the "finally" should be executed whether there is an exception or not:
   
   ![image](https://user-images.githubusercontent.com/1920178/196300056-c0d042a6-923b-44bb-be97-cd189348f113.png)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [airflow] o-nikolas commented on issue #26196: Amazon Glue job providers not printing log when job get completed or failed.

Posted by "o-nikolas (via GitHub)" <gi...@apache.org>.
o-nikolas commented on issue #26196:
URL: https://github.com/apache/airflow/issues/26196#issuecomment-1404011147

   Hey @nikhi-suthar,
   
   Is this still a relevant issue? This Issue is quite old. I see a few PRs that are linked but they've been closed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org