You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/06/03 21:36:43 UTC

[GitHub] [airflow] ahmad-maruf opened a new issue #9127: AttributeError: 'EMR' object has no attribute 'get_cluster_id_by_name' (Airflow 1.10.10)

ahmad-maruf opened a new issue #9127:
URL: https://github.com/apache/airflow/issues/9127


   A bug in the latest stable version of Airflow (1.10.10) causes the following library API call mismatch error when calling the `EmrAddStepsOperator`:
   ```
   [2020-06-03 18:05:06,862] {taskinstance.py:1145} ERROR - 'EMR' object has no attribute 'get_cluster_id_by_name'
   Traceback (most recent call last):
     File "/home/ubuntu/.pyenv/versions/3.7.7/envs/.venv_python377/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 983, in _run_raw_task
       result = task_copy.execute(context=context)
     File "/home/ubuntu/.pyenv/versions/3.7.7/envs/.venv_python377/lib/python3.7/site-packages/airflow/contrib/operators/emr_add_steps_operator.py", line 74, in execute
       job_flow_id = emr.get_cluster_id_by_name(self.job_flow_name, self.cluster_states)
     File "/home/ubuntu/.pyenv/versions/3.7.7/envs/.venv_python377/lib/python3.7/site-packages/botocore/client.py", line 575, in _getattr_
       self._class.name_, item)
   AttributeError: 'EMR' object has no attribute 'get_cluster_id_by_name'
   [2020-06-03 18:05:06,864] {taskinstance.py:1202} INFO - Marking task as FAILED.dag_id=my_spark_job_dag_id, task_id=my_spark_job_emr_add_step_id, execution_date=20200603T180500, start_date=20200603T180506, end_date=20200603T180506
   [2020-06-03 18:05:16,153] {logging_mixin.py:112} INFO - [2020-06-03 18:05:16,153] {local_task_job.py:103} INFO - Task exited with return code 1*
   ```
   After digging through the library API code, I found the code bug here: https://github.com/apache/airflow/blob/b099571b9af739c5a96e7aed41be9f22912a3443/airflow/contrib/operators/emr_add_steps_operator.py#L74
   
   The root cause is that `botocore.client.EMR object has no attribute 'get_cluster_id_by_name'`. Instead this attribute belongs to `airflow.contrib.hooks.emr_hook.EmrHook` object. 
   
   Compare the above bug with **corrected** corresponding code in the Airflow 2.0.0Dev version in the `master` branch:
   https://github.com/apache/airflow/blob/ff5dcccbbd49e7a4632f93fa915565ac31730110/airflow/providers/amazon/aws/operators/emr_add_steps.py#L77
   
   This is forcing the user to provide `job_flow_id` directly when instantiating `EmrAddStepsOperator`, which in my opinion is not the best practice.
   
   apache-airflow     1.10.10
   boto                       2.49.0
   boto3                     1.13.18
   botocore                1.16.21
   Python                    3.7.7
   
   If this issue has already been fixed in Airflow 1.10.10 somehow, please provide instructions as I'm not aware of it. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] ashb closed issue #9127: AttributeError: 'EMR' object has no attribute 'get_cluster_id_by_name' (Airflow 1.10.10)

Posted by GitBox <gi...@apache.org>.
ashb closed issue #9127:
URL: https://github.com/apache/airflow/issues/9127


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [airflow] boring-cyborg[bot] commented on issue #9127: AttributeError: 'EMR' object has no attribute 'get_cluster_id_by_name' (Airflow 1.10.10)

Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #9127:
URL: https://github.com/apache/airflow/issues/9127#issuecomment-638474984


   Thanks for opening your first issue here! Be sure to follow the issue template!
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org