You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2021/03/18 19:50:34 UTC
[GitHub] [airflow] prabcs opened a new issue #14885: Dataproc cluster create operator failing in diagnose_cluster step
prabcs opened a new issue #14885:
URL: https://github.com/apache/airflow/issues/14885
**Apache Airflow version**: 2.0
**Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
**Environment**:
- **Cloud provider or hardware configuration**: GCP
- **OS** (e.g. from /etc/os-release):
- **Kernel** (e.g. `uname -a`):
- **Install tools**:
- **Others**:
**What happened**:
`DataprocCreateClusterOperator` fails in the [`diagnose_cluster` step.](https://github.com/apache/airflow/blob/16f43605f3370f20611ba9e08b568ff8a7cd433d/airflow/providers/google/cloud/operators/dataproc.py#L564)
<details><summary>TypeError: Could not convert Any to Empty
</summary>
```
[2021-02-05 05:07:31,668] {taskinstance.py:1396} ERROR - Could not convert Any to Empty
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1086, in _run_raw_task
self._prepare_and_execute_task_with_callbacks(context, task)
File "/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1260, in _prepare_and_execute_task_with_callbacks
result = self._execute_task(context, task_copy)
File "/usr/local/lib/python3.8/site-packages/airflow/models/taskinstance.py", line 1300, in _execute_task
result = task_copy.execute(context=context)
File "/home/airflow/nfs/dags/catwalk/monitor_dag_try.py", line 79, in execute
self._handle_error_state(hook, cluster)
File "/usr/local/lib/python3.8/site-packages/airflow/providers/google/cloud/operators/dataproc.py", line 568, in _handle_error_state
gcs_uri = hook.diagnose_cluster(
File "/usr/local/lib/python3.8/site-packages/airflow/providers/google/common/hooks/base_google.py", line 425, in inner_wrapper
return func(self, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/airflow/providers/google/cloud/hooks/dataproc.py", line 390, in diagnose_cluster
operation.result()
File "/usr/local/lib/python3.8/site-packages/google/api_core/future/polling.py", line 129, in result
self._blocking_poll(timeout=timeout, **kwargs)
File "/usr/local/lib/python3.8/site-packages/google/api_core/future/polling.py", line 107, in _blocking_poll
retry_(self._done_or_raise)(**kwargs)
File "/usr/local/lib/python3.8/site-packages/google/api_core/retry.py", line 281, in retry_wrapped_func
return retry_target(
File "/usr/local/lib/python3.8/site-packages/google/api_core/retry.py", line 184, in retry_target
return target()
File "/usr/local/lib/python3.8/site-packages/google/api_core/future/polling.py", line 85, in _done_or_raise
if not self.done(**kwargs):
File "/usr/local/lib/python3.8/site-packages/google/api_core/operation.py", line 170, in done
self._refresh_and_update(retry)
File "/usr/local/lib/python3.8/site-packages/google/api_core/operation.py", line 159, in _refresh_and_update
self._set_result_from_operation()
File "/usr/local/lib/python3.8/site-packages/google/api_core/operation.py", line 130, in _set_result_from_operation
response = protobuf_helpers.from_any_pb(
File "/usr/local/lib/python3.8/site-packages/google/api_core/protobuf_helpers.py", line 69, in from_any_pb
raise TypeError(
TypeError: Could not convert Any to Empty
```
</details>
**What you expected to happen**:
Ideally, the `diagnose_cluster` check should complete successfully. If it fails, it should be skipped and the deletion must be tried.
<!-- What do you think went wrong? -->
**How to reproduce it**:
This was observed while building a workflow around automatic cluster deletion when the cluster creation fails due to a bad init script. Steps to reproduce:
1. Push a bad dataproc init script
2. Set the operator's parameter to delete the cluster on error.
3. trigger the DAG
4. Notice that the cluster is in error as a result of the bad init script
5. Then, hope to see the cluster getting deleted in its next retry
At step 5, after making sure the cluster is in error state and before deleting the cluster, Airflow 2.0's operator runs a `diagnose_cluster` hook that hits dataproc API to diagnose the cluster and returns a variable of type `Empty` that is not getting converted to `Any` and thats causing the diagnosis to fail.
**Anything else we need to know**:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] turbaszek commented on issue #14885: Dataproc cluster create operator failing in diagnose_cluster step
Posted by GitBox <gi...@apache.org>.
turbaszek commented on issue #14885:
URL: https://github.com/apache/airflow/issues/14885#issuecomment-926820159
@eladkal that's also my understanding
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] eladkal closed issue #14885: Dataproc cluster create operator failing in diagnose_cluster step
Posted by GitBox <gi...@apache.org>.
eladkal closed issue #14885:
URL: https://github.com/apache/airflow/issues/14885
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] eladkal commented on issue #14885: Dataproc cluster create operator failing in diagnose_cluster step
Posted by GitBox <gi...@apache.org>.
eladkal commented on issue #14885:
URL: https://github.com/apache/airflow/issues/14885#issuecomment-925757915
I think this has been resolved upstream by @turbaszek PR https://github.com/googleapis/python-dataproc/issues/51 ?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [airflow] boring-cyborg[bot] commented on issue #14885: Dataproc cluster create operator failing in diagnose_cluster step
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #14885:
URL: https://github.com/apache/airflow/issues/14885#issuecomment-802242100
Thanks for opening your first issue here! Be sure to follow the issue template!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org