You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by "jherrmannNetfonds (via GitHub)" <gi...@apache.org> on 2023/06/16 08:51:56 UTC
[GitHub] [airflow] jherrmannNetfonds commented on pull request #31798: fix spark-kubernetes-operator compatibility
jherrmannNetfonds commented on PR #31798:
URL: https://github.com/apache/airflow/pull/31798#issuecomment-1594346774
Hi, thanks for the PR.
I am testing this in production right now with airflow 2.6.1 running on Kubernetes with KubernetesExecutor. I encountered this error today:
```
[2023-06-16, 09:44:04 CEST] {taskinstance.py:1824} ERROR - Task failed with exception
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/urllib3/response.py", line 761, in _update_chunk_length
self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/home/airflow/.local/lib/python3.10/site-packages/urllib3/response.py", line 444, in _error_catcher
yield
File "/home/airflow/.local/lib/python3.10/site-packages/urllib3/response.py", line 828, in read_chunked
self._update_chunk_length()
File "/home/airflow/.local/lib/python3.10/site-packages/urllib3/response.py", line 765, in _update_chunk_length
raise InvalidChunkLength(self, line)
urllib3.exceptions.InvalidChunkLength: InvalidChunkLength(got length b'', 0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/opt/airflow/dags/repo/packages/local_copy_of_this_pr/spark_kubernetes.py", line 122, in execute
for line in pod_log_stream:
File "/home/airflow/.local/lib/python3.10/site-packages/kubernetes/watch/watch.py", line 165, in stream
for line in iter_resp_lines(resp):
File "/home/airflow/.local/lib/python3.10/site-packages/kubernetes/watch/watch.py", line 56, in iter_resp_lines
for seg in resp.stream(amt=None, decode_content=False):
File "/home/airflow/.local/lib/python3.10/site-packages/urllib3/response.py", line 624, in stream
for line in self.read_chunked(amt, decode_content=decode_content):
File "/home/airflow/.local/lib/python3.10/site-packages/urllib3/response.py", line 816, in read_chunked
with self._error_catcher():
File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
self.gen.throw(typ, value, traceback)
File "/home/airflow/.local/lib/python3.10/site-packages/urllib3/response.py", line 461, in _error_catcher
raise ProtocolError("Connection broken: %r" % e, e)
```
This fails the task, but the spark application is still running and succeeding. Maybe some of these types of errors should be catched instead of letting the pod fail. What do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: commits-unsubscribe@airflow.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org