You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@airflow.apache.org by GitBox <gi...@apache.org> on 2020/04/06 12:29:18 UTC
[GitHub] [airflow] BrunoDamacena opened a new issue #8160:
SimpleHttpOperator aborts connection after 5 minutes
BrunoDamacena opened a new issue #8160: SimpleHttpOperator aborts connection after 5 minutes
URL: https://github.com/apache/airflow/issues/8160
<!--
Welcome to Apache Airflow! For a smooth issue process, try to answer the following questions.
Don't worry if they're not all applicable; just try to include what you can :-)
If you need to include code snippets or logs, please put them in fenced code
blocks. If they're super-long, please use the details tag like
<details><summary>super-long log</summary> lots of stuff </details>
Please delete these comment blocks before submitting the issue.
-->
<!--
IMPORTANT!!!
Please complete the next sections or the issue will be closed.
These questions are the first thing we need to know to understand the context.
-->
**Apache Airflow version**: v1.10.4
**Kubernetes version (if you are using kubernetes)** (use `kubectl version`):
**Environment**: [puckel/docker-airflow](https://github.com/puckel/docker-airflow)
**What happened**:
The HTTP request from the API aborts connection after 5 minutes.
I'm trying to run a long request, but everytime this error occurs:
```
[2020-04-06 11:44:11,217] {{logging_mixin.py:95}} INFO - [[34m2020-04-06 11:44:11,217[0m] {{[34mhttp_hook.py:[0m131}} INFO[0m - Sending '[1mPOST[0m' to url: [1m{api_url_here}[0m[0m
[2020-04-06 11:49:19,249] {{logging_mixin.py:95}} WARNING - /usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py:181: DeprecationWarning: The 'warn' method is deprecated, use 'warning' instead
self.log.warn(str(ex) + ' Tenacity will retry to execute the operation')
[2020-04-06 11:49:19,250] {{logging_mixin.py:95}} INFO - [[34m2020-04-06 11:49:19,250[0m] {{[34mhttp_hook.py:[0m181}} WARNING[0m - ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response')) Tenacity will retry to execute the operation[0m
[2020-04-06 11:49:19,250] {{taskinstance.py:1047}} ERROR - ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 603, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1336, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 306, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 275, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
http.client.RemoteDisconnected: Remote end closed connection without response
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 449, in send
timeout=timeout
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 641, in urlopen
_stacktrace=sys.exc_info()[2])
File "/usr/local/lib/python3.7/site-packages/urllib3/util/retry.py", line 368, in increment
raise six.reraise(type(error), error, _stacktrace)
File "/usr/local/lib/python3.7/site-packages/urllib3/packages/six.py", line 685, in reraise
raise value.with_traceback(tb)
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 603, in urlopen
chunked=chunked)
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 387, in _make_request
six.raise_from(e, None)
File "<string>", line 2, in raise_from
File "/usr/local/lib/python3.7/site-packages/urllib3/connectionpool.py", line 383, in _make_request
httplib_response = conn.getresponse()
File "/usr/local/lib/python3.7/http/client.py", line 1336, in getresponse
response.begin()
File "/usr/local/lib/python3.7/http/client.py", line 306, in begin
version, status, reason = self._read_status()
File "/usr/local/lib/python3.7/http/client.py", line 275, in _read_status
raise RemoteDisconnected("Remote end closed connection without"
urllib3.exceptions.ProtocolError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/usr/local/lib/python3.7/site-packages/airflow/models/taskinstance.py", line 922, in _run_raw_task
result = task_copy.execute(context=context)
File "/usr/local/lib/python3.7/site-packages/airflow/operators/http_operator.py", line 92, in execute
self.extra_options)
File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 132, in run
return self.run_and_check(session, prepped_request, extra_options)
File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 182, in run_and_check
raise ex
File "/usr/local/lib/python3.7/site-packages/airflow/hooks/http_hook.py", line 174, in run_and_check
allow_redirects=extra_options.get("allow_redirects", True))
File "/usr/local/lib/python3.7/site-packages/requests/sessions.py", line 646, in send
r = adapter.send(request, **kwargs)
File "/usr/local/lib/python3.7/site-packages/requests/adapters.py", line 498, in send
raise ConnectionError(err, request=request)
requests.exceptions.ConnectionError: ('Connection aborted.', RemoteDisconnected('Remote end closed connection without response'))
```
**What you expected to happen**:
The operator should wait for the request response without timeout
<!-- What do you think went wrong? -->
**How to reproduce it**:
Run a long request using SimpleHttpOperator
<!---
As minimally and precisely as possible. Keep in mind we do not have access to your cluster or dags.
If you are using Kubernetes, please attempt to recreate the issue using minikube or kind.
## Install minikube/kind
- Minikube https://minikube.sigs.k8s.io/docs/start/
- Kind https://kind.sigs.k8s.io/docs/user/quick-start/
If this is a UI bug, please provide a screenshot of the bug or a link to a youtube video of the bug in action
You can include images using the .md style of
![alt text](http://url/to/img.png)
To record a screencast, mac users can use QuickTime and then create an unlisted youtube video with the resulting .mov file.
--->
**Anything else we need to know**:
<!--
How often does this problem occur? Once? Every time etc?
Any relevant logs to include? Put them here in side a detail tag:
<details><summary>x.log</summary> lots of stuff </details>
-->
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [airflow] khyurri edited a comment on issue #8160:
SimpleHttpOperator aborts connection after 5 minutes
Posted by GitBox <gi...@apache.org>.
khyurri edited a comment on issue #8160: SimpleHttpOperator aborts connection after 5 minutes
URL: https://github.com/apache/airflow/issues/8160#issuecomment-610460054
It seems that the server to which SimpleHttpOperator connects breaks the connection after 5 minutes.
I've tried to reproduce this bug using simple flask app:
```python
from flask import Flask
from time import monotonic, sleep
app = Flask(__name__)
@app.route('/')
def hello_world():
t0 = monotonic()
sleep(302)
t1 = monotonic()
return "{}".format(t1-t0)
```
Everything works great:
```
[2020-04-07 18:28:08,024] {http_operator.py:87} INFO - Calling HTTP method
[2020-04-07 18:28:08,046] {logging_mixin.py:112} INFO - [2020-04-07 18:28:08,045] {base_hook.py:87} INFO - Using connection to: id: http_default. Host: http://127.0.0.1:5000, Port: None, Schema: None, Login: None, Password: None, extra: None
[2020-04-07 18:28:08,051] {logging_mixin.py:112} INFO - [2020-04-07 18:28:08,050] {http_hook.py:136} INFO - Sending 'GET' to url: http://127.0.0.1:5000/
[2020-04-07 18:33:10,086] {taskinstance.py:1065} INFO - Marking task as SUCCESS.dag_id=test_dag_v2, task_id=run_this_1, execution_date=20200405T000000, start_date=20200407T152807, end_date=20200407T153310
[2020-04-07 18:33:10,396] {logging_mixin.py:112} INFO - [2020-04-07 18:33:10,395] {local_task_job.py:103} INFO - Task exited with return code 0
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [airflow] ashb commented on issue #8160: SimpleHttpOperator aborts
connection after 5 minutes
Posted by GitBox <gi...@apache.org>.
ashb commented on issue #8160: SimpleHttpOperator aborts connection after 5 minutes
URL: https://github.com/apache/airflow/issues/8160#issuecomment-611228926
Are you running with the SequentialExecutor?
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [airflow] ashb commented on issue #8160: SimpleHttpOperator aborts
connection after 5 minutes
Posted by GitBox <gi...@apache.org>.
ashb commented on issue #8160: SimpleHttpOperator aborts connection after 5 minutes
URL: https://github.com/apache/airflow/issues/8160#issuecomment-611736250
This one appears to be something specific to your environment or the request you are making, as I can't reproduce this behavour, nor could khyurri.
Without more detailed reproduction steps that we can try ourselves we won't be able to help with this one.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [airflow] boring-cyborg[bot] commented on issue #8160:
SimpleHttpOperator aborts connection after 5 minutes
Posted by GitBox <gi...@apache.org>.
boring-cyborg[bot] commented on issue #8160: SimpleHttpOperator aborts connection after 5 minutes
URL: https://github.com/apache/airflow/issues/8160#issuecomment-609764175
Thanks for opening your first issue here! Be sure to follow the issue template!
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [airflow] BrunoDamacena commented on issue #8160:
SimpleHttpOperator aborts connection after 5 minutes
Posted by GitBox <gi...@apache.org>.
BrunoDamacena commented on issue #8160: SimpleHttpOperator aborts connection after 5 minutes
URL: https://github.com/apache/airflow/issues/8160#issuecomment-611502203
> Are you running with the SequentialExecutor?
No, I'm using CeleryExecutor
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [airflow] BrunoDamacena commented on issue #8160:
SimpleHttpOperator aborts connection after 5 minutes
Posted by GitBox <gi...@apache.org>.
BrunoDamacena commented on issue #8160: SimpleHttpOperator aborts connection after 5 minutes
URL: https://github.com/apache/airflow/issues/8160#issuecomment-610943502
I don't think it is a server issue, because I made the same request on Postman, and it worked.
I notice that when the request starts, the following message appears on Airflow GUI:
```
The scheduler does not appear to be running. Last heartbeat was received 2 minutes ago.
The DAGs list may not update, and new tasks will not be scheduled.
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services
[GitHub] [airflow] khyurri commented on issue #8160: SimpleHttpOperator
aborts connection after 5 minutes
Posted by GitBox <gi...@apache.org>.
khyurri commented on issue #8160: SimpleHttpOperator aborts connection after 5 minutes
URL: https://github.com/apache/airflow/issues/8160#issuecomment-610460054
It seems that the server to which SimpleHttpOperator connects breaks the connection after 5 minutes.
I've tried reproduce this bug using simple flask app:
```python
from flask import Flask
from time import monotonic, sleep
app = Flask(__name__)
@app.route('/')
def hello_world():
t0 = monotonic()
sleep(302)
t1 = monotonic()
return "{}".format(t1-t0)
```
Everything works great:
```
[2020-04-07 18:28:08,024] {http_operator.py:87} INFO - Calling HTTP method
[2020-04-07 18:28:08,046] {logging_mixin.py:112} INFO - [2020-04-07 18:28:08,045] {base_hook.py:87} INFO - Using connection to: id: http_default. Host: http://127.0.0.1:5000, Port: None, Schema: None, Login: None, Password: None, extra: None
[2020-04-07 18:28:08,051] {logging_mixin.py:112} INFO - [2020-04-07 18:28:08,050] {http_hook.py:136} INFO - Sending 'GET' to url: http://127.0.0.1:5000/
[2020-04-07 18:33:10,086] {taskinstance.py:1065} INFO - Marking task as SUCCESS.dag_id=test_dag_v2, task_id=run_this_1, execution_date=20200405T000000, start_date=20200407T152807, end_date=20200407T153310
[2020-04-07 18:33:10,396] {logging_mixin.py:112} INFO - [2020-04-07 18:33:10,395] {local_task_job.py:103} INFO - Task exited with return code 0
```
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
With regards,
Apache Git Services