You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Alan Hamlett <al...@gmail.com> on 2018/01/01 20:14:46 UTC

Re: [python] [flask] [CQLAlchemy] NoHostAvailable on create

Still getting the cassandra.cluster.NoHostAvailable error periodically from
uWSGI hosts. Setting up the connection with postfork:
https://github.com/alanhamlett/flask-cqlalchemy/blob/653ed3298af7dd617a972e9f87437f6e53f741b9/flask_cqlalchemy/__init__.py#L56

Lazy connection is False, Retry connection is True. Could this be a bug in
cassandra-driver's connection pooling?

P.S. Blocking a web app when connection isn't available (default non-lazy
connect) is really bad. With a web app you want requests that don't depend
on Cassandra to complete, but cassandra-driver blocks all requests when
there's no Cassandra connection even if it's not needed for the current web
app's request. This design decision gives me very low confidence in the
Python cassandra-driver.

On Sun, Dec 31, 2017 at 2:34 PM, Alan Hamlett <al...@gmail.com>
wrote:

> Thanks for the reply, I think it's related. However, after using a fork of
> Flask-CQLAlchemy with postfork I'm still getting the NoHostAvailable error
> once per 4k requests. One strange thing is the error rate doesn't increase
> with the number of requests, since some uWSGI clients with ~20k requests
> over the same time period have an error rate of once per 20k requests. Both
> uWSGI hosts have the same number of worker processes.
>
> *Flask-CQLAlchemy Fork with Patch:*
>
> https://github.com/alanhamlett/flask-cqlalchemy/tree/
> a7e5c7c7cf0c51a19be98791dd4c47b72b97d9be
>
> *Error Traceback seen after patch applied:*
>
> Failed to create connection pool for new host 10.1.2.3:
> Traceback (most recent call last):
>   File "cassandra/cluster.py", line 2452, in cassandra.cluster.Session.add_
> or_renew_pool.run_add_or_renew_pool
>   File "cassandra/pool.py", line 332, in cassandra.pool.HostConnection.
> __init__
>   File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.
> connection_factory
>   File "cassandra/connection.py", line 341, in cassandra.connection.
> Connection.factory
> cassandra.OperationTimedOut: errors=Timed out creating connection (5
> seconds), last_host=None
> Traceback (most recent call last):
>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1982, in
> wsgi_app
>     response = self.full_dispatch_request()
>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1614, in
> full_dispatch_request
>     rv = self.handle_user_exception(e)
>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1517, in
> handle_user_exception
>     reraise(exc_type, exc_value, tb)
>   File "./venv/lib/python3.4/site-packages/flask/_compat.py", line 33, in
> reraise
>     raise value
>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1612, in
> full_dispatch_request
>     rv = self.dispatch_request()
>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1598, in
> dispatch_request
>     return self.view_functions[rule.endpoint](**req.view_args)
>   File "./app/api_utils.py", line 876, in get_durations
>     use_cassandra=use_cassandra,
>   File "./venv/lib/python3.4/site-packages/datadog/dogstatsd/context.py",
> line 53, in wrapped
>     return func(*args, **kwargs)
>   File "./app/api_utils.py", line 1339, in heartbeats_to_durations
>     for heartbeat in heartbeats:
>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
> line 512, in __iter__
>     self._execute_query()
>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
> line 469, in _execute_query
>     self._result_generator = (i for i in self._execute(self._select_
> query()))
>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
> line 401, in _execute
>     result = _execute_statement(self.model, statement, self._consistency,
> self._timeout, connection=connection)
>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
> line 1505, in _execute_statement
>     return conn.execute(s, params, timeout=timeout, connection=connection)
>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py",
> line 341, in execute
>     result = conn.session.execute(query, params, timeout=timeout)
>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.
> execute
>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.
> ResponseFuture.result
> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation
> against any hosts', {})
>
> On Sun, Dec 31, 2017 at 9:04 AM, Jeff Jirsa <jj...@gmail.com> wrote:
>
>> uWSGI forks and the driver / cqlalchemy may need to reconnect or
>> otherwise fix the state after each fork - you could try to prove this is
>> the cause by checking uWSGI logs or ps for indication that a worker process
>> has exited/been recycled. If you think it may be related to this, check out
>> @postfork decorator
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Dec 31, 2017, at 8:52 AM, Alan Hamlett <al...@gmail.com> wrote:
>>
>> More info: The NoHostAvailable error is happening at random times on each
>> client host, so it's probably a client error. If the Cassandra cluster was
>> really offline then all client hosts would report the error at the same
>> time instead of different random times. The NoHostAvailable error occurs
>> about once every 30 minutes, so most request call Model.create() without
>> the error.
>>
>> On Sun, Dec 31, 2017 at 1:07 AM, Alan Hamlett <al...@gmail.com>
>> wrote:
>>
>>> I'm seeing tracebacks in my Python Flask app when creating rows:
>>>
>>> Traceback (most recent call last):
>>>   File "/opt/app/current/app/api.py", line 1174, in consume_heartbeat
>>>     Heartbeat.create(**form_data)
>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 672, in create
>>>     return cls.objects.create(**kwargs)
>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 977, in create
>>>     .using(connection=self._connection) \
>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 738, in save
>>>     if_exists=self._if_exists).save()
>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1476, in save
>>>     self._execute(insert)
>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1351, in _execute
>>>     results = _execute_statement(self.model, statement, self._consistency, self._timeout, connection=connection)
>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1505, in _execute_statement
>>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py", line 341, in execute
>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.execute
>>>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.ResponseFuture.result
>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {})
>>>
>>>
>>> I'm using the cassandra-driver client library 3.12.0 via
>>> Flask-CQLAlchemy 1.2.0 (https://github.com/thegeorgeous/flask-cqlalchemy)
>>> with uWSGI (https://github.com/unbit/uwsgi).
>>>
>>> cassandra.cqlengine.connection.setup is being passed lazy_connect=True
>>> and retry_connect=Truecassandra.cqlengine because lazy_connect=False
>>> causes requests to timeout to the Flask app for some reason.
>>>
>>> Also seeing these errors in my uWSGI log file:
>>>
>>> [control connection] Error connecting to 10.1.2.3: Traceback (most recent call last): File "cassandra/cluster.py", line 2781, in cassandra.cluster.ControlConnection._reconnect_internal File "cassandra/cluster.py", line 2803, in cassandra.cluster.ControlConnection._try_connect File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.connection_factory File "cassandra/connection.py", line 341, in cassandra.connection.Connection.factory cassandra.OperationTimedOut: errors=Timed out creating connection (5 seconds), last_host=None
>>>
>>>
>>> What's causing these connection and timeout errors? Something related to
>>> Flask-CQLAlchemy?
>>>
>>
>>
>>
>


-- 
Alan Hamlett
ahamlett.com

Re: [python] [flask] [CQLAlchemy] NoHostAvailable on create

Posted by Jeff Jirsa <jj...@gmail.com>.

The warn is a hint you’ve got tombstones, maybe not a big deal, but a hint at your data model. It’s not causing this

The log at INFO is Cassandra connection to your app getting severed, Cassandra is saying the reset is on the other side (app side, maybe firewall or something in the middle too).

-- 
Jeff Jirsa


> On Jan 5, 2018, at 5:50 PM, Alan Hamlett <al...@gmail.com> wrote:
> 
> Update: Still getting the NoHostAvailable periodically in client logs.
> 
> Also seeing these INFO and WARN messages in
> /var/log/cassandra/system.log
> INFO  [epollEventLoopGroup-2-5] 2018-01-06 01:39:02,412 Message.java:623 - Unexpected exception during request; channel = [id: 0xae99b597, L:/10.1.2.3:9042 - R:/10.1.2.12:54720]
> io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)() failed: Connection reset by peer
>         at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
> WARN  [ReadStage-1] 2018-01-06 01:39:24,350 ReadCommand.java:533 - Read 344 live rows and 2074 tombstone cells for query SELECT * FROM keyspace.heartbeat WHERE user_id = 66b6796d-eb84-4bb9-b9d2-8dc882f4c6ac AND time >= 1515225599 AND time <= 1515139200 ORDER BY (time ASC) LIMIT 5000 (see tombstone_warn_threshold)
> 
>> On Tue, Jan 2, 2018 at 8:13 AM, Alan Hamlett <al...@gmail.com> wrote:
>> Still getting the NoHostAvailable with more hosts, just occurring less frequently. Created a JIRA issue on the Python cassandra-driver tracker:
>> https://datastax-oss.atlassian.net/browse/PYTHON-891
>> 
>>> On Mon, Jan 1, 2018 at 8:43 PM, Alan Hamlett <al...@gmail.com> wrote:
>>> Adding more nodes to the cluster fixed the error. Looks like a bug in python-driver connection pool:
>>> 
>>> 1. The connection pool only has one host
>>> 2. A query times out, causing that connection to be removed from the pool
>>> 3. Another query executes, but there are no hosts in the pool
>>> 
>>>> On Mon, Jan 1, 2018 at 12:21 PM, Jeff Jirsa <jj...@gmail.com> wrote:
>>>> Well the python driver you reference is a third party driver, because the project doesn’t ship official drivers. You may have better luck looking for a datastax driver support forum, or wait until after the holiday for more people to be checking email.
>>>> 
>>>> 
>>>> -- 
>>>> Jeff Jirsa
>>>> 
>>>> 
>>>>> On Jan 1, 2018, at 12:14 PM, Alan Hamlett <al...@gmail.com> wrote:
>>>>> 
>>>>> Still getting the cassandra.cluster.NoHostAvailable error periodically from uWSGI hosts. Setting up the connection with postfork:
>>>>> https://github.com/alanhamlett/flask-cqlalchemy/blob/653ed3298af7dd617a972e9f87437f6e53f741b9/flask_cqlalchemy/__init__.py#L56
>>>>> 
>>>>> Lazy connection is False, Retry connection is True. Could this be a bug in cassandra-driver's connection pooling?
>>>>> 
>>>>> P.S. Blocking a web app when connection isn't available (default non-lazy connect) is really bad. With a web app you want requests that don't depend on Cassandra to complete, but cassandra-driver blocks all requests when there's no Cassandra connection even if it's not needed for the current web app's request. This design decision gives me very low confidence in the Python cassandra-driver.
>>>>> 
>>>>>> On Sun, Dec 31, 2017 at 2:34 PM, Alan Hamlett <al...@gmail.com> wrote:
>>>>>> Thanks for the reply, I think it's related. However, after using a fork of Flask-CQLAlchemy with postfork I'm still getting the NoHostAvailable error once per 4k requests. One strange thing is the error rate doesn't increase with the number of requests, since some uWSGI clients with ~20k requests over the same time period have an error rate of once per 20k requests. Both uWSGI hosts have the same number of worker processes.
>>>>>> 
>>>>>> Flask-CQLAlchemy Fork with Patch:
>>>>>> 
>>>>>> https://github.com/alanhamlett/flask-cqlalchemy/tree/a7e5c7c7cf0c51a19be98791dd4c47b72b97d9be
>>>>>> 
>>>>>> Error Traceback seen after patch applied:
>>>>>> 
>>>>>> Failed to create connection pool for new host 10.1.2.3:
>>>>>> Traceback (most recent call last):
>>>>>>   File "cassandra/cluster.py", line 2452, in cassandra.cluster.Session.add_or_renew_pool.run_add_or_renew_pool
>>>>>>   File "cassandra/pool.py", line 332, in cassandra.pool.HostConnection.__init__
>>>>>>   File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.connection_factory
>>>>>>   File "cassandra/connection.py", line 341, in cassandra.connection.Connection.factory
>>>>>> cassandra.OperationTimedOut: errors=Timed out creating connection (5 seconds), last_host=None
>>>>>> Traceback (most recent call last):
>>>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1982, in wsgi_app
>>>>>>     response = self.full_dispatch_request()
>>>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1614, in full_dispatch_request
>>>>>>     rv = self.handle_user_exception(e)
>>>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1517, in handle_user_exception
>>>>>>     reraise(exc_type, exc_value, tb)
>>>>>>   File "./venv/lib/python3.4/site-packages/flask/_compat.py", line 33, in reraise
>>>>>>     raise value
>>>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1612, in full_dispatch_request
>>>>>>     rv = self.dispatch_request()
>>>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1598, in dispatch_request
>>>>>>     return self.view_functions[rule.endpoint](**req.view_args)
>>>>>>   File "./app/api_utils.py", line 876, in get_durations
>>>>>>     use_cassandra=use_cassandra,
>>>>>>   File "./venv/lib/python3.4/site-packages/datadog/dogstatsd/context.py", line 53, in wrapped
>>>>>>     return func(*args, **kwargs)
>>>>>>   File "./app/api_utils.py", line 1339, in heartbeats_to_durations
>>>>>>     for heartbeat in heartbeats:
>>>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 512, in __iter__
>>>>>>     self._execute_query()
>>>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 469, in _execute_query
>>>>>>     self._result_generator = (i for i in self._execute(self._select_query()))
>>>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 401, in _execute
>>>>>>     result = _execute_statement(self.model, statement, self._consistency, self._timeout, connection=connection)
>>>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1505, in _execute_statement
>>>>>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py", line 341, in execute
>>>>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>>>>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.execute
>>>>>>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.ResponseFuture.result
>>>>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {})
>>>>>> 
>>>>>>> On Sun, Dec 31, 2017 at 9:04 AM, Jeff Jirsa <jj...@gmail.com> wrote:
>>>>>>> uWSGI forks and the driver / cqlalchemy may need to reconnect or otherwise fix the state after each fork - you could try to prove this is the cause by checking uWSGI logs or ps for indication that a worker process has exited/been recycled. If you think it may be related to this, check out @postfork decorator
>>>>>>> 
>>>>>>> 
>>>>>>> -- 
>>>>>>> Jeff Jirsa
>>>>>>> 
>>>>>>> 
>>>>>>>> On Dec 31, 2017, at 8:52 AM, Alan Hamlett <al...@gmail.com> wrote:
>>>>>>>> 
>>>>>>>> More info: The NoHostAvailable error is happening at random times on each client host, so it's probably a client error. If the Cassandra cluster was really offline then all client hosts would report the error at the same time instead of different random times. The NoHostAvailable error occurs about once every 30 minutes, so most request call Model.create() without the error.
>>>>>>>> 
>>>>>>>>> On Sun, Dec 31, 2017 at 1:07 AM, Alan Hamlett <al...@gmail.com> wrote:
>>>>>>>>> I'm seeing tracebacks in my Python Flask app when creating rows:
>>>>>>>>> 
>>>>>>>>> Traceback (most recent call last):
>>>>>>>>>   File "/opt/app/current/app/api.py", line 1174, in consume_heartbeat
>>>>>>>>>     Heartbeat.create(**form_data)
>>>>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 672, in create
>>>>>>>>>     return cls.objects.create(**kwargs)
>>>>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 977, in create
>>>>>>>>>     .using(connection=self._connection) \
>>>>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 738, in save
>>>>>>>>>     if_exists=self._if_exists).save()
>>>>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1476, in save
>>>>>>>>>     self._execute(insert)
>>>>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1351, in _execute
>>>>>>>>>     results = _execute_statement(self.model, statement, self._consistency, self._timeout, connection=connection)
>>>>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1505, in _execute_statement
>>>>>>>>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>>>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py", line 341, in execute
>>>>>>>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>>>>>>>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.execute
>>>>>>>>>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.ResponseFuture.result
>>>>>>>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {})
>>>>>>>>> 
>>>>>>>>> I'm using the cassandra-driver client library 3.12.0 via Flask-CQLAlchemy 1.2.0 (https://github.com/thegeorgeous/flask-cqlalchemy) with uWSGI (https://github.com/unbit/uwsgi).
>>>>>>>>> 
>>>>>>>>> cassandra.cqlengine.connection.setup is being passed lazy_connect=True and retry_connect=Truecassandra.cqlengine because lazy_connect=False causes requests to timeout to the Flask app for some reason.
>>>>>>>>> 
>>>>>>>>> Also seeing these errors in my uWSGI log file:
>>>>>>>>> 
>>>>>>>>> [control connection] Error connecting to 10.1.2.3: Traceback (most recent call last): File "cassandra/cluster.py", line 2781, in cassandra.cluster.ControlConnection._reconnect_internal File "cassandra/cluster.py", line 2803, in cassandra.cluster.ControlConnection._try_connect File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.connection_factory File "cassandra/connection.py", line 341, in cassandra.connection.Connection.factory cassandra.OperationTimedOut: errors=Timed out creating connection (5 seconds), last_host=None
>>>>>>>>> 
>>>>>>>>> What's causing these connection and timeout errors? Something related to Flask-CQLAlchemy?
>>>>>>>> 
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> -- 
>>>>> Alan Hamlett
>>>>> ahamlett.com
>>> 
>>> 
>>> 
>>> -- 
>>> Alan Hamlett
>>> ahamlett.com
>> 
>> 
>> 
>> -- 
>> Alan Hamlett
>> ahamlett.com
> 
> 
> 
> -- 
> Alan Hamlett
> ahamlett.com

Re: [python] [flask] [CQLAlchemy] NoHostAvailable on create

Posted by Alan Hamlett <al...@gmail.com>.

Update: Still getting the NoHostAvailable periodically in client logs.

Also seeing these INFO and WARN messages in

/var/log/cassandra/system.log

INFO  [epollEventLoopGroup-2-5] 2018-01-06 01:39:02,412
Message.java:623 - Unexpected exception during request; channel = [id:
0xae99b597, L:/10.1.2.3:9042 - R:/10.1.2.12:54720]
io.netty.channel.unix.Errors$NativeIoException: syscall:read(...)()
failed: Connection reset by peer
        at io.netty.channel.unix.FileDescriptor.readAddress(...)(Unknown
Source) ~[netty-all-4.0.44.Final.jar:4.0.44.Final]
WARN  [ReadStage-1] 2018-01-06 01:39:24,350 ReadCommand.java:533 -
Read 344 live rows and 2074 tombstone cells for query SELECT * FROM
keyspace.heartbeat WHERE user_id =
66b6796d-eb84-4bb9-b9d2-8dc882f4c6ac AND time >= 1515225599 AND time
<= 1515139200 ORDER BY (time ASC) LIMIT 5000 (see
tombstone_warn_threshold)


On Tue, Jan 2, 2018 at 8:13 AM, Alan Hamlett <al...@gmail.com> wrote:

> Still getting the NoHostAvailable with more hosts, just occurring less
> frequently. Created a JIRA issue on the Python cassandra-driver tracker:
> https://datastax-oss.atlassian.net/browse/PYTHON-891
>
> On Mon, Jan 1, 2018 at 8:43 PM, Alan Hamlett <al...@gmail.com>
> wrote:
>
>> Adding more nodes to the cluster fixed the error. Looks like a bug in
>> python-driver connection pool:
>>
>> 1. The connection pool only has one host
>> 2. A query times out, causing that connection to be removed from the pool
>> 3. Another query executes, but there are no hosts in the pool
>>
>> On Mon, Jan 1, 2018 at 12:21 PM, Jeff Jirsa <jj...@gmail.com> wrote:
>>
>>> Well the python driver you reference is a third party driver, because
>>> the project doesn’t ship official drivers. You may have better luck looking
>>> for a datastax driver support forum, or wait until after the holiday for
>>> more people to be checking email.
>>>
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Jan 1, 2018, at 12:14 PM, Alan Hamlett <al...@gmail.com>
>>> wrote:
>>>
>>> Still getting the cassandra.cluster.NoHostAvailable error periodically
>>> from uWSGI hosts. Setting up the connection with postfork:
>>> https://github.com/alanhamlett/flask-cqlalchemy/blob/653ed32
>>> 98af7dd617a972e9f87437f6e53f741b9/flask_cqlalchemy/__init__.py#L56
>>>
>>> Lazy connection is False, Retry connection is True. Could this be a bug
>>> in cassandra-driver's connection pooling?
>>>
>>> P.S. Blocking a web app when connection isn't available (default
>>> non-lazy connect) is really bad. With a web app you want requests that
>>> don't depend on Cassandra to complete, but cassandra-driver blocks all
>>> requests when there's no Cassandra connection even if it's not needed for
>>> the current web app's request. This design decision gives me very low
>>> confidence in the Python cassandra-driver.
>>>
>>> On Sun, Dec 31, 2017 at 2:34 PM, Alan Hamlett <al...@gmail.com>
>>> wrote:
>>>
>>>> Thanks for the reply, I think it's related. However, after using a fork
>>>> of Flask-CQLAlchemy with postfork I'm still getting the NoHostAvailable
>>>> error once per 4k requests. One strange thing is the error rate doesn't
>>>> increase with the number of requests, since some uWSGI clients with ~20k
>>>> requests over the same time period have an error rate of once per 20k
>>>> requests. Both uWSGI hosts have the same number of worker processes.
>>>>
>>>> *Flask-CQLAlchemy Fork with Patch:*
>>>>
>>>> https://github.com/alanhamlett/flask-cqlalchemy/tree/a7e5c7c
>>>> 7cf0c51a19be98791dd4c47b72b97d9be
>>>>
>>>> *Error Traceback seen after patch applied:*
>>>>
>>>> Failed to create connection pool for new host 10.1.2.3:
>>>> Traceback (most recent call last):
>>>>   File "cassandra/cluster.py", line 2452, in
>>>> cassandra.cluster.Session.add_or_renew_pool.run_add_or_renew_pool
>>>>   File "cassandra/pool.py", line 332, in cassandra.pool.HostConnection.
>>>> __init__
>>>>   File "cassandra/cluster.py", line 1195, in
>>>> cassandra.cluster.Cluster.connection_factory
>>>>   File "cassandra/connection.py", line 341, in
>>>> cassandra.connection.Connection.factory
>>>> cassandra.OperationTimedOut: errors=Timed out creating connection (5
>>>> seconds), last_host=None
>>>> Traceback (most recent call last):
>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1982,
>>>> in wsgi_app
>>>>     response = self.full_dispatch_request()
>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1614,
>>>> in full_dispatch_request
>>>>     rv = self.handle_user_exception(e)
>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1517,
>>>> in handle_user_exception
>>>>     reraise(exc_type, exc_value, tb)
>>>>   File "./venv/lib/python3.4/site-packages/flask/_compat.py", line 33,
>>>> in reraise
>>>>     raise value
>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1612,
>>>> in full_dispatch_request
>>>>     rv = self.dispatch_request()
>>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1598,
>>>> in dispatch_request
>>>>     return self.view_functions[rule.endpoint](**req.view_args)
>>>>   File "./app/api_utils.py", line 876, in get_durations
>>>>     use_cassandra=use_cassandra,
>>>>   File "./venv/lib/python3.4/site-packages/datadog/dogstatsd/context.py",
>>>> line 53, in wrapped
>>>>     return func(*args, **kwargs)
>>>>   File "./app/api_utils.py", line 1339, in heartbeats_to_durations
>>>>     for heartbeat in heartbeats:
>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>>>> line 512, in __iter__
>>>>     self._execute_query()
>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>>>> line 469, in _execute_query
>>>>     self._result_generator = (i for i in self._execute(self._select_que
>>>> ry()))
>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>>>> line 401, in _execute
>>>>     result = _execute_statement(self.model, statement,
>>>> self._consistency, self._timeout, connection=connection)
>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>>>> line 1505, in _execute_statement
>>>>     return conn.execute(s, params, timeout=timeout,
>>>> connection=connection)
>>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py",
>>>> line 341, in execute
>>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>>   File "cassandra/cluster.py", line 2122, in
>>>> cassandra.cluster.Session.execute
>>>>   File "cassandra/cluster.py", line 3982, in
>>>> cassandra.cluster.ResponseFuture.result
>>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation
>>>> against any hosts', {})
>>>>
>>>> On Sun, Dec 31, 2017 at 9:04 AM, Jeff Jirsa <jj...@gmail.com> wrote:
>>>>
>>>>> uWSGI forks and the driver / cqlalchemy may need to reconnect or
>>>>> otherwise fix the state after each fork - you could try to prove this is
>>>>> the cause by checking uWSGI logs or ps for indication that a worker process
>>>>> has exited/been recycled. If you think it may be related to this, check out
>>>>> @postfork decorator
>>>>>
>>>>>
>>>>> --
>>>>> Jeff Jirsa
>>>>>
>>>>>
>>>>> On Dec 31, 2017, at 8:52 AM, Alan Hamlett <al...@gmail.com>
>>>>> wrote:
>>>>>
>>>>> More info: The NoHostAvailable error is happening at random times on
>>>>> each client host, so it's probably a client error. If the Cassandra cluster
>>>>> was really offline then all client hosts would report the error at the same
>>>>> time instead of different random times. The NoHostAvailable error occurs
>>>>> about once every 30 minutes, so most request call Model.create() without
>>>>> the error.
>>>>>
>>>>> On Sun, Dec 31, 2017 at 1:07 AM, Alan Hamlett <al...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>> I'm seeing tracebacks in my Python Flask app when creating rows:
>>>>>>
>>>>>> Traceback (most recent call last):
>>>>>>   File "/opt/app/current/app/api.py", line 1174, in consume_heartbeat
>>>>>>     Heartbeat.create(**form_data)
>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 672, in create
>>>>>>     return cls.objects.create(**kwargs)
>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 977, in create
>>>>>>     .using(connection=self._connection) \
>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 738, in save
>>>>>>     if_exists=self._if_exists).save()
>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1476, in save
>>>>>>     self._execute(insert)
>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1351, in _execute
>>>>>>     results = _execute_statement(self.model, statement, self._consistency, self._timeout, connection=connection)
>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1505, in _execute_statement
>>>>>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py", line 341, in execute
>>>>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>>>>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.execute
>>>>>>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.ResponseFuture.result
>>>>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {})
>>>>>>
>>>>>>
>>>>>> I'm using the cassandra-driver client library 3.12.0 via
>>>>>> Flask-CQLAlchemy 1.2.0 (https://github.com/thegeorgeo
>>>>>> us/flask-cqlalchemy) with uWSGI (https://github.com/unbit/uwsgi).
>>>>>>
>>>>>> cassandra.cqlengine.connection.setup is being passed
>>>>>> lazy_connect=True and retry_connect=Truecassandra.cqlengine because
>>>>>> lazy_connect=False causes requests to timeout to the Flask app for some
>>>>>> reason.
>>>>>>
>>>>>> Also seeing these errors in my uWSGI log file:
>>>>>>
>>>>>> [control connection] Error connecting to 10.1.2.3: Traceback (most recent call last): File "cassandra/cluster.py", line 2781, in cassandra.cluster.ControlConnection._reconnect_internal File "cassandra/cluster.py", line 2803, in cassandra.cluster.ControlConnection._try_connect File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.connection_factory File "cassandra/connection.py", line 341, in cassandra.connection.Connection.factory cassandra.OperationTimedOut: errors=Timed out creating connection (5 seconds), last_host=None
>>>>>>
>>>>>>
>>>>>> What's causing these connection and timeout errors? Something related
>>>>>> to Flask-CQLAlchemy?
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>>
>>> --
>>> Alan Hamlett
>>> ahamlett.com
>>>
>>>
>>
>>
>> --
>> Alan Hamlett
>> ahamlett.com
>>
>
>
>
> --
> Alan Hamlett
> ahamlett.com
>



-- 
Alan Hamlett
ahamlett.com

Re: [python] [flask] [CQLAlchemy] NoHostAvailable on create

Posted by Alan Hamlett <al...@gmail.com>.

Still getting the NoHostAvailable with more hosts, just occurring less
frequently. Created a JIRA issue on the Python cassandra-driver tracker:
https://datastax-oss.atlassian.net/browse/PYTHON-891

On Mon, Jan 1, 2018 at 8:43 PM, Alan Hamlett <al...@gmail.com> wrote:

> Adding more nodes to the cluster fixed the error. Looks like a bug in
> python-driver connection pool:
>
> 1. The connection pool only has one host
> 2. A query times out, causing that connection to be removed from the pool
> 3. Another query executes, but there are no hosts in the pool
>
> On Mon, Jan 1, 2018 at 12:21 PM, Jeff Jirsa <jj...@gmail.com> wrote:
>
>> Well the python driver you reference is a third party driver, because the
>> project doesn’t ship official drivers. You may have better luck looking for
>> a datastax driver support forum, or wait until after the holiday for more
>> people to be checking email.
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Jan 1, 2018, at 12:14 PM, Alan Hamlett <al...@gmail.com> wrote:
>>
>> Still getting the cassandra.cluster.NoHostAvailable error periodically
>> from uWSGI hosts. Setting up the connection with postfork:
>> https://github.com/alanhamlett/flask-cqlalchemy/blob/653ed32
>> 98af7dd617a972e9f87437f6e53f741b9/flask_cqlalchemy/__init__.py#L56
>>
>> Lazy connection is False, Retry connection is True. Could this be a bug
>> in cassandra-driver's connection pooling?
>>
>> P.S. Blocking a web app when connection isn't available (default non-lazy
>> connect) is really bad. With a web app you want requests that don't depend
>> on Cassandra to complete, but cassandra-driver blocks all requests when
>> there's no Cassandra connection even if it's not needed for the current web
>> app's request. This design decision gives me very low confidence in the
>> Python cassandra-driver.
>>
>> On Sun, Dec 31, 2017 at 2:34 PM, Alan Hamlett <al...@gmail.com>
>> wrote:
>>
>>> Thanks for the reply, I think it's related. However, after using a fork
>>> of Flask-CQLAlchemy with postfork I'm still getting the NoHostAvailable
>>> error once per 4k requests. One strange thing is the error rate doesn't
>>> increase with the number of requests, since some uWSGI clients with ~20k
>>> requests over the same time period have an error rate of once per 20k
>>> requests. Both uWSGI hosts have the same number of worker processes.
>>>
>>> *Flask-CQLAlchemy Fork with Patch:*
>>>
>>> https://github.com/alanhamlett/flask-cqlalchemy/tree/a7e5c7c
>>> 7cf0c51a19be98791dd4c47b72b97d9be
>>>
>>> *Error Traceback seen after patch applied:*
>>>
>>> Failed to create connection pool for new host 10.1.2.3:
>>> Traceback (most recent call last):
>>>   File "cassandra/cluster.py", line 2452, in
>>> cassandra.cluster.Session.add_or_renew_pool.run_add_or_renew_pool
>>>   File "cassandra/pool.py", line 332, in cassandra.pool.HostConnection.
>>> __init__
>>>   File "cassandra/cluster.py", line 1195, in
>>> cassandra.cluster.Cluster.connection_factory
>>>   File "cassandra/connection.py", line 341, in
>>> cassandra.connection.Connection.factory
>>> cassandra.OperationTimedOut: errors=Timed out creating connection (5
>>> seconds), last_host=None
>>> Traceback (most recent call last):
>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1982, in
>>> wsgi_app
>>>     response = self.full_dispatch_request()
>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1614, in
>>> full_dispatch_request
>>>     rv = self.handle_user_exception(e)
>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1517, in
>>> handle_user_exception
>>>     reraise(exc_type, exc_value, tb)
>>>   File "./venv/lib/python3.4/site-packages/flask/_compat.py", line 33,
>>> in reraise
>>>     raise value
>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1612, in
>>> full_dispatch_request
>>>     rv = self.dispatch_request()
>>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1598, in
>>> dispatch_request
>>>     return self.view_functions[rule.endpoint](**req.view_args)
>>>   File "./app/api_utils.py", line 876, in get_durations
>>>     use_cassandra=use_cassandra,
>>>   File "./venv/lib/python3.4/site-packages/datadog/dogstatsd/context.py",
>>> line 53, in wrapped
>>>     return func(*args, **kwargs)
>>>   File "./app/api_utils.py", line 1339, in heartbeats_to_durations
>>>     for heartbeat in heartbeats:
>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>>> line 512, in __iter__
>>>     self._execute_query()
>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>>> line 469, in _execute_query
>>>     self._result_generator = (i for i in self._execute(self._select_que
>>> ry()))
>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>>> line 401, in _execute
>>>     result = _execute_statement(self.model, statement,
>>> self._consistency, self._timeout, connection=connection)
>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>>> line 1505, in _execute_statement
>>>     return conn.execute(s, params, timeout=timeout,
>>> connection=connection)
>>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py",
>>> line 341, in execute
>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>   File "cassandra/cluster.py", line 2122, in
>>> cassandra.cluster.Session.execute
>>>   File "cassandra/cluster.py", line 3982, in
>>> cassandra.cluster.ResponseFuture.result
>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation
>>> against any hosts', {})
>>>
>>> On Sun, Dec 31, 2017 at 9:04 AM, Jeff Jirsa <jj...@gmail.com> wrote:
>>>
>>>> uWSGI forks and the driver / cqlalchemy may need to reconnect or
>>>> otherwise fix the state after each fork - you could try to prove this is
>>>> the cause by checking uWSGI logs or ps for indication that a worker process
>>>> has exited/been recycled. If you think it may be related to this, check out
>>>> @postfork decorator
>>>>
>>>>
>>>> --
>>>> Jeff Jirsa
>>>>
>>>>
>>>> On Dec 31, 2017, at 8:52 AM, Alan Hamlett <al...@gmail.com>
>>>> wrote:
>>>>
>>>> More info: The NoHostAvailable error is happening at random times on
>>>> each client host, so it's probably a client error. If the Cassandra cluster
>>>> was really offline then all client hosts would report the error at the same
>>>> time instead of different random times. The NoHostAvailable error occurs
>>>> about once every 30 minutes, so most request call Model.create() without
>>>> the error.
>>>>
>>>> On Sun, Dec 31, 2017 at 1:07 AM, Alan Hamlett <al...@gmail.com>
>>>> wrote:
>>>>
>>>>> I'm seeing tracebacks in my Python Flask app when creating rows:
>>>>>
>>>>> Traceback (most recent call last):
>>>>>   File "/opt/app/current/app/api.py", line 1174, in consume_heartbeat
>>>>>     Heartbeat.create(**form_data)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 672, in create
>>>>>     return cls.objects.create(**kwargs)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 977, in create
>>>>>     .using(connection=self._connection) \
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 738, in save
>>>>>     if_exists=self._if_exists).save()
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1476, in save
>>>>>     self._execute(insert)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1351, in _execute
>>>>>     results = _execute_statement(self.model, statement, self._consistency, self._timeout, connection=connection)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1505, in _execute_statement
>>>>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py", line 341, in execute
>>>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>>>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.execute
>>>>>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.ResponseFuture.result
>>>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {})
>>>>>
>>>>>
>>>>> I'm using the cassandra-driver client library 3.12.0 via
>>>>> Flask-CQLAlchemy 1.2.0 (https://github.com/thegeorgeo
>>>>> us/flask-cqlalchemy) with uWSGI (https://github.com/unbit/uwsgi).
>>>>>
>>>>> cassandra.cqlengine.connection.setup is being passed
>>>>> lazy_connect=True and retry_connect=Truecassandra.cqlengine because
>>>>> lazy_connect=False causes requests to timeout to the Flask app for some
>>>>> reason.
>>>>>
>>>>> Also seeing these errors in my uWSGI log file:
>>>>>
>>>>> [control connection] Error connecting to 10.1.2.3: Traceback (most recent call last): File "cassandra/cluster.py", line 2781, in cassandra.cluster.ControlConnection._reconnect_internal File "cassandra/cluster.py", line 2803, in cassandra.cluster.ControlConnection._try_connect File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.connection_factory File "cassandra/connection.py", line 341, in cassandra.connection.Connection.factory cassandra.OperationTimedOut: errors=Timed out creating connection (5 seconds), last_host=None
>>>>>
>>>>>
>>>>> What's causing these connection and timeout errors? Something related
>>>>> to Flask-CQLAlchemy?
>>>>>
>>>>
>>>>
>>>>
>>>
>>
>>
>> --
>> Alan Hamlett
>> ahamlett.com
>>
>>
>
>
> --
> Alan Hamlett
> ahamlett.com
>



-- 
Alan Hamlett
ahamlett.com

Re: [python] [flask] [CQLAlchemy] NoHostAvailable on create

Posted by Alan Hamlett <al...@gmail.com>.

Adding more nodes to the cluster fixed the error. Looks like a bug in
python-driver connection pool:

1. The connection pool only has one host
2. A query times out, causing that connection to be removed from the pool
3. Another query executes, but there are no hosts in the pool

On Mon, Jan 1, 2018 at 12:21 PM, Jeff Jirsa <jj...@gmail.com> wrote:

> Well the python driver you reference is a third party driver, because the
> project doesn’t ship official drivers. You may have better luck looking for
> a datastax driver support forum, or wait until after the holiday for more
> people to be checking email.
>
>
> --
> Jeff Jirsa
>
>
> On Jan 1, 2018, at 12:14 PM, Alan Hamlett <al...@gmail.com> wrote:
>
> Still getting the cassandra.cluster.NoHostAvailable error periodically
> from uWSGI hosts. Setting up the connection with postfork:
> https://github.com/alanhamlett/flask-cqlalchemy/blob/
> 653ed3298af7dd617a972e9f87437f6e53f741b9/flask_cqlalchemy/__init__.py#L56
>
> Lazy connection is False, Retry connection is True. Could this be a bug in
> cassandra-driver's connection pooling?
>
> P.S. Blocking a web app when connection isn't available (default non-lazy
> connect) is really bad. With a web app you want requests that don't depend
> on Cassandra to complete, but cassandra-driver blocks all requests when
> there's no Cassandra connection even if it's not needed for the current web
> app's request. This design decision gives me very low confidence in the
> Python cassandra-driver.
>
> On Sun, Dec 31, 2017 at 2:34 PM, Alan Hamlett <al...@gmail.com>
> wrote:
>
>> Thanks for the reply, I think it's related. However, after using a fork
>> of Flask-CQLAlchemy with postfork I'm still getting the NoHostAvailable
>> error once per 4k requests. One strange thing is the error rate doesn't
>> increase with the number of requests, since some uWSGI clients with ~20k
>> requests over the same time period have an error rate of once per 20k
>> requests. Both uWSGI hosts have the same number of worker processes.
>>
>> *Flask-CQLAlchemy Fork with Patch:*
>>
>> https://github.com/alanhamlett/flask-cqlalchemy/tree/a7e5c7c
>> 7cf0c51a19be98791dd4c47b72b97d9be
>>
>> *Error Traceback seen after patch applied:*
>>
>> Failed to create connection pool for new host 10.1.2.3:
>> Traceback (most recent call last):
>>   File "cassandra/cluster.py", line 2452, in
>> cassandra.cluster.Session.add_or_renew_pool.run_add_or_renew_pool
>>   File "cassandra/pool.py", line 332, in cassandra.pool.HostConnection.
>> __init__
>>   File "cassandra/cluster.py", line 1195, in
>> cassandra.cluster.Cluster.connection_factory
>>   File "cassandra/connection.py", line 341, in
>> cassandra.connection.Connection.factory
>> cassandra.OperationTimedOut: errors=Timed out creating connection (5
>> seconds), last_host=None
>> Traceback (most recent call last):
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1982, in
>> wsgi_app
>>     response = self.full_dispatch_request()
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1614, in
>> full_dispatch_request
>>     rv = self.handle_user_exception(e)
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1517, in
>> handle_user_exception
>>     reraise(exc_type, exc_value, tb)
>>   File "./venv/lib/python3.4/site-packages/flask/_compat.py", line 33,
>> in reraise
>>     raise value
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1612, in
>> full_dispatch_request
>>     rv = self.dispatch_request()
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1598, in
>> dispatch_request
>>     return self.view_functions[rule.endpoint](**req.view_args)
>>   File "./app/api_utils.py", line 876, in get_durations
>>     use_cassandra=use_cassandra,
>>   File "./venv/lib/python3.4/site-packages/datadog/dogstatsd/context.py",
>> line 53, in wrapped
>>     return func(*args, **kwargs)
>>   File "./app/api_utils.py", line 1339, in heartbeats_to_durations
>>     for heartbeat in heartbeats:
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>> line 512, in __iter__
>>     self._execute_query()
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>> line 469, in _execute_query
>>     self._result_generator = (i for i in self._execute(self._select_que
>> ry()))
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>> line 401, in _execute
>>     result = _execute_statement(self.model, statement, self._consistency,
>> self._timeout, connection=connection)
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py",
>> line 1505, in _execute_statement
>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py",
>> line 341, in execute
>>     result = conn.session.execute(query, params, timeout=timeout)
>>   File "cassandra/cluster.py", line 2122, in
>> cassandra.cluster.Session.execute
>>   File "cassandra/cluster.py", line 3982, in
>> cassandra.cluster.ResponseFuture.result
>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation
>> against any hosts', {})
>>
>> On Sun, Dec 31, 2017 at 9:04 AM, Jeff Jirsa <jj...@gmail.com> wrote:
>>
>>> uWSGI forks and the driver / cqlalchemy may need to reconnect or
>>> otherwise fix the state after each fork - you could try to prove this is
>>> the cause by checking uWSGI logs or ps for indication that a worker process
>>> has exited/been recycled. If you think it may be related to this, check out
>>> @postfork decorator
>>>
>>>
>>> --
>>> Jeff Jirsa
>>>
>>>
>>> On Dec 31, 2017, at 8:52 AM, Alan Hamlett <al...@gmail.com>
>>> wrote:
>>>
>>> More info: The NoHostAvailable error is happening at random times on
>>> each client host, so it's probably a client error. If the Cassandra cluster
>>> was really offline then all client hosts would report the error at the same
>>> time instead of different random times. The NoHostAvailable error occurs
>>> about once every 30 minutes, so most request call Model.create() without
>>> the error.
>>>
>>> On Sun, Dec 31, 2017 at 1:07 AM, Alan Hamlett <al...@gmail.com>
>>> wrote:
>>>
>>>> I'm seeing tracebacks in my Python Flask app when creating rows:
>>>>
>>>> Traceback (most recent call last):
>>>>   File "/opt/app/current/app/api.py", line 1174, in consume_heartbeat
>>>>     Heartbeat.create(**form_data)
>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 672, in create
>>>>     return cls.objects.create(**kwargs)
>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 977, in create
>>>>     .using(connection=self._connection) \
>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 738, in save
>>>>     if_exists=self._if_exists).save()
>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1476, in save
>>>>     self._execute(insert)
>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1351, in _execute
>>>>     results = _execute_statement(self.model, statement, self._consistency, self._timeout, connection=connection)
>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1505, in _execute_statement
>>>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py", line 341, in execute
>>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.execute
>>>>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.ResponseFuture.result
>>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {})
>>>>
>>>>
>>>> I'm using the cassandra-driver client library 3.12.0 via
>>>> Flask-CQLAlchemy 1.2.0 (https://github.com/thegeorgeo
>>>> us/flask-cqlalchemy) with uWSGI (https://github.com/unbit/uwsgi).
>>>>
>>>> cassandra.cqlengine.connection.setup is being passed lazy_connect=True
>>>> and retry_connect=Truecassandra.cqlengine because lazy_connect=False
>>>> causes requests to timeout to the Flask app for some reason.
>>>>
>>>> Also seeing these errors in my uWSGI log file:
>>>>
>>>> [control connection] Error connecting to 10.1.2.3: Traceback (most recent call last): File "cassandra/cluster.py", line 2781, in cassandra.cluster.ControlConnection._reconnect_internal File "cassandra/cluster.py", line 2803, in cassandra.cluster.ControlConnection._try_connect File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.connection_factory File "cassandra/connection.py", line 341, in cassandra.connection.Connection.factory cassandra.OperationTimedOut: errors=Timed out creating connection (5 seconds), last_host=None
>>>>
>>>>
>>>> What's causing these connection and timeout errors? Something related
>>>> to Flask-CQLAlchemy?
>>>>
>>>
>>>
>>>
>>
>
>
> --
> Alan Hamlett
> ahamlett.com
>
>


-- 
Alan Hamlett
ahamlett.com

Re: [python] [flask] [CQLAlchemy] NoHostAvailable on create

Posted by Jeff Jirsa <jj...@gmail.com>.

Well the python driver you reference is a third party driver, because the project doesn’t ship official drivers. You may have better luck looking for a datastax driver support forum, or wait until after the holiday for more people to be checking email.


-- 
Jeff Jirsa


> On Jan 1, 2018, at 12:14 PM, Alan Hamlett <al...@gmail.com> wrote:
> 
> Still getting the cassandra.cluster.NoHostAvailable error periodically from uWSGI hosts. Setting up the connection with postfork:
> https://github.com/alanhamlett/flask-cqlalchemy/blob/653ed3298af7dd617a972e9f87437f6e53f741b9/flask_cqlalchemy/__init__.py#L56
> 
> Lazy connection is False, Retry connection is True. Could this be a bug in cassandra-driver's connection pooling?
> 
> P.S. Blocking a web app when connection isn't available (default non-lazy connect) is really bad. With a web app you want requests that don't depend on Cassandra to complete, but cassandra-driver blocks all requests when there's no Cassandra connection even if it's not needed for the current web app's request. This design decision gives me very low confidence in the Python cassandra-driver.
> 
>> On Sun, Dec 31, 2017 at 2:34 PM, Alan Hamlett <al...@gmail.com> wrote:
>> Thanks for the reply, I think it's related. However, after using a fork of Flask-CQLAlchemy with postfork I'm still getting the NoHostAvailable error once per 4k requests. One strange thing is the error rate doesn't increase with the number of requests, since some uWSGI clients with ~20k requests over the same time period have an error rate of once per 20k requests. Both uWSGI hosts have the same number of worker processes.
>> 
>> Flask-CQLAlchemy Fork with Patch:
>> 
>> https://github.com/alanhamlett/flask-cqlalchemy/tree/a7e5c7c7cf0c51a19be98791dd4c47b72b97d9be
>> 
>> Error Traceback seen after patch applied:
>> 
>> Failed to create connection pool for new host 10.1.2.3:
>> Traceback (most recent call last):
>>   File "cassandra/cluster.py", line 2452, in cassandra.cluster.Session.add_or_renew_pool.run_add_or_renew_pool
>>   File "cassandra/pool.py", line 332, in cassandra.pool.HostConnection.__init__
>>   File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.connection_factory
>>   File "cassandra/connection.py", line 341, in cassandra.connection.Connection.factory
>> cassandra.OperationTimedOut: errors=Timed out creating connection (5 seconds), last_host=None
>> Traceback (most recent call last):
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1982, in wsgi_app
>>     response = self.full_dispatch_request()
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1614, in full_dispatch_request
>>     rv = self.handle_user_exception(e)
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1517, in handle_user_exception
>>     reraise(exc_type, exc_value, tb)
>>   File "./venv/lib/python3.4/site-packages/flask/_compat.py", line 33, in reraise
>>     raise value
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1612, in full_dispatch_request
>>     rv = self.dispatch_request()
>>   File "./venv/lib/python3.4/site-packages/flask/app.py", line 1598, in dispatch_request
>>     return self.view_functions[rule.endpoint](**req.view_args)
>>   File "./app/api_utils.py", line 876, in get_durations
>>     use_cassandra=use_cassandra,
>>   File "./venv/lib/python3.4/site-packages/datadog/dogstatsd/context.py", line 53, in wrapped
>>     return func(*args, **kwargs)
>>   File "./app/api_utils.py", line 1339, in heartbeats_to_durations
>>     for heartbeat in heartbeats:
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 512, in __iter__
>>     self._execute_query()
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 469, in _execute_query
>>     self._result_generator = (i for i in self._execute(self._select_query()))
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 401, in _execute
>>     result = _execute_statement(self.model, statement, self._consistency, self._timeout, connection=connection)
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1505, in _execute_statement
>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>   File "./venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py", line 341, in execute
>>     result = conn.session.execute(query, params, timeout=timeout)
>>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.execute
>>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.ResponseFuture.result
>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {})
>> 
>>> On Sun, Dec 31, 2017 at 9:04 AM, Jeff Jirsa <jj...@gmail.com> wrote:
>>> uWSGI forks and the driver / cqlalchemy may need to reconnect or otherwise fix the state after each fork - you could try to prove this is the cause by checking uWSGI logs or ps for indication that a worker process has exited/been recycled. If you think it may be related to this, check out @postfork decorator
>>> 
>>> 
>>> -- 
>>> Jeff Jirsa
>>> 
>>> 
>>>> On Dec 31, 2017, at 8:52 AM, Alan Hamlett <al...@gmail.com> wrote:
>>>> 
>>>> More info: The NoHostAvailable error is happening at random times on each client host, so it's probably a client error. If the Cassandra cluster was really offline then all client hosts would report the error at the same time instead of different random times. The NoHostAvailable error occurs about once every 30 minutes, so most request call Model.create() without the error.
>>>> 
>>>>> On Sun, Dec 31, 2017 at 1:07 AM, Alan Hamlett <al...@gmail.com> wrote:
>>>>> I'm seeing tracebacks in my Python Flask app when creating rows:
>>>>> 
>>>>> Traceback (most recent call last):
>>>>>   File "/opt/app/current/app/api.py", line 1174, in consume_heartbeat
>>>>>     Heartbeat.create(**form_data)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 672, in create
>>>>>     return cls.objects.create(**kwargs)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 977, in create
>>>>>     .using(connection=self._connection) \
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/models.py", line 738, in save
>>>>>     if_exists=self._if_exists).save()
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1476, in save
>>>>>     self._execute(insert)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1351, in _execute
>>>>>     results = _execute_statement(self.model, statement, self._consistency, self._timeout, connection=connection)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/query.py", line 1505, in _execute_statement
>>>>>     return conn.execute(s, params, timeout=timeout, connection=connection)
>>>>>   File "/opt/app/current/venv/lib/python3.4/site-packages/cassandra/cqlengine/connection.py", line 341, in execute
>>>>>     result = conn.session.execute(query, params, timeout=timeout)
>>>>>   File "cassandra/cluster.py", line 2122, in cassandra.cluster.Session.execute
>>>>>   File "cassandra/cluster.py", line 3982, in cassandra.cluster.ResponseFuture.result
>>>>> cassandra.cluster.NoHostAvailable: ('Unable to complete the operation against any hosts', {})
>>>>> 
>>>>> I'm using the cassandra-driver client library 3.12.0 via Flask-CQLAlchemy 1.2.0 (https://github.com/thegeorgeous/flask-cqlalchemy) with uWSGI (https://github.com/unbit/uwsgi).
>>>>> 
>>>>> cassandra.cqlengine.connection.setup is being passed lazy_connect=True and retry_connect=Truecassandra.cqlengine because lazy_connect=False causes requests to timeout to the Flask app for some reason.
>>>>> 
>>>>> Also seeing these errors in my uWSGI log file:
>>>>> 
>>>>> [control connection] Error connecting to 10.1.2.3: Traceback (most recent call last): File "cassandra/cluster.py", line 2781, in cassandra.cluster.ControlConnection._reconnect_internal File "cassandra/cluster.py", line 2803, in cassandra.cluster.ControlConnection._try_connect File "cassandra/cluster.py", line 1195, in cassandra.cluster.Cluster.connection_factory File "cassandra/connection.py", line 341, in cassandra.connection.Connection.factory cassandra.OperationTimedOut: errors=Timed out creating connection (5 seconds), last_host=None
>>>>> 
>>>>> What's causing these connection and timeout errors? Something related to Flask-CQLAlchemy?
>>>> 
>>>> 
>> 
>> 
> 
> 
> 
> -- 
> Alan Hamlett
> ahamlett.com