You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jan Algermissen <ja...@nordsc.com> on 2013/09/16 18:09:08 UTC

All subsequent CAS requests time out after heavy use of new CAS feature

Hi,

I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for implementing distributed locks.

Basically, I have a table  of 'states' I want to serialize access to:

  create table state ( id text , lock uuid , data text, primary key (id) )   (3 nodes, replication level 3)

  insert into state (id) values ( 'foo')

I try to akquire the lock for state 'foo' like this:

  update state set lock = myUUID where id = 'foo' if lock = null;

and check whether I got it by comparing the lock against my supplied UUID:

   select lock from state where id = 'foo'; 

... do work on 'foo' state ....

release lock:

 update state set lock = null where id = 'foo' if lock = myUUID;


This works pretty well and if I increase the number of clients competing for the lock I start seeing timeouts on the client side. Natural so far and the lock also remains in a consistent state (it works to work around the failing clients and the uncertainty whether they got the lock or not).

However, after pausing the clients for a while the timeouts do not disappear. Meaning that when I send a single request after everything calms down , I still get a timeout:

   Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (-1 replica were required but only -1 acknowledged the write)

I do not see any reaction in the C* logs for these follow-up requests that still time out.

Any idea how to approach this problem?

Jan

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by horschi <ho...@gmail.com>.

Oh yes it is, like Couters :-)


On Sat, Dec 24, 2016 at 4:02 AM, Edward Capriolo <ed...@gmail.com>
wrote:

> Anecdotal CAS works differently than the typical cassandra workload. If
> you run a stress instance 3 nodes one host, you find that you typically run
> into CPU issues, but if you are doing a CAS workload you see things timing
> out and before you hit 100% CPU. It is a strange beast.
>
> On Fri, Dec 23, 2016 at 7:28 AM, horschi <ho...@gmail.com> wrote:
>
>> Update: I replace all quorum reads on that table with serial reads, and
>> now these errors got less. Somehow quorum reads on CAS values cause most of
>> these WTEs.
>>
>> Also I found two tickets on that topic:
>> https://issues.apache.org/jira/browse/CASSANDRA-9328
>> https://issues.apache.org/jira/browse/CASSANDRA-8672
>>
>> On Thu, Dec 15, 2016 at 3:14 PM, horschi <ho...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I would like to warm up this old thread. I did some debugging and found
>>> out that the timeouts are coming from StorageProxy.proposePaxos()
>>> - callback.isFullyRefused() returns false and therefore triggers a
>>> WriteTimeout.
>>>
>>> Looking at my ccm cluster logs, I can see that two replica nodes return
>>> different results in their ProposeVerbHandler. In my opinion the
>>> coordinator should not throw a Exception in such a case, but instead retry
>>> the operation.
>>>
>>> What do the CAS/Paxos experts on this list say to this? Feel free to
>>> instruct me to do further tests/code changes. I'd be glad to help.
>>>
>>> Log:
>>>
>>> node1/logs/system.log:WARN  [SharedPool-Worker-5] 2016-12-15
>>> 14:48:36,896 PaxosState.java:124 - Rejecting proposal for
>>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>>> columns=[[] | [value]]
>>> node1/logs/system.log-    Row: id=@ | value=<tombstone>) because
>>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4,
>>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>>> --
>>> node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15
>>> 14:48:36,980 StorageProxy.java:506 - proposePaxos:
>>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>>> columns=[[] | [value]]
>>> node1/logs/system.log-    Row: id=@ | value=<tombstone>)//1//0
>>> --
>>> node2/logs/system.log:WARN  [SharedPool-Worker-7] 2016-12-15
>>> 14:48:36,969 PaxosState.java:117 - Accepting proposal:
>>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>>> columns=[[] | [value]]
>>> node2/logs/system.log-    Row: id=@ | value=<tombstone>)
>>> --
>>> node3/logs/system.log:WARN  [SharedPool-Worker-2] 2016-12-15
>>> 14:48:36,897 PaxosState.java:124 - Rejecting proposal for
>>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>>> columns=[[] | [value]]
>>> node3/logs/system.log-    Row: id=@ | value=<tombstone>) because
>>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4,
>>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>>>
>>>
>>> kind regards,
>>> Christian
>>>
>>>
>>> On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers <da...@aol.com> wrote:
>>>
>>>> My thinking was that due to the size of the data that there maybe I/O
>>>> issues. But it sounds more like you're competing for locks and hit a
>>>> deadlock issue.
>>>>
>>>> Regards,
>>>> Denise
>>>> Cell - (860)989-3431 <(860)%20989-3431>
>>>>
>>>> Sent from mi iPhone
>>>>
>>>> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>>>>
>>>> Hi Denise,
>>>>
>>>> in my case its a small blob I am writing (should be around 100 bytes):
>>>>
>>>>      CREATE TABLE "Lock" (
>>>>          lockname varchar,
>>>>          id varchar,
>>>>          value blob,
>>>>          PRIMARY KEY (lockname, id)
>>>>      ) WITH COMPACT STORAGE
>>>>          AND COMPRESSION = { 'sstable_compression' :
>>>> 'SnappyCompressor', 'chunk_length_kb' : '8' };
>>>>
>>>> You ask because large values are known to cause issues? Anything
>>>> special you have in mind?
>>>>
>>>> kind regards,
>>>> Christian
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com>
>>>> wrote:
>>>>
>>>>> Also, what type of data were you reading/writing?
>>>>>
>>>>> Regards,
>>>>> Denise
>>>>>
>>>>> Sent from mi iPad
>>>>>
>>>>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>>>>
>>>>> Hi Jan,
>>>>>
>>>>> were you able to resolve your Problem?
>>>>>
>>>>> We are trying the same and also see a lot of WriteTimeouts:
>>>>> WriteTimeoutException: Cassandra timeout during write query at
>>>>> consistency SERIAL (2 replica were required but only 1 acknowledged the
>>>>> write)
>>>>>
>>>>> How many clients were competing for a lock in your case? In our case
>>>>> its only two :-(
>>>>>
>>>>> cheers,
>>>>> Christian
>>>>>
>>>>>
>>>>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>>>>>> jan.algermissen@nordsc.com> wrote:
>>>>>>
>>>>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0
>>>>>>> snapshot) for implementing distributed locks.
>>>>>>>
>>>>>>
>>>>>> [ and I'm experiencing the problem described in the subject ... ]
>>>>>>
>>>>>>
>>>>>>> Any idea how to approach this problem?
>>>>>>>
>>>>>>
>>>>>> 1) Upgrade to 2.0.1 release.
>>>>>> 2) Try to reproduce symptoms.
>>>>>> 3) If able to, file a JIRA at https://issues.apache.org/jira
>>>>>> /secure/Dashboard.jspa including repro steps
>>>>>> 4) Reply to this thread with the JIRA ticket URL
>>>>>>
>>>>>> =Rob
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by Edward Capriolo <ed...@gmail.com>.

Anecdotal CAS works differently than the typical cassandra workload. If you
run a stress instance 3 nodes one host, you find that you typically run
into CPU issues, but if you are doing a CAS workload you see things timing
out and before you hit 100% CPU. It is a strange beast.

On Fri, Dec 23, 2016 at 7:28 AM, horschi <ho...@gmail.com> wrote:

> Update: I replace all quorum reads on that table with serial reads, and
> now these errors got less. Somehow quorum reads on CAS values cause most of
> these WTEs.
>
> Also I found two tickets on that topic:
> https://issues.apache.org/jira/browse/CASSANDRA-9328
> https://issues.apache.org/jira/browse/CASSANDRA-8672
>
> On Thu, Dec 15, 2016 at 3:14 PM, horschi <ho...@gmail.com> wrote:
>
>> Hi,
>>
>> I would like to warm up this old thread. I did some debugging and found
>> out that the timeouts are coming from StorageProxy.proposePaxos()
>> - callback.isFullyRefused() returns false and therefore triggers a
>> WriteTimeout.
>>
>> Looking at my ccm cluster logs, I can see that two replica nodes return
>> different results in their ProposeVerbHandler. In my opinion the
>> coordinator should not throw a Exception in such a case, but instead retry
>> the operation.
>>
>> What do the CAS/Paxos experts on this list say to this? Feel free to
>> instruct me to do further tests/code changes. I'd be glad to help.
>>
>> Log:
>>
>> node1/logs/system.log:WARN  [SharedPool-Worker-5] 2016-12-15 14:48:36,896
>> PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>> node1/logs/system.log-    Row: id=@ | value=<tombstone>) because
>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>> --
>> node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15
>> 14:48:36,980 StorageProxy.java:506 - proposePaxos:
>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>> columns=[[] | [value]]
>> node1/logs/system.log-    Row: id=@ | value=<tombstone>)//1//0
>> --
>> node2/logs/system.log:WARN  [SharedPool-Worker-7] 2016-12-15 14:48:36,969
>> PaxosState.java:117 - Accepting proposal: Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>> node2/logs/system.log-    Row: id=@ | value=<tombstone>)
>> --
>> node3/logs/system.log:WARN  [SharedPool-Worker-2] 2016-12-15 14:48:36,897
>> PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>> node3/logs/system.log-    Row: id=@ | value=<tombstone>) because
>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>>
>>
>> kind regards,
>> Christian
>>
>>
>> On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers <da...@aol.com> wrote:
>>
>>> My thinking was that due to the size of the data that there maybe I/O
>>> issues. But it sounds more like you're competing for locks and hit a
>>> deadlock issue.
>>>
>>> Regards,
>>> Denise
>>> Cell - (860)989-3431 <(860)%20989-3431>
>>>
>>> Sent from mi iPhone
>>>
>>> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>>>
>>> Hi Denise,
>>>
>>> in my case its a small blob I am writing (should be around 100 bytes):
>>>
>>>      CREATE TABLE "Lock" (
>>>          lockname varchar,
>>>          id varchar,
>>>          value blob,
>>>          PRIMARY KEY (lockname, id)
>>>      ) WITH COMPACT STORAGE
>>>          AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor',
>>> 'chunk_length_kb' : '8' };
>>>
>>> You ask because large values are known to cause issues? Anything special
>>> you have in mind?
>>>
>>> kind regards,
>>> Christian
>>>
>>>
>>>
>>>
>>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
>>>
>>>> Also, what type of data were you reading/writing?
>>>>
>>>> Regards,
>>>> Denise
>>>>
>>>> Sent from mi iPad
>>>>
>>>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>>>
>>>> Hi Jan,
>>>>
>>>> were you able to resolve your Problem?
>>>>
>>>> We are trying the same and also see a lot of WriteTimeouts:
>>>> WriteTimeoutException: Cassandra timeout during write query at
>>>> consistency SERIAL (2 replica were required but only 1 acknowledged the
>>>> write)
>>>>
>>>> How many clients were competing for a lock in your case? In our case
>>>> its only two :-(
>>>>
>>>> cheers,
>>>> Christian
>>>>
>>>>
>>>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
>>>> wrote:
>>>>
>>>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>>>>> jan.algermissen@nordsc.com> wrote:
>>>>>
>>>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0
>>>>>> snapshot) for implementing distributed locks.
>>>>>>
>>>>>
>>>>> [ and I'm experiencing the problem described in the subject ... ]
>>>>>
>>>>>
>>>>>> Any idea how to approach this problem?
>>>>>>
>>>>>
>>>>> 1) Upgrade to 2.0.1 release.
>>>>> 2) Try to reproduce symptoms.
>>>>> 3) If able to, file a JIRA at https://issues.apache.org/jira
>>>>> /secure/Dashboard.jspa including repro steps
>>>>> 4) Reply to this thread with the JIRA ticket URL
>>>>>
>>>>> =Rob
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by horschi <ho...@gmail.com>.

Update: I replace all quorum reads on that table with serial reads, and now
these errors got less. Somehow quorum reads on CAS values cause most of
these WTEs.

Also I found two tickets on that topic:
https://issues.apache.org/jira/browse/CASSANDRA-9328
https://issues.apache.org/jira/browse/CASSANDRA-8672

On Thu, Dec 15, 2016 at 3:14 PM, horschi <ho...@gmail.com> wrote:

> Hi,
>
> I would like to warm up this old thread. I did some debugging and found
> out that the timeouts are coming from StorageProxy.proposePaxos()
> - callback.isFullyRefused() returns false and therefore triggers a
> WriteTimeout.
>
> Looking at my ccm cluster logs, I can see that two replica nodes return
> different results in their ProposeVerbHandler. In my opinion the
> coordinator should not throw a Exception in such a case, but instead retry
> the operation.
>
> What do the CAS/Paxos experts on this list say to this? Feel free to
> instruct me to do further tests/code changes. I'd be glad to help.
>
> Log:
>
> node1/logs/system.log:WARN  [SharedPool-Worker-5] 2016-12-15 14:48:36,896
> PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
> node1/logs/system.log-    Row: id=@ | value=<tombstone>) because
> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock]
> key=locktest_ 1 columns=[[] | [value]]
> --
> node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15 14:48:36,980
> StorageProxy.java:506 - proposePaxos: Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
> node1/logs/system.log-    Row: id=@ | value=<tombstone>)//1//0
> --
> node2/logs/system.log:WARN  [SharedPool-Worker-7] 2016-12-15 14:48:36,969
> PaxosState.java:117 - Accepting proposal: Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
> node2/logs/system.log-    Row: id=@ | value=<tombstone>)
> --
> node3/logs/system.log:WARN  [SharedPool-Worker-2] 2016-12-15 14:48:36,897
> PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
> node3/logs/system.log-    Row: id=@ | value=<tombstone>) because
> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock]
> key=locktest_ 1 columns=[[] | [value]]
>
>
> kind regards,
> Christian
>
>
> On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers <da...@aol.com> wrote:
>
>> My thinking was that due to the size of the data that there maybe I/O
>> issues. But it sounds more like you're competing for locks and hit a
>> deadlock issue.
>>
>> Regards,
>> Denise
>> Cell - (860)989-3431 <(860)%20989-3431>
>>
>> Sent from mi iPhone
>>
>> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>>
>> Hi Denise,
>>
>> in my case its a small blob I am writing (should be around 100 bytes):
>>
>>      CREATE TABLE "Lock" (
>>          lockname varchar,
>>          id varchar,
>>          value blob,
>>          PRIMARY KEY (lockname, id)
>>      ) WITH COMPACT STORAGE
>>          AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor',
>> 'chunk_length_kb' : '8' };
>>
>> You ask because large values are known to cause issues? Anything special
>> you have in mind?
>>
>> kind regards,
>> Christian
>>
>>
>>
>>
>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
>>
>>> Also, what type of data were you reading/writing?
>>>
>>> Regards,
>>> Denise
>>>
>>> Sent from mi iPad
>>>
>>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>>
>>> Hi Jan,
>>>
>>> were you able to resolve your Problem?
>>>
>>> We are trying the same and also see a lot of WriteTimeouts:
>>> WriteTimeoutException: Cassandra timeout during write query at
>>> consistency SERIAL (2 replica were required but only 1 acknowledged the
>>> write)
>>>
>>> How many clients were competing for a lock in your case? In our case its
>>> only two :-(
>>>
>>> cheers,
>>> Christian
>>>
>>>
>>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
>>> wrote:
>>>
>>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>>>> jan.algermissen@nordsc.com> wrote:
>>>>
>>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot)
>>>>> for implementing distributed locks.
>>>>>
>>>>
>>>> [ and I'm experiencing the problem described in the subject ... ]
>>>>
>>>>
>>>>> Any idea how to approach this problem?
>>>>>
>>>>
>>>> 1) Upgrade to 2.0.1 release.
>>>> 2) Try to reproduce symptoms.
>>>> 3) If able to, file a JIRA at https://issues.apache.org/jira
>>>> /secure/Dashboard.jspa including repro steps
>>>> 4) Reply to this thread with the JIRA ticket URL
>>>>
>>>> =Rob
>>>>
>>>>
>>>>
>>>
>>>
>>
>

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by horschi <ho...@gmail.com>.

Hi,

I would like to warm up this old thread. I did some debugging and found out
that the timeouts are coming from StorageProxy.proposePaxos()
- callback.isFullyRefused() returns false and therefore triggers a
WriteTimeout.

Looking at my ccm cluster logs, I can see that two replica nodes return
different results in their ProposeVerbHandler. In my opinion the
coordinator should not throw a Exception in such a case, but instead retry
the operation.

What do the CAS/Paxos experts on this list say to this? Feel free to
instruct me to do further tests/code changes. I'd be glad to help.

Log:

node1/logs/system.log:WARN  [SharedPool-Worker-5] 2016-12-15 14:48:36,896
PaxosState.java:124 - Rejecting proposal for
Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
columns=[[] | [value]]
node1/logs/system.log-    Row: id=@ | value=<tombstone>) because inProgress
is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock]
key=locktest_ 1 columns=[[] | [value]]
--
node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15 14:48:36,980
StorageProxy.java:506 - proposePaxos:
Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
columns=[[] | [value]]
node1/logs/system.log-    Row: id=@ | value=<tombstone>)//1//0
--
node2/logs/system.log:WARN  [SharedPool-Worker-7] 2016-12-15 14:48:36,969
PaxosState.java:117 - Accepting proposal:
Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
columns=[[] | [value]]
node2/logs/system.log-    Row: id=@ | value=<tombstone>)
--
node3/logs/system.log:WARN  [SharedPool-Worker-2] 2016-12-15 14:48:36,897
PaxosState.java:124 - Rejecting proposal for
Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
columns=[[] | [value]]
node3/logs/system.log-    Row: id=@ | value=<tombstone>) because inProgress
is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock]
key=locktest_ 1 columns=[[] | [value]]


kind regards,
Christian


On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers <da...@aol.com> wrote:

> My thinking was that due to the size of the data that there maybe I/O
> issues. But it sounds more like you're competing for locks and hit a
> deadlock issue.
>
> Regards,
> Denise
> Cell - (860)989-3431 <(860)%20989-3431>
>
> Sent from mi iPhone
>
> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>
> Hi Denise,
>
> in my case its a small blob I am writing (should be around 100 bytes):
>
>      CREATE TABLE "Lock" (
>          lockname varchar,
>          id varchar,
>          value blob,
>          PRIMARY KEY (lockname, id)
>      ) WITH COMPACT STORAGE
>          AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor',
> 'chunk_length_kb' : '8' };
>
> You ask because large values are known to cause issues? Anything special
> you have in mind?
>
> kind regards,
> Christian
>
>
>
>
> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
>
>> Also, what type of data were you reading/writing?
>>
>> Regards,
>> Denise
>>
>> Sent from mi iPad
>>
>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>
>> Hi Jan,
>>
>> were you able to resolve your Problem?
>>
>> We are trying the same and also see a lot of WriteTimeouts:
>> WriteTimeoutException: Cassandra timeout during write query at
>> consistency SERIAL (2 replica were required but only 1 acknowledged the
>> write)
>>
>> How many clients were competing for a lock in your case? In our case its
>> only two :-(
>>
>> cheers,
>> Christian
>>
>>
>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
>> wrote:
>>
>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>>> jan.algermissen@nordsc.com> wrote:
>>>
>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot)
>>>> for implementing distributed locks.
>>>>
>>>
>>> [ and I'm experiencing the problem described in the subject ... ]
>>>
>>>
>>>> Any idea how to approach this problem?
>>>>
>>>
>>> 1) Upgrade to 2.0.1 release.
>>> 2) Try to reproduce symptoms.
>>> 3) If able to, file a JIRA at https://issues.apache.org/
>>> jira/secure/Dashboard.jspa including repro steps
>>> 4) Reply to this thread with the JIRA ticket URL
>>>
>>> =Rob
>>>
>>>
>>>
>>
>>
>

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by Denise Rogers <da...@aol.com>.

My thinking was that due to the size of the data that there maybe I/O issues. But it sounds more like you're competing for locks and hit a deadlock issue. 

Regards,
Denise
Cell - (860)989-3431

Sent from mi iPhone

> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
> 
> Hi Denise,
> 
> in my case its a small blob I am writing (should be around 100 bytes):
> 
>      CREATE TABLE "Lock" (
>          lockname varchar,
>          id varchar,
>          value blob,
>          PRIMARY KEY (lockname, id)
>      ) WITH COMPACT STORAGE 
>          AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor', 'chunk_length_kb' : '8' };
> 
> You ask because large values are known to cause issues? Anything special you have in mind?
> 
> kind regards,
> Christian
> 
> 
> 
> 
>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
>> Also, what type of data were you reading/writing?
>> 
>> Regards,
>> Denise
>> 
>> Sent from mi iPad
>> 
>>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>> 
>>> Hi Jan,
>>> 
>>> were you able to resolve your Problem?
>>> 
>>> We are trying the same and also see a lot of WriteTimeouts:
>>> WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (2 replica were required but only 1 acknowledged the write)
>>> 
>>> How many clients were competing for a lock in your case? In our case its only two :-(
>>> 
>>> cheers,
>>> Christian
>>> 
>>> 
>>>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com> wrote:
>>>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <ja...@nordsc.com> wrote:
>>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for implementing distributed locks.
>>>> 
>>>> [ and I'm experiencing the problem described in the subject ... ]
>>>>  
>>>>> Any idea how to approach this problem?
>>>> 
>>>> 1) Upgrade to 2.0.1 release.
>>>> 2) Try to reproduce symptoms.
>>>> 3) If able to, file a JIRA at https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps
>>>> 4) Reply to this thread with the JIRA ticket URL
>>>> 
>>>> =Rob
>

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by horschi <ho...@gmail.com>.

Hi Denise,

in my case its a small blob I am writing (should be around 100 bytes):

     CREATE TABLE "Lock" (
         lockname varchar,
         id varchar,
         value blob,
         PRIMARY KEY (lockname, id)
     ) WITH COMPACT STORAGE
         AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor',
'chunk_length_kb' : '8' };

You ask because large values are known to cause issues? Anything special
you have in mind?

kind regards,
Christian




On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:

> Also, what type of data were you reading/writing?
>
> Regards,
> Denise
>
> Sent from mi iPad
>
> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>
> Hi Jan,
>
> were you able to resolve your Problem?
>
> We are trying the same and also see a lot of WriteTimeouts:
> WriteTimeoutException: Cassandra timeout during write query at consistency
> SERIAL (2 replica were required but only 1 acknowledged the write)
>
> How many clients were competing for a lock in your case? In our case its
> only two :-(
>
> cheers,
> Christian
>
>
> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
> wrote:
>
>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>> jan.algermissen@nordsc.com> wrote:
>>
>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot)
>>> for implementing distributed locks.
>>>
>>
>> [ and I'm experiencing the problem described in the subject ... ]
>>
>>
>>> Any idea how to approach this problem?
>>>
>>
>> 1) Upgrade to 2.0.1 release.
>> 2) Try to reproduce symptoms.
>> 3) If able to, file a JIRA at
>> https://issues.apache.org/jira/secure/Dashboard.jspa including repro
>> steps
>> 4) Reply to this thread with the JIRA ticket URL
>>
>> =Rob
>>
>>
>>
>
>

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by Denise Rogers <da...@aol.com>.

Also, what type of data were you reading/writing?

Regards,
Denise

Sent from mi iPad

> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
> 
> Hi Jan,
> 
> were you able to resolve your Problem?
> 
> We are trying the same and also see a lot of WriteTimeouts:
> WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (2 replica were required but only 1 acknowledged the write)
> 
> How many clients were competing for a lock in your case? In our case its only two :-(
> 
> cheers,
> Christian
> 
> 
>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com> wrote:
>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <ja...@nordsc.com> wrote:
>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for implementing distributed locks.
>> 
>> [ and I'm experiencing the problem described in the subject ... ]
>>  
>>> Any idea how to approach this problem?
>> 
>> 1) Upgrade to 2.0.1 release.
>> 2) Try to reproduce symptoms.
>> 3) If able to, file a JIRA at https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps
>> 4) Reply to this thread with the JIRA ticket URL
>> 
>> =Rob
>

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by horschi <ho...@gmail.com>.

Hi Jan,

were you able to resolve your Problem?

We are trying the same and also see a lot of WriteTimeouts:
WriteTimeoutException: Cassandra timeout during write query at consistency
SERIAL (2 replica were required but only 1 acknowledged the write)

How many clients were competing for a lock in your case? In our case its
only two :-(

cheers,
Christian


On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com> wrote:

> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
> jan.algermissen@nordsc.com> wrote:
>
>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot)
>> for implementing distributed locks.
>>
>
> [ and I'm experiencing the problem described in the subject ... ]
>
>
>> Any idea how to approach this problem?
>>
>
> 1) Upgrade to 2.0.1 release.
> 2) Try to reproduce symptoms.
> 3) If able to, file a JIRA at
> https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps
> 4) Reply to this thread with the JIRA ticket URL
>
> =Rob
>
>
>

Re: All subsequent CAS requests time out after heavy use of new CAS feature

Posted by Robert Coli <rc...@eventbrite.com>.

On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <jan.algermissen@nordsc.com
> wrote:

> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for
> implementing distributed locks.
>

[ and I'm experiencing the problem described in the subject ... ]

> Any idea how to approach this problem?
>

1) Upgrade to 2.0.1 release.
2) Try to reproduce symptoms.
3) If able to, file a JIRA at
https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps
4) Reply to this thread with the JIRA ticket URL

=Rob