You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Jan Algermissen <ja...@nordsc.com> on 2013/09/16 18:09:08 UTC
All subsequent CAS requests time out after heavy use of new CAS feature
Hi,
I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for implementing distributed locks.
Basically, I have a table of 'states' I want to serialize access to:
create table state ( id text , lock uuid , data text, primary key (id) ) (3 nodes, replication level 3)
insert into state (id) values ( 'foo')
I try to akquire the lock for state 'foo' like this:
update state set lock = myUUID where id = 'foo' if lock = null;
and check whether I got it by comparing the lock against my supplied UUID:
select lock from state where id = 'foo';
... do work on 'foo' state ....
release lock:
update state set lock = null where id = 'foo' if lock = myUUID;
This works pretty well and if I increase the number of clients competing for the lock I start seeing timeouts on the client side. Natural so far and the lock also remains in a consistent state (it works to work around the failing clients and the uncertainty whether they got the lock or not).
However, after pausing the clients for a while the timeouts do not disappear. Meaning that when I send a single request after everything calms down , I still get a timeout:
Caused by: com.datastax.driver.core.exceptions.WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (-1 replica were required but only -1 acknowledged the write)
I do not see any reaction in the C* logs for these follow-up requests that still time out.
Any idea how to approach this problem?
Jan
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by horschi <ho...@gmail.com>.
Oh yes it is, like Couters :-)
On Sat, Dec 24, 2016 at 4:02 AM, Edward Capriolo <ed...@gmail.com>
wrote:
> Anecdotal CAS works differently than the typical cassandra workload. If
> you run a stress instance 3 nodes one host, you find that you typically run
> into CPU issues, but if you are doing a CAS workload you see things timing
> out and before you hit 100% CPU. It is a strange beast.
>
> On Fri, Dec 23, 2016 at 7:28 AM, horschi <ho...@gmail.com> wrote:
>
>> Update: I replace all quorum reads on that table with serial reads, and
>> now these errors got less. Somehow quorum reads on CAS values cause most of
>> these WTEs.
>>
>> Also I found two tickets on that topic:
>> https://issues.apache.org/jira/browse/CASSANDRA-9328
>> https://issues.apache.org/jira/browse/CASSANDRA-8672
>>
>> On Thu, Dec 15, 2016 at 3:14 PM, horschi <ho...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> I would like to warm up this old thread. I did some debugging and found
>>> out that the timeouts are coming from StorageProxy.proposePaxos()
>>> - callback.isFullyRefused() returns false and therefore triggers a
>>> WriteTimeout.
>>>
>>> Looking at my ccm cluster logs, I can see that two replica nodes return
>>> different results in their ProposeVerbHandler. In my opinion the
>>> coordinator should not throw a Exception in such a case, but instead retry
>>> the operation.
>>>
>>> What do the CAS/Paxos experts on this list say to this? Feel free to
>>> instruct me to do further tests/code changes. I'd be glad to help.
>>>
>>> Log:
>>>
>>> node1/logs/system.log:WARN [SharedPool-Worker-5] 2016-12-15
>>> 14:48:36,896 PaxosState.java:124 - Rejecting proposal for
>>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>>> columns=[[] | [value]]
>>> node1/logs/system.log- Row: id=@ | value=<tombstone>) because
>>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4,
>>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>>> --
>>> node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15
>>> 14:48:36,980 StorageProxy.java:506 - proposePaxos:
>>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>>> columns=[[] | [value]]
>>> node1/logs/system.log- Row: id=@ | value=<tombstone>)//1//0
>>> --
>>> node2/logs/system.log:WARN [SharedPool-Worker-7] 2016-12-15
>>> 14:48:36,969 PaxosState.java:117 - Accepting proposal:
>>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>>> columns=[[] | [value]]
>>> node2/logs/system.log- Row: id=@ | value=<tombstone>)
>>> --
>>> node3/logs/system.log:WARN [SharedPool-Worker-2] 2016-12-15
>>> 14:48:36,897 PaxosState.java:124 - Rejecting proposal for
>>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>>> columns=[[] | [value]]
>>> node3/logs/system.log- Row: id=@ | value=<tombstone>) because
>>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4,
>>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>>>
>>>
>>> kind regards,
>>> Christian
>>>
>>>
>>> On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers <da...@aol.com> wrote:
>>>
>>>> My thinking was that due to the size of the data that there maybe I/O
>>>> issues. But it sounds more like you're competing for locks and hit a
>>>> deadlock issue.
>>>>
>>>> Regards,
>>>> Denise
>>>> Cell - (860)989-3431 <(860)%20989-3431>
>>>>
>>>> Sent from mi iPhone
>>>>
>>>> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>>>>
>>>> Hi Denise,
>>>>
>>>> in my case its a small blob I am writing (should be around 100 bytes):
>>>>
>>>> CREATE TABLE "Lock" (
>>>> lockname varchar,
>>>> id varchar,
>>>> value blob,
>>>> PRIMARY KEY (lockname, id)
>>>> ) WITH COMPACT STORAGE
>>>> AND COMPRESSION = { 'sstable_compression' :
>>>> 'SnappyCompressor', 'chunk_length_kb' : '8' };
>>>>
>>>> You ask because large values are known to cause issues? Anything
>>>> special you have in mind?
>>>>
>>>> kind regards,
>>>> Christian
>>>>
>>>>
>>>>
>>>>
>>>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com>
>>>> wrote:
>>>>
>>>>> Also, what type of data were you reading/writing?
>>>>>
>>>>> Regards,
>>>>> Denise
>>>>>
>>>>> Sent from mi iPad
>>>>>
>>>>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>>>>
>>>>> Hi Jan,
>>>>>
>>>>> were you able to resolve your Problem?
>>>>>
>>>>> We are trying the same and also see a lot of WriteTimeouts:
>>>>> WriteTimeoutException: Cassandra timeout during write query at
>>>>> consistency SERIAL (2 replica were required but only 1 acknowledged the
>>>>> write)
>>>>>
>>>>> How many clients were competing for a lock in your case? In our case
>>>>> its only two :-(
>>>>>
>>>>> cheers,
>>>>> Christian
>>>>>
>>>>>
>>>>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
>>>>> wrote:
>>>>>
>>>>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>>>>>> jan.algermissen@nordsc.com> wrote:
>>>>>>
>>>>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0
>>>>>>> snapshot) for implementing distributed locks.
>>>>>>>
>>>>>>
>>>>>> [ and I'm experiencing the problem described in the subject ... ]
>>>>>>
>>>>>>
>>>>>>> Any idea how to approach this problem?
>>>>>>>
>>>>>>
>>>>>> 1) Upgrade to 2.0.1 release.
>>>>>> 2) Try to reproduce symptoms.
>>>>>> 3) If able to, file a JIRA at https://issues.apache.org/jira
>>>>>> /secure/Dashboard.jspa including repro steps
>>>>>> 4) Reply to this thread with the JIRA ticket URL
>>>>>>
>>>>>> =Rob
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by Edward Capriolo <ed...@gmail.com>.
Anecdotal CAS works differently than the typical cassandra workload. If you
run a stress instance 3 nodes one host, you find that you typically run
into CPU issues, but if you are doing a CAS workload you see things timing
out and before you hit 100% CPU. It is a strange beast.
On Fri, Dec 23, 2016 at 7:28 AM, horschi <ho...@gmail.com> wrote:
> Update: I replace all quorum reads on that table with serial reads, and
> now these errors got less. Somehow quorum reads on CAS values cause most of
> these WTEs.
>
> Also I found two tickets on that topic:
> https://issues.apache.org/jira/browse/CASSANDRA-9328
> https://issues.apache.org/jira/browse/CASSANDRA-8672
>
> On Thu, Dec 15, 2016 at 3:14 PM, horschi <ho...@gmail.com> wrote:
>
>> Hi,
>>
>> I would like to warm up this old thread. I did some debugging and found
>> out that the timeouts are coming from StorageProxy.proposePaxos()
>> - callback.isFullyRefused() returns false and therefore triggers a
>> WriteTimeout.
>>
>> Looking at my ccm cluster logs, I can see that two replica nodes return
>> different results in their ProposeVerbHandler. In my opinion the
>> coordinator should not throw a Exception in such a case, but instead retry
>> the operation.
>>
>> What do the CAS/Paxos experts on this list say to this? Feel free to
>> instruct me to do further tests/code changes. I'd be glad to help.
>>
>> Log:
>>
>> node1/logs/system.log:WARN [SharedPool-Worker-5] 2016-12-15 14:48:36,896
>> PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>> node1/logs/system.log- Row: id=@ | value=<tombstone>) because
>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>> --
>> node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15
>> 14:48:36,980 StorageProxy.java:506 - proposePaxos:
>> Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
>> columns=[[] | [value]]
>> node1/logs/system.log- Row: id=@ | value=<tombstone>)//1//0
>> --
>> node2/logs/system.log:WARN [SharedPool-Worker-7] 2016-12-15 14:48:36,969
>> PaxosState.java:117 - Accepting proposal: Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>> node2/logs/system.log- Row: id=@ | value=<tombstone>)
>> --
>> node3/logs/system.log:WARN [SharedPool-Worker-2] 2016-12-15 14:48:36,897
>> PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>> node3/logs/system.log- Row: id=@ | value=<tombstone>) because
>> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4,
>> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
>>
>>
>> kind regards,
>> Christian
>>
>>
>> On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers <da...@aol.com> wrote:
>>
>>> My thinking was that due to the size of the data that there maybe I/O
>>> issues. But it sounds more like you're competing for locks and hit a
>>> deadlock issue.
>>>
>>> Regards,
>>> Denise
>>> Cell - (860)989-3431 <(860)%20989-3431>
>>>
>>> Sent from mi iPhone
>>>
>>> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>>>
>>> Hi Denise,
>>>
>>> in my case its a small blob I am writing (should be around 100 bytes):
>>>
>>> CREATE TABLE "Lock" (
>>> lockname varchar,
>>> id varchar,
>>> value blob,
>>> PRIMARY KEY (lockname, id)
>>> ) WITH COMPACT STORAGE
>>> AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor',
>>> 'chunk_length_kb' : '8' };
>>>
>>> You ask because large values are known to cause issues? Anything special
>>> you have in mind?
>>>
>>> kind regards,
>>> Christian
>>>
>>>
>>>
>>>
>>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
>>>
>>>> Also, what type of data were you reading/writing?
>>>>
>>>> Regards,
>>>> Denise
>>>>
>>>> Sent from mi iPad
>>>>
>>>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>>>
>>>> Hi Jan,
>>>>
>>>> were you able to resolve your Problem?
>>>>
>>>> We are trying the same and also see a lot of WriteTimeouts:
>>>> WriteTimeoutException: Cassandra timeout during write query at
>>>> consistency SERIAL (2 replica were required but only 1 acknowledged the
>>>> write)
>>>>
>>>> How many clients were competing for a lock in your case? In our case
>>>> its only two :-(
>>>>
>>>> cheers,
>>>> Christian
>>>>
>>>>
>>>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
>>>> wrote:
>>>>
>>>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>>>>> jan.algermissen@nordsc.com> wrote:
>>>>>
>>>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0
>>>>>> snapshot) for implementing distributed locks.
>>>>>>
>>>>>
>>>>> [ and I'm experiencing the problem described in the subject ... ]
>>>>>
>>>>>
>>>>>> Any idea how to approach this problem?
>>>>>>
>>>>>
>>>>> 1) Upgrade to 2.0.1 release.
>>>>> 2) Try to reproduce symptoms.
>>>>> 3) If able to, file a JIRA at https://issues.apache.org/jira
>>>>> /secure/Dashboard.jspa including repro steps
>>>>> 4) Reply to this thread with the JIRA ticket URL
>>>>>
>>>>> =Rob
>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>
>>
>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by horschi <ho...@gmail.com>.
Update: I replace all quorum reads on that table with serial reads, and now
these errors got less. Somehow quorum reads on CAS values cause most of
these WTEs.
Also I found two tickets on that topic:
https://issues.apache.org/jira/browse/CASSANDRA-9328
https://issues.apache.org/jira/browse/CASSANDRA-8672
On Thu, Dec 15, 2016 at 3:14 PM, horschi <ho...@gmail.com> wrote:
> Hi,
>
> I would like to warm up this old thread. I did some debugging and found
> out that the timeouts are coming from StorageProxy.proposePaxos()
> - callback.isFullyRefused() returns false and therefore triggers a
> WriteTimeout.
>
> Looking at my ccm cluster logs, I can see that two replica nodes return
> different results in their ProposeVerbHandler. In my opinion the
> coordinator should not throw a Exception in such a case, but instead retry
> the operation.
>
> What do the CAS/Paxos experts on this list say to this? Feel free to
> instruct me to do further tests/code changes. I'd be glad to help.
>
> Log:
>
> node1/logs/system.log:WARN [SharedPool-Worker-5] 2016-12-15 14:48:36,896
> PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
> node1/logs/system.log- Row: id=@ | value=<tombstone>) because
> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock]
> key=locktest_ 1 columns=[[] | [value]]
> --
> node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15 14:48:36,980
> StorageProxy.java:506 - proposePaxos: Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
> node1/logs/system.log- Row: id=@ | value=<tombstone>)//1//0
> --
> node2/logs/system.log:WARN [SharedPool-Worker-7] 2016-12-15 14:48:36,969
> PaxosState.java:117 - Accepting proposal: Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
> node2/logs/system.log- Row: id=@ | value=<tombstone>)
> --
> node3/logs/system.log:WARN [SharedPool-Worker-2] 2016-12-15 14:48:36,897
> PaxosState.java:124 - Rejecting proposal for Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc,
> [MDS.Lock] key=locktest_ 1 columns=[[] | [value]]
> node3/logs/system.log- Row: id=@ | value=<tombstone>) because
> inProgress is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock]
> key=locktest_ 1 columns=[[] | [value]]
>
>
> kind regards,
> Christian
>
>
> On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers <da...@aol.com> wrote:
>
>> My thinking was that due to the size of the data that there maybe I/O
>> issues. But it sounds more like you're competing for locks and hit a
>> deadlock issue.
>>
>> Regards,
>> Denise
>> Cell - (860)989-3431 <(860)%20989-3431>
>>
>> Sent from mi iPhone
>>
>> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>>
>> Hi Denise,
>>
>> in my case its a small blob I am writing (should be around 100 bytes):
>>
>> CREATE TABLE "Lock" (
>> lockname varchar,
>> id varchar,
>> value blob,
>> PRIMARY KEY (lockname, id)
>> ) WITH COMPACT STORAGE
>> AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor',
>> 'chunk_length_kb' : '8' };
>>
>> You ask because large values are known to cause issues? Anything special
>> you have in mind?
>>
>> kind regards,
>> Christian
>>
>>
>>
>>
>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
>>
>>> Also, what type of data were you reading/writing?
>>>
>>> Regards,
>>> Denise
>>>
>>> Sent from mi iPad
>>>
>>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>>
>>> Hi Jan,
>>>
>>> were you able to resolve your Problem?
>>>
>>> We are trying the same and also see a lot of WriteTimeouts:
>>> WriteTimeoutException: Cassandra timeout during write query at
>>> consistency SERIAL (2 replica were required but only 1 acknowledged the
>>> write)
>>>
>>> How many clients were competing for a lock in your case? In our case its
>>> only two :-(
>>>
>>> cheers,
>>> Christian
>>>
>>>
>>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
>>> wrote:
>>>
>>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>>>> jan.algermissen@nordsc.com> wrote:
>>>>
>>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot)
>>>>> for implementing distributed locks.
>>>>>
>>>>
>>>> [ and I'm experiencing the problem described in the subject ... ]
>>>>
>>>>
>>>>> Any idea how to approach this problem?
>>>>>
>>>>
>>>> 1) Upgrade to 2.0.1 release.
>>>> 2) Try to reproduce symptoms.
>>>> 3) If able to, file a JIRA at https://issues.apache.org/jira
>>>> /secure/Dashboard.jspa including repro steps
>>>> 4) Reply to this thread with the JIRA ticket URL
>>>>
>>>> =Rob
>>>>
>>>>
>>>>
>>>
>>>
>>
>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by horschi <ho...@gmail.com>.
Hi,
I would like to warm up this old thread. I did some debugging and found out
that the timeouts are coming from StorageProxy.proposePaxos()
- callback.isFullyRefused() returns false and therefore triggers a
WriteTimeout.
Looking at my ccm cluster logs, I can see that two replica nodes return
different results in their ProposeVerbHandler. In my opinion the
coordinator should not throw a Exception in such a case, but instead retry
the operation.
What do the CAS/Paxos experts on this list say to this? Feel free to
instruct me to do further tests/code changes. I'd be glad to help.
Log:
node1/logs/system.log:WARN [SharedPool-Worker-5] 2016-12-15 14:48:36,896
PaxosState.java:124 - Rejecting proposal for
Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
columns=[[] | [value]]
node1/logs/system.log- Row: id=@ | value=<tombstone>) because inProgress
is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock]
key=locktest_ 1 columns=[[] | [value]]
--
node1/logs/system.log:ERROR [SharedPool-Worker-12] 2016-12-15 14:48:36,980
StorageProxy.java:506 - proposePaxos:
Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
columns=[[] | [value]]
node1/logs/system.log- Row: id=@ | value=<tombstone>)//1//0
--
node2/logs/system.log:WARN [SharedPool-Worker-7] 2016-12-15 14:48:36,969
PaxosState.java:117 - Accepting proposal:
Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
columns=[[] | [value]]
node2/logs/system.log- Row: id=@ | value=<tombstone>)
--
node3/logs/system.log:WARN [SharedPool-Worker-2] 2016-12-15 14:48:36,897
PaxosState.java:124 - Rejecting proposal for
Commit(2d803540-c2cd-11e6-2e48-53a129c60cfc, [MDS.Lock] key=locktest_ 1
columns=[[] | [value]]
node3/logs/system.log- Row: id=@ | value=<tombstone>) because inProgress
is now Commit(2d8146b0-c2cd-11e6-f996-e5c8d88a1da4, [MDS.Lock]
key=locktest_ 1 columns=[[] | [value]]
kind regards,
Christian
On Fri, Apr 15, 2016 at 8:27 PM, Denise Rogers <da...@aol.com> wrote:
> My thinking was that due to the size of the data that there maybe I/O
> issues. But it sounds more like you're competing for locks and hit a
> deadlock issue.
>
> Regards,
> Denise
> Cell - (860)989-3431 <(860)%20989-3431>
>
> Sent from mi iPhone
>
> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>
> Hi Denise,
>
> in my case its a small blob I am writing (should be around 100 bytes):
>
> CREATE TABLE "Lock" (
> lockname varchar,
> id varchar,
> value blob,
> PRIMARY KEY (lockname, id)
> ) WITH COMPACT STORAGE
> AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor',
> 'chunk_length_kb' : '8' };
>
> You ask because large values are known to cause issues? Anything special
> you have in mind?
>
> kind regards,
> Christian
>
>
>
>
> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
>
>> Also, what type of data were you reading/writing?
>>
>> Regards,
>> Denise
>>
>> Sent from mi iPad
>>
>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>
>> Hi Jan,
>>
>> were you able to resolve your Problem?
>>
>> We are trying the same and also see a lot of WriteTimeouts:
>> WriteTimeoutException: Cassandra timeout during write query at
>> consistency SERIAL (2 replica were required but only 1 acknowledged the
>> write)
>>
>> How many clients were competing for a lock in your case? In our case its
>> only two :-(
>>
>> cheers,
>> Christian
>>
>>
>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
>> wrote:
>>
>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>>> jan.algermissen@nordsc.com> wrote:
>>>
>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot)
>>>> for implementing distributed locks.
>>>>
>>>
>>> [ and I'm experiencing the problem described in the subject ... ]
>>>
>>>
>>>> Any idea how to approach this problem?
>>>>
>>>
>>> 1) Upgrade to 2.0.1 release.
>>> 2) Try to reproduce symptoms.
>>> 3) If able to, file a JIRA at https://issues.apache.org/
>>> jira/secure/Dashboard.jspa including repro steps
>>> 4) Reply to this thread with the JIRA ticket URL
>>>
>>> =Rob
>>>
>>>
>>>
>>
>>
>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by Denise Rogers <da...@aol.com>.
My thinking was that due to the size of the data that there maybe I/O issues. But it sounds more like you're competing for locks and hit a deadlock issue.
Regards,
Denise
Cell - (860)989-3431
Sent from mi iPhone
> On Apr 15, 2016, at 9:00 AM, horschi <ho...@gmail.com> wrote:
>
> Hi Denise,
>
> in my case its a small blob I am writing (should be around 100 bytes):
>
> CREATE TABLE "Lock" (
> lockname varchar,
> id varchar,
> value blob,
> PRIMARY KEY (lockname, id)
> ) WITH COMPACT STORAGE
> AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor', 'chunk_length_kb' : '8' };
>
> You ask because large values are known to cause issues? Anything special you have in mind?
>
> kind regards,
> Christian
>
>
>
>
>> On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
>> Also, what type of data were you reading/writing?
>>
>> Regards,
>> Denise
>>
>> Sent from mi iPad
>>
>>> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>>>
>>> Hi Jan,
>>>
>>> were you able to resolve your Problem?
>>>
>>> We are trying the same and also see a lot of WriteTimeouts:
>>> WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (2 replica were required but only 1 acknowledged the write)
>>>
>>> How many clients were competing for a lock in your case? In our case its only two :-(
>>>
>>> cheers,
>>> Christian
>>>
>>>
>>>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com> wrote:
>>>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <ja...@nordsc.com> wrote:
>>>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for implementing distributed locks.
>>>>
>>>> [ and I'm experiencing the problem described in the subject ... ]
>>>>
>>>>> Any idea how to approach this problem?
>>>>
>>>> 1) Upgrade to 2.0.1 release.
>>>> 2) Try to reproduce symptoms.
>>>> 3) If able to, file a JIRA at https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps
>>>> 4) Reply to this thread with the JIRA ticket URL
>>>>
>>>> =Rob
>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by horschi <ho...@gmail.com>.
Hi Denise,
in my case its a small blob I am writing (should be around 100 bytes):
CREATE TABLE "Lock" (
lockname varchar,
id varchar,
value blob,
PRIMARY KEY (lockname, id)
) WITH COMPACT STORAGE
AND COMPRESSION = { 'sstable_compression' : 'SnappyCompressor',
'chunk_length_kb' : '8' };
You ask because large values are known to cause issues? Anything special
you have in mind?
kind regards,
Christian
On Fri, Apr 15, 2016 at 2:42 PM, Denise Rogers <da...@aol.com> wrote:
> Also, what type of data were you reading/writing?
>
> Regards,
> Denise
>
> Sent from mi iPad
>
> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>
> Hi Jan,
>
> were you able to resolve your Problem?
>
> We are trying the same and also see a lot of WriteTimeouts:
> WriteTimeoutException: Cassandra timeout during write query at consistency
> SERIAL (2 replica were required but only 1 acknowledged the write)
>
> How many clients were competing for a lock in your case? In our case its
> only two :-(
>
> cheers,
> Christian
>
>
> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com>
> wrote:
>
>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
>> jan.algermissen@nordsc.com> wrote:
>>
>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot)
>>> for implementing distributed locks.
>>>
>>
>> [ and I'm experiencing the problem described in the subject ... ]
>>
>>
>>> Any idea how to approach this problem?
>>>
>>
>> 1) Upgrade to 2.0.1 release.
>> 2) Try to reproduce symptoms.
>> 3) If able to, file a JIRA at
>> https://issues.apache.org/jira/secure/Dashboard.jspa including repro
>> steps
>> 4) Reply to this thread with the JIRA ticket URL
>>
>> =Rob
>>
>>
>>
>
>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by Denise Rogers <da...@aol.com>.
Also, what type of data were you reading/writing?
Regards,
Denise
Sent from mi iPad
> On Apr 15, 2016, at 8:29 AM, horschi <ho...@gmail.com> wrote:
>
> Hi Jan,
>
> were you able to resolve your Problem?
>
> We are trying the same and also see a lot of WriteTimeouts:
> WriteTimeoutException: Cassandra timeout during write query at consistency SERIAL (2 replica were required but only 1 acknowledged the write)
>
> How many clients were competing for a lock in your case? In our case its only two :-(
>
> cheers,
> Christian
>
>
>> On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com> wrote:
>>> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <ja...@nordsc.com> wrote:
>>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for implementing distributed locks.
>>
>> [ and I'm experiencing the problem described in the subject ... ]
>>
>>> Any idea how to approach this problem?
>>
>> 1) Upgrade to 2.0.1 release.
>> 2) Try to reproduce symptoms.
>> 3) If able to, file a JIRA at https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps
>> 4) Reply to this thread with the JIRA ticket URL
>>
>> =Rob
>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by horschi <ho...@gmail.com>.
Hi Jan,
were you able to resolve your Problem?
We are trying the same and also see a lot of WriteTimeouts:
WriteTimeoutException: Cassandra timeout during write query at consistency
SERIAL (2 replica were required but only 1 acknowledged the write)
How many clients were competing for a lock in your case? In our case its
only two :-(
cheers,
Christian
On Tue, Sep 24, 2013 at 12:18 AM, Robert Coli <rc...@eventbrite.com> wrote:
> On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <
> jan.algermissen@nordsc.com> wrote:
>
>> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot)
>> for implementing distributed locks.
>>
>
> [ and I'm experiencing the problem described in the subject ... ]
>
>
>> Any idea how to approach this problem?
>>
>
> 1) Upgrade to 2.0.1 release.
> 2) Try to reproduce symptoms.
> 3) If able to, file a JIRA at
> https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps
> 4) Reply to this thread with the JIRA ticket URL
>
> =Rob
>
>
>
Re: All subsequent CAS requests time out after heavy use of new CAS feature
Posted by Robert Coli <rc...@eventbrite.com>.
On Mon, Sep 16, 2013 at 9:09 AM, Jan Algermissen <jan.algermissen@nordsc.com
> wrote:
> I am experimenting with C* 2.0 ( and today's java-driver 2.0 snapshot) for
> implementing distributed locks.
>
[ and I'm experiencing the problem described in the subject ... ]
> Any idea how to approach this problem?
>
1) Upgrade to 2.0.1 release.
2) Try to reproduce symptoms.
3) If able to, file a JIRA at
https://issues.apache.org/jira/secure/Dashboard.jspa including repro steps
4) Reply to this thread with the JIRA ticket URL
=Rob