You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Alain RODRIGUEZ <ar...@gmail.com> on 2011/11/07 16:15:57 UTC

Counters and replication factor

Hi,

I trying to switch from a RF = 1 to a RF = 3, but I get wrong values from
counters when doing so...

I got a CF that contains many counters of some events. When I'm at RF = 1
and simulate 10 events, they are well counted.
However, when I switch to a RF = 3, my counter show a wrong value that
sometimes change when requested twice (it can return 7, then 5 instead of
10 all the time).

I first thought that it was a problem of CL because I seem to remember that
I read once that I had to use CL.One for reads and writes with counters. So
I tried with CL.One, without success...

What am I doing wrong ? Is that some precaution to take when replicating
counters ?

Alain

Re: Counters and replication factor

Posted by Radim Kolar <hs...@filez.com>.

Dne 25.5.2012 2:41, Edward Capriolo napsal(a):
>
> Also it does not sound like you have run anti entropy repair. You 
> should do that when upping rf.
i run entropy repairs and it still does not fix counters. I have some 
reports from users with same problem but nobody discovered repeatable 
scenario. I am currently in migrating phase to Infinispan data grid, it 
does not seems to have problems with distributed counters.

Re: Counters and replication factor

Posted by Edward Capriolo <ed...@gmail.com>.

Also it does not sound like you have run anti entropy repair. You should do
that when upping rf.
On Monday, May 21, 2012, Radim Kolar <hs...@filez.com> wrote:
> Dne 26.3.2012 19:17, aaron morton napsal(a):
>>
>> Can you describe the situations where counter updates are lost or go
backwards ?
>>
>> Do you ever get TimedOutExceptions when performing counter updates ?
>
> we got few timeouts per day but not much, less then 10. I do not think
that timeouts will be root cause. I havent figured exact steps to reproduce
it (i havent even tried). We are reading at CL.ONE but cluster is well
synchronized and we are reading long time after writing - new value should
be present at all nodes allready.
>

Re: Counters and replication factor

Posted by Radim Kolar <hs...@filez.com>.

Dne 26.3.2012 19:17, aaron morton napsal(a):
> Can you describe the situations where counter updates are lost or go 
> backwards ?
>
> Do you ever get TimedOutExceptions when performing counter updates ?
we got few timeouts per day but not much, less then 10. I do not think 
that timeouts will be root cause. I havent figured exact steps to 
reproduce it (i havent even tried). We are reading at CL.ONE but cluster 
is well synchronized and we are reading long time after writing - new 
value should be present at all nodes allready.

Re: Counters and replication factor

Posted by aaron morton <aa...@thelastpickle.com>.

Can you describe the situations where counter updates are lost or go backwards ?

Do you ever get TimedOutExceptions when performing counter updates ? 

Cheers
 
-----------------
Aaron Morton
Freelance Developer
@aaronmorton
http://www.thelastpickle.com

On 24/03/2012, at 6:34 PM, Radim Kolar wrote:

> 
>> I still have wrong results (I simulated an event 5 times and it was counted 3 times by some counters 4 or 5 times by others.
> I have also wrong results with counters in 1.0.8, many times updates to counter column are just lost and sometimes counters are going backwards even if our app uses only increments. Dont reply on counters for something important. they are still beta quality.
> 
> We are now using zookeeper for "important counters" and cassandra for junk like statistic.

Re: Counters and replication factor

Posted by Radim Kolar <hs...@filez.com>.

> I still have wrong results (I simulated an event 5 times and it was 
> counted 3 times by some counters 4 or 5 times by others.
I have also wrong results with counters in 1.0.8, many times updates to 
counter column are just lost and sometimes counters are going backwards 
even if our app uses only increments. Dont reply on counters for 
something important. they are still beta quality.

We are now using zookeeper for "important counters" and cassandra for 
junk like statistic.

Re: Counters and replication factor

Posted by Sylvain Lebresne <sy...@datastax.com>.

This sound like a bug 'a priori'. Do you mind opening a ticket at
https://issues.apache.org/jira/browse/CASSANDRA?
It will help if you can specify which version you are using and the
exact procedure you did that leads to that.
If know how to reproduce, that would be even better.

--
Sylvain

On Mon, Nov 7, 2011 at 5:57 PM, Alain RODRIGUEZ <ar...@gmail.com> wrote:
> I retried it after restarting all the servers.
> I still have wrong results (I simulated an event 5 times and it was counted
> 3 times by some counters 4 or 5 times by others.
> What I meant by "but now every request returns me always the same count
> value..." will be easier to explain with an example :
> event 1:
> counter1.increment
> counter2.increment
> counter3.increment
> .
> .
> .
> event 5:
> counter1.increment
> counter2.increment
> counter3.increment
> Show results :
> counter1.getValue = returns 4
> counter2.getValue = returns 3
> counter3.getValue = returns 5
> counter1.getValue = returns 5
> counter2.getValue = returns 3
> counter3.getValue = returns 5
> counter1.getValue = returns 4
> counter2.getValue = returns 4
> counter3.getValue = returns 5
> ...
> So I've got wrong values, and not always the same ones. In my previous email
> I tried to tell you by saying "but now every request returns me always the
> same count value..." that I had all the time the same wrong values, let us
> say :
> counter1.getValue = returns 4
> counter2.getValue = returns 3
> counter3.getValue = returns 5
> counter1.getValue = returns 4
> counter2.getValue = returns 3
> counter3.getValue = returns 5
> counter1.getValue = returns 4
> counter2.getValue = returns 3
> counter3.getValue = returns 5
> But that is not true, I still have some "random" wrong values, maybe haven't
> I query to get counter values often enough to see it last time.
> Sorry of not being clearer, that is not easy to explain, neither to
> understand for me.
> Thanks for help.
> Alain
>
> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>>
>> Alain,
>> When you tried CL.All was that only after you had made the change of
>> ReplicationFactor=3 and restarted all the servers?
>> If you hadn't restarted the servers with the new RF, I am not sure that
>> CL.All would have the intended effect.
>> Also, I wasn't sure what you meant by "but know every request returns me
>> always the same count value..." -- didn't want the requests to always return
>> you the same values?
>> Or maybe you are saying that it always returns the same *wrong* value?
>> Like you do:
>> counter.increment (v=1)
>> counter.increment (v=2)
>> counter.increment (v=3)
>> counter.getValue = returns 7
>> counter.getValue = returns 7
>> counter.getValue = returns 7
>> or something inconsistent like that?
>> On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <ar...@gmail.com>
>> wrote:
>>>
>>> I've tried with CL.All, but it doesn't wotk better. I still have strange
>>> values (between 4 and 10 events counted instead of 10) but know every
>>> request returns me always the same count value...
>>> It's very strange.
>>> Any other idea ?
>>> Alain
>>>
>>> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>>>>
>>>> Alain,
>>>> Try using a CL of 3 or "ALL" and see if that the problem goes away.
>>>> Your replication factor (as I just learned) dictates how many nodes each
>>>> piece of data is replicated to; by using a RF of 3 you are saying "replicate
>>>> all my data to all my nodes" (in this case counters).
>>>> This doesn't happen immediately, but you can *force* it to happen on
>>>> write by specifying a CL of "ALL". If you specify "1" then your counter
>>>> value is written to one member of the ring, then your command returns.
>>>> If you keep querying you will bounce around your ring, reading the
>>>> values from the different nodes until a future date at *which point* all the
>>>> values will likely agree.
>>>> If you keep all your code you have now exactly the same, just change the
>>>> code at the end where you read the counter value back, to keep reading the
>>>> counter value back every second for 60 seconds and see if all the values
>>>> eventually match up -- they should (as the counter value is replicated to
>>>> all the nodes and their old values discarded).
>>>> -R
>>>>
>>>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <ar...@gmail.com>
>>>> wrote:
>>>>>
>>>>> Hi,
>>>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values
>>>>> from counters when doing so...
>>>>> I got a CF that contains many counters of some events. When I'm at RF =
>>>>> 1 and simulate 10 events, they are well counted.
>>>>> However, when I switch to a RF = 3, my counter show a wrong value that
>>>>> sometimes change when requested twice (it can return 7, then 5 instead of 10
>>>>> all the time).
>>>>> I first thought that it was a problem of CL because I seem to remember
>>>>> that I read once that I had to use CL.One for reads and writes with
>>>>> counters. So I tried with CL.One, without success...
>>>>> What am I doing wrong ? Is that some precaution to take when
>>>>> replicating counters ?
>>>>> Alain
>>>
>>
>
>

Re: Counters and replication factor

Posted by Riyad Kalla <rk...@gmail.com>.

Most welcome, hopefully the bug is easy to find and kill :)

On Tue, Nov 8, 2011 at 3:28 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> Sylvain, here is my ticket, but I guess you already know it since you are
> the assignee :) -->https://issues.apache.org/jira/browse/CASSANDRA-3465
> Riyad, Thanks for your help.
>
> Alain
>
> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>
>> Alain thank you for all the clarification, I understand exactly what you
>> meant now... and as a result am just as confused as you are :)
>>
>> What version of Cassandra are you using? Can you share the important
>> parts of your config? (you double checked that your replication factor is
>> set on all 3 to "3"?)
>>
>> Also out of curiosity, if you keep querying for up to 5 mins (say every
>> 10 seconds) do counter1, 2 and 3 still show the same wrong values for
>> getValue or do the values eventually converge on the correct amounts?
>>
>> (I assume 5mins is a long enough window to test, maybe I'm wrong and
>> another Cassandra dev can correct me here).
>>
>> -R
>>
>>
>> On Mon, Nov 7, 2011 at 9:57 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>
>>> I retried it after restarting all the servers.
>>>
>>> I still have wrong results (I simulated an event 5 times and it was
>>> counted 3 times by some counters 4 or 5 times by others.
>>>
>>> What I meant by "but now every request returns me always the same count
>>> value..." will be easier to explain with an example :
>>>
>>> event 1:
>>>
>>> counter1.increment
>>> counter2.increment
>>> counter3.increment
>>>
>>> .
>>> .
>>> .
>>>
>>> event 5:
>>>
>>> counter1.increment
>>> counter2.increment
>>> counter3.increment
>>>
>>> Show results :
>>>
>>> counter1.getValue = returns 4
>>> counter2.getValue = returns 3
>>> counter3.getValue = returns 5
>>>
>>> counter1.getValue = returns 5
>>> counter2.getValue = returns 3
>>> counter3.getValue = returns 5
>>>
>>> counter1.getValue = returns 4
>>> counter2.getValue = returns 4
>>> counter3.getValue = returns 5
>>>
>>> ...
>>>
>>> So I've got wrong values, and not always the same ones. In my previous
>>> email I tried to tell you by saying "but now every request returns me
>>> always the same count value..." that I had all the time the same wrong
>>> values, let us say :
>>>
>>> counter1.getValue = returns 4
>>> counter2.getValue = returns 3
>>> counter3.getValue = returns 5
>>>
>>> counter1.getValue = returns 4
>>> counter2.getValue = returns 3
>>> counter3.getValue = returns 5
>>>
>>> counter1.getValue = returns 4
>>> counter2.getValue = returns 3
>>> counter3.getValue = returns 5
>>>
>>> But that is not true, I still have some "random" wrong values, maybe
>>> haven't I query to get counter values often enough to see it last time.
>>>
>>> Sorry of not being clearer, that is not easy to explain, neither to
>>> understand for me.
>>>
>>> Thanks for help.
>>>
>>> Alain
>>>
>>>
>>> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>>>
>>>> Alain,
>>>>
>>>> When you tried CL.All was that only after you had made the change of
>>>> ReplicationFactor=3 and restarted all the servers?
>>>>
>>>> If you hadn't restarted the servers with the new RF, I am not sure that
>>>> CL.All would have the intended effect.
>>>>
>>>> Also, I wasn't sure what you meant by "but know every request returns
>>>> me always the same count value..." -- didn't want the requests to always
>>>> return you the same values?
>>>>
>>>> Or maybe you are saying that it always returns the same *wrong* value?
>>>> Like you do:
>>>>
>>>> counter.increment (v=1)
>>>> counter.increment (v=2)
>>>> counter.increment (v=3)
>>>>
>>>> counter.getValue = returns 7
>>>> counter.getValue = returns 7
>>>> counter.getValue = returns 7
>>>>
>>>> or something inconsistent like that?
>>>>
>>>> On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>>>
>>>>> I've tried with CL.All, but it doesn't wotk better. I still have
>>>>> strange values (between 4 and 10 events counted instead of 10) but know
>>>>> every request returns me always the same count value...
>>>>>
>>>>> It's very strange.
>>>>>
>>>>> Any other idea ?
>>>>>
>>>>> Alain
>>>>>
>>>>>
>>>>> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>>>>>
>>>>>> Alain,
>>>>>>
>>>>>> Try using a CL of 3 or "ALL" and see if that the problem goes away.
>>>>>>
>>>>>> Your replication factor (as I just learned) dictates how many nodes
>>>>>> each piece of data is replicated to; by using a RF of 3 you are saying
>>>>>> "replicate all my data to all my nodes" (in this case counters).
>>>>>>
>>>>>> This doesn't happen immediately, but you can *force* it to happen on
>>>>>> write by specifying a CL of "ALL". If you specify "1" then your counter
>>>>>> value is written to one member of the ring, then your command returns.
>>>>>>
>>>>>> If you keep querying you will bounce around your ring, reading the
>>>>>> values from the different nodes until a future date at *which point* all
>>>>>> the values will likely agree.
>>>>>>
>>>>>> If you keep all your code you have now exactly the same, just change
>>>>>> the code at the end where you read the counter value back, to keep reading
>>>>>> the counter value back every second for 60 seconds and see if all the
>>>>>> values eventually match up -- they should (as the counter value is
>>>>>> replicated to all the nodes and their old values discarded).
>>>>>>
>>>>>> -R
>>>>>>
>>>>>>
>>>>>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>>>>>
>>>>>>> Hi,
>>>>>>>
>>>>>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values
>>>>>>> from counters when doing so...
>>>>>>>
>>>>>>> I got a CF that contains many counters of some events. When I'm at
>>>>>>> RF = 1 and simulate 10 events, they are well counted.
>>>>>>> However, when I switch to a RF = 3, my counter show a wrong value
>>>>>>> that sometimes change when requested twice (it can return 7, then 5 instead
>>>>>>> of 10 all the time).
>>>>>>>
>>>>>>> I first thought that it was a problem of CL because I seem to
>>>>>>> remember that I read once that I had to use CL.One for reads and writes
>>>>>>> with counters. So I tried with CL.One, without success...
>>>>>>>
>>>>>>> What am I doing wrong ? Is that some precaution to take when
>>>>>>> replicating counters ?
>>>>>>>
>>>>>>> Alain
>>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Counters and replication factor

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

Sylvain, here is my ticket, but I guess you already know it since you are
the assignee :) -->https://issues.apache.org/jira/browse/CASSANDRA-3465
Riyad, Thanks for your help.

Alain

2011/11/7 Riyad Kalla <rk...@gmail.com>

> Alain thank you for all the clarification, I understand exactly what you
> meant now... and as a result am just as confused as you are :)
>
> What version of Cassandra are you using? Can you share the important parts
> of your config? (you double checked that your replication factor is set on
> all 3 to "3"?)
>
> Also out of curiosity, if you keep querying for up to 5 mins (say every 10
> seconds) do counter1, 2 and 3 still show the same wrong values for getValue
> or do the values eventually converge on the correct amounts?
>
> (I assume 5mins is a long enough window to test, maybe I'm wrong and
> another Cassandra dev can correct me here).
>
> -R
>
>
> On Mon, Nov 7, 2011 at 9:57 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>
>> I retried it after restarting all the servers.
>>
>> I still have wrong results (I simulated an event 5 times and it was
>> counted 3 times by some counters 4 or 5 times by others.
>>
>> What I meant by "but now every request returns me always the same count
>> value..." will be easier to explain with an example :
>>
>> event 1:
>>
>> counter1.increment
>> counter2.increment
>> counter3.increment
>>
>> .
>> .
>> .
>>
>> event 5:
>>
>> counter1.increment
>> counter2.increment
>> counter3.increment
>>
>> Show results :
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> counter1.getValue = returns 5
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 4
>> counter3.getValue = returns 5
>>
>> ...
>>
>> So I've got wrong values, and not always the same ones. In my previous
>> email I tried to tell you by saying "but now every request returns me
>> always the same count value..." that I had all the time the same wrong
>> values, let us say :
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> counter1.getValue = returns 4
>> counter2.getValue = returns 3
>> counter3.getValue = returns 5
>>
>> But that is not true, I still have some "random" wrong values, maybe
>> haven't I query to get counter values often enough to see it last time.
>>
>> Sorry of not being clearer, that is not easy to explain, neither to
>> understand for me.
>>
>> Thanks for help.
>>
>> Alain
>>
>>
>> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>>
>>> Alain,
>>>
>>> When you tried CL.All was that only after you had made the change of
>>> ReplicationFactor=3 and restarted all the servers?
>>>
>>> If you hadn't restarted the servers with the new RF, I am not sure that
>>> CL.All would have the intended effect.
>>>
>>> Also, I wasn't sure what you meant by "but know every request returns me
>>> always the same count value..." -- didn't want the requests to always
>>> return you the same values?
>>>
>>> Or maybe you are saying that it always returns the same *wrong* value?
>>> Like you do:
>>>
>>> counter.increment (v=1)
>>> counter.increment (v=2)
>>> counter.increment (v=3)
>>>
>>> counter.getValue = returns 7
>>> counter.getValue = returns 7
>>> counter.getValue = returns 7
>>>
>>> or something inconsistent like that?
>>>
>>> On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>>
>>>> I've tried with CL.All, but it doesn't wotk better. I still have
>>>> strange values (between 4 and 10 events counted instead of 10) but know
>>>> every request returns me always the same count value...
>>>>
>>>> It's very strange.
>>>>
>>>> Any other idea ?
>>>>
>>>> Alain
>>>>
>>>>
>>>> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>>>>
>>>>> Alain,
>>>>>
>>>>> Try using a CL of 3 or "ALL" and see if that the problem goes away.
>>>>>
>>>>> Your replication factor (as I just learned) dictates how many nodes
>>>>> each piece of data is replicated to; by using a RF of 3 you are saying
>>>>> "replicate all my data to all my nodes" (in this case counters).
>>>>>
>>>>> This doesn't happen immediately, but you can *force* it to happen on
>>>>> write by specifying a CL of "ALL". If you specify "1" then your counter
>>>>> value is written to one member of the ring, then your command returns.
>>>>>
>>>>> If you keep querying you will bounce around your ring, reading the
>>>>> values from the different nodes until a future date at *which point* all
>>>>> the values will likely agree.
>>>>>
>>>>> If you keep all your code you have now exactly the same, just change
>>>>> the code at the end where you read the counter value back, to keep reading
>>>>> the counter value back every second for 60 seconds and see if all the
>>>>> values eventually match up -- they should (as the counter value is
>>>>> replicated to all the nodes and their old values discarded).
>>>>>
>>>>> -R
>>>>>
>>>>>
>>>>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values
>>>>>> from counters when doing so...
>>>>>>
>>>>>> I got a CF that contains many counters of some events. When I'm at RF
>>>>>> = 1 and simulate 10 events, they are well counted.
>>>>>> However, when I switch to a RF = 3, my counter show a wrong value
>>>>>> that sometimes change when requested twice (it can return 7, then 5 instead
>>>>>> of 10 all the time).
>>>>>>
>>>>>> I first thought that it was a problem of CL because I seem to
>>>>>> remember that I read once that I had to use CL.One for reads and writes
>>>>>> with counters. So I tried with CL.One, without success...
>>>>>>
>>>>>> What am I doing wrong ? Is that some precaution to take when
>>>>>> replicating counters ?
>>>>>>
>>>>>> Alain
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Counters and replication factor

Posted by Riyad Kalla <rk...@gmail.com>.

Alain thank you for all the clarification, I understand exactly what you
meant now... and as a result am just as confused as you are :)

What version of Cassandra are you using? Can you share the important parts
of your config? (you double checked that your replication factor is set on
all 3 to "3"?)

Also out of curiosity, if you keep querying for up to 5 mins (say every 10
seconds) do counter1, 2 and 3 still show the same wrong values for getValue
or do the values eventually converge on the correct amounts?

(I assume 5mins is a long enough window to test, maybe I'm wrong and
another Cassandra dev can correct me here).

-R

On Mon, Nov 7, 2011 at 9:57 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> I retried it after restarting all the servers.
>
> I still have wrong results (I simulated an event 5 times and it was
> counted 3 times by some counters 4 or 5 times by others.
>
> What I meant by "but now every request returns me always the same count
> value..." will be easier to explain with an example :
>
> event 1:
>
> counter1.increment
> counter2.increment
> counter3.increment
>
> .
> .
> .
>
> event 5:
>
> counter1.increment
> counter2.increment
> counter3.increment
>
> Show results :
>
> counter1.getValue = returns 4
> counter2.getValue = returns 3
> counter3.getValue = returns 5
>
> counter1.getValue = returns 5
> counter2.getValue = returns 3
> counter3.getValue = returns 5
>
> counter1.getValue = returns 4
> counter2.getValue = returns 4
> counter3.getValue = returns 5
>
> ...
>
> So I've got wrong values, and not always the same ones. In my previous
> email I tried to tell you by saying "but now every request returns me
> always the same count value..." that I had all the time the same wrong
> values, let us say :
>
> counter1.getValue = returns 4
> counter2.getValue = returns 3
> counter3.getValue = returns 5
>
> counter1.getValue = returns 4
> counter2.getValue = returns 3
> counter3.getValue = returns 5
>
> counter1.getValue = returns 4
> counter2.getValue = returns 3
> counter3.getValue = returns 5
>
> But that is not true, I still have some "random" wrong values, maybe
> haven't I query to get counter values often enough to see it last time.
>
> Sorry of not being clearer, that is not easy to explain, neither to
> understand for me.
>
> Thanks for help.
>
> Alain
>
>
> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>
>> Alain,
>>
>> When you tried CL.All was that only after you had made the change of
>> ReplicationFactor=3 and restarted all the servers?
>>
>> If you hadn't restarted the servers with the new RF, I am not sure that
>> CL.All would have the intended effect.
>>
>> Also, I wasn't sure what you meant by "but know every request returns me
>> always the same count value..." -- didn't want the requests to always
>> return you the same values?
>>
>> Or maybe you are saying that it always returns the same *wrong* value?
>> Like you do:
>>
>> counter.increment (v=1)
>> counter.increment (v=2)
>> counter.increment (v=3)
>>
>> counter.getValue = returns 7
>> counter.getValue = returns 7
>> counter.getValue = returns 7
>>
>> or something inconsistent like that?
>>
>> On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>
>>> I've tried with CL.All, but it doesn't wotk better. I still have strange
>>> values (between 4 and 10 events counted instead of 10) but know every
>>> request returns me always the same count value...
>>>
>>> It's very strange.
>>>
>>> Any other idea ?
>>>
>>> Alain
>>>
>>>
>>> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>>>
>>>> Alain,
>>>>
>>>> Try using a CL of 3 or "ALL" and see if that the problem goes away.
>>>>
>>>> Your replication factor (as I just learned) dictates how many nodes
>>>> each piece of data is replicated to; by using a RF of 3 you are saying
>>>> "replicate all my data to all my nodes" (in this case counters).
>>>>
>>>> This doesn't happen immediately, but you can *force* it to happen on
>>>> write by specifying a CL of "ALL". If you specify "1" then your counter
>>>> value is written to one member of the ring, then your command returns.
>>>>
>>>> If you keep querying you will bounce around your ring, reading the
>>>> values from the different nodes until a future date at *which point* all
>>>> the values will likely agree.
>>>>
>>>> If you keep all your code you have now exactly the same, just change
>>>> the code at the end where you read the counter value back, to keep reading
>>>> the counter value back every second for 60 seconds and see if all the
>>>> values eventually match up -- they should (as the counter value is
>>>> replicated to all the nodes and their old values discarded).
>>>>
>>>> -R
>>>>
>>>>
>>>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>>>
>>>>> Hi,
>>>>>
>>>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values
>>>>> from counters when doing so...
>>>>>
>>>>> I got a CF that contains many counters of some events. When I'm at RF
>>>>> = 1 and simulate 10 events, they are well counted.
>>>>> However, when I switch to a RF = 3, my counter show a wrong value that
>>>>> sometimes change when requested twice (it can return 7, then 5 instead of
>>>>> 10 all the time).
>>>>>
>>>>> I first thought that it was a problem of CL because I seem to remember
>>>>> that I read once that I had to use CL.One for reads and writes with
>>>>> counters. So I tried with CL.One, without success...
>>>>>
>>>>> What am I doing wrong ? Is that some precaution to take when
>>>>> replicating counters ?
>>>>>
>>>>> Alain
>>>>>
>>>>
>>>>
>>>
>>
>

Re: Counters and replication factor

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

I retried it after restarting all the servers.

I still have wrong results (I simulated an event 5 times and it was counted
3 times by some counters 4 or 5 times by others.

What I meant by "but now every request returns me always the same count
value..." will be easier to explain with an example :

event 1:

counter1.increment
counter2.increment
counter3.increment

.
.
.

event 5:

counter1.increment
counter2.increment
counter3.increment

Show results :

counter1.getValue = returns 4
counter2.getValue = returns 3
counter3.getValue = returns 5

counter1.getValue = returns 5
counter2.getValue = returns 3
counter3.getValue = returns 5

counter1.getValue = returns 4
counter2.getValue = returns 4
counter3.getValue = returns 5

...

So I've got wrong values, and not always the same ones. In my previous
email I tried to tell you by saying "but now every request returns me
always the same count value..." that I had all the time the same wrong
values, let us say :

counter1.getValue = returns 4
counter2.getValue = returns 3
counter3.getValue = returns 5

counter1.getValue = returns 4
counter2.getValue = returns 3
counter3.getValue = returns 5

counter1.getValue = returns 4
counter2.getValue = returns 3
counter3.getValue = returns 5

But that is not true, I still have some "random" wrong values, maybe
haven't I query to get counter values often enough to see it last time.

Sorry of not being clearer, that is not easy to explain, neither to
understand for me.

Thanks for help.

Alain


2011/11/7 Riyad Kalla <rk...@gmail.com>

> Alain,
>
> When you tried CL.All was that only after you had made the change of
> ReplicationFactor=3 and restarted all the servers?
>
> If you hadn't restarted the servers with the new RF, I am not sure that
> CL.All would have the intended effect.
>
> Also, I wasn't sure what you meant by "but know every request returns me
> always the same count value..." -- didn't want the requests to always
> return you the same values?
>
> Or maybe you are saying that it always returns the same *wrong* value?
> Like you do:
>
> counter.increment (v=1)
> counter.increment (v=2)
> counter.increment (v=3)
>
> counter.getValue = returns 7
> counter.getValue = returns 7
> counter.getValue = returns 7
>
> or something inconsistent like that?
>
> On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>
>> I've tried with CL.All, but it doesn't wotk better. I still have strange
>> values (between 4 and 10 events counted instead of 10) but know every
>> request returns me always the same count value...
>>
>> It's very strange.
>>
>> Any other idea ?
>>
>> Alain
>>
>>
>> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>>
>>> Alain,
>>>
>>> Try using a CL of 3 or "ALL" and see if that the problem goes away.
>>>
>>> Your replication factor (as I just learned) dictates how many nodes each
>>> piece of data is replicated to; by using a RF of 3 you are saying
>>> "replicate all my data to all my nodes" (in this case counters).
>>>
>>> This doesn't happen immediately, but you can *force* it to happen on
>>> write by specifying a CL of "ALL". If you specify "1" then your counter
>>> value is written to one member of the ring, then your command returns.
>>>
>>> If you keep querying you will bounce around your ring, reading the
>>> values from the different nodes until a future date at *which point* all
>>> the values will likely agree.
>>>
>>> If you keep all your code you have now exactly the same, just change the
>>> code at the end where you read the counter value back, to keep reading the
>>> counter value back every second for 60 seconds and see if all the values
>>> eventually match up -- they should (as the counter value is replicated to
>>> all the nodes and their old values discarded).
>>>
>>> -R
>>>
>>>
>>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>>
>>>> Hi,
>>>>
>>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values
>>>> from counters when doing so...
>>>>
>>>> I got a CF that contains many counters of some events. When I'm at RF =
>>>> 1 and simulate 10 events, they are well counted.
>>>> However, when I switch to a RF = 3, my counter show a wrong value that
>>>> sometimes change when requested twice (it can return 7, then 5 instead of
>>>> 10 all the time).
>>>>
>>>> I first thought that it was a problem of CL because I seem to remember
>>>> that I read once that I had to use CL.One for reads and writes with
>>>> counters. So I tried with CL.One, without success...
>>>>
>>>> What am I doing wrong ? Is that some precaution to take when
>>>> replicating counters ?
>>>>
>>>> Alain
>>>>
>>>
>>>
>>
>

Re: Counters and replication factor

Posted by Riyad Kalla <rk...@gmail.com>.

Alain,

When you tried CL.All was that only after you had made the change of
ReplicationFactor=3 and restarted all the servers?

If you hadn't restarted the servers with the new RF, I am not sure that
CL.All would have the intended effect.

Also, I wasn't sure what you meant by "but know every request returns me
always the same count value..." -- didn't want the requests to always
return you the same values?

Or maybe you are saying that it always returns the same *wrong* value? Like
you do:

counter.increment (v=1)
counter.increment (v=2)
counter.increment (v=3)

counter.getValue = returns 7
counter.getValue = returns 7
counter.getValue = returns 7

or something inconsistent like that?

On Mon, Nov 7, 2011 at 9:09 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> I've tried with CL.All, but it doesn't wotk better. I still have strange
> values (between 4 and 10 events counted instead of 10) but know every
> request returns me always the same count value...
>
> It's very strange.
>
> Any other idea ?
>
> Alain
>
>
> 2011/11/7 Riyad Kalla <rk...@gmail.com>
>
>> Alain,
>>
>> Try using a CL of 3 or "ALL" and see if that the problem goes away.
>>
>> Your replication factor (as I just learned) dictates how many nodes each
>> piece of data is replicated to; by using a RF of 3 you are saying
>> "replicate all my data to all my nodes" (in this case counters).
>>
>> This doesn't happen immediately, but you can *force* it to happen on
>> write by specifying a CL of "ALL". If you specify "1" then your counter
>> value is written to one member of the ring, then your command returns.
>>
>> If you keep querying you will bounce around your ring, reading the values
>> from the different nodes until a future date at *which point* all the
>> values will likely agree.
>>
>> If you keep all your code you have now exactly the same, just change the
>> code at the end where you read the counter value back, to keep reading the
>> counter value back every second for 60 seconds and see if all the values
>> eventually match up -- they should (as the counter value is replicated to
>> all the nodes and their old values discarded).
>>
>> -R
>>
>>
>> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>>
>>> Hi,
>>>
>>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values
>>> from counters when doing so...
>>>
>>> I got a CF that contains many counters of some events. When I'm at RF =
>>> 1 and simulate 10 events, they are well counted.
>>> However, when I switch to a RF = 3, my counter show a wrong value that
>>> sometimes change when requested twice (it can return 7, then 5 instead of
>>> 10 all the time).
>>>
>>> I first thought that it was a problem of CL because I seem to remember
>>> that I read once that I had to use CL.One for reads and writes with
>>> counters. So I tried with CL.One, without success...
>>>
>>> What am I doing wrong ? Is that some precaution to take when replicating
>>> counters ?
>>>
>>> Alain
>>>
>>
>>
>

Re: Counters and replication factor

Posted by Alain RODRIGUEZ <ar...@gmail.com>.

I've tried with CL.All, but it doesn't wotk better. I still have strange
values (between 4 and 10 events counted instead of 10) but know every
request returns me always the same count value...

It's very strange.

Any other idea ?

Alain

2011/11/7 Riyad Kalla <rk...@gmail.com>

> Alain,
>
> Try using a CL of 3 or "ALL" and see if that the problem goes away.
>
> Your replication factor (as I just learned) dictates how many nodes each
> piece of data is replicated to; by using a RF of 3 you are saying
> "replicate all my data to all my nodes" (in this case counters).
>
> This doesn't happen immediately, but you can *force* it to happen on write
> by specifying a CL of "ALL". If you specify "1" then your counter value is
> written to one member of the ring, then your command returns.
>
> If you keep querying you will bounce around your ring, reading the values
> from the different nodes until a future date at *which point* all the
> values will likely agree.
>
> If you keep all your code you have now exactly the same, just change the
> code at the end where you read the counter value back, to keep reading the
> counter value back every second for 60 seconds and see if all the values
> eventually match up -- they should (as the counter value is replicated to
> all the nodes and their old values discarded).
>
> -R
>
>
> On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <ar...@gmail.com>wrote:
>
>> Hi,
>>
>> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values from
>> counters when doing so...
>>
>> I got a CF that contains many counters of some events. When I'm at RF = 1
>> and simulate 10 events, they are well counted.
>> However, when I switch to a RF = 3, my counter show a wrong value that
>> sometimes change when requested twice (it can return 7, then 5 instead of
>> 10 all the time).
>>
>> I first thought that it was a problem of CL because I seem to remember
>> that I read once that I had to use CL.One for reads and writes with
>> counters. So I tried with CL.One, without success...
>>
>> What am I doing wrong ? Is that some precaution to take when replicating
>> counters ?
>>
>> Alain
>>
>
>

Re: Counters and replication factor

Posted by Riyad Kalla <rk...@gmail.com>.

Alain,

Try using a CL of 3 or "ALL" and see if that the problem goes away.

Your replication factor (as I just learned) dictates how many nodes each
piece of data is replicated to; by using a RF of 3 you are saying
"replicate all my data to all my nodes" (in this case counters).

This doesn't happen immediately, but you can *force* it to happen on write
by specifying a CL of "ALL". If you specify "1" then your counter value is
written to one member of the ring, then your command returns.

If you keep querying you will bounce around your ring, reading the values
from the different nodes until a future date at *which point* all the
values will likely agree.

If you keep all your code you have now exactly the same, just change the
code at the end where you read the counter value back, to keep reading the
counter value back every second for 60 seconds and see if all the values
eventually match up -- they should (as the counter value is replicated to
all the nodes and their old values discarded).

-R

On Mon, Nov 7, 2011 at 8:15 AM, Alain RODRIGUEZ <ar...@gmail.com> wrote:

> Hi,
>
> I trying to switch from a RF = 1 to a RF = 3, but I get wrong values from
> counters when doing so...
>
> I got a CF that contains many counters of some events. When I'm at RF = 1
> and simulate 10 events, they are well counted.
> However, when I switch to a RF = 3, my counter show a wrong value that
> sometimes change when requested twice (it can return 7, then 5 instead of
> 10 all the time).
>
> I first thought that it was a problem of CL because I seem to remember
> that I read once that I had to use CL.One for reads and writes with
> counters. So I tried with CL.One, without success...
>
> What am I doing wrong ? Is that some precaution to take when replicating
> counters ?
>
> Alain
>