You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Jiang Chen <ji...@gmail.com> on 2011/08/29 23:03:14 UTC

Updates lost

Hi,

Just started developing using Cassandra (0.8.4). I noticed when
updating the same row and column repeatedly, say, in a test case,
updates may get lost. I found it in a Java client but the following
python script also exhibits the same problem.

******************************************************************
import pycassa
import time

pool = pycassa.ConnectionPool('TestKeySpace', server_list=['localhost'])

cf = pycassa.ColumnFamily(pool, 'blah')

for i in range(1, 1000):
	cf.insert('foo', {'body': 'aaa'})
	assert 'aaa'==cf.get('foo')['body']	
	cf.insert('foo', {'body': 'bbb'})
	assert 'bbb'==cf.get('foo')['body']
	#time.sleep(0.015)

	print i
******************************************************************

The script errors out at assertion, when the value is supposed to be
updated back to 'aaa' from 'bbb'

******************************************************************
$ python test.py
1
Traceback (most recent call last):
  File "test.py", line 10, in <module>
    assert 'aaa'==cf.get('foo')['body']
AssertionError
******************************************************************

If the client sleeps for a few ms at each loop, the success rate
increases. At 15 ms, the script always succeeds so far. Interestingly,
the problem seems to be sensitive to alphabetical order. Updating the
value from 'aaa' to 'bbb' never has problem. No pause needed.

Is this a real problem or I'm missing something about how Cassandra
works? Only one Cassandra instance is used in the test.

Thanks.

Jiang

Re: Updates lost

Posted by Jiang Chen <ji...@gmail.com>.

Cheers. That would be another solution.

On Wed, Aug 31, 2011 at 10:42 AM, Jim Ancona <ji...@anconafamily.com> wrote:
> You could also look at Hector's approach in:
> https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/service/clock/MicrosecondsSyncClockResolution.java
>
> It works well and I believe there was some performance testing done on
> it as well.
>
> Jim
>
> On Tue, Aug 30, 2011 at 3:43 PM, Jeremy Hanna
> <je...@gmail.com> wrote:
>> Sorry - misread your earlier email.  I would login to IRC and ask in #cassandra.  I would think given the nature of nanotime you'll run into harder to track down problems, but it may be fine.
>>
>> On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote:
>>
>>> Do you see any problem with my approach to derive the current time in
>>> nano seconds though?
>>>
>>> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
>>> <je...@gmail.com> wrote:
>>>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is because System.nanoTime javadoc says "This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time."
>>>>
>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>
>>>> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:
>>>>
>>>>> Indeed it's microseconds. We are talking about how to achieve the
>>>>> precision of microseconds. One way is System.currentTimeInMillis() *
>>>>> 1000. It's only precise to milliseconds. If there are more than one
>>>>> update in the same millisecond, the second one may be lost. That's my
>>>>> original problem.
>>>>>
>>>>> The other way is to derive from System.nanoTime(). This function
>>>>> doesn't directly return the time since epoch. I used the following:
>>>>>
>>>>>       private static long nanotimeOffset = System.nanoTime()
>>>>>                       - System.currentTimeMillis() * 1000000;
>>>>>
>>>>>       private static long currentTimeNanos() {
>>>>>               return System.nanoTime() - nanotimeOffset;
>>>>>       }
>>>>>
>>>>> The timestamp to use is then currentTimeNanos() / 1000.
>>>>>
>>>>> Anyone sees problem with this approach?
>>>>>
>>>>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <ed...@gmail.com> wrote:
>>>>>>
>>>>>>
>>>>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <je...@gmail.com>
>>>>>> wrote:
>>>>>>>
>>>>>>> I would not use nano time with cassandra.  Internally and throughout the
>>>>>>> clients, milliseconds is pretty much a standard.  You can get into trouble
>>>>>>> because when comparing nanoseconds with milliseconds as long numbers,
>>>>>>> nanoseconds will always win.  That bit us a while back when we deleted
>>>>>>> something and it couldn't come back because we deleted it with nanoseconds
>>>>>>> as the timestamp value.
>>>>>>>
>>>>>>> See the caveats for System.nanoTime() for why milliseconds is a standard:
>>>>>>>
>>>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>>>>
>>>>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>>>>>>
>>>>>>>> Looks like the theory is correct for the java case at least.
>>>>>>>>
>>>>>>>> The default timestamp precision of Pelops is millisecond. Hence the
>>>>>>>> problem as explained by Peter. Once I supplied timestamps precise to
>>>>>>>> microsecond (using System.nanoTime()), the problem went away.
>>>>>>>>
>>>>>>>> I previously stated that sleeping for a few milliseconds didn't help.
>>>>>>>> It was actually because of the precision of Java Thread.sleep().
>>>>>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>>>>>>
>>>>>>>> Haven't checked the Python side to see if it's similar situation.
>>>>>>>>
>>>>>>>> Cheers.
>>>>>>>>
>>>>>>>> Jiang
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
>>>>>>>>> It's a single node. Thanks for the theory. I suspect part of it may
>>>>>>>>> still be right. Will dig more.
>>>>>>>>>
>>>>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>>>>>> <pe...@infidyne.com> wrote:
>>>>>>>>>>> The problem still happens with very high probability even when it
>>>>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>>>>>>>>>>> it can't be the cause. Also I have the same problem with a Java
>>>>>>>>>>> client
>>>>>>>>>>> using Pelops.
>>>>>>>>>>
>>>>>>>>>> You connect to localhost, but is that a single node or part of a
>>>>>>>>>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
>>>>>>>>>> level to ensure that a read sees your write.
>>>>>>>>>>
>>>>>>>>>> If it's a single node and not a pycassa / client issue, I don't know
>>>>>>>>>> off hand.
>>>>>>>>>>
>>>>>>>>>> --
>>>>>>>>>> / Peter Schuller (@scode on twitter)
>>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>>
>>>>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>>>>>> http://wiki.apache.org/cassandra/DataModel
>>>>>> The CLI uses microseconds. If your code and the CLI are doing different
>>>>>> things with time BadThingsWillHappen TM
>>>>>>
>>>>>>
>>>>
>>>>
>>
>>
>

Re: Updates lost

Posted by Jim Ancona <ji...@anconafamily.com>.

You could also look at Hector's approach in:
https://github.com/rantav/hector/blob/master/core/src/main/java/me/prettyprint/cassandra/service/clock/MicrosecondsSyncClockResolution.java

It works well and I believe there was some performance testing done on
it as well.

Jim

On Tue, Aug 30, 2011 at 3:43 PM, Jeremy Hanna
<je...@gmail.com> wrote:
> Sorry - misread your earlier email.  I would login to IRC and ask in #cassandra.  I would think given the nature of nanotime you'll run into harder to track down problems, but it may be fine.
>
> On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote:
>
>> Do you see any problem with my approach to derive the current time in
>> nano seconds though?
>>
>> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
>> <je...@gmail.com> wrote:
>>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is because System.nanoTime javadoc says "This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time."
>>>
>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>
>>> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:
>>>
>>>> Indeed it's microseconds. We are talking about how to achieve the
>>>> precision of microseconds. One way is System.currentTimeInMillis() *
>>>> 1000. It's only precise to milliseconds. If there are more than one
>>>> update in the same millisecond, the second one may be lost. That's my
>>>> original problem.
>>>>
>>>> The other way is to derive from System.nanoTime(). This function
>>>> doesn't directly return the time since epoch. I used the following:
>>>>
>>>>       private static long nanotimeOffset = System.nanoTime()
>>>>                       - System.currentTimeMillis() * 1000000;
>>>>
>>>>       private static long currentTimeNanos() {
>>>>               return System.nanoTime() - nanotimeOffset;
>>>>       }
>>>>
>>>> The timestamp to use is then currentTimeNanos() / 1000.
>>>>
>>>> Anyone sees problem with this approach?
>>>>
>>>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <ed...@gmail.com> wrote:
>>>>>
>>>>>
>>>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <je...@gmail.com>
>>>>> wrote:
>>>>>>
>>>>>> I would not use nano time with cassandra.  Internally and throughout the
>>>>>> clients, milliseconds is pretty much a standard.  You can get into trouble
>>>>>> because when comparing nanoseconds with milliseconds as long numbers,
>>>>>> nanoseconds will always win.  That bit us a while back when we deleted
>>>>>> something and it couldn't come back because we deleted it with nanoseconds
>>>>>> as the timestamp value.
>>>>>>
>>>>>> See the caveats for System.nanoTime() for why milliseconds is a standard:
>>>>>>
>>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>>>
>>>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>>>>>
>>>>>>> Looks like the theory is correct for the java case at least.
>>>>>>>
>>>>>>> The default timestamp precision of Pelops is millisecond. Hence the
>>>>>>> problem as explained by Peter. Once I supplied timestamps precise to
>>>>>>> microsecond (using System.nanoTime()), the problem went away.
>>>>>>>
>>>>>>> I previously stated that sleeping for a few milliseconds didn't help.
>>>>>>> It was actually because of the precision of Java Thread.sleep().
>>>>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>>>>>
>>>>>>> Haven't checked the Python side to see if it's similar situation.
>>>>>>>
>>>>>>> Cheers.
>>>>>>>
>>>>>>> Jiang
>>>>>>>
>>>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
>>>>>>>> It's a single node. Thanks for the theory. I suspect part of it may
>>>>>>>> still be right. Will dig more.
>>>>>>>>
>>>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>>>>> <pe...@infidyne.com> wrote:
>>>>>>>>>> The problem still happens with very high probability even when it
>>>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>>>>>>>>>> it can't be the cause. Also I have the same problem with a Java
>>>>>>>>>> client
>>>>>>>>>> using Pelops.
>>>>>>>>>
>>>>>>>>> You connect to localhost, but is that a single node or part of a
>>>>>>>>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
>>>>>>>>> level to ensure that a read sees your write.
>>>>>>>>>
>>>>>>>>> If it's a single node and not a pycassa / client issue, I don't know
>>>>>>>>> off hand.
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>> / Peter Schuller (@scode on twitter)
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>>>>> http://wiki.apache.org/cassandra/DataModel
>>>>> The CLI uses microseconds. If your code and the CLI are doing different
>>>>> things with time BadThingsWillHappen TM
>>>>>
>>>>>
>>>
>>>
>
>

Re: Updates lost

Posted by Jeremy Hanna <je...@gmail.com>.

Sorry - misread your earlier email.  I would login to IRC and ask in #cassandra.  I would think given the nature of nanotime you'll run into harder to track down problems, but it may be fine.

On Aug 30, 2011, at 2:06 PM, Jiang Chen wrote:

> Do you see any problem with my approach to derive the current time in
> nano seconds though?
> 
> On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
> <je...@gmail.com> wrote:
>> Yes - the reason why internally Cassandra uses milliseconds * 1000 is because System.nanoTime javadoc says "This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time."
>> 
>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>> 
>> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:
>> 
>>> Indeed it's microseconds. We are talking about how to achieve the
>>> precision of microseconds. One way is System.currentTimeInMillis() *
>>> 1000. It's only precise to milliseconds. If there are more than one
>>> update in the same millisecond, the second one may be lost. That's my
>>> original problem.
>>> 
>>> The other way is to derive from System.nanoTime(). This function
>>> doesn't directly return the time since epoch. I used the following:
>>> 
>>>       private static long nanotimeOffset = System.nanoTime()
>>>                       - System.currentTimeMillis() * 1000000;
>>> 
>>>       private static long currentTimeNanos() {
>>>               return System.nanoTime() - nanotimeOffset;
>>>       }
>>> 
>>> The timestamp to use is then currentTimeNanos() / 1000.
>>> 
>>> Anyone sees problem with this approach?
>>> 
>>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <ed...@gmail.com> wrote:
>>>> 
>>>> 
>>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <je...@gmail.com>
>>>> wrote:
>>>>> 
>>>>> I would not use nano time with cassandra.  Internally and throughout the
>>>>> clients, milliseconds is pretty much a standard.  You can get into trouble
>>>>> because when comparing nanoseconds with milliseconds as long numbers,
>>>>> nanoseconds will always win.  That bit us a while back when we deleted
>>>>> something and it couldn't come back because we deleted it with nanoseconds
>>>>> as the timestamp value.
>>>>> 
>>>>> See the caveats for System.nanoTime() for why milliseconds is a standard:
>>>>> 
>>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>> 
>>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>>>> 
>>>>>> Looks like the theory is correct for the java case at least.
>>>>>> 
>>>>>> The default timestamp precision of Pelops is millisecond. Hence the
>>>>>> problem as explained by Peter. Once I supplied timestamps precise to
>>>>>> microsecond (using System.nanoTime()), the problem went away.
>>>>>> 
>>>>>> I previously stated that sleeping for a few milliseconds didn't help.
>>>>>> It was actually because of the precision of Java Thread.sleep().
>>>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>>>> 
>>>>>> Haven't checked the Python side to see if it's similar situation.
>>>>>> 
>>>>>> Cheers.
>>>>>> 
>>>>>> Jiang
>>>>>> 
>>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
>>>>>>> It's a single node. Thanks for the theory. I suspect part of it may
>>>>>>> still be right. Will dig more.
>>>>>>> 
>>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>>>> <pe...@infidyne.com> wrote:
>>>>>>>>> The problem still happens with very high probability even when it
>>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>>>>>>>>> it can't be the cause. Also I have the same problem with a Java
>>>>>>>>> client
>>>>>>>>> using Pelops.
>>>>>>>> 
>>>>>>>> You connect to localhost, but is that a single node or part of a
>>>>>>>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
>>>>>>>> level to ensure that a read sees your write.
>>>>>>>> 
>>>>>>>> If it's a single node and not a pycassa / client issue, I don't know
>>>>>>>> off hand.
>>>>>>>> 
>>>>>>>> --
>>>>>>>> / Peter Schuller (@scode on twitter)
>>>>>>>> 
>>>>>>> 
>>>>> 
>>>> 
>>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>>>> http://wiki.apache.org/cassandra/DataModel
>>>> The CLI uses microseconds. If your code and the CLI are doing different
>>>> things with time BadThingsWillHappen TM
>>>> 
>>>> 
>> 
>>

Re: Updates lost

Posted by Jiang Chen <ji...@gmail.com>.

Do you see any problem with my approach to derive the current time in
nano seconds though?

On Tue, Aug 30, 2011 at 2:39 PM, Jeremy Hanna
<je...@gmail.com> wrote:
> Yes - the reason why internally Cassandra uses milliseconds * 1000 is because System.nanoTime javadoc says "This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time."
>
> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>
> On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:
>
>> Indeed it's microseconds. We are talking about how to achieve the
>> precision of microseconds. One way is System.currentTimeInMillis() *
>> 1000. It's only precise to milliseconds. If there are more than one
>> update in the same millisecond, the second one may be lost. That's my
>> original problem.
>>
>> The other way is to derive from System.nanoTime(). This function
>> doesn't directly return the time since epoch. I used the following:
>>
>>       private static long nanotimeOffset = System.nanoTime()
>>                       - System.currentTimeMillis() * 1000000;
>>
>>       private static long currentTimeNanos() {
>>               return System.nanoTime() - nanotimeOffset;
>>       }
>>
>> The timestamp to use is then currentTimeNanos() / 1000.
>>
>> Anyone sees problem with this approach?
>>
>> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <ed...@gmail.com> wrote:
>>>
>>>
>>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <je...@gmail.com>
>>> wrote:
>>>>
>>>> I would not use nano time with cassandra.  Internally and throughout the
>>>> clients, milliseconds is pretty much a standard.  You can get into trouble
>>>> because when comparing nanoseconds with milliseconds as long numbers,
>>>> nanoseconds will always win.  That bit us a while back when we deleted
>>>> something and it couldn't come back because we deleted it with nanoseconds
>>>> as the timestamp value.
>>>>
>>>> See the caveats for System.nanoTime() for why milliseconds is a standard:
>>>>
>>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>>>
>>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>>>
>>>>> Looks like the theory is correct for the java case at least.
>>>>>
>>>>> The default timestamp precision of Pelops is millisecond. Hence the
>>>>> problem as explained by Peter. Once I supplied timestamps precise to
>>>>> microsecond (using System.nanoTime()), the problem went away.
>>>>>
>>>>> I previously stated that sleeping for a few milliseconds didn't help.
>>>>> It was actually because of the precision of Java Thread.sleep().
>>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>>>
>>>>> Haven't checked the Python side to see if it's similar situation.
>>>>>
>>>>> Cheers.
>>>>>
>>>>> Jiang
>>>>>
>>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
>>>>>> It's a single node. Thanks for the theory. I suspect part of it may
>>>>>> still be right. Will dig more.
>>>>>>
>>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>>> <pe...@infidyne.com> wrote:
>>>>>>>> The problem still happens with very high probability even when it
>>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>>>>>>>> it can't be the cause. Also I have the same problem with a Java
>>>>>>>> client
>>>>>>>> using Pelops.
>>>>>>>
>>>>>>> You connect to localhost, but is that a single node or part of a
>>>>>>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
>>>>>>> level to ensure that a read sees your write.
>>>>>>>
>>>>>>> If it's a single node and not a pycassa / client issue, I don't know
>>>>>>> off hand.
>>>>>>>
>>>>>>> --
>>>>>>> / Peter Schuller (@scode on twitter)
>>>>>>>
>>>>>>
>>>>
>>>
>>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>>> http://wiki.apache.org/cassandra/DataModel
>>> The CLI uses microseconds. If your code and the CLI are doing different
>>> things with time BadThingsWillHappen TM
>>>
>>>
>
>

Re: Updates lost

Posted by Jeremy Hanna <je...@gmail.com>.

Yes - the reason why internally Cassandra uses milliseconds * 1000 is because System.nanoTime javadoc says "This method can only be used to measure elapsed time and is not related to any other notion of system or wall-clock time."

http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29

On Aug 30, 2011, at 1:31 PM, Jiang Chen wrote:

> Indeed it's microseconds. We are talking about how to achieve the
> precision of microseconds. One way is System.currentTimeInMillis() *
> 1000. It's only precise to milliseconds. If there are more than one
> update in the same millisecond, the second one may be lost. That's my
> original problem.
> 
> The other way is to derive from System.nanoTime(). This function
> doesn't directly return the time since epoch. I used the following:
> 
> 	private static long nanotimeOffset = System.nanoTime()
> 			- System.currentTimeMillis() * 1000000;
> 
> 	private static long currentTimeNanos() {
> 		return System.nanoTime() - nanotimeOffset;
> 	}
> 
> The timestamp to use is then currentTimeNanos() / 1000.
> 
> Anyone sees problem with this approach?
> 
> On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <ed...@gmail.com> wrote:
>> 
>> 
>> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <je...@gmail.com>
>> wrote:
>>> 
>>> I would not use nano time with cassandra.  Internally and throughout the
>>> clients, milliseconds is pretty much a standard.  You can get into trouble
>>> because when comparing nanoseconds with milliseconds as long numbers,
>>> nanoseconds will always win.  That bit us a while back when we deleted
>>> something and it couldn't come back because we deleted it with nanoseconds
>>> as the timestamp value.
>>> 
>>> See the caveats for System.nanoTime() for why milliseconds is a standard:
>>> 
>>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>> 
>>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>> 
>>>> Looks like the theory is correct for the java case at least.
>>>> 
>>>> The default timestamp precision of Pelops is millisecond. Hence the
>>>> problem as explained by Peter. Once I supplied timestamps precise to
>>>> microsecond (using System.nanoTime()), the problem went away.
>>>> 
>>>> I previously stated that sleeping for a few milliseconds didn't help.
>>>> It was actually because of the precision of Java Thread.sleep().
>>>> Sleeping for less than 15ms often doesn't sleep at all.
>>>> 
>>>> Haven't checked the Python side to see if it's similar situation.
>>>> 
>>>> Cheers.
>>>> 
>>>> Jiang
>>>> 
>>>> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
>>>>> It's a single node. Thanks for the theory. I suspect part of it may
>>>>> still be right. Will dig more.
>>>>> 
>>>>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>>>>> <pe...@infidyne.com> wrote:
>>>>>>> The problem still happens with very high probability even when it
>>>>>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>>>>>>> it can't be the cause. Also I have the same problem with a Java
>>>>>>> client
>>>>>>> using Pelops.
>>>>>> 
>>>>>> You connect to localhost, but is that a single node or part of a
>>>>>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
>>>>>> level to ensure that a read sees your write.
>>>>>> 
>>>>>> If it's a single node and not a pycassa / client issue, I don't know
>>>>>> off hand.
>>>>>> 
>>>>>> --
>>>>>> / Peter Schuller (@scode on twitter)
>>>>>> 
>>>>> 
>>> 
>> 
>> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
>> http://wiki.apache.org/cassandra/DataModel
>> The CLI uses microseconds. If your code and the CLI are doing different
>> things with time BadThingsWillHappen TM
>> 
>>

Re: Updates lost

Posted by Jiang Chen <ji...@gmail.com>.

Indeed it's microseconds. We are talking about how to achieve the
precision of microseconds. One way is System.currentTimeInMillis() *
1000. It's only precise to milliseconds. If there are more than one
update in the same millisecond, the second one may be lost. That's my
original problem.

The other way is to derive from System.nanoTime(). This function
doesn't directly return the time since epoch. I used the following:

	private static long nanotimeOffset = System.nanoTime()
			- System.currentTimeMillis() * 1000000;

	private static long currentTimeNanos() {
		return System.nanoTime() - nanotimeOffset;
	}

The timestamp to use is then currentTimeNanos() / 1000.

Anyone sees problem with this approach?

On Tue, Aug 30, 2011 at 2:20 PM, Edward Capriolo <ed...@gmail.com> wrote:
>
>
> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <je...@gmail.com>
> wrote:
>>
>> I would not use nano time with cassandra.  Internally and throughout the
>> clients, milliseconds is pretty much a standard.  You can get into trouble
>> because when comparing nanoseconds with milliseconds as long numbers,
>> nanoseconds will always win.  That bit us a while back when we deleted
>> something and it couldn't come back because we deleted it with nanoseconds
>> as the timestamp value.
>>
>> See the caveats for System.nanoTime() for why milliseconds is a standard:
>>
>> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>>
>> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>>
>> > Looks like the theory is correct for the java case at least.
>> >
>> > The default timestamp precision of Pelops is millisecond. Hence the
>> > problem as explained by Peter. Once I supplied timestamps precise to
>> > microsecond (using System.nanoTime()), the problem went away.
>> >
>> > I previously stated that sleeping for a few milliseconds didn't help.
>> > It was actually because of the precision of Java Thread.sleep().
>> > Sleeping for less than 15ms often doesn't sleep at all.
>> >
>> > Haven't checked the Python side to see if it's similar situation.
>> >
>> > Cheers.
>> >
>> > Jiang
>> >
>> > On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
>> >> It's a single node. Thanks for the theory. I suspect part of it may
>> >> still be right. Will dig more.
>> >>
>> >> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>> >> <pe...@infidyne.com> wrote:
>> >>>> The problem still happens with very high probability even when it
>> >>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>> >>>> it can't be the cause. Also I have the same problem with a Java
>> >>>> client
>> >>>> using Pelops.
>> >>>
>> >>> You connect to localhost, but is that a single node or part of a
>> >>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
>> >>> level to ensure that a read sees your write.
>> >>>
>> >>> If it's a single node and not a pycassa / client issue, I don't know
>> >>> off hand.
>> >>>
>> >>> --
>> >>> / Peter Schuller (@scode on twitter)
>> >>>
>> >>
>>
>
> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
> http://wiki.apache.org/cassandra/DataModel
> The CLI uses microseconds. If your code and the CLI are doing different
> things with time BadThingsWillHappen TM
>
>

Re: Updates lost

Posted by Jeremy Hanna <je...@gmail.com>.

Ed- you're right - milliseconds * 1000.  That's right.  The other stuff about nano time still stands, but you're right - microseconds.  Sorry about that.

On Aug 30, 2011, at 1:20 PM, Edward Capriolo wrote:

> 
> 
> On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <je...@gmail.com> wrote:
> I would not use nano time with cassandra.  Internally and throughout the clients, milliseconds is pretty much a standard.  You can get into trouble because when comparing nanoseconds with milliseconds as long numbers, nanoseconds will always win.  That bit us a while back when we deleted something and it couldn't come back because we deleted it with nanoseconds as the timestamp value.
> 
> See the caveats for System.nanoTime() for why milliseconds is a standard:
> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
> 
> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
> 
> > Looks like the theory is correct for the java case at least.
> >
> > The default timestamp precision of Pelops is millisecond. Hence the
> > problem as explained by Peter. Once I supplied timestamps precise to
> > microsecond (using System.nanoTime()), the problem went away.
> >
> > I previously stated that sleeping for a few milliseconds didn't help.
> > It was actually because of the precision of Java Thread.sleep().
> > Sleeping for less than 15ms often doesn't sleep at all.
> >
> > Haven't checked the Python side to see if it's similar situation.
> >
> > Cheers.
> >
> > Jiang
> >
> > On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
> >> It's a single node. Thanks for the theory. I suspect part of it may
> >> still be right. Will dig more.
> >>
> >> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
> >> <pe...@infidyne.com> wrote:
> >>>> The problem still happens with very high probability even when it
> >>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
> >>>> it can't be the cause. Also I have the same problem with a Java client
> >>>> using Pelops.
> >>>
> >>> You connect to localhost, but is that a single node or part of a
> >>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
> >>> level to ensure that a read sees your write.
> >>>
> >>> If it's a single node and not a pycassa / client issue, I don't know off hand.
> >>>
> >>> --
> >>> / Peter Schuller (@scode on twitter)
> >>>
> >>
> 
> 
> Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)
> 
> http://wiki.apache.org/cassandra/DataModel
> 
> The CLI uses microseconds. If your code and the CLI are doing different things with time BadThingsWillHappen TM
> 
>

Re: Updates lost

Posted by Edward Capriolo <ed...@gmail.com>.

On Tue, Aug 30, 2011 at 1:41 PM, Jeremy Hanna <je...@gmail.com>wrote:

> I would not use nano time with cassandra.  Internally and throughout the
> clients, milliseconds is pretty much a standard.  You can get into trouble
> because when comparing nanoseconds with milliseconds as long numbers,
> nanoseconds will always win.  That bit us a while back when we deleted
> something and it couldn't come back because we deleted it with nanoseconds
> as the timestamp value.
>
> See the caveats for System.nanoTime() for why milliseconds is a standard:
>
> http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29
>
> On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:
>
> > Looks like the theory is correct for the java case at least.
> >
> > The default timestamp precision of Pelops is millisecond. Hence the
> > problem as explained by Peter. Once I supplied timestamps precise to
> > microsecond (using System.nanoTime()), the problem went away.
> >
> > I previously stated that sleeping for a few milliseconds didn't help.
> > It was actually because of the precision of Java Thread.sleep().
> > Sleeping for less than 15ms often doesn't sleep at all.
> >
> > Haven't checked the Python side to see if it's similar situation.
> >
> > Cheers.
> >
> > Jiang
> >
> > On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
> >> It's a single node. Thanks for the theory. I suspect part of it may
> >> still be right. Will dig more.
> >>
> >> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
> >> <pe...@infidyne.com> wrote:
> >>>> The problem still happens with very high probability even when it
> >>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
> >>>> it can't be the cause. Also I have the same problem with a Java client
> >>>> using Pelops.
> >>>
> >>> You connect to localhost, but is that a single node or part of a
> >>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
> >>> level to ensure that a read sees your write.
> >>>
> >>> If it's a single node and not a pycassa / client issue, I don't know
> off hand.
> >>>
> >>> --
> >>> / Peter Schuller (@scode on twitter)
> >>>
> >>
>
>
Isn't the standard microseconds ? (System.currentTimeMillis()*1000L)

http://wiki.apache.org/cassandra/DataModel

The CLI uses microseconds. If your code and the CLI are doing different
things with time BadThingsWillHappen TM

Re: Updates lost

Posted by Jeremy Hanna <je...@gmail.com>.

I would not use nano time with cassandra.  Internally and throughout the clients, milliseconds is pretty much a standard.  You can get into trouble because when comparing nanoseconds with milliseconds as long numbers, nanoseconds will always win.  That bit us a while back when we deleted something and it couldn't come back because we deleted it with nanoseconds as the timestamp value.

See the caveats for System.nanoTime() for why milliseconds is a standard:
http://download.oracle.com/javase/6/docs/api/java/lang/System.html#nanoTime%28%29

On Aug 30, 2011, at 12:31 PM, Jiang Chen wrote:

> Looks like the theory is correct for the java case at least.
> 
> The default timestamp precision of Pelops is millisecond. Hence the
> problem as explained by Peter. Once I supplied timestamps precise to
> microsecond (using System.nanoTime()), the problem went away.
> 
> I previously stated that sleeping for a few milliseconds didn't help.
> It was actually because of the precision of Java Thread.sleep().
> Sleeping for less than 15ms often doesn't sleep at all.
> 
> Haven't checked the Python side to see if it's similar situation.
> 
> Cheers.
> 
> Jiang
> 
> On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
>> It's a single node. Thanks for the theory. I suspect part of it may
>> still be right. Will dig more.
>> 
>> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
>> <pe...@infidyne.com> wrote:
>>>> The problem still happens with very high probability even when it
>>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>>>> it can't be the cause. Also I have the same problem with a Java client
>>>> using Pelops.
>>> 
>>> You connect to localhost, but is that a single node or part of a
>>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
>>> level to ensure that a read sees your write.
>>> 
>>> If it's a single node and not a pycassa / client issue, I don't know off hand.
>>> 
>>> --
>>> / Peter Schuller (@scode on twitter)
>>> 
>>

Re: Updates lost

Posted by Jiang Chen <ji...@gmail.com>.

Looks like the theory is correct for the java case at least.

The default timestamp precision of Pelops is millisecond. Hence the
problem as explained by Peter. Once I supplied timestamps precise to
microsecond (using System.nanoTime()), the problem went away.

I previously stated that sleeping for a few milliseconds didn't help.
It was actually because of the precision of Java Thread.sleep().
Sleeping for less than 15ms often doesn't sleep at all.

Haven't checked the Python side to see if it's similar situation.

Cheers.

Jiang

On Tue, Aug 30, 2011 at 9:57 AM, Jiang Chen <ji...@gmail.com> wrote:
> It's a single node. Thanks for the theory. I suspect part of it may
> still be right. Will dig more.
>
> On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
> <pe...@infidyne.com> wrote:
>>> The problem still happens with very high probability even when it
>>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>>> it can't be the cause. Also I have the same problem with a Java client
>>> using Pelops.
>>
>> You connect to localhost, but is that a single node or part of a
>> cluster with RF > 1? If the latter, you need to use QUORUM consistency
>> level to ensure that a read sees your write.
>>
>> If it's a single node and not a pycassa / client issue, I don't know off hand.
>>
>> --
>> / Peter Schuller (@scode on twitter)
>>
>

Re: Updates lost

Posted by Jiang Chen <ji...@gmail.com>.

It's a single node. Thanks for the theory. I suspect part of it may
still be right. Will dig more.

On Tue, Aug 30, 2011 at 9:50 AM, Peter Schuller
<pe...@infidyne.com> wrote:
>> The problem still happens with very high probability even when it
>> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
>> it can't be the cause. Also I have the same problem with a Java client
>> using Pelops.
>
> You connect to localhost, but is that a single node or part of a
> cluster with RF > 1? If the latter, you need to use QUORUM consistency
> level to ensure that a read sees your write.
>
> If it's a single node and not a pycassa / client issue, I don't know off hand.
>
> --
> / Peter Schuller (@scode on twitter)
>

Re: Updates lost

Posted by Peter Schuller <pe...@infidyne.com>.

> The problem still happens with very high probability even when it
> pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
> it can't be the cause. Also I have the same problem with a Java client
> using Pelops.

You connect to localhost, but is that a single node or part of a
cluster with RF > 1? If the latter, you need to use QUORUM consistency
level to ensure that a read sees your write.

If it's a single node and not a pycassa / client issue, I don't know off hand.

-- 
/ Peter Schuller (@scode on twitter)

Re: Updates lost

Posted by Jiang Chen <ji...@gmail.com>.

The problem still happens with very high probability even when it
pauses for 5 milliseconds at every loop. If Pycassa uses microseconds
it can't be the cause. Also I have the same problem with a Java client
using Pelops.

On Tue, Aug 30, 2011 at 12:14 AM, Tyler Hobbs <ty...@datastax.com> wrote:
>
> On Mon, Aug 29, 2011 at 4:56 PM, Peter Schuller
> <pe...@infidyne.com> wrote:
>>
>> > If the client sleeps for a few ms at each loop, the success rate
>> > increases. At 15 ms, the script always succeeds so far. Interestingly,
>> > the problem seems to be sensitive to alphabetical order. Updating the
>> > value from 'aaa' to 'bbb' never has problem. No pause needed.
>>
>> Is it possible the version of pycassa you're using does not guarantee
>> that successive queries use non-identical and monotonically increasing
>> timestamps?
>
> By default, pycassa uses microsecond-precision timestamps.
> ColumnFamily.insert() returns the timestamp used for the insert, so you
> could always check that it was different.  However, I doubt that you're
> getting more than one insert per microsecond, unless you have VM issues with
> the system clock.
>
> --
> Tyler Hobbs
> Software Engineer, DataStax
> Maintainer of the pycassa Cassandra Python client library
>
>

Re: Updates lost

Posted by Tyler Hobbs <ty...@datastax.com>.

On Mon, Aug 29, 2011 at 4:56 PM, Peter Schuller <peter.schuller@infidyne.com
> wrote:

> > If the client sleeps for a few ms at each loop, the success rate
> > increases. At 15 ms, the script always succeeds so far. Interestingly,
> > the problem seems to be sensitive to alphabetical order. Updating the
> > value from 'aaa' to 'bbb' never has problem. No pause needed.
>
> Is it possible the version of pycassa you're using does not guarantee
> that successive queries use non-identical and monotonically increasing
> timestamps?
>

By default, pycassa uses microsecond-precision timestamps.
ColumnFamily.insert() returns the timestamp used for the insert, so you
could always check that it was different.  However, I doubt that you're
getting more than one insert per microsecond, unless you have VM issues with
the system clock.

-- 
Tyler Hobbs
Software Engineer, DataStax <http://datastax.com/>
Maintainer of the pycassa <http://github.com/pycassa/pycassa> Cassandra
Python client library

Re: Updates lost

Posted by Paul Loy <ke...@gmail.com>.

Well, on windows Vista and below (haven't checked on 7),
System.currentTimeMillis only has around 10ms granularity. That is for any
10ms period, you get the same value. I develop on Windows and I'd get
sporadic integration test failures due to this.

On Thu, Sep 1, 2011 at 8:31 PM, Jeremiah Jordan <
jeremiah.jordan@morningstar.com> wrote:

> Are you running on windows?  If the default timestamp is just using
> time.time()*1e6 you will get the same timestamp twice if the code is close
> together.  time.time() on windows is only millisecond resolution.  I don't
> use pycassa, but in the Thrift api wrapper I created for our python code I
> implemented the following function for getting timestamps:
>
> def GetTimeInMicroSec():
>    """
>    Returns the current time in microseconds, returned value always
> increases with each call.
>
>    :return: Current time in microseconds
>    """
>    newTime = long(time.time()*1e6)
>    try:
>        if GetTimeInMicroSec.lastTime >= newTime:
>            newTime = GetTimeInMicroSec.lastTime + 1
>    except AttributeError:
>        pass
>    GetTimeInMicroSec.lastTime = newTime
>    return newTime
>
>
> On 08/29/2011 04:56 PM, Peter Schuller wrote:
>
>> If the client sleeps for a few ms at each loop, the success rate
>>> increases. At 15 ms, the script always succeeds so far. Interestingly,
>>> the problem seems to be sensitive to alphabetical order. Updating the
>>> value from 'aaa' to 'bbb' never has problem. No pause needed.
>>>
>> Is it possible the version of pycassa you're using does not guarantee
>> that successive queries use non-identical and monotonically increasing
>> timestamps? I'm just speculating, but if that is the case and two
>> requests are sent with the same timestamp (due to resolution being
>> lower than the time it takes between calls), the tie breaking would be
>> the column value which jives with the fact that you're saying it seems
>> to depend on the value.
>>
>> (I haven't checked current nor past versions of pycassa to determine
>> if this is plausible. Just speculating.)
>>
>>


-- 
---------------------------------------------
Paul Loy
paul@keteracel.com
http://uk.linkedin.com/in/paulloy

Re: Updates lost

Posted by Jeremiah Jordan <je...@morningstar.com>.

Are you running on windows?  If the default timestamp is just using 
time.time()*1e6 you will get the same timestamp twice if the code is 
close together.  time.time() on windows is only millisecond resolution.  
I don't use pycassa, but in the Thrift api wrapper I created for our 
python code I implemented the following function for getting timestamps:

def GetTimeInMicroSec():
     """
     Returns the current time in microseconds, returned value always 
increases with each call.

     :return: Current time in microseconds
     """
     newTime = long(time.time()*1e6)
     try:
         if GetTimeInMicroSec.lastTime >= newTime:
             newTime = GetTimeInMicroSec.lastTime + 1
     except AttributeError:
         pass
     GetTimeInMicroSec.lastTime = newTime
     return newTime


On 08/29/2011 04:56 PM, Peter Schuller wrote:
>> If the client sleeps for a few ms at each loop, the success rate
>> increases. At 15 ms, the script always succeeds so far. Interestingly,
>> the problem seems to be sensitive to alphabetical order. Updating the
>> value from 'aaa' to 'bbb' never has problem. No pause needed.
> Is it possible the version of pycassa you're using does not guarantee
> that successive queries use non-identical and monotonically increasing
> timestamps? I'm just speculating, but if that is the case and two
> requests are sent with the same timestamp (due to resolution being
> lower than the time it takes between calls), the tie breaking would be
> the column value which jives with the fact that you're saying it seems
> to depend on the value.
>
> (I haven't checked current nor past versions of pycassa to determine
> if this is plausible. Just speculating.)
>

Re: Updates lost

Posted by Peter Schuller <pe...@infidyne.com>.

> If the client sleeps for a few ms at each loop, the success rate
> increases. At 15 ms, the script always succeeds so far. Interestingly,
> the problem seems to be sensitive to alphabetical order. Updating the
> value from 'aaa' to 'bbb' never has problem. No pause needed.

Is it possible the version of pycassa you're using does not guarantee
that successive queries use non-identical and monotonically increasing
timestamps? I'm just speculating, but if that is the case and two
requests are sent with the same timestamp (due to resolution being
lower than the time it takes between calls), the tie breaking would be
the column value which jives with the fact that you're saying it seems
to depend on the value.

(I haven't checked current nor past versions of pycassa to determine
if this is plausible. Just speculating.)

-- 
/ Peter Schuller (@scode on twitter)