You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by John Alessi <jo...@socketlabs.com> on 2010/03/19 00:12:52 UTC

Issue with TimeUUID

I am having an issue where Cassandra doesn't seem to be able to distinguish between 2 different UUIDs if based on the same exact time, and sorting by TimeUUID.

*****************************
Some of my config:
*****************************
<Keyspaces>
		<Keyspace Name="OD">
	      <KeysCachedFraction>0.01</KeysCachedFraction>
	      <ColumnFamily CompareWith="TimeUUIDType" Name="Events"/>
		</Keyspace>
</Keyspaces>
...
<Partitioner>org.apache.cassandra.dht.OrderPreservingPartitioner</Partitioner>


*****************************
Ruby sample:
*****************************
require 'rubygems'
require 'cassandra'

logs = [{:DateTime=>'1257195334', :Text=>"something happened"},
      	{:DateTime=>'1257195334', :Text=>"something happened again at the same time"},
      	{:DateTime=>'1257195334', :Text=>"something happened one second later"}]

client = Cassandra.new("OD")
logs.each do |log_line| 
  event_id = UUID.new(Time.at(log_line['DateTime'].to_i))
  puts event_id.to_guid
  if client.get(:Events, "processed", event_id) != nil
     puts "duplicate: " + event_id.to_guid + " " + log_line['DateTime'].to_s
  end
  client.insert(:Events, "processed", event_id=>"#{log_line['Text']}")
end


*****************************
Outputs:
*****************************
1257195334
1077e700-c7f2-11de-86d5-f5bcc793a028
1257195334
1077e700-c7f2-11de-982e-6fad363d5f29
duplicate: 1077e700-c7f2-11de-982e-6fad363d5f29 1257195334
1257195335
11107d80-c7f2-11de-9b6f-4c8aee849eef


Cassandra doesn't seem to be able to distinguish between 2 different UUIDs if based on the same exact time, and sorting by TimeUUID.

What am I missing???

--
John Alessi
SocketLabs, Inc.
484-418-1282



Re: Issue with TimeUUID

Posted by John Alessi <jo...@socketlabs.com>.
Yes, I tried that but then the date does not sort correctly.
--
John Alessi
SocketLabs, Inc.
484-418-1282


On Mar 19, 2010, at 12:12 PM, Jesse McConnell wrote:

> alternately try using LexicalUUIDType, that seems to work
> 
> jesse
> 
> --
> jesse mcconnell
> jesse.mcconnell@gmail.com
> 
> 
> 
> On Fri, Mar 19, 2010 at 11:00, Sylvain Lebresne <sy...@yakaz.com> wrote:
>> As said, I agree with that.
>> I've thus created a jira issue
>> (https://issues.apache.org/jira/browse/CASSANDRA-907).
>> The discussion could continue there.
>> 
>> On Fri, Mar 19, 2010 at 4:30 PM, Jesse McConnell
>> <je...@gmail.com> wrote:
>>> imo it is a terrible bug..
>>> 
>>> the usage of a TimeUUIDType implies that your actually caring about
>>> the unique bits outside of a timestamp...
>>> 
>>> currently it's nothing more then LongType ColumnFamily backed by
>>> System.currentTimeInMillis() as a source for name columns.
>>> 
>>> jesse
>>> 
>>> --
>>> jesse mcconnell
>>> jesse.mcconnell@gmail.com
>>> 
>>> 
>>> 
>>> On Fri, Mar 19, 2010 at 04:53, Sylvain Lebresne <sy...@yakaz.com> wrote:
>>>> Just looked at the code and it indeed just compare the
>>>> timestamps. I also find it weird and I would be for changing it,
>>>> but maybe there was a good reason to do it the way it is (even
>>>> if I don't see one right now). I'll let people give their opinion on
>>>> that.
>>>> 
>>>> In the meantime, if you need a quick fix for testing, I join you
>>>> a two line patch that should fix it.
>>>> 
>>>> --
>>>> Sylvain
>>>> 
>>>> On Fri, Mar 19, 2010 at 1:25 AM, John Alessi <jo...@socketlabs.com> wrote:
>>>>> But they are different names.  In my example they are:
>>>>> 1077e700-c7f2-11de-86d5-f5bcc793a028
>>>>> 1077e700-c7f2-11de-982e-6fad363d5f29
>>>>> But Cassandra sees them as the same.
>>>>> --
>>>>> John
>>>>> 
>>>>> 
>>>>> 
>>>>> On Mar 18, 2010, at 7:17 PM, Brandon Williams wrote:
>>>>> 
>>>>> On Thu, Mar 18, 2010 at 6:12 PM, John Alessi <jo...@socketlabs.com> wrote:
>>>>>> 
>>>>>> I am having an issue where Cassandra doesn't seem to be able to
>>>>>> distinguish between 2 different UUIDs if based on the same exact time, and
>>>>>> sorting by TimeUUID.
>>>>> 
>>>>> *snip*
>>>>>> 
>>>>>> Cassandra doesn't seem to be able to distinguish between 2 different UUIDs
>>>>>> if based on the same exact time, and sorting by TimeUUID.
>>>>>> 
>>>>>> What am I missing???
>>>>> 
>>>>> Column names must be distinct.  If you insert two columns with the same
>>>>> name, one overwrites the other.
>>>>> -Brandon
>>>>> 
>>>>> 
>>>> 
>>> 
>> 


Re: Issue with TimeUUID

Posted by Jesse McConnell <je...@gmail.com>.
alternately try using LexicalUUIDType, that seems to work

jesse

--
jesse mcconnell
jesse.mcconnell@gmail.com



On Fri, Mar 19, 2010 at 11:00, Sylvain Lebresne <sy...@yakaz.com> wrote:
> As said, I agree with that.
> I've thus created a jira issue
> (https://issues.apache.org/jira/browse/CASSANDRA-907).
> The discussion could continue there.
>
> On Fri, Mar 19, 2010 at 4:30 PM, Jesse McConnell
> <je...@gmail.com> wrote:
>> imo it is a terrible bug..
>>
>> the usage of a TimeUUIDType implies that your actually caring about
>> the unique bits outside of a timestamp...
>>
>> currently it's nothing more then LongType ColumnFamily backed by
>> System.currentTimeInMillis() as a source for name columns.
>>
>> jesse
>>
>> --
>> jesse mcconnell
>> jesse.mcconnell@gmail.com
>>
>>
>>
>> On Fri, Mar 19, 2010 at 04:53, Sylvain Lebresne <sy...@yakaz.com> wrote:
>>> Just looked at the code and it indeed just compare the
>>> timestamps. I also find it weird and I would be for changing it,
>>> but maybe there was a good reason to do it the way it is (even
>>> if I don't see one right now). I'll let people give their opinion on
>>> that.
>>>
>>> In the meantime, if you need a quick fix for testing, I join you
>>> a two line patch that should fix it.
>>>
>>> --
>>> Sylvain
>>>
>>> On Fri, Mar 19, 2010 at 1:25 AM, John Alessi <jo...@socketlabs.com> wrote:
>>>> But they are different names.  In my example they are:
>>>> 1077e700-c7f2-11de-86d5-f5bcc793a028
>>>> 1077e700-c7f2-11de-982e-6fad363d5f29
>>>> But Cassandra sees them as the same.
>>>> --
>>>> John
>>>>
>>>>
>>>>
>>>> On Mar 18, 2010, at 7:17 PM, Brandon Williams wrote:
>>>>
>>>> On Thu, Mar 18, 2010 at 6:12 PM, John Alessi <jo...@socketlabs.com> wrote:
>>>>>
>>>>> I am having an issue where Cassandra doesn't seem to be able to
>>>>> distinguish between 2 different UUIDs if based on the same exact time, and
>>>>> sorting by TimeUUID.
>>>>
>>>> *snip*
>>>>>
>>>>> Cassandra doesn't seem to be able to distinguish between 2 different UUIDs
>>>>> if based on the same exact time, and sorting by TimeUUID.
>>>>>
>>>>> What am I missing???
>>>>
>>>> Column names must be distinct.  If you insert two columns with the same
>>>> name, one overwrites the other.
>>>> -Brandon
>>>>
>>>>
>>>
>>
>

Re: Issue with TimeUUID

Posted by Sylvain Lebresne <sy...@yakaz.com>.
As said, I agree with that.
I've thus created a jira issue
(https://issues.apache.org/jira/browse/CASSANDRA-907).
The discussion could continue there.

On Fri, Mar 19, 2010 at 4:30 PM, Jesse McConnell
<je...@gmail.com> wrote:
> imo it is a terrible bug..
>
> the usage of a TimeUUIDType implies that your actually caring about
> the unique bits outside of a timestamp...
>
> currently it's nothing more then LongType ColumnFamily backed by
> System.currentTimeInMillis() as a source for name columns.
>
> jesse
>
> --
> jesse mcconnell
> jesse.mcconnell@gmail.com
>
>
>
> On Fri, Mar 19, 2010 at 04:53, Sylvain Lebresne <sy...@yakaz.com> wrote:
>> Just looked at the code and it indeed just compare the
>> timestamps. I also find it weird and I would be for changing it,
>> but maybe there was a good reason to do it the way it is (even
>> if I don't see one right now). I'll let people give their opinion on
>> that.
>>
>> In the meantime, if you need a quick fix for testing, I join you
>> a two line patch that should fix it.
>>
>> --
>> Sylvain
>>
>> On Fri, Mar 19, 2010 at 1:25 AM, John Alessi <jo...@socketlabs.com> wrote:
>>> But they are different names.  In my example they are:
>>> 1077e700-c7f2-11de-86d5-f5bcc793a028
>>> 1077e700-c7f2-11de-982e-6fad363d5f29
>>> But Cassandra sees them as the same.
>>> --
>>> John
>>>
>>>
>>>
>>> On Mar 18, 2010, at 7:17 PM, Brandon Williams wrote:
>>>
>>> On Thu, Mar 18, 2010 at 6:12 PM, John Alessi <jo...@socketlabs.com> wrote:
>>>>
>>>> I am having an issue where Cassandra doesn't seem to be able to
>>>> distinguish between 2 different UUIDs if based on the same exact time, and
>>>> sorting by TimeUUID.
>>>
>>> *snip*
>>>>
>>>> Cassandra doesn't seem to be able to distinguish between 2 different UUIDs
>>>> if based on the same exact time, and sorting by TimeUUID.
>>>>
>>>> What am I missing???
>>>
>>> Column names must be distinct.  If you insert two columns with the same
>>> name, one overwrites the other.
>>> -Brandon
>>>
>>>
>>
>

Re: Issue with TimeUUID

Posted by Jesse McConnell <je...@gmail.com>.
imo it is a terrible bug..

the usage of a TimeUUIDType implies that your actually caring about
the unique bits outside of a timestamp...

currently it's nothing more then LongType ColumnFamily backed by
System.currentTimeInMillis() as a source for name columns.

jesse

--
jesse mcconnell
jesse.mcconnell@gmail.com



On Fri, Mar 19, 2010 at 04:53, Sylvain Lebresne <sy...@yakaz.com> wrote:
> Just looked at the code and it indeed just compare the
> timestamps. I also find it weird and I would be for changing it,
> but maybe there was a good reason to do it the way it is (even
> if I don't see one right now). I'll let people give their opinion on
> that.
>
> In the meantime, if you need a quick fix for testing, I join you
> a two line patch that should fix it.
>
> --
> Sylvain
>
> On Fri, Mar 19, 2010 at 1:25 AM, John Alessi <jo...@socketlabs.com> wrote:
>> But they are different names.  In my example they are:
>> 1077e700-c7f2-11de-86d5-f5bcc793a028
>> 1077e700-c7f2-11de-982e-6fad363d5f29
>> But Cassandra sees them as the same.
>> --
>> John
>>
>>
>>
>> On Mar 18, 2010, at 7:17 PM, Brandon Williams wrote:
>>
>> On Thu, Mar 18, 2010 at 6:12 PM, John Alessi <jo...@socketlabs.com> wrote:
>>>
>>> I am having an issue where Cassandra doesn't seem to be able to
>>> distinguish between 2 different UUIDs if based on the same exact time, and
>>> sorting by TimeUUID.
>>
>> *snip*
>>>
>>> Cassandra doesn't seem to be able to distinguish between 2 different UUIDs
>>> if based on the same exact time, and sorting by TimeUUID.
>>>
>>> What am I missing???
>>
>> Column names must be distinct.  If you insert two columns with the same
>> name, one overwrites the other.
>> -Brandon
>>
>>
>

Re: Issue with TimeUUID

Posted by Sylvain Lebresne <sy...@yakaz.com>.
Just looked at the code and it indeed just compare the
timestamps. I also find it weird and I would be for changing it,
but maybe there was a good reason to do it the way it is (even
if I don't see one right now). I'll let people give their opinion on
that.

In the meantime, if you need a quick fix for testing, I join you
a two line patch that should fix it.

--
Sylvain

On Fri, Mar 19, 2010 at 1:25 AM, John Alessi <jo...@socketlabs.com> wrote:
> But they are different names.  In my example they are:
> 1077e700-c7f2-11de-86d5-f5bcc793a028
> 1077e700-c7f2-11de-982e-6fad363d5f29
> But Cassandra sees them as the same.
> --
> John
>
>
>
> On Mar 18, 2010, at 7:17 PM, Brandon Williams wrote:
>
> On Thu, Mar 18, 2010 at 6:12 PM, John Alessi <jo...@socketlabs.com> wrote:
>>
>> I am having an issue where Cassandra doesn't seem to be able to
>> distinguish between 2 different UUIDs if based on the same exact time, and
>> sorting by TimeUUID.
>
> *snip*
>>
>> Cassandra doesn't seem to be able to distinguish between 2 different UUIDs
>> if based on the same exact time, and sorting by TimeUUID.
>>
>> What am I missing???
>
> Column names must be distinct.  If you insert two columns with the same
> name, one overwrites the other.
> -Brandon
>
>

Re: Issue with TimeUUID

Posted by John Alessi <jo...@socketlabs.com>.
But they are different names.  In my example they are:

1077e700-c7f2-11de-86d5-f5bcc793a028
1077e700-c7f2-11de-982e-6fad363d5f29

But Cassandra sees them as the same.
--
John



On Mar 18, 2010, at 7:17 PM, Brandon Williams wrote:

On Thu, Mar 18, 2010 at 6:12 PM, John Alessi <jo...@socketlabs.com>> wrote:
I am having an issue where Cassandra doesn't seem to be able to distinguish between 2 different UUIDs if based on the same exact time, and sorting by TimeUUID.
*snip*
Cassandra doesn't seem to be able to distinguish between 2 different UUIDs if based on the same exact time, and sorting by TimeUUID.

What am I missing???

Column names must be distinct.  If you insert two columns with the same name, one overwrites the other.

-Brandon



Re: Issue with TimeUUID

Posted by Brandon Williams <dr...@gmail.com>.
On Thu, Mar 18, 2010 at 6:12 PM, John Alessi <jo...@socketlabs.com> wrote:

> I am having an issue where Cassandra doesn't seem to be able to distinguish
> between 2 different UUIDs if based on the same exact time, and sorting by
> TimeUUID.
>
*snip*

> Cassandra doesn't seem to be able to distinguish between 2 different UUIDs
> if based on the same exact time, and sorting by TimeUUID.
>
> What am I missing???


Column names must be distinct.  If you insert two columns with the same
name, one overwrites the other.

-Brandon