You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Drew Kutcharian <dr...@venarc.com> on 2011/04/14 00:07:58 UTC

Possible design flaw in "Cassandra By Example" blog

Hi Everyone,

I was going thru Cassandra By Example Blog http://www.rackspace.com/cloud/blog/2010/05/12/cassandra-by-example/ and I had a question about the user sign up section:

username = 'jericevans'
password = '**********'
useruuid = str(uuid())
columns = {'id': useruuid, 'username': username, 'password': password}
USER.insert(useruuid, columns)
USERNAME.insert(username, {'id': useruuid})

How can I guarantee that USERNAME.insert(username, {'id': useruuid}) won't overwrite someone else's account. What I mean is how can I guarantee that a user's username doesn't already exist in Cassandra? I know I can check first, but in a highly concurrent environment, there's a possibility that between USER.insert(useruuid, columns) and USERNAME.insert(username, {'id': useruuid}) someone else does the same USERNAME.insert(username, {'id': useruuid}) and hijack the user's account.

Seems like that USERNAME is something that the author has added since it's missing in original Twissandra source code.

Thanks,

Drew


Re: Possible design flaw in "Cassandra By Example" blog

Posted by Drew Kutcharian <dr...@venarc.com>.
Thanks for your response. In general, seems like you always need some kind of external coordination if you are doing inverted indexes. How do others tackle this issue?

Now would using secondary indexes be a good idea in this case considering cardinality of the keys will be pretty high?

cheers,

Drew


On Apr 13, 2011, at 4:51 PM, Eric Evans wrote:

> On Wed, 2011-04-13 at 15:07 -0700, Drew Kutcharian wrote:
>> username = 'jericevans'
>> password = '**********'
>> useruuid = str(uuid())
>> columns = {'id': useruuid, 'username': username, 'password': password}
>> USER.insert(useruuid, columns)
>> USERNAME.insert(username, {'id': useruuid})
>> 
>> How can I guarantee that USERNAME.insert(username, {'id': useruuid})
>> won't overwrite someone else's account. What I mean is how can I
>> guarantee that a user's username doesn't already exist in Cassandra? I
>> know I can check first, but in a highly concurrent environment,
>> there's a possibility that between USER.insert(useruuid, columns) and
>> USERNAME.insert(username, {'id': useruuid}) someone else does the same
>> USERNAME.insert(username, {'id': useruuid}) and hijack the user's
>> account.
> 
> Yes, this is a flaw.  You'd need some sort of external coordination to
> be sure you could prevent this.
> 
> There are probably many such flaws, Twissandra wasn't meant to be a Real
> app, it's an aid in teaching the query and data models, and a lot was
> glossed over to keep it concise.
> 
>> Seems like that USERNAME is something that the author has added since
>> it's missing in original Twissandra source code.
> 
> Right, since that article was written, the Username column family was
> removed, and the User column family is now keyed on username (which
> solves the problem of concurrent updates, by making it "last write
> wins").
> 
> -- 
> Eric Evans
> eevans@rackspace.com
> 


Re: Possible design flaw in "Cassandra By Example" blog

Posted by Eric Evans <ee...@rackspace.com>.
On Wed, 2011-04-13 at 15:07 -0700, Drew Kutcharian wrote:
> username = 'jericevans'
> password = '**********'
> useruuid = str(uuid())
> columns = {'id': useruuid, 'username': username, 'password': password}
> USER.insert(useruuid, columns)
> USERNAME.insert(username, {'id': useruuid})
> 
> How can I guarantee that USERNAME.insert(username, {'id': useruuid})
> won't overwrite someone else's account. What I mean is how can I
> guarantee that a user's username doesn't already exist in Cassandra? I
> know I can check first, but in a highly concurrent environment,
> there's a possibility that between USER.insert(useruuid, columns) and
> USERNAME.insert(username, {'id': useruuid}) someone else does the same
> USERNAME.insert(username, {'id': useruuid}) and hijack the user's
> account.

Yes, this is a flaw.  You'd need some sort of external coordination to
be sure you could prevent this.

There are probably many such flaws, Twissandra wasn't meant to be a Real
app, it's an aid in teaching the query and data models, and a lot was
glossed over to keep it concise.

> Seems like that USERNAME is something that the author has added since
> it's missing in original Twissandra source code.

Right, since that article was written, the Username column family was
removed, and the User column family is now keyed on username (which
solves the problem of concurrent updates, by making it "last write
wins").

-- 
Eric Evans
eevans@rackspace.com