You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Kasper Middelboe Petersen <ka...@sybogames.com> on 2014/01/22 18:44:20 UTC

Datamodel for a highscore list

Hi!

I'm a little worried about the data model I have come up with for handling
highscores.

I have a lot of users. Each user has a number of friends. I need a
highscore list pr friend list.

I would like to have it optimized for reading the highscores as opposed to
setting a new highscore as the use case would suggest I would need to read
the list a lot more than I would need write new highscores.

Currently I have the following tables:
CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo int,
PRIMARY KEY(userId))
CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
... and a tables for friends - for the purpose of this mail assume everyone
is friends with everyone else

Reading the highscore list for a given user is easy. SELECT * FROM
highscores WHERE userId = <id>.

Problem is setting a new highscore.
1. I need to read-before-write to get the old score
2. I'm screwed if something goes wrong and the old score gets overwritten
before all the friends highscore lists gets updated - and it is an highly
visible error due to the same user is on the highscore multiple times.

I would very much appreciate some feedback and/or alternatives to how to
solve this with Cassandra.


Thanks,
Kasper

Re: Datamodel for a highscore list

Posted by Colin <co...@gmail.com>.
One way might be to use userid as a rowid, and then put all of the friends with their scores on the same row.  You could even update the column entry like this

Score:username or Id

This way the columns would come back sorted when reading the high scores for the group.

To update set that uses score in that users row after reading it for update.

So each row would look like this

Rowkey - userid
Columns would be userid:score followed by friendid:score

This way, you could also get global high score list

Each user would have their own row

If multiple games, create userid+gameid as rowkey

Might this work?


--
Colin 
+1 320 221 9531

 

> On Jan 22, 2014, at 11:13 AM, Kasper Middelboe Petersen <ka...@sybogames.com> wrote:
> 
> Many million users. Just the one game- I might have some different scores I need to keep track of, but I very much hope to be able to use the same approach for those as for the high score mentioned here.
> 
> 
>> On Wed, Jan 22, 2014 at 7:08 PM, Colin Clark <co...@clark.ws> wrote:
>> How many users and how many games?
>> 
>> 
>> --
>> Colin 
>> +1 320 221 9531
>> 
>>  
>> 
>>> On Jan 22, 2014, at 10:59 AM, Kasper Middelboe Petersen <ka...@sybogames.com> wrote:
>>> 
>>> I can think of two cases where something bad would happen in this case:
>>> 1. Something bad happens after the increment but before some or all of the update friend list is finished
>>> 2. Someone spams two scores at the same time creating a race condition where one of them could have a score that is not yet updated (or the old score, depending on if the increment of the highscore is done before or after the friend updates)
>>> 
>>> Both are unlikely things to have happen often, but I'm going to have quite a few users using the system and it would be bound to happen and I would really like to avoid having data corruption (especially of the kind that is also obvious to the users) if it can at all be avoided.
>>> 
>>> Also should it happen there is no way to neither detect nor clean it up.
>>> 
>>> 
>>>> On Wed, Jan 22, 2014 at 6:48 PM, Colin <co...@gmail.com> wrote:
>>>> Read users score, increment, update friends list, update user with new high score
>>>> 
>>>> Would that work?
>>>> 
>>>> --
>>>> Colin
>>>> +1 320 221 9531
>>>> 
>>>> 
>>>> 
>>>> > On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen <ka...@sybogames.com> wrote:
>>>> >
>>>> > Hi!
>>>> >
>>>> > I'm a little worried about the data model I have come up with for handling highscores.
>>>> >
>>>> > I have a lot of users. Each user has a number of friends. I need a highscore list pr friend list.
>>>> >
>>>> > I would like to have it optimized for reading the highscores as opposed to setting a new highscore as the use case would suggest I would need to read the list a lot more than I would need write new highscores.
>>>> >
>>>> > Currently I have the following tables:
>>>> > CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo int, PRIMARY KEY(userId))
>>>> > CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
>>>> > ... and a tables for friends - for the purpose of this mail assume everyone is friends with everyone else
>>>> >
>>>> > Reading the highscore list for a given user is easy. SELECT * FROM highscores WHERE userId = <id>.
>>>> >
>>>> > Problem is setting a new highscore.
>>>> > 1. I need to read-before-write to get the old score
>>>> > 2. I'm screwed if something goes wrong and the old score gets overwritten before all the friends highscore lists gets updated - and it is an highly visible error due to the same user is on the highscore multiple times.
>>>> >
>>>> > I would very much appreciate some feedback and/or alternatives to how to solve this with Cassandra.
>>>> >
>>>> >
>>>> > Thanks,
>>>> > Kasper
>>> 
> 

Re: Datamodel for a highscore list

Posted by Kasper Middelboe Petersen <ka...@sybogames.com>.
Many million users. Just the one game- I might have some different scores I
need to keep track of, but I very much hope to be able to use the same
approach for those as for the high score mentioned here.


On Wed, Jan 22, 2014 at 7:08 PM, Colin Clark <co...@clark.ws> wrote:

> How many users and how many games?
>
>
> --
> Colin
> +1 320 221 9531
>
>
>
> On Jan 22, 2014, at 10:59 AM, Kasper Middelboe Petersen <
> kasper@sybogames.com> wrote:
>
> I can think of two cases where something bad would happen in this case:
> 1. Something bad happens after the increment but before some or all of the
> update friend list is finished
> 2. Someone spams two scores at the same time creating a race condition
> where one of them could have a score that is not yet updated (or the old
> score, depending on if the increment of the highscore is done before or
> after the friend updates)
>
> Both are unlikely things to have happen often, but I'm going to have quite
> a few users using the system and it would be bound to happen and I would
> really like to avoid having data corruption (especially of the kind that is
> also obvious to the users) if it can at all be avoided.
>
> Also should it happen there is no way to neither detect nor clean it up.
>
>
> On Wed, Jan 22, 2014 at 6:48 PM, Colin <co...@gmail.com> wrote:
>
>> Read users score, increment, update friends list, update user with new
>> high score
>>
>> Would that work?
>>
>> --
>> Colin
>> +1 320 221 9531
>>
>>
>>
>> > On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen <
>> kasper@sybogames.com> wrote:
>> >
>> > Hi!
>> >
>> > I'm a little worried about the data model I have come up with for
>> handling highscores.
>> >
>> > I have a lot of users. Each user has a number of friends. I need a
>> highscore list pr friend list.
>> >
>> > I would like to have it optimized for reading the highscores as opposed
>> to setting a new highscore as the use case would suggest I would need to
>> read the list a lot more than I would need write new highscores.
>> >
>> > Currently I have the following tables:
>> > CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
>> int, PRIMARY KEY(userId))
>> > CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
>> KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
>> > ... and a tables for friends - for the purpose of this mail assume
>> everyone is friends with everyone else
>> >
>> > Reading the highscore list for a given user is easy. SELECT * FROM
>> highscores WHERE userId = <id>.
>> >
>> > Problem is setting a new highscore.
>> > 1. I need to read-before-write to get the old score
>> > 2. I'm screwed if something goes wrong and the old score gets
>> overwritten before all the friends highscore lists gets updated - and it is
>> an highly visible error due to the same user is on the highscore multiple
>> times.
>> >
>> > I would very much appreciate some feedback and/or alternatives to how
>> to solve this with Cassandra.
>> >
>> >
>> > Thanks,
>> > Kasper
>>
>
>

Re: Datamodel for a highscore list

Posted by Colin Clark <co...@clark.ws>.
How many users and how many games?

--
Colin
+1 320 221 9531



On Jan 22, 2014, at 10:59 AM, Kasper Middelboe Petersen <
kasper@sybogames.com> wrote:

I can think of two cases where something bad would happen in this case:
1. Something bad happens after the increment but before some or all of the
update friend list is finished
2. Someone spams two scores at the same time creating a race condition
where one of them could have a score that is not yet updated (or the old
score, depending on if the increment of the highscore is done before or
after the friend updates)

Both are unlikely things to have happen often, but I'm going to have quite
a few users using the system and it would be bound to happen and I would
really like to avoid having data corruption (especially of the kind that is
also obvious to the users) if it can at all be avoided.

Also should it happen there is no way to neither detect nor clean it up.


On Wed, Jan 22, 2014 at 6:48 PM, Colin <co...@gmail.com> wrote:

> Read users score, increment, update friends list, update user with new
> high score
>
> Would that work?
>
> --
> Colin
> +1 320 221 9531
>
>
>
> > On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen <
> kasper@sybogames.com> wrote:
> >
> > Hi!
> >
> > I'm a little worried about the data model I have come up with for
> handling highscores.
> >
> > I have a lot of users. Each user has a number of friends. I need a
> highscore list pr friend list.
> >
> > I would like to have it optimized for reading the highscores as opposed
> to setting a new highscore as the use case would suggest I would need to
> read the list a lot more than I would need write new highscores.
> >
> > Currently I have the following tables:
> > CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
> int, PRIMARY KEY(userId))
> > CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
> KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
> > ... and a tables for friends - for the purpose of this mail assume
> everyone is friends with everyone else
> >
> > Reading the highscore list for a given user is easy. SELECT * FROM
> highscores WHERE userId = <id>.
> >
> > Problem is setting a new highscore.
> > 1. I need to read-before-write to get the old score
> > 2. I'm screwed if something goes wrong and the old score gets
> overwritten before all the friends highscore lists gets updated - and it is
> an highly visible error due to the same user is on the highscore multiple
> times.
> >
> > I would very much appreciate some feedback and/or alternatives to how to
> solve this with Cassandra.
> >
> >
> > Thanks,
> > Kasper
>

Re: Datamodel for a highscore list

Posted by Kasper Middelboe Petersen <ka...@sybogames.com>.
I can think of two cases where something bad would happen in this case:
1. Something bad happens after the increment but before some or all of the
update friend list is finished
2. Someone spams two scores at the same time creating a race condition
where one of them could have a score that is not yet updated (or the old
score, depending on if the increment of the highscore is done before or
after the friend updates)

Both are unlikely things to have happen often, but I'm going to have quite
a few users using the system and it would be bound to happen and I would
really like to avoid having data corruption (especially of the kind that is
also obvious to the users) if it can at all be avoided.

Also should it happen there is no way to neither detect nor clean it up.


On Wed, Jan 22, 2014 at 6:48 PM, Colin <co...@gmail.com> wrote:

> Read users score, increment, update friends list, update user with new
> high score
>
> Would that work?
>
> --
> Colin
> +1 320 221 9531
>
>
>
> > On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen <
> kasper@sybogames.com> wrote:
> >
> > Hi!
> >
> > I'm a little worried about the data model I have come up with for
> handling highscores.
> >
> > I have a lot of users. Each user has a number of friends. I need a
> highscore list pr friend list.
> >
> > I would like to have it optimized for reading the highscores as opposed
> to setting a new highscore as the use case would suggest I would need to
> read the list a lot more than I would need write new highscores.
> >
> > Currently I have the following tables:
> > CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
> int, PRIMARY KEY(userId))
> > CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
> KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
> > ... and a tables for friends - for the purpose of this mail assume
> everyone is friends with everyone else
> >
> > Reading the highscore list for a given user is easy. SELECT * FROM
> highscores WHERE userId = <id>.
> >
> > Problem is setting a new highscore.
> > 1. I need to read-before-write to get the old score
> > 2. I'm screwed if something goes wrong and the old score gets
> overwritten before all the friends highscore lists gets updated - and it is
> an highly visible error due to the same user is on the highscore multiple
> times.
> >
> > I would very much appreciate some feedback and/or alternatives to how to
> solve this with Cassandra.
> >
> >
> > Thanks,
> > Kasper
>

Re: Datamodel for a highscore list

Posted by Edward Capriolo <ed...@gmail.com>.
It is a tricky type of problem because some ways of doing it involve
iterative scans.
This presentation discusses a solution for top-k:

http://www.slideshare.net/planetcassandra/jonathan-halliday


On Wed, Jan 22, 2014 at 12:48 PM, Colin <co...@gmail.com> wrote:

> Read users score, increment, update friends list, update user with new
> high score
>
> Would that work?
>
> --
> Colin
> +1 320 221 9531
>
>
>
> > On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen <
> kasper@sybogames.com> wrote:
> >
> > Hi!
> >
> > I'm a little worried about the data model I have come up with for
> handling highscores.
> >
> > I have a lot of users. Each user has a number of friends. I need a
> highscore list pr friend list.
> >
> > I would like to have it optimized for reading the highscores as opposed
> to setting a new highscore as the use case would suggest I would need to
> read the list a lot more than I would need write new highscores.
> >
> > Currently I have the following tables:
> > CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
> int, PRIMARY KEY(userId))
> > CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
> KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
> > ... and a tables for friends - for the purpose of this mail assume
> everyone is friends with everyone else
> >
> > Reading the highscore list for a given user is easy. SELECT * FROM
> highscores WHERE userId = <id>.
> >
> > Problem is setting a new highscore.
> > 1. I need to read-before-write to get the old score
> > 2. I'm screwed if something goes wrong and the old score gets
> overwritten before all the friends highscore lists gets updated - and it is
> an highly visible error due to the same user is on the highscore multiple
> times.
> >
> > I would very much appreciate some feedback and/or alternatives to how to
> solve this with Cassandra.
> >
> >
> > Thanks,
> > Kasper
>

Re: Datamodel for a highscore list

Posted by Colin <co...@gmail.com>.
Read users score, increment, update friends list, update user with new high score

Would that work?

--
Colin 
+1 320 221 9531

 

> On Jan 22, 2014, at 11:44 AM, Kasper Middelboe Petersen <ka...@sybogames.com> wrote:
> 
> Hi!
> 
> I'm a little worried about the data model I have come up with for handling highscores.
> 
> I have a lot of users. Each user has a number of friends. I need a highscore list pr friend list.
> 
> I would like to have it optimized for reading the highscores as opposed to setting a new highscore as the use case would suggest I would need to read the list a lot more than I would need write new highscores.
> 
> Currently I have the following tables:
> CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo int, PRIMARY KEY(userId))
> CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
> ... and a tables for friends - for the purpose of this mail assume everyone is friends with everyone else
> 
> Reading the highscore list for a given user is easy. SELECT * FROM highscores WHERE userId = <id>.
> 
> Problem is setting a new highscore.
> 1. I need to read-before-write to get the old score
> 2. I'm screwed if something goes wrong and the old score gets overwritten before all the friends highscore lists gets updated - and it is an highly visible error due to the same user is on the highscore multiple times.
> 
> I would very much appreciate some feedback and/or alternatives to how to solve this with Cassandra.
> 
> 
> Thanks,
> Kasper

Re: Datamodel for a highscore list

Posted by Colin Clark <co...@clark.ws>.
Most of the work I've done like this has used sparse table definitions and
the empty column trick.  I didn't explain that very well in my last
response.

I think by using the userid as the rowid, and using the friend id as the
column name with the score, that I would put an entire user's friend list
on one row.  The row would look like this:

ROWID
USERID

Colin
+1 320 221 9531



On Thu, Jan 23, 2014 at 2:34 AM, Kasper Middelboe Petersen <
kasper@sybogames.com> wrote:

> What would the consequence be of having this updated highscore table
> (using friendId as part of the clustering index to avoid name collisions):
>
> CREATE TABLE highscore (
>   userId uuid,
>   score int,
>   friendId uuid,
>   name varchar,
>   PRIMARY KEY(userId, score, friendId)
> ) WITH CLUSTERING ORDER BY (score DESC);
>
> And then create an index:
>
> CREATE INDEX friendId_idx ON highscore ( friendId );
>
> The table will have many million (I should expect 100+ million) entries.
> Each friendId would appear as many times as the user has friends. It sounds
> like a scenario where I should take care of using a custom index.
>
> I haven't worked with custom indexes in Cassandra before, but I assume
> this would allow me to query the table based on (userId, friendId) for
> updating highscores.
>
> But what would happen in this case? What queries would be affected and
> roughly to what degree?
>
> Would this be a viable option?
>
>
>
> On Wed, Jan 22, 2014 at 6:44 PM, Kasper Middelboe Petersen <
> kasper@sybogames.com> wrote:
>
>> Hi!
>>
>> I'm a little worried about the data model I have come up with for
>> handling highscores.
>>
>> I have a lot of users. Each user has a number of friends. I need a
>> highscore list pr friend list.
>>
>> I would like to have it optimized for reading the highscores as opposed
>> to setting a new highscore as the use case would suggest I would need to
>> read the list a lot more than I would need write new highscores.
>>
>> Currently I have the following tables:
>> CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
>> int, PRIMARY KEY(userId))
>> CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
>> KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
>> ... and a tables for friends - for the purpose of this mail assume
>> everyone is friends with everyone else
>>
>> Reading the highscore list for a given user is easy. SELECT * FROM
>> highscores WHERE userId = <id>.
>>
>> Problem is setting a new highscore.
>> 1. I need to read-before-write to get the old score
>> 2. I'm screwed if something goes wrong and the old score gets overwritten
>> before all the friends highscore lists gets updated - and it is an highly
>> visible error due to the same user is on the highscore multiple times.
>>
>> I would very much appreciate some feedback and/or alternatives to how to
>> solve this with Cassandra.
>>
>>
>> Thanks,
>> Kasper
>>
>
>

Re: Datamodel for a highscore list

Posted by Kasper Middelboe Petersen <ka...@sybogames.com>.
Is this a non-CQL trick? If not, then would it be possible for you to give
me the CQL3 CREATE TABLE definition instead?

I'm not entirely sure what you mean by "UserId, UserId, UserScore:Score"
and then FriendId, Score, etc.


On Thu, Jan 23, 2014 at 5:06 PM, Colin Clark <co...@clark.ws> wrote:

> One of tricks I've used a lot with cassandra is a sparse df definition and
> inserted columns programmatically that weren't in the definition.
>
> I'd be tempted to look at putting a users friend list on one row, the row
> would look like this:
>
> ROWID        COLUMNS
>
> UserID         UserId, UserID, UserScore:Score     FriendID, score
>  FriendID,   score     ....
>
> The "UserID" and "UserScore" columns are literal, the FriendID's are
> either literal or keys into the user cf.
>
> When a user gets a new score, you update that user's row and a general
> update query updating all rows with that userid with the new score
>
> That way, all friends are on the same row, which makes query easy.  And
> you can still issue query to find the top score across the entire userbase
> by querying userid, and userscore.
>
> Is this a better explanation of my previous and lame explanation?
>
> Colin
> +1 320 221 9531
>
>
>
> On Thu, Jan 23, 2014 at 2:34 AM, Kasper Middelboe Petersen <
> kasper@sybogames.com> wrote:
>
>> What would the consequence be of having this updated highscore table
>> (using friendId as part of the clustering index to avoid name collisions):
>>
>> CREATE TABLE highscore (
>>   userId uuid,
>>   score int,
>>   friendId uuid,
>>   name varchar,
>>   PRIMARY KEY(userId, score, friendId)
>> ) WITH CLUSTERING ORDER BY (score DESC);
>>
>> And then create an index:
>>
>> CREATE INDEX friendId_idx ON highscore ( friendId );
>>
>> The table will have many million (I should expect 100+ million) entries.
>> Each friendId would appear as many times as the user has friends. It sounds
>> like a scenario where I should take care of using a custom index.
>>
>> I haven't worked with custom indexes in Cassandra before, but I assume
>> this would allow me to query the table based on (userId, friendId) for
>> updating highscores.
>>
>> But what would happen in this case? What queries would be affected and
>> roughly to what degree?
>>
>> Would this be a viable option?
>>
>>
>>
>> On Wed, Jan 22, 2014 at 6:44 PM, Kasper Middelboe Petersen <
>> kasper@sybogames.com> wrote:
>>
>>> Hi!
>>>
>>> I'm a little worried about the data model I have come up with for
>>> handling highscores.
>>>
>>> I have a lot of users. Each user has a number of friends. I need a
>>> highscore list pr friend list.
>>>
>>> I would like to have it optimized for reading the highscores as opposed
>>> to setting a new highscore as the use case would suggest I would need to
>>> read the list a lot more than I would need write new highscores.
>>>
>>> Currently I have the following tables:
>>> CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
>>> int, PRIMARY KEY(userId))
>>> CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
>>> KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
>>> ... and a tables for friends - for the purpose of this mail assume
>>> everyone is friends with everyone else
>>>
>>> Reading the highscore list for a given user is easy. SELECT * FROM
>>> highscores WHERE userId = <id>.
>>>
>>> Problem is setting a new highscore.
>>> 1. I need to read-before-write to get the old score
>>> 2. I'm screwed if something goes wrong and the old score gets
>>> overwritten before all the friends highscore lists gets updated - and it is
>>> an highly visible error due to the same user is on the highscore multiple
>>> times.
>>>
>>> I would very much appreciate some feedback and/or alternatives to how to
>>> solve this with Cassandra.
>>>
>>>
>>> Thanks,
>>> Kasper
>>>
>>
>>
>

Re: Datamodel for a highscore list

Posted by Colin Clark <co...@clark.ws>.
One of tricks I've used a lot with cassandra is a sparse df definition and
inserted columns programmatically that weren't in the definition.

I'd be tempted to look at putting a users friend list on one row, the row
would look like this:

ROWID        COLUMNS

UserID         UserId, UserID, UserScore:Score     FriendID, score
 FriendID,   score     ....

The "UserID" and "UserScore" columns are literal, the FriendID's are either
literal or keys into the user cf.

When a user gets a new score, you update that user's row and a general
update query updating all rows with that userid with the new score

That way, all friends are on the same row, which makes query easy.  And you
can still issue query to find the top score across the entire userbase by
querying userid, and userscore.

Is this a better explanation of my previous and lame explanation?

Colin
+1 320 221 9531



On Thu, Jan 23, 2014 at 2:34 AM, Kasper Middelboe Petersen <
kasper@sybogames.com> wrote:

> What would the consequence be of having this updated highscore table
> (using friendId as part of the clustering index to avoid name collisions):
>
> CREATE TABLE highscore (
>   userId uuid,
>   score int,
>   friendId uuid,
>   name varchar,
>   PRIMARY KEY(userId, score, friendId)
> ) WITH CLUSTERING ORDER BY (score DESC);
>
> And then create an index:
>
> CREATE INDEX friendId_idx ON highscore ( friendId );
>
> The table will have many million (I should expect 100+ million) entries.
> Each friendId would appear as many times as the user has friends. It sounds
> like a scenario where I should take care of using a custom index.
>
> I haven't worked with custom indexes in Cassandra before, but I assume
> this would allow me to query the table based on (userId, friendId) for
> updating highscores.
>
> But what would happen in this case? What queries would be affected and
> roughly to what degree?
>
> Would this be a viable option?
>
>
>
> On Wed, Jan 22, 2014 at 6:44 PM, Kasper Middelboe Petersen <
> kasper@sybogames.com> wrote:
>
>> Hi!
>>
>> I'm a little worried about the data model I have come up with for
>> handling highscores.
>>
>> I have a lot of users. Each user has a number of friends. I need a
>> highscore list pr friend list.
>>
>> I would like to have it optimized for reading the highscores as opposed
>> to setting a new highscore as the use case would suggest I would need to
>> read the list a lot more than I would need write new highscores.
>>
>> Currently I have the following tables:
>> CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
>> int, PRIMARY KEY(userId))
>> CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
>> KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
>> ... and a tables for friends - for the purpose of this mail assume
>> everyone is friends with everyone else
>>
>> Reading the highscore list for a given user is easy. SELECT * FROM
>> highscores WHERE userId = <id>.
>>
>> Problem is setting a new highscore.
>> 1. I need to read-before-write to get the old score
>> 2. I'm screwed if something goes wrong and the old score gets overwritten
>> before all the friends highscore lists gets updated - and it is an highly
>> visible error due to the same user is on the highscore multiple times.
>>
>> I would very much appreciate some feedback and/or alternatives to how to
>> solve this with Cassandra.
>>
>>
>> Thanks,
>> Kasper
>>
>
>

Re: Datamodel for a highscore list

Posted by Kasper Middelboe Petersen <ka...@sybogames.com>.
What would the consequence be of having this updated highscore table (using
friendId as part of the clustering index to avoid name collisions):

CREATE TABLE highscore (
  userId uuid,
  score int,
  friendId uuid,
  name varchar,
  PRIMARY KEY(userId, score, friendId)
) WITH CLUSTERING ORDER BY (score DESC);

And then create an index:

CREATE INDEX friendId_idx ON highscore ( friendId );

The table will have many million (I should expect 100+ million) entries.
Each friendId would appear as many times as the user has friends. It sounds
like a scenario where I should take care of using a custom index.

I haven't worked with custom indexes in Cassandra before, but I assume this
would allow me to query the table based on (userId, friendId) for updating
highscores.

But what would happen in this case? What queries would be affected and
roughly to what degree?

Would this be a viable option?



On Wed, Jan 22, 2014 at 6:44 PM, Kasper Middelboe Petersen <
kasper@sybogames.com> wrote:

> Hi!
>
> I'm a little worried about the data model I have come up with for handling
> highscores.
>
> I have a lot of users. Each user has a number of friends. I need a
> highscore list pr friend list.
>
> I would like to have it optimized for reading the highscores as opposed to
> setting a new highscore as the use case would suggest I would need to read
> the list a lot more than I would need write new highscores.
>
> Currently I have the following tables:
> CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
> int, PRIMARY KEY(userId))
> CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
> KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
> ... and a tables for friends - for the purpose of this mail assume
> everyone is friends with everyone else
>
> Reading the highscore list for a given user is easy. SELECT * FROM
> highscores WHERE userId = <id>.
>
> Problem is setting a new highscore.
> 1. I need to read-before-write to get the old score
> 2. I'm screwed if something goes wrong and the old score gets overwritten
> before all the friends highscore lists gets updated - and it is an highly
> visible error due to the same user is on the highscore multiple times.
>
> I would very much appreciate some feedback and/or alternatives to how to
> solve this with Cassandra.
>
>
> Thanks,
> Kasper
>

Re: Datamodel for a highscore list

Posted by Kasper Middelboe Petersen <ka...@sybogames.com>.
Yes friendship is symmetrical.

This could work for my problem right now, but I'm afraid it would just be
postponing the problem slightly until something like big tournaments (which
are coming) raises the same problem again.


On Wed, Jan 22, 2014 at 6:58 PM, Jon Ribbens <
jon-cassandra@unequivocal.co.uk> wrote:

> On Wed, Jan 22, 2014 at 06:44:20PM +0100, Kasper Middelboe Petersen wrote:
> >    I'm a little worried about the data model I have come up with for
> handling
> >    highscores.
> >    I have a lot of users. Each user has a number of friends. I need a
> >    highscore list pr friend list.
> >    I would like to have it optimized for reading the highscores as
> opposed to
> >    setting a new highscore as the use case would suggest I would need to
> read
> >    the list a lot more than I would need write new highscores.
> >    Currently I have the following tables:
> >    CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
> >    int, PRIMARY KEY(userId))
> >    CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
> >    KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
> >    ... and a tables for friends - for the purpose of this mail assume
> >    everyone is friends with everyone else
> >    Reading the highscore list for a given user is easy. SELECT * FROM
> >    highscores WHERE userId = <id>.
> >    Problem is setting a new highscore.
> >    1. I need to read-before-write to get the old score
> >    2. I'm screwed if something goes wrong and the old score gets
> overwritten
> >    before all the friends highscore lists gets updated - and it is an
> highly
> >    visible error due to the same user is on the highscore multiple times.
> >    I would very much appreciate some feedback and/or alternatives to how
> to
> >    solve this with Cassandra.
>
> Is friendship symmetrical? Why not just store the scores in the friend
> list like so:
>
> CREATE TABLE friends (
>   userID      uuid,
>   friendID    uuid,
>   name        varchar,
>   score       int,
>   PRIMARY KEY (userID, friendID)
> );
>
> and then simply sort the friends by score in your application code?
>
> When you update a user's score, you just do something like:
>
>   UPDATE friends SET score=x WHERE userID IN (all,my,friends) AND
> friendID=myID;
>
> It should be quite efficient unless you have people with truly
> ludicrous numbers of 'friends' ;-)
>

Re: Datamodel for a highscore list

Posted by Jon Ribbens <jo...@unequivocal.co.uk>.
On Wed, Jan 22, 2014 at 06:44:20PM +0100, Kasper Middelboe Petersen wrote:
>    I'm a little worried about the data model I have come up with for handling
>    highscores.
>    I have a lot of users. Each user has a number of friends. I need a
>    highscore list pr friend list.
>    I would like to have it optimized for reading the highscores as opposed to
>    setting a new highscore as the use case would suggest I would need to read
>    the list a lot more than I would need write new highscores.
>    Currently I have the following tables:
>    CREATE TABLE user (userId uuid, name varchar, highscore int, bestcombo
>    int, PRIMARY KEY(userId))
>    CREATE TABLE highscore (userId uuid, score int, name varchar, PRIMARY
>    KEY(userId, score, name)) WITH CLUSTERING ORDER BY (score DESC);
>    ... and a tables for friends - for the purpose of this mail assume
>    everyone is friends with everyone else
>    Reading the highscore list for a given user is easy. SELECT * FROM
>    highscores WHERE userId = <id>.
>    Problem is setting a new highscore.
>    1. I need to read-before-write to get the old score
>    2. I'm screwed if something goes wrong and the old score gets overwritten
>    before all the friends highscore lists gets updated - and it is an highly
>    visible error due to the same user is on the highscore multiple times.
>    I would very much appreciate some feedback and/or alternatives to how to
>    solve this with Cassandra.

Is friendship symmetrical? Why not just store the scores in the friend
list like so:

CREATE TABLE friends (
  userID      uuid,
  friendID    uuid,
  name        varchar,
  score       int,
  PRIMARY KEY (userID, friendID)
);

and then simply sort the friends by score in your application code?

When you update a user's score, you just do something like:

  UPDATE friends SET score=x WHERE userID IN (all,my,friends) AND friendID=myID;

It should be quite efficient unless you have people with truly
ludicrous numbers of 'friends' ;-)