You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hbase.apache.org by Billy Pearson <sa...@pearsonwholesale.com> on 2009/04/01 03:31:56 UTC

Re: Curious about using HBase

I agree with Jonathan Gray about the two tables to have a random access to 
both what achievement's a player has got and ALL players that have x 
achievement's you will need two tables

Things you should look in to changing about your setup to be more efficient 
would be to only use a player table and cache the other stuff and use MR 
jobs if you do not need random access to it

Example Do you reality need access to this query?
What players have a given achievement?
This data can be mined form the player table with a MR job for updating 
stats or emailing or what every you use it for internally.
If not provided to players but you need it for something internal and can 
wait a few hours for it then a MR job can provide this data.
This is the query you will need to make the achievement table and will 
require you to update both the players and achievement table on changes to 
keep everything in sync
If you reality have to have this query I would still thank about going with 
a cached version of it that gets updated every X hours/days depending on 
resources.

Who are the first 25 people to have a given achievement?
This could be cached somehow on local disk sense it will never change once 
25 have made the achievement.

What are all the possible achievements?
This should be easy to cache outside hbase also sense you said the new 
achievement will not be common.
Just have to update it if you add new achievement's

What achievements does a given player NOT have?
Using a cache from above you could pull a players achievement's from the 
players table and remove the ones he has and use what's left as Jonathan 
Gray pointed to

With the above suggestions they should help to scale with less resources.

Billy Pearson

"Matthew Runo" <ma...@realmranker.com> wrote in 
message news:c523f7010903310920h5c7cc365hc2eeb1b4a731d609@mail.gmail.com...
> Hello folks!
>
> I'm running a site (http://www.realmranker.com) that works on medium sized
> data sets from the World of Warcraft armory. I'm currently doing 
> everything
> in MySQL on a "Large" EC2 instance, but hitting some budgetary limits in 
> how
> much more data I can put there.
>
> First, here is my current MySQL architecture:
>
> player - table that holds the player data, currently about 500,000 rows
> (expecting 2,000,000+ rows long term)
> achievement - table that holds the root achievement data, currently a few
> 1000 rows, not expecting all that much growth
> player_achievement - table that maps a player to all the achievement's
> they've completed. Currentl about 60,000,000 rows, expecting 250,000,000+
> long term
>
> Here is the layout of player_achievment right now:
> CREATE TABLE `player_achievement` (
>  `playerAchievementId` int(11) unsigned NOT NULL AUTO_INCREMENT,
>  `playerId` int(11) unsigned NOT NULL,
>  `achievementId` int(11) unsigned NOT NULL,
>  `guildId` int(11) unsigned DEFAULT NULL,
>  `dateCompleted` varchar(100) NOT NULL,
>  PRIMARY KEY (`playerAchievementId`),
>  UNIQUE KEY `playerId` (`playerId`,`achievementId`),
>  KEY `playerId_2` (`playerId`),
>  KEY `achievementId` (`achievementId`),
>  KEY `guildId` (`guildId`),
>  KEY `dateCompleted` (`dateCompleted`)
> )
>
> I'd like to move this table into HBase, since it's simply getting too 
> large
> for the amount of server resources I can afford. I assume that HBase would
> preform faster than MySQL on my (small by HBase standards) dataset.
>
> My question is simply what is the best way to do this? I'm assuming
> something like this:
>
> player_achievement table
> achievementId row
> players family
> player1, player2, player3.....playerN columns
> date completed would be the data under each column
>
> I need to do the following types of queries:
> - What players have a given achievement?
> - What achievements does a give player have?
> - What achievements does a given player NOT have?
> - What are all the possible achievements?
> - Who are the first 25 people to have a given achievement?
>
> I'm sorry for the really high level of this email - I'm still getting up 
> to
> speed on HBase, and not quite sure it's exactly the answer to my prayers
> that I think it might be..
>
> --Matthew Runo
>