You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by Larry Root <la...@armorgames.com> on 2010/04/23 22:33:14 UTC

Trying To Understand get_range_slices Results When Using RandomPartitioner

I trying to better understand how using the RandomPartitioner will affect my
ability to select ranges of keys. Consider my simple example where we have
many online games across different game genres (GameType). These games need
to store data for each one of their users. With that in mind consider the
following data model:

enum GameType {'RPG', 'FPS', 'ARCADE'}

{
    "GameData": {                         // Super Column Family

        *GameType+"1234"*: {                // Row (concat gametype with a
game id for example)
            *"user-data:5678"*:{            // Super column (user data)
                *"user_prop_name"*: "value",// Subcolumn (arbitrary user
properties and values)
*                "another_prop_name"*: "value",
                 ...
            },
            *"user-data:9012"*:{
                *"**user_prop_name**"*: "value",
                 ...
            }
        },

        * GameType+"3456"*: {...},
        *GameType+"7890"*: {...},
        ...
    }
}

Assume we have a multi node cluster running Cassandra 0.6.1. In that
scenario could some one help me understand what the result would be in the
following cases:

   1. We use a range slice to grab keys for all 'RPG' games (range slice at
   the ROW level). Would we be able to get all games back in a single query or
   would that not be guaranteed?

   2. For a given game we use a range slice to grab all user-data keys in
   which the ID starts with '5' (range slice at the COLUMN level). Again, would
   we be able to get all keys in one call (assuming number of keys in the
   result was not an issue)?

   3. Finally for a given game and a given user we do a range slice to grab
   all user properties that start with 'a' (range slice at the SUBCOLUMN level
   of a SUPERCOLUMN). Is that possible in one call?

I'm trying to understand at what level the RandomPartioner affects my
example data model. Is it at a fixed level like just ROWS (the sub data is
fixed to the same node) or is all data at every level *randomized* across
all nodes.

Are there any tricks to doing these sort of range slices using RP? For
example if I set my consistency level to 'ALL' when doing a range slice
would that effectively compile a complete result set for me?

Thanks for the help!

larry

Re: Trying To Understand get_range_slices Results When Using RandomPartitioner

Posted by Schubert Zhang <zs...@gmail.com>.
RandomPartioner  is for row-keys.

#1  no
#2 yes
#3 yes

On Sat, Apr 24, 2010 at 4:33 AM, Larry Root <la...@armorgames.com> wrote:

> I trying to better understand how using the RandomPartitioner will affect
> my ability to select ranges of keys. Consider my simple example where we
> have many online games across different game genres (GameType). These games
> need to store data for each one of their users. With that in mind consider
> the following data model:
>
> enum GameType {'RPG', 'FPS', 'ARCADE'}
>
> {
>     "GameData": {                         // Super Column Family
>
>         *GameType+"1234"*: {                // Row (concat gametype with a
> game id for example)
>             *"user-data:5678"*:{            // Super column (user data)
>                 *"user_prop_name"*: "value",// Subcolumn (arbitrary user
> properties and values)
> *                "another_prop_name"*: "value",
>                  ...
>             },
>             *"user-data:9012"*:{
>                 *"**user_prop_name**"*: "value",
>                  ...
>             }
>         },
>
>         * GameType+"3456"*: {...},
>         *GameType+"7890"*: {...},
>         ...
>     }
> }
>
> Assume we have a multi node cluster running Cassandra 0.6.1. In that
> scenario could some one help me understand what the result would be in the
> following cases:
>
>    1. We use a range slice to grab keys for all 'RPG' games (range slice
>    at the ROW level). Would we be able to get all games back in a single query
>    or would that not be guaranteed?
>
>    2. For a given game we use a range slice to grab all user-data keys in
>    which the ID starts with '5' (range slice at the COLUMN level). Again, would
>    we be able to get all keys in one call (assuming number of keys in the
>    result was not an issue)?
>
>    3. Finally for a given game and a given user we do a range slice to
>    grab all user properties that start with 'a' (range slice at the SUBCOLUMN
>    level of a SUPERCOLUMN). Is that possible in one call?
>
> I'm trying to understand at what level the RandomPartioner affects my
> example data model. Is it at a fixed level like just ROWS (the sub data is
> fixed to the same node) or is all data at every level *randomized* across
> all nodes.
>
> Are there any tricks to doing these sort of range slices using RP? For
> example if I set my consistency level to 'ALL' when doing a range slice
> would that effectively compile a complete result set for me?
>
> Thanks for the help!
>
> larry