You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Qi Li <ke...@gmail.com> on 2016/01/28 23:51:14 UTC

Wide row in Cassandra

Hi all,

I've found something in Internet, but still want to consult with your
expertise.

I'm designing a table, the object model will be like,
class Data{
     String uuid;        //partition key
     String value1;
     String value2;
     ...
     String valueN;
     Map<String, Double> mapValues;
}


For one Data object, I would like it to be saved into one wide row in C*,
that means the mapValues will be extended as dynamic columns. AFAIK, I can
put the mapValues' key into Cluster column, then C* will use the wide row
to save the data. Then I would use 'uuid' as partition key, and mapKey into
cluster key. My question is for the other columns : value1 to valueN, shall
I put them into ClusterKey too? like below,

create table Data (
   text uuid;
   text value1;
   text value2;
   ...
   text valueN;
   text mapKey;
   Double mapValue;
   primary key(key, mapKey, value1, value2, ..., valueN);
);

The reason I put them into cluster keys is I don't want value1 to valueN
are saved duplicated each time when the mapKey is created. For example, the
mapValues can have 100 entries, I don't want value1 to valueN are saved 100
times, I only want them saved 1 time together with the partition key. Is
this correct?

Thanks for your help.

-- 
Ken Li

Re: Wide row in Cassandra

Posted by Qi Li <ke...@gmail.com>.

static column is exactly what I want!

Thank you Duyhai!

On Fri, 29 Jan 2016 07:22 DuyHai Doan <do...@gmail.com> wrote:

> This data model should do the job
>
> Create table Data (
>    text uuid;
>    text value1 static;
>    text value2 static;
>    ...
>    text valueN static;
>    text mapKey;
>    Double mapValue;
>    primary key(key, mapKey);
> );
>
> Warning, value1... valueN being static, there will be a 1:1 relationship
> between them and the partition key uuid.
>
> 1.Query for value, it can be any one from value1 to valueN. The query
> criteria will be 'uuid'.
>
> SELECT value1,..., valueN FROM data
> WHERE partition = uuid
>
> 2. Query for the Double in mapValue. The query criteria will be 'uuid' +
> 'key' in mapValue
>
> SELECT mapValue FROM data WHERE partition = uuid AND mapKey = double
>
>
> Le 29 janv. 2016 07:51, "Qi Li" <ke...@gmail.com> a écrit :
> >
> > Thanks Jack.
> >
> > the columns to be used for query will be 'uuid' and 'key' in mapValues.
> For value1 to valueN, and Double in mapValues will be merely return.
> >
> > there are 2 scenarios to query.
> > 1. Query for value, it can be any one from value1 to valueN. The query
> criteria will be 'uuid'.
> > 2. Query for the Double in mapValue. The query criteria will be 'uuid' +
> 'key' in mapValue.
> >
> > Thanks for your help.
> >
> > Ken
> >
> > On Thu, Jan 28, 2016 at 11:22 PM, Jack Krupansky <
> jack.krupansky@gmail.com> wrote:
> >>
> >> As usual, the first step should be to example your queries and use them
> as the guide to data modeling. So... how do you need to access the data?
> What columns do you need to be able to query on vs. merely return? What
> data needs to be accessed at the same time? What data does not need to be
> accessed at the same time?
> >>
> >> -- Jack Krupansky
> >>
> >> On Thu, Jan 28, 2016 at 5:51 PM, Qi Li <ke...@gmail.com> wrote:
> >>>
> >>> Hi all,
> >>>
> >>> I've found something in Internet, but still want to consult with your
> expertise.
> >>>
> >>> I'm designing a table, the object model will be like,
> >>> class Data{
> >>>      String uuid;        //partition key
> >>>      String value1;
> >>>      String value2;
> >>>      ...
> >>>      String valueN;
> >>>      Map<String, Double> mapValues;
> >>> }
> >>>
> >>>
> >>> For one Data object, I would like it to be saved into one wide row in
> C*, that means the mapValues will be extended as dynamic columns. AFAIK, I
> can put the mapValues' key into Cluster column, then C* will use the wide
> row to save the data. Then I would use 'uuid' as partition key, and mapKey
> into cluster key. My question is for the other columns : value1 to valueN,
> shall I put them into ClusterKey too? like below,
> >>>
> >>> create table Data (
> >>>    text uuid;
> >>>    text value1;
> >>>    text value2;
> >>>    ...
> >>>    text valueN;
> >>>    text mapKey;
> >>>    Double mapValue;
> >>>    primary key(key, mapKey, value1, value2, ..., valueN);
> >>> );
> >>>
> >>> The reason I put them into cluster keys is I don't want value1 to
> valueN are saved duplicated each time when the mapKey is created. For
> example, the mapValues can have 100 entries, I don't want value1 to valueN
> are saved 100 times, I only want them saved 1 time together with the
> partition key. Is this correct?
> >>>
> >>> Thanks for your help.
> >>>
> >>> --
> >>> Ken Li
> >>
> >>
> >
> >
> >
> > --
> > Ken Li
>

Re: Wide row in Cassandra

Posted by DuyHai Doan <do...@gmail.com>.

This data model should do the job

Create table Data (
   text uuid;
   text value1 static;
   text value2 static;
   ...
   text valueN static;
   text mapKey;
   Double mapValue;
   primary key(key, mapKey);
);

Warning, value1... valueN being static, there will be a 1:1 relationship
between them and the partition key uuid.

1.Query for value, it can be any one from value1 to valueN. The query
criteria will be 'uuid'.

SELECT value1,..., valueN FROM data
WHERE partition = uuid

2. Query for the Double in mapValue. The query criteria will be 'uuid' +
'key' in mapValue

SELECT mapValue FROM data WHERE partition = uuid AND mapKey = double

Le 29 janv. 2016 07:51, "Qi Li" <ke...@gmail.com> a écrit :
>
> Thanks Jack.
>
> the columns to be used for query will be 'uuid' and 'key' in mapValues.
For value1 to valueN, and Double in mapValues will be merely return.
>
> there are 2 scenarios to query.
> 1. Query for value, it can be any one from value1 to valueN. The query
criteria will be 'uuid'.
> 2. Query for the Double in mapValue. The query criteria will be 'uuid' +
'key' in mapValue.
>
> Thanks for your help.
>
> Ken
>
> On Thu, Jan 28, 2016 at 11:22 PM, Jack Krupansky <ja...@gmail.com>
wrote:
>>
>> As usual, the first step should be to example your queries and use them
as the guide to data modeling. So... how do you need to access the data?
What columns do you need to be able to query on vs. merely return? What
data needs to be accessed at the same time? What data does not need to be
accessed at the same time?
>>
>> -- Jack Krupansky
>>
>> On Thu, Jan 28, 2016 at 5:51 PM, Qi Li <ke...@gmail.com> wrote:
>>>
>>> Hi all,
>>>
>>> I've found something in Internet, but still want to consult with your
expertise.
>>>
>>> I'm designing a table, the object model will be like,
>>> class Data{
>>>      String uuid;        //partition key
>>>      String value1;
>>>      String value2;
>>>      ...
>>>      String valueN;
>>>      Map<String, Double> mapValues;
>>> }
>>>
>>>
>>> For one Data object, I would like it to be saved into one wide row in
C*, that means the mapValues will be extended as dynamic columns. AFAIK, I
can put the mapValues' key into Cluster column, then C* will use the wide
row to save the data. Then I would use 'uuid' as partition key, and mapKey
into cluster key. My question is for the other columns : value1 to valueN,
shall I put them into ClusterKey too? like below,
>>>
>>> create table Data (
>>>    text uuid;
>>>    text value1;
>>>    text value2;
>>>    ...
>>>    text valueN;
>>>    text mapKey;
>>>    Double mapValue;
>>>    primary key(key, mapKey, value1, value2, ..., valueN);
>>> );
>>>
>>> The reason I put them into cluster keys is I don't want value1 to
valueN are saved duplicated each time when the mapKey is created. For
example, the mapValues can have 100 entries, I don't want value1 to valueN
are saved 100 times, I only want them saved 1 time together with the
partition key. Is this correct?
>>>
>>> Thanks for your help.
>>>
>>> --
>>> Ken Li
>>
>>
>
>
>
> --
> Ken Li

Re: Wide row in Cassandra

Posted by Qi Li <ke...@gmail.com>.

Thanks Jack.

the columns to be used for query will be 'uuid' and 'key' in mapValues. For
value1 to valueN, and Double in mapValues will be merely return.

there are 2 scenarios to query.
1. Query for value, it can be any one from value1 to valueN. The query
criteria will be 'uuid'.
2. Query for the Double in mapValue. The query criteria will be 'uuid' +
'key' in mapValue.

Thanks for your help.

Ken

On Thu, Jan 28, 2016 at 11:22 PM, Jack Krupansky <ja...@gmail.com>
wrote:

> As usual, the first step should be to example your queries and use them as
> the guide to data modeling. So... how do you need to access the data? What
> columns do you need to be able to query on vs. merely return? What data
> needs to be accessed at the same time? What data does not need to be
> accessed at the same time?
>
> -- Jack Krupansky
>
> On Thu, Jan 28, 2016 at 5:51 PM, Qi Li <ke...@gmail.com> wrote:
>
>> Hi all,
>>
>> I've found something in Internet, but still want to consult with your
>> expertise.
>>
>> I'm designing a table, the object model will be like,
>> class Data{
>>      String uuid;        //partition key
>>      String value1;
>>      String value2;
>>      ...
>>      String valueN;
>>      Map<String, Double> mapValues;
>> }
>>
>>
>> For one Data object, I would like it to be saved into one wide row in C*,
>> that means the mapValues will be extended as dynamic columns. AFAIK, I can
>> put the mapValues' key into Cluster column, then C* will use the wide row
>> to save the data. Then I would use 'uuid' as partition key, and mapKey into
>> cluster key. My question is for the other columns : value1 to valueN, shall
>> I put them into ClusterKey too? like below,
>>
>> create table Data (
>>    text uuid;
>>    text value1;
>>    text value2;
>>    ...
>>    text valueN;
>>    text mapKey;
>>    Double mapValue;
>>    primary key(key, mapKey, value1, value2, ..., valueN);
>> );
>>
>> The reason I put them into cluster keys is I don't want value1 to valueN
>> are saved duplicated each time when the mapKey is created. For example, the
>> mapValues can have 100 entries, I don't want value1 to valueN are saved 100
>> times, I only want them saved 1 time together with the partition key. Is
>> this correct?
>>
>> Thanks for your help.
>>
>> --
>> Ken Li
>>
>
>


-- 
Ken Li

Re: Wide row in Cassandra

Posted by Jack Krupansky <ja...@gmail.com>.

As usual, the first step should be to example your queries and use them as
the guide to data modeling. So... how do you need to access the data? What
columns do you need to be able to query on vs. merely return? What data
needs to be accessed at the same time? What data does not need to be
accessed at the same time?

-- Jack Krupansky

On Thu, Jan 28, 2016 at 5:51 PM, Qi Li <ke...@gmail.com> wrote:

> Hi all,
>
> I've found something in Internet, but still want to consult with your
> expertise.
>
> I'm designing a table, the object model will be like,
> class Data{
>      String uuid;        //partition key
>      String value1;
>      String value2;
>      ...
>      String valueN;
>      Map<String, Double> mapValues;
> }
>
>
> For one Data object, I would like it to be saved into one wide row in C*,
> that means the mapValues will be extended as dynamic columns. AFAIK, I can
> put the mapValues' key into Cluster column, then C* will use the wide row
> to save the data. Then I would use 'uuid' as partition key, and mapKey into
> cluster key. My question is for the other columns : value1 to valueN, shall
> I put them into ClusterKey too? like below,
>
> create table Data (
>    text uuid;
>    text value1;
>    text value2;
>    ...
>    text valueN;
>    text mapKey;
>    Double mapValue;
>    primary key(key, mapKey, value1, value2, ..., valueN);
> );
>
> The reason I put them into cluster keys is I don't want value1 to valueN
> are saved duplicated each time when the mapKey is created. For example, the
> mapValues can have 100 entries, I don't want value1 to valueN are saved 100
> times, I only want them saved 1 time together with the partition key. Is
> this correct?
>
> Thanks for your help.
>
> --
> Ken Li
>