You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@cassandra.apache.org by Srinivasa T N <se...@gmail.com> on 2015/01/06 18:47:37 UTC

Queries required before data modeling?

Hi All,
   I was just googling around and reading the various articles on data
modeling in cassandra.  All of them talk about working backwards, i.e.,
first now what type of queries you are going to make and select a right
data model which can support those queries efficiently.  But one thing I
cannot understand: You can expect me that I can know some queries that I
will be making but how can I know what all queries will be made before
hand?  I have to remodel the whole stuff when I get a query which I had not
thought off?

Regards,
Seenu.

Re: Queries required before data modeling?

Posted by Srinivasa T N <se...@gmail.com>.

Thanks for the info guys.

Regards,
Seenu.

On Tue, Jan 6, 2015 at 11:31 PM, Ryan Svihla <rs...@foundev.pro> wrote:

> Yes, however in most cases this means just one new table, so you make a
> new table and copy the data over.  In many ways this is not unlike a schema
> change, or if you need to change your primary key on an existing table in
> traditional SQL databases.
>
> This design around partition key is true of all databases once you go
> distributed, and even when you start trying to scale out SQL databases you
> have to think about problem sets like this. Whether your sharding your data
> with Cassandra or doing it by hand in MySQL some key determines which data
> is on which server.
>
> If you really want to support dynamic queries you can use something like
> Spark Sql to front end your data or index all the table ids with something
> like Solr.  However, both of these approaches have performance implications
> (they fan out and scan lots of data) and if you need Cassandra's speed and
> scalability then you're going to need to model in a scalable way.
>
>
>
>
> On Tue, Jan 6, 2015 at 11:47 AM, Srinivasa T N <se...@gmail.com> wrote:
>
>> Hi All,
>>    I was just googling around and reading the various articles on data
>> modeling in cassandra.  All of them talk about working backwards, i.e.,
>> first now what type of queries you are going to make and select a right
>> data model which can support those queries efficiently.  But one thing I
>> cannot understand: You can expect me that I can know some queries that I
>> will be making but how can I know what all queries will be made before
>> hand?  I have to remodel the whole stuff when I get a query which I had not
>> thought off?
>>
>> Regards,
>> Seenu.
>>
>
>
>
> --
>
> Thanks,
> Ryan Svihla
>
>

Re: Queries required before data modeling?

Posted by Ryan Svihla <rs...@foundev.pro>.

Yes, however in most cases this means just one new table, so you make a new
table and copy the data over.  In many ways this is not unlike a schema
change, or if you need to change your primary key on an existing table in
traditional SQL databases.

This design around partition key is true of all databases once you go
distributed, and even when you start trying to scale out SQL databases you
have to think about problem sets like this. Whether your sharding your data
with Cassandra or doing it by hand in MySQL some key determines which data
is on which server.

If you really want to support dynamic queries you can use something like
Spark Sql to front end your data or index all the table ids with something
like Solr.  However, both of these approaches have performance implications
(they fan out and scan lots of data) and if you need Cassandra's speed and
scalability then you're going to need to model in a scalable way.

On Tue, Jan 6, 2015 at 11:47 AM, Srinivasa T N <se...@gmail.com> wrote:

> Hi All,
>    I was just googling around and reading the various articles on data
> modeling in cassandra.  All of them talk about working backwards, i.e.,
> first now what type of queries you are going to make and select a right
> data model which can support those queries efficiently.  But one thing I
> cannot understand: You can expect me that I can know some queries that I
> will be making but how can I know what all queries will be made before
> hand?  I have to remodel the whole stuff when I get a query which I had not
> thought off?
>
> Regards,
> Seenu.
>

-- 

Thanks,
Ryan Svihla

Re: Queries required before data modeling?

Posted by James Rothering <jr...@codojo.me>.

Yes, "remodeling" the schema will be required to have good performance for
new queries which things had not been cached ahead of time to accommodate.

In C*, you're going to pre-compute all caching ahead of time, in order to
maximize performance.

This is in contrast to the relational approach where you get whatever you
want, but your performance may degrade when you do the costly joins.

Regards,

James

On Tue, Jan 6, 2015 at 9:47 AM, Srinivasa T N <se...@gmail.com> wrote:

> Hi All,
>    I was just googling around and reading the various articles on data
> modeling in cassandra.  All of them talk about working backwards, i.e.,
> first now what type of queries you are going to make and select a right
> data model which can support those queries efficiently.  But one thing I
> cannot understand: You can expect me that I can know some queries that I
> will be making but how can I know what all queries will be made before
> hand?  I have to remodel the whole stuff when I get a query which I had not
> thought off?
>
> Regards,
> Seenu.
>