You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@cassandra.apache.org by sonia gehlot <so...@gmail.com> on 2010/08/06 20:00:52 UTC

How to migrate any relational database to Cassandra

Hi All,


Little background about myself. I am ETL engineer worked in only relational
databases.

I have been reading and trying Cassandra since 3-4 weeks. I kind of
understood Cassandra data model, its structure,  nodes etc. I also installed
Cassandra and played around with it, like

  cassandra> set Keyspace1.Standard2['jsmith']['first'] = 'John'
  Value inserted.
 cassandra> get Keyspace1.Standard2['jsmith']
    (column=first, value=John; timestamp=1249930053103)
  Returned 1 rows.

But don't know what to do next? Like if someone says me this is MySQL
database migrate it to cassandra, then I dont know what should be my next
step?

Can you please help me how to move forward? How should I do all the setup
for this?

Any help is appreciated.

Thanks,
Sonia

Re: How to migrate any relational database to Cassandra

Posted by Peng Guo <gp...@gmail.com>.
Maybe you could integrate with Hadoop.

On Mon, Aug 9, 2010 at 1:15 PM, sonia gehlot <so...@gmail.com> wrote:

> Hi Guys,
>
> Thanks for sharing your experiences and valuable links.... these are really
> helpful.
>
> But I want to do ETL and then wanted to load data in Cassandra. I have link
> 10-15 various source system, presently daily ETL jobs runs load data in our
> database which is Netezza. How can I do this in Cassandra, like what if my
> target data base is source are the same (MySQL, Oracle, Netezza..etc)?
>
> -Sonia
>
> On Sat, Aug 7, 2010 at 7:46 PM, Zhong Li <zl...@voxeo.com> wrote:
>
>> Yes, I use OrderPreservngPartitioner, the token considers
>> datacenter+ip+function+timestamp+recordId+...
>>
>>
>> On Aug 7, 2010, at 10:36 PM, Jonathan Ellis wrote:
>>
>>  are you using OrderPreservingPartitioner then?
>>>
>>> On Sat, Aug 7, 2010 at 10:32 PM, Zhong Li <zl...@voxeo.com> wrote:
>>>
>>>> Here is just my personal experiences.
>>>>
>>>> I recently use Cassandra to implement a system cross 5 datacenters.
>>>> Because
>>>> it is impossible to do it in SQL Database at low cost, Cassandra helps.
>>>>
>>>> Cassandra is all about indexing, there is no relationship naturally, you
>>>> have to use indexing to keep all relationships. This is fine, because
>>>> you
>>>> can add new index when you want.
>>>>
>>>> The big pain is the token. Only one token you can choose for a node, all
>>>> system have to adopt same rule to create index. It is huge huge pain.
>>>>
>>>> If Cassandra can implement token at CF level, it is much nature and easy
>>>> for
>>>> us to implement a system.
>>>>
>>>> Best,
>>>>
>>>> Zhong
>>>>
>>>>
>>>> On Aug 6, 2010, at 9:23 PM, Peter Harrison wrote:
>>>>
>>>>  On Sat, Aug 7, 2010 at 6:00 AM, sonia gehlot <so...@gmail.com>
>>>>> wrote:
>>>>>
>>>>>  Can you please help me how to move forward? How should I do all the
>>>>>> setup
>>>>>> for this?
>>>>>>
>>>>>
>>>>> My view is that Cassandra is fundamentally different from SQL
>>>>> databases.
>>>>> There
>>>>> may be artefact's which are superficially similar between the two
>>>>> systems,
>>>>> but
>>>>> I guess I'm thinking of a move to Cassandra like my move from dBase to
>>>>> Delphi;
>>>>> in other words there were concepts which modified how you write
>>>>> applications.
>>>>>
>>>>> Now, you can do something similar to a SQL database, but I don't think
>>>>> you
>>>>> would
>>>>> be leveraging the features of Cassandra. That said, I think there will
>>>>> be
>>>>> a new
>>>>> generation of abstraction tools that will make modeling easier.
>>>>>
>>>>> A perhaps more practical answer: there is no one to one mapping between
>>>>> SQL
>>>>> and Cassandra.
>>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>> Jonathan Ellis
>>> Project Chair, Apache Cassandra
>>> co-founder of Riptano, the source for professional Cassandra support
>>> http://riptano.com
>>>
>>
>>
>


-- 
Regards
    Peng Guo

Re: How to migrate any relational database to Cassandra

Posted by sonia gehlot <so...@gmail.com>.
Hi Guys,

Thanks for sharing your experiences and valuable links.... these are really
helpful.

But I want to do ETL and then wanted to load data in Cassandra. I have link
10-15 various source system, presently daily ETL jobs runs load data in our
database which is Netezza. How can I do this in Cassandra, like what if my
target data base is source are the same (MySQL, Oracle, Netezza..etc)?

-Sonia

On Sat, Aug 7, 2010 at 7:46 PM, Zhong Li <zl...@voxeo.com> wrote:

> Yes, I use OrderPreservngPartitioner, the token considers
> datacenter+ip+function+timestamp+recordId+...
>
>
> On Aug 7, 2010, at 10:36 PM, Jonathan Ellis wrote:
>
>  are you using OrderPreservingPartitioner then?
>>
>> On Sat, Aug 7, 2010 at 10:32 PM, Zhong Li <zl...@voxeo.com> wrote:
>>
>>> Here is just my personal experiences.
>>>
>>> I recently use Cassandra to implement a system cross 5 datacenters.
>>> Because
>>> it is impossible to do it in SQL Database at low cost, Cassandra helps.
>>>
>>> Cassandra is all about indexing, there is no relationship naturally, you
>>> have to use indexing to keep all relationships. This is fine, because you
>>> can add new index when you want.
>>>
>>> The big pain is the token. Only one token you can choose for a node, all
>>> system have to adopt same rule to create index. It is huge huge pain.
>>>
>>> If Cassandra can implement token at CF level, it is much nature and easy
>>> for
>>> us to implement a system.
>>>
>>> Best,
>>>
>>> Zhong
>>>
>>>
>>> On Aug 6, 2010, at 9:23 PM, Peter Harrison wrote:
>>>
>>>  On Sat, Aug 7, 2010 at 6:00 AM, sonia gehlot <so...@gmail.com>
>>>> wrote:
>>>>
>>>>  Can you please help me how to move forward? How should I do all the
>>>>> setup
>>>>> for this?
>>>>>
>>>>
>>>> My view is that Cassandra is fundamentally different from SQL databases.
>>>> There
>>>> may be artefact's which are superficially similar between the two
>>>> systems,
>>>> but
>>>> I guess I'm thinking of a move to Cassandra like my move from dBase to
>>>> Delphi;
>>>> in other words there were concepts which modified how you write
>>>> applications.
>>>>
>>>> Now, you can do something similar to a SQL database, but I don't think
>>>> you
>>>> would
>>>> be leveraging the features of Cassandra. That said, I think there will
>>>> be
>>>> a new
>>>> generation of abstraction tools that will make modeling easier.
>>>>
>>>> A perhaps more practical answer: there is no one to one mapping between
>>>> SQL
>>>> and Cassandra.
>>>>
>>>
>>>
>>>
>>
>>
>> --
>> Jonathan Ellis
>> Project Chair, Apache Cassandra
>> co-founder of Riptano, the source for professional Cassandra support
>> http://riptano.com
>>
>
>

Re: How to migrate any relational database to Cassandra

Posted by Zhong Li <zl...@voxeo.com>.
Yes, I use OrderPreservngPartitioner, the token considers datacenter+ip 
+function+timestamp+recordId+...

On Aug 7, 2010, at 10:36 PM, Jonathan Ellis wrote:

> are you using OrderPreservingPartitioner then?
>
> On Sat, Aug 7, 2010 at 10:32 PM, Zhong Li <zl...@voxeo.com> wrote:
>> Here is just my personal experiences.
>>
>> I recently use Cassandra to implement a system cross 5 datacenters.  
>> Because
>> it is impossible to do it in SQL Database at low cost, Cassandra  
>> helps.
>>
>> Cassandra is all about indexing, there is no relationship  
>> naturally, you
>> have to use indexing to keep all relationships. This is fine,  
>> because you
>> can add new index when you want.
>>
>> The big pain is the token. Only one token you can choose for a  
>> node, all
>> system have to adopt same rule to create index. It is huge huge pain.
>>
>> If Cassandra can implement token at CF level, it is much nature and  
>> easy for
>> us to implement a system.
>>
>> Best,
>>
>> Zhong
>>
>>
>> On Aug 6, 2010, at 9:23 PM, Peter Harrison wrote:
>>
>>> On Sat, Aug 7, 2010 at 6:00 AM, sonia gehlot  
>>> <so...@gmail.com>
>>> wrote:
>>>
>>>> Can you please help me how to move forward? How should I do all  
>>>> the setup
>>>> for this?
>>>
>>> My view is that Cassandra is fundamentally different from SQL  
>>> databases.
>>> There
>>> may be artefact's which are superficially similar between the two  
>>> systems,
>>> but
>>> I guess I'm thinking of a move to Cassandra like my move from  
>>> dBase to
>>> Delphi;
>>> in other words there were concepts which modified how you write
>>> applications.
>>>
>>> Now, you can do something similar to a SQL database, but I don't  
>>> think you
>>> would
>>> be leveraging the features of Cassandra. That said, I think there  
>>> will be
>>> a new
>>> generation of abstraction tools that will make modeling easier.
>>>
>>> A perhaps more practical answer: there is no one to one mapping  
>>> between
>>> SQL
>>> and Cassandra.
>>
>>
>
>
>
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com


Re: How to migrate any relational database to Cassandra

Posted by Jonathan Ellis <jb...@gmail.com>.
are you using OrderPreservingPartitioner then?

On Sat, Aug 7, 2010 at 10:32 PM, Zhong Li <zl...@voxeo.com> wrote:
> Here is just my personal experiences.
>
> I recently use Cassandra to implement a system cross 5 datacenters. Because
> it is impossible to do it in SQL Database at low cost, Cassandra helps.
>
> Cassandra is all about indexing, there is no relationship naturally, you
> have to use indexing to keep all relationships. This is fine, because you
> can add new index when you want.
>
> The big pain is the token. Only one token you can choose for a node, all
> system have to adopt same rule to create index. It is huge huge pain.
>
> If Cassandra can implement token at CF level, it is much nature and easy for
> us to implement a system.
>
> Best,
>
> Zhong
>
>
> On Aug 6, 2010, at 9:23 PM, Peter Harrison wrote:
>
>> On Sat, Aug 7, 2010 at 6:00 AM, sonia gehlot <so...@gmail.com>
>> wrote:
>>
>>> Can you please help me how to move forward? How should I do all the setup
>>> for this?
>>
>> My view is that Cassandra is fundamentally different from SQL databases.
>> There
>> may be artefact's which are superficially similar between the two systems,
>> but
>> I guess I'm thinking of a move to Cassandra like my move from dBase to
>> Delphi;
>> in other words there were concepts which modified how you write
>> applications.
>>
>> Now, you can do something similar to a SQL database, but I don't think you
>> would
>> be leveraging the features of Cassandra. That said, I think there will be
>> a new
>> generation of abstraction tools that will make modeling easier.
>>
>> A perhaps more practical answer: there is no one to one mapping between
>> SQL
>> and Cassandra.
>
>



-- 
Jonathan Ellis
Project Chair, Apache Cassandra
co-founder of Riptano, the source for professional Cassandra support
http://riptano.com

Re: How to migrate any relational database to Cassandra

Posted by Zhong Li <zl...@voxeo.com>.
Here is just my personal experiences.

I recently use Cassandra to implement a system cross 5 datacenters.  
Because it is impossible to do it in SQL Database at low cost,  
Cassandra helps.

Cassandra is all about indexing, there is no relationship naturally,  
you have to use indexing to keep all relationships. This is fine,  
because you can add new index when you want.

The big pain is the token. Only one token you can choose for a node,  
all system have to adopt same rule to create index. It is huge huge  
pain.

If Cassandra can implement token at CF level, it is much nature and  
easy for us to implement a system.

Best,

Zhong


On Aug 6, 2010, at 9:23 PM, Peter Harrison wrote:

> On Sat, Aug 7, 2010 at 6:00 AM, sonia gehlot  
> <so...@gmail.com> wrote:
>
>> Can you please help me how to move forward? How should I do all the  
>> setup
>> for this?
>
> My view is that Cassandra is fundamentally different from SQL  
> databases. There
> may be artefact's which are superficially similar between the two  
> systems, but
> I guess I'm thinking of a move to Cassandra like my move from dBase  
> to Delphi;
> in other words there were concepts which modified how you write  
> applications.
>
> Now, you can do something similar to a SQL database, but I don't  
> think you would
> be leveraging the features of Cassandra. That said, I think there  
> will be a new
> generation of abstraction tools that will make modeling easier.
>
> A perhaps more practical answer: there is no one to one mapping  
> between SQL
> and Cassandra.


Re: How to migrate any relational database to Cassandra

Posted by Peter Harrison <ch...@gmail.com>.
On Sat, Aug 7, 2010 at 6:00 AM, sonia gehlot <so...@gmail.com> wrote:

> Can you please help me how to move forward? How should I do all the setup
> for this?

My view is that Cassandra is fundamentally different from SQL databases. There
may be artefact's which are superficially similar between the two systems, but
I guess I'm thinking of a move to Cassandra like my move from dBase to Delphi;
in other words there were concepts which modified how you write applications.

Now, you can do something similar to a SQL database, but I don't think you would
be leveraging the features of Cassandra. That said, I think there will be a new
generation of abstraction tools that will make modeling easier.

A perhaps more practical answer: there is no one to one mapping between SQL
and Cassandra.

Re: How to migrate any relational database to Cassandra

Posted by Benjamin Black <b...@b3k.us>.
http://maxgrinev.com/2010/07/12/do-you-really-need-sql-to-do-it-all-in-cassandra/
http://www.slideshare.net/benjaminblack/cassandra-basics-indexing

On Fri, Aug 6, 2010 at 11:42 AM, sonia gehlot <so...@gmail.com> wrote:
> Thanks for reply,
>
> I am sorry It seems my question comes out wrong..
>
> * My question is what are the considration should I keep in mind to Migrate
> to Cassandra?
>
> * Like we do in ETL to extract data from source data we write query and then
> load it in our database after applying desired transformation.. How can we
> do this if we want extract data from MySQL and load it into Cassandra.
>
> * I think I can write script for these kind of stuff but do anyone have any
> example script?
>
> * What kind of setup I need to do this?
>
> -Sonia
>
>
> On Fri, Aug 6, 2010 at 11:24 AM, Michael Dürgner <mi...@duergner.de>
> wrote:
>>
>> In my opinion it's the wrong approach when so ask how to migrate from
>> MySQL to Cassandra from a database level view. The lack of joins in NoSQL
>> should lead to think about what you wanna get out of your persistent storage
>> and afterwards think about how to migrate and most of the time how to
>> denormalize the data you have in order to insert it into a NoSQL storage
>> like Cassandra.
>>
>> Simply just migrating the data and moving the joins up to the application
>> level might work in the beginning but most of the times doesn't scale in the
>> end.
>>
>> Am 06.08.2010 um 20:00 schrieb sonia gehlot:
>>
>> > Hi All,
>> >
>> >
>> > Little background about myself. I am ETL engineer worked in only
>> > relational databases.
>> >
>> > I have been reading and trying Cassandra since 3-4 weeks. I kind of
>> > understood Cassandra data model, its structure,  nodes etc. I also installed
>> > Cassandra and played around with it, like
>> >
>> >   cassandra> set Keyspace1.Standard2['jsmith']['first'] = 'John'
>> >   Value inserted.
>> >  cassandra> get Keyspace1.Standard2['jsmith']
>> >     (column=first, value=John; timestamp=1249930053103)
>> >   Returned 1 rows.
>> >
>> > But don't know what to do next? Like if someone says me this is MySQL
>> > database migrate it to cassandra, then I dont know what should be my next
>> > step?
>> >
>> > Can you please help me how to move forward? How should I do all the
>> > setup for this?
>> >
>> > Any help is appreciated.
>> >
>> > Thanks,
>> > Sonia
>> >
>>
>
>

Re: How to migrate any relational database to Cassandra

Posted by sonia gehlot <so...@gmail.com>.
Thanks for reply,

I am sorry It seems my question comes out wrong..

* My question is what are the considration should I keep in mind to Migrate
to Cassandra?

* Like we do in ETL to extract data from source data we write query and then
load it in our database after applying desired transformation.. How can we
do this if we want extract data from MySQL and load it into Cassandra.

* I think I can write script for these kind of stuff but do anyone have any
example script?

* What kind of setup I need to do this?

-Sonia


On Fri, Aug 6, 2010 at 11:24 AM, Michael Dürgner <mi...@duergner.de>wrote:

> In my opinion it's the wrong approach when so ask how to migrate from MySQL
> to Cassandra from a database level view. The lack of joins in NoSQL should
> lead to think about what you wanna get out of your persistent storage and
> afterwards think about how to migrate and most of the time how to
> denormalize the data you have in order to insert it into a NoSQL storage
> like Cassandra.
>
> Simply just migrating the data and moving the joins up to the application
> level might work in the beginning but most of the times doesn't scale in the
> end.
>
> Am 06.08.2010 um 20:00 schrieb sonia gehlot:
>
> > Hi All,
> >
> >
> > Little background about myself. I am ETL engineer worked in only
> relational databases.
> >
> > I have been reading and trying Cassandra since 3-4 weeks. I kind of
> understood Cassandra data model, its structure,  nodes etc. I also installed
> Cassandra and played around with it, like
> >
> >   cassandra> set Keyspace1.Standard2['jsmith']['first'] = 'John'
> >   Value inserted.
> >  cassandra> get Keyspace1.Standard2['jsmith']
> >     (column=first, value=John; timestamp=1249930053103)
> >   Returned 1 rows.
> >
> > But don't know what to do next? Like if someone says me this is MySQL
> database migrate it to cassandra, then I dont know what should be my next
> step?
> >
> > Can you please help me how to move forward? How should I do all the setup
> for this?
> >
> > Any help is appreciated.
> >
> > Thanks,
> > Sonia
> >
>
>

Re: How to migrate any relational database to Cassandra

Posted by Michael Dürgner <mi...@duergner.de>.
In my opinion it's the wrong approach when so ask how to migrate from MySQL to Cassandra from a database level view. The lack of joins in NoSQL should lead to think about what you wanna get out of your persistent storage and afterwards think about how to migrate and most of the time how to denormalize the data you have in order to insert it into a NoSQL storage like Cassandra.

Simply just migrating the data and moving the joins up to the application level might work in the beginning but most of the times doesn't scale in the end.

Am 06.08.2010 um 20:00 schrieb sonia gehlot:

> Hi All,
> 
> 
> Little background about myself. I am ETL engineer worked in only relational databases.
> 
> I have been reading and trying Cassandra since 3-4 weeks. I kind of understood Cassandra data model, its structure,  nodes etc. I also installed Cassandra and played around with it, like 
> 
>   cassandra> set Keyspace1.Standard2['jsmith']['first'] = 'John'
>   Value inserted.
>  cassandra> get Keyspace1.Standard2['jsmith']
>     (column=first, value=John; timestamp=1249930053103)
>   Returned 1 rows.
> 
> But don't know what to do next? Like if someone says me this is MySQL database migrate it to cassandra, then I dont know what should be my next step? 
> 
> Can you please help me how to move forward? How should I do all the setup for this? 
> 
> Any help is appreciated.
> 
> Thanks,
> Sonia 
>