You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Min Zhou <co...@gmail.com> on 2009/08/04 05:15:29 UTC

how to dump data from a mysql cluster to hdfs?

hi all,

We need to dump data from a mysql cluster with about 50 nodes to a hdfs
file. Considered about the issues on security , we can't use tools like
sqoop, where all datanodes must hold a connection to mysql. any suggestions?


Thanks,
Min
-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: how to dump data from a mysql cluster to hdfs?

Posted by tim robertson <ti...@gmail.com>.
Sounds like you don't have many other options other than finding a
machine that can get to the mysql gateway, do some mysql export, copy
to a machine which can talk to the Hadoop gateway (if necessary) and
then copy it in to HDFS.

I'm not sure what else you could do.

Cheers

Tim



On Thu, Aug 6, 2009 at 12:31 PM, Min Zhou<co...@gmail.com> wrote:
> I guess I havenot expressed clearly. Neither datanodes nor namenodes is
> allowed to be directly connected.
> Even though, namenode is often in heavy work, it would be burdened dumping
> data on them.
>
> On Thu, Aug 6, 2009 at 1:52 PM, Yang Zhou <ya...@gmail.com> wrote:
>
>> Write a Java program which will dump data from mysql cluster and save them
>> into HDFS at the same time.
>> Run it on namenode. I assume namenode should be able to connect to mysql
>> gateway.
>> Will it work?
>>
>> On Thu, Aug 6, 2009 at 12:02 PM, Min Zhou <co...@gmail.com> wrote:
>>
>> > Hi Aaron,
>> >
>> > We couldnot run mysqldump on the nodes mysqld runs on. The only way is
>> > handling a connection to a gateway of the mysql cluster. Our hadoop
>> cluster
>> > serves us with also gateways, it's not allowed hadoop datanodes directly
>> > connect to mysql gateway.
>> >
>> > Min
>> >
>> > On Thu, Aug 6, 2009 at 1:27 AM, Aaron Kimball <aa...@cloudera.com>
>> wrote:
>> >
>> > > mysqldump to local files on all 50 nodes, scp them to datanodes, and
>> then
>> > > bin/hadoop fs -put?
>> > > - Aaron
>> > >
>> > > On Mon, Aug 3, 2009 at 8:15 PM, Min Zhou <co...@gmail.com> wrote:
>> > >
>> > > > hi all,
>> > > >
>> > > > We need to dump data from a mysql cluster with about 50 nodes to a
>> hdfs
>> > > > file. Considered about the issues on security , we can't use tools
>> like
>> > > > sqoop, where all datanodes must hold a connection to mysql. any
>> > > > suggestions?
>> > > >
>> > > >
>> > > > Thanks,
>> > > > Min
>> > > > --
>> > > > My research interests are distributed systems, parallel computing and
>> > > > bytecode based virtual machine.
>> > > >
>> > > > My profile:
>> > > > http://www.linkedin.com/in/coderplay
>> > > > My blog:
>> > > > http://coderplay.javaeye.com
>> > > >
>> > >
>> >
>> >
>> >
>> > --
>> > My research interests are distributed systems, parallel computing and
>> > bytecode based virtual machine.
>> >
>> > My profile:
>> > http://www.linkedin.com/in/coderplay
>> > My blog:
>> > http://coderplay.javaeye.com
>> >
>>
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>

Re: how to dump data from a mysql cluster to hdfs?

Posted by Min Zhou <co...@gmail.com>.
I guess I havenot expressed clearly. Neither datanodes nor namenodes is
allowed to be directly connected.
Even though, namenode is often in heavy work, it would be burdened dumping
data on them.

On Thu, Aug 6, 2009 at 1:52 PM, Yang Zhou <ya...@gmail.com> wrote:

> Write a Java program which will dump data from mysql cluster and save them
> into HDFS at the same time.
> Run it on namenode. I assume namenode should be able to connect to mysql
> gateway.
> Will it work?
>
> On Thu, Aug 6, 2009 at 12:02 PM, Min Zhou <co...@gmail.com> wrote:
>
> > Hi Aaron,
> >
> > We couldnot run mysqldump on the nodes mysqld runs on. The only way is
> > handling a connection to a gateway of the mysql cluster. Our hadoop
> cluster
> > serves us with also gateways, it's not allowed hadoop datanodes directly
> > connect to mysql gateway.
> >
> > Min
> >
> > On Thu, Aug 6, 2009 at 1:27 AM, Aaron Kimball <aa...@cloudera.com>
> wrote:
> >
> > > mysqldump to local files on all 50 nodes, scp them to datanodes, and
> then
> > > bin/hadoop fs -put?
> > > - Aaron
> > >
> > > On Mon, Aug 3, 2009 at 8:15 PM, Min Zhou <co...@gmail.com> wrote:
> > >
> > > > hi all,
> > > >
> > > > We need to dump data from a mysql cluster with about 50 nodes to a
> hdfs
> > > > file. Considered about the issues on security , we can't use tools
> like
> > > > sqoop, where all datanodes must hold a connection to mysql. any
> > > > suggestions?
> > > >
> > > >
> > > > Thanks,
> > > > Min
> > > > --
> > > > My research interests are distributed systems, parallel computing and
> > > > bytecode based virtual machine.
> > > >
> > > > My profile:
> > > > http://www.linkedin.com/in/coderplay
> > > > My blog:
> > > > http://coderplay.javaeye.com
> > > >
> > >
> >
> >
> >
> > --
> > My research interests are distributed systems, parallel computing and
> > bytecode based virtual machine.
> >
> > My profile:
> > http://www.linkedin.com/in/coderplay
> > My blog:
> > http://coderplay.javaeye.com
> >
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: how to dump data from a mysql cluster to hdfs?

Posted by Yang Zhou <ya...@gmail.com>.
Write a Java program which will dump data from mysql cluster and save them
into HDFS at the same time.
Run it on namenode. I assume namenode should be able to connect to mysql
gateway.
Will it work?

On Thu, Aug 6, 2009 at 12:02 PM, Min Zhou <co...@gmail.com> wrote:

> Hi Aaron,
>
> We couldnot run mysqldump on the nodes mysqld runs on. The only way is
> handling a connection to a gateway of the mysql cluster. Our hadoop cluster
> serves us with also gateways, it's not allowed hadoop datanodes directly
> connect to mysql gateway.
>
> Min
>
> On Thu, Aug 6, 2009 at 1:27 AM, Aaron Kimball <aa...@cloudera.com> wrote:
>
> > mysqldump to local files on all 50 nodes, scp them to datanodes, and then
> > bin/hadoop fs -put?
> > - Aaron
> >
> > On Mon, Aug 3, 2009 at 8:15 PM, Min Zhou <co...@gmail.com> wrote:
> >
> > > hi all,
> > >
> > > We need to dump data from a mysql cluster with about 50 nodes to a hdfs
> > > file. Considered about the issues on security , we can't use tools like
> > > sqoop, where all datanodes must hold a connection to mysql. any
> > > suggestions?
> > >
> > >
> > > Thanks,
> > > Min
> > > --
> > > My research interests are distributed systems, parallel computing and
> > > bytecode based virtual machine.
> > >
> > > My profile:
> > > http://www.linkedin.com/in/coderplay
> > > My blog:
> > > http://coderplay.javaeye.com
> > >
> >
>
>
>
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>

Re: how to dump data from a mysql cluster to hdfs?

Posted by Min Zhou <co...@gmail.com>.
Hi Aaron,

We couldnot run mysqldump on the nodes mysqld runs on. The only way is
handling a connection to a gateway of the mysql cluster. Our hadoop cluster
serves us with also gateways, it's not allowed hadoop datanodes directly
connect to mysql gateway.

Min

On Thu, Aug 6, 2009 at 1:27 AM, Aaron Kimball <aa...@cloudera.com> wrote:

> mysqldump to local files on all 50 nodes, scp them to datanodes, and then
> bin/hadoop fs -put?
> - Aaron
>
> On Mon, Aug 3, 2009 at 8:15 PM, Min Zhou <co...@gmail.com> wrote:
>
> > hi all,
> >
> > We need to dump data from a mysql cluster with about 50 nodes to a hdfs
> > file. Considered about the issues on security , we can't use tools like
> > sqoop, where all datanodes must hold a connection to mysql. any
> > suggestions?
> >
> >
> > Thanks,
> > Min
> > --
> > My research interests are distributed systems, parallel computing and
> > bytecode based virtual machine.
> >
> > My profile:
> > http://www.linkedin.com/in/coderplay
> > My blog:
> > http://coderplay.javaeye.com
> >
>



-- 
My research interests are distributed systems, parallel computing and
bytecode based virtual machine.

My profile:
http://www.linkedin.com/in/coderplay
My blog:
http://coderplay.javaeye.com

Re: how to dump data from a mysql cluster to hdfs?

Posted by Aaron Kimball <aa...@cloudera.com>.
mysqldump to local files on all 50 nodes, scp them to datanodes, and then
bin/hadoop fs -put?
- Aaron

On Mon, Aug 3, 2009 at 8:15 PM, Min Zhou <co...@gmail.com> wrote:

> hi all,
>
> We need to dump data from a mysql cluster with about 50 nodes to a hdfs
> file. Considered about the issues on security , we can't use tools like
> sqoop, where all datanodes must hold a connection to mysql. any
> suggestions?
>
>
> Thanks,
> Min
> --
> My research interests are distributed systems, parallel computing and
> bytecode based virtual machine.
>
> My profile:
> http://www.linkedin.com/in/coderplay
> My blog:
> http://coderplay.javaeye.com
>