You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Robber Phex <ro...@gmail.com> on 2013/05/20 15:42:14 UTC

mysql to hbase, how to custom row-family?

I have a table in mysql like this:
+-------+-------------+------+-----+---------+-------+
| Field | Type        | Null | Key | Default | Extra |
+-------+-------------+------+-----+---------+-------+
| id    | int(11)     | NO   | PRI | NULL    |       |
| name  | varchar(45) | YES  |     | NULL    |       |
| age   | int(11)     | YES  |     | NULL    |       |
| email | varchar(20) | YES  |     | NULL    |       |
| title | varchar(10) | YES  |     | NULL    |       |
+-------+-------------+------+-----+---------+-------+
I want to export it to hbase by sqoop.
I execute this:
sqoop import --connect jdbc:mysql://h2/test --username testu --password
testp \
  --table testtable --hbase-create-table --hbase-table hbasetable \
  --column-family name --column-family age --column-family email
--column-family title \
  --hbase-row-key id

I get full data in hbase, but I want custom row-family. Exactly I need
transformation like this:

HBase                       MySQL Columns

row key    <-----------     id
baseinfo   <-----------     name
                 `-<---     age
workinfo   <-----------     email
                 `-<---     title

Anyone have any idea?

-- 
Regards,
RobberPhex

About me: http://about.me/RobberPhex

Re: mysql to hbase, how to custom row-family?

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Robber,
I'm glad to hear that you were able to find workaround for your use case! It seems that the HBase import wasn't working as expected for you when using --columns argument. If my understanding is correct, would you mind filling a JIRA [1], so that it can be fixed in future releases?

Jarcec

Links:
1: https://issues.apache.org/jira/browse/SQOOP

On Tue, May 21, 2013 at 01:34:01AM +0800, Robber Phex wrote:
> This is data in mysql table:
> +----+------+------+-------------+-------+
> | id | name | age  | email       | title |
> +----+------+------+-------------+-------+
> |  1 | aaaa |   20 | a@gmail.com | A     |
> |  2 | bbbb |   21 | a@gmail.com | B     |
> |  3 | cccc |   21 | c@gmail.com | A     |
> +----+------+------+-------------+-------+
> 
> and, I want data in hbase:
> ROW
> COLUMN+CELL
>  1                    column=baseinfo:age, timestamp=1369041765578,
> value=20
>  1                    column=baseinfo:name, timestamp=1369041765578,
> value=aaaa
>  1                    column=workinfo:email, timestamp=1369041967157,value=
> a@gmail.com
>  1                    column=workinfo:title, timestamp=1369041967157,
> value=A
>  2                    column=baseinfo:age, timestamp=1369041778521,
> value=21
>  2                    column=baseinfo:name, timestamp=1369041778521,
> value=bbbb
>  2                    column=workinfo:email, timestamp=1369041970526, value=
> a@gmail.com
>  2                    column=workinfo:title, timestamp=1369041970526,
> value=B
>  3                    column=baseinfo:age, timestamp=1369041778521,
> value=21
>  3                    column=baseinfo:name, timestamp=1369041778521,
> value=cccc
>  3                    column=workinfo:email, timestamp=1369041970526, value=
> c@gmail.com
>  3                    column=workinfo:title, timestamp=1369041970526,
> value=A
> 3 row(s) in 1.1040 seconds
> 
> I use two commonds:
> #name,age to baseinfo
> sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
> password \
>   --query 'SELECT id,name,age FROM testtable WHERE $CONDITIONS' \
>   --hbase-create-table --hbase-table hbasetable \
>   --column-family baseinfo --hbase-row-key id --split-by id
> 
> #email,title to workinfo
> sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
> password \
>   --query 'SELECT id,email,title FROM testtable WHERE $CONDITIONS' \
>   --hbase-create-table --hbase-table hbasetable \
>   --column-family workinfo --hbase-row-key id --split-by id
> 
> Anyone have more simple, easy-understand way?
> 
> On Tue, May 21, 2013 at 12:55 AM, Jarek Jarcec Cecho <ja...@apache.org>wrote:
> 
> > Hi Robber,
> > Sqoop won't import rows where all columns that should be inserted into
> > cell contains NULL. Would you mind taking a look if in your case the name
> > and age are NULL? Another possibility to set the property
> > sqoop.hbase.add.row.key. You might find more information about this
> > property in SQOOP-870 [1] or in our user guide [1].
> >
> > Jarcec
> >
> > Links:
> > 1: https://issues.apache.org/jira/browse/SQOOP-870
> > 2:
> > http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_additional_import_configuration_properties
> >
> > On Mon, May 20, 2013 at 11:02:56PM +0800, Robber Phex wrote:
> > > I think I can do it:
> > >
> > > HBase                       MySQL Columns
> > >
> > > row key    <-----------     id
> > > baseinfo   <-----------     name
> > >                  `-<---     age
> > >
> > > But, I use following command import zero record.
> > >
> > > sqoop import --connect jdbc:mysql://h2/test --username hiveuser
> > --password
> > > password --table testtable --hbase-table hbasetable --column-family
> > > baseinfo --columns "name,age" --hbase-row-key id
> > >
> > > Or,you means:
> > >
> > > sqoop import --connect jdbc:mysql://h2/test --username hiveuser
> > --password
> > > password --table testtable --hbase-table hbasetable --column-family name
> > > --hbase-row-key id
> > >
> > > sqoop import --connect jdbc:mysql://h2/test --username hiveuser
> > --password
> > > password --table testtable --hbase-table hbasetable --column-family age
> > > --hbase-row-key id
> > >
> > > Thanks.
> > >
> > > On Mon, May 20, 2013 at 10:10 PM, Jarek Jarcec Cecho <jarcec@apache.org
> > >wrote:
> > >
> > > > Hi Robber,
> > > > Sqoop currently supports importing only into single column family. You
> > can
> > > > import data twice using two different column families in each case and
> > > > parameter --columns to filter only relevant columns for given column
> > family.
> > > >
> > > > Jarcec
> > > >
> > > > On Mon, May 20, 2013 at 09:42:14PM +0800, Robber Phex wrote:
> > > > > I have a table in mysql like this:
> > > > > +-------+-------------+------+-----+---------+-------+
> > > > > | Field | Type        | Null | Key | Default | Extra |
> > > > > +-------+-------------+------+-----+---------+-------+
> > > > > | id    | int(11)     | NO   | PRI | NULL    |       |
> > > > > | name  | varchar(45) | YES  |     | NULL    |       |
> > > > > | age   | int(11)     | YES  |     | NULL    |       |
> > > > > | email | varchar(20) | YES  |     | NULL    |       |
> > > > > | title | varchar(10) | YES  |     | NULL    |       |
> > > > > +-------+-------------+------+-----+---------+-------+
> > > > > I want to export it to hbase by sqoop.
> > > > > I execute this:
> > > > > sqoop import --connect jdbc:mysql://h2/test --username testu
> > --password
> > > > > testp \
> > > > >   --table testtable --hbase-create-table --hbase-table hbasetable \
> > > > >   --column-family name --column-family age --column-family email
> > > > > --column-family title \
> > > > >   --hbase-row-key id
> > > > >
> > > > > I get full data in hbase, but I want custom row-family. Exactly I
> > need
> > > > > transformation like this:
> > > > >
> > > > > HBase                       MySQL Columns
> > > > >
> > > > > row key    <-----------     id
> > > > > baseinfo   <-----------     name
> > > > >                  `-<---     age
> > > > > workinfo   <-----------     email
> > > > >                  `-<---     title
> > > > >
> > > > > Anyone have any idea?
> > > > >
> > > > > --
> > > > > Regards,
> > > > > RobberPhex
> > > > >
> > > > > About me: http://about.me/RobberPhex
> > > >
> > >
> > >
> > >
> > > --
> > > Regards,
> > > RobberPhex
> > >
> > > About me: http://about.me/RobberPhex
> >
> 
> 
> 
> -- 
> Regards,
> RobberPhex
> 
> About me: http://about.me/RobberPhex

Re: mysql to hbase, how to custom row-family?

Posted by Robber Phex <ro...@gmail.com>.
This is data in mysql table:
+----+------+------+-------------+-------+
| id | name | age  | email       | title |
+----+------+------+-------------+-------+
|  1 | aaaa |   20 | a@gmail.com | A     |
|  2 | bbbb |   21 | a@gmail.com | B     |
|  3 | cccc |   21 | c@gmail.com | A     |
+----+------+------+-------------+-------+

and, I want data in hbase:
ROW
COLUMN+CELL
 1                    column=baseinfo:age, timestamp=1369041765578,
value=20
 1                    column=baseinfo:name, timestamp=1369041765578,
value=aaaa
 1                    column=workinfo:email, timestamp=1369041967157,value=
a@gmail.com
 1                    column=workinfo:title, timestamp=1369041967157,
value=A
 2                    column=baseinfo:age, timestamp=1369041778521,
value=21
 2                    column=baseinfo:name, timestamp=1369041778521,
value=bbbb
 2                    column=workinfo:email, timestamp=1369041970526, value=
a@gmail.com
 2                    column=workinfo:title, timestamp=1369041970526,
value=B
 3                    column=baseinfo:age, timestamp=1369041778521,
value=21
 3                    column=baseinfo:name, timestamp=1369041778521,
value=cccc
 3                    column=workinfo:email, timestamp=1369041970526, value=
c@gmail.com
 3                    column=workinfo:title, timestamp=1369041970526,
value=A
3 row(s) in 1.1040 seconds

I use two commonds:
#name,age to baseinfo
sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
password \
  --query 'SELECT id,name,age FROM testtable WHERE $CONDITIONS' \
  --hbase-create-table --hbase-table hbasetable \
  --column-family baseinfo --hbase-row-key id --split-by id

#email,title to workinfo
sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
password \
  --query 'SELECT id,email,title FROM testtable WHERE $CONDITIONS' \
  --hbase-create-table --hbase-table hbasetable \
  --column-family workinfo --hbase-row-key id --split-by id

Anyone have more simple, easy-understand way?

On Tue, May 21, 2013 at 12:55 AM, Jarek Jarcec Cecho <ja...@apache.org>wrote:

> Hi Robber,
> Sqoop won't import rows where all columns that should be inserted into
> cell contains NULL. Would you mind taking a look if in your case the name
> and age are NULL? Another possibility to set the property
> sqoop.hbase.add.row.key. You might find more information about this
> property in SQOOP-870 [1] or in our user guide [1].
>
> Jarcec
>
> Links:
> 1: https://issues.apache.org/jira/browse/SQOOP-870
> 2:
> http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_additional_import_configuration_properties
>
> On Mon, May 20, 2013 at 11:02:56PM +0800, Robber Phex wrote:
> > I think I can do it:
> >
> > HBase                       MySQL Columns
> >
> > row key    <-----------     id
> > baseinfo   <-----------     name
> >                  `-<---     age
> >
> > But, I use following command import zero record.
> >
> > sqoop import --connect jdbc:mysql://h2/test --username hiveuser
> --password
> > password --table testtable --hbase-table hbasetable --column-family
> > baseinfo --columns "name,age" --hbase-row-key id
> >
> > Or,you means:
> >
> > sqoop import --connect jdbc:mysql://h2/test --username hiveuser
> --password
> > password --table testtable --hbase-table hbasetable --column-family name
> > --hbase-row-key id
> >
> > sqoop import --connect jdbc:mysql://h2/test --username hiveuser
> --password
> > password --table testtable --hbase-table hbasetable --column-family age
> > --hbase-row-key id
> >
> > Thanks.
> >
> > On Mon, May 20, 2013 at 10:10 PM, Jarek Jarcec Cecho <jarcec@apache.org
> >wrote:
> >
> > > Hi Robber,
> > > Sqoop currently supports importing only into single column family. You
> can
> > > import data twice using two different column families in each case and
> > > parameter --columns to filter only relevant columns for given column
> family.
> > >
> > > Jarcec
> > >
> > > On Mon, May 20, 2013 at 09:42:14PM +0800, Robber Phex wrote:
> > > > I have a table in mysql like this:
> > > > +-------+-------------+------+-----+---------+-------+
> > > > | Field | Type        | Null | Key | Default | Extra |
> > > > +-------+-------------+------+-----+---------+-------+
> > > > | id    | int(11)     | NO   | PRI | NULL    |       |
> > > > | name  | varchar(45) | YES  |     | NULL    |       |
> > > > | age   | int(11)     | YES  |     | NULL    |       |
> > > > | email | varchar(20) | YES  |     | NULL    |       |
> > > > | title | varchar(10) | YES  |     | NULL    |       |
> > > > +-------+-------------+------+-----+---------+-------+
> > > > I want to export it to hbase by sqoop.
> > > > I execute this:
> > > > sqoop import --connect jdbc:mysql://h2/test --username testu
> --password
> > > > testp \
> > > >   --table testtable --hbase-create-table --hbase-table hbasetable \
> > > >   --column-family name --column-family age --column-family email
> > > > --column-family title \
> > > >   --hbase-row-key id
> > > >
> > > > I get full data in hbase, but I want custom row-family. Exactly I
> need
> > > > transformation like this:
> > > >
> > > > HBase                       MySQL Columns
> > > >
> > > > row key    <-----------     id
> > > > baseinfo   <-----------     name
> > > >                  `-<---     age
> > > > workinfo   <-----------     email
> > > >                  `-<---     title
> > > >
> > > > Anyone have any idea?
> > > >
> > > > --
> > > > Regards,
> > > > RobberPhex
> > > >
> > > > About me: http://about.me/RobberPhex
> > >
> >
> >
> >
> > --
> > Regards,
> > RobberPhex
> >
> > About me: http://about.me/RobberPhex
>



-- 
Regards,
RobberPhex

About me: http://about.me/RobberPhex

Re: mysql to hbase, how to custom row-family?

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Robber,
Sqoop won't import rows where all columns that should be inserted into cell contains NULL. Would you mind taking a look if in your case the name and age are NULL? Another possibility to set the property sqoop.hbase.add.row.key. You might find more information about this property in SQOOP-870 [1] or in our user guide [1].

Jarcec

Links:
1: https://issues.apache.org/jira/browse/SQOOP-870
2: http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_additional_import_configuration_properties

On Mon, May 20, 2013 at 11:02:56PM +0800, Robber Phex wrote:
> I think I can do it:
> 
> HBase                       MySQL Columns
> 
> row key    <-----------     id
> baseinfo   <-----------     name
>                  `-<---     age
> 
> But, I use following command import zero record.
> 
> sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
> password --table testtable --hbase-table hbasetable --column-family
> baseinfo --columns "name,age" --hbase-row-key id
> 
> Or,you means:
> 
> sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
> password --table testtable --hbase-table hbasetable --column-family name
> --hbase-row-key id
> 
> sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
> password --table testtable --hbase-table hbasetable --column-family age
> --hbase-row-key id
> 
> Thanks.
> 
> On Mon, May 20, 2013 at 10:10 PM, Jarek Jarcec Cecho <ja...@apache.org>wrote:
> 
> > Hi Robber,
> > Sqoop currently supports importing only into single column family. You can
> > import data twice using two different column families in each case and
> > parameter --columns to filter only relevant columns for given column family.
> >
> > Jarcec
> >
> > On Mon, May 20, 2013 at 09:42:14PM +0800, Robber Phex wrote:
> > > I have a table in mysql like this:
> > > +-------+-------------+------+-----+---------+-------+
> > > | Field | Type        | Null | Key | Default | Extra |
> > > +-------+-------------+------+-----+---------+-------+
> > > | id    | int(11)     | NO   | PRI | NULL    |       |
> > > | name  | varchar(45) | YES  |     | NULL    |       |
> > > | age   | int(11)     | YES  |     | NULL    |       |
> > > | email | varchar(20) | YES  |     | NULL    |       |
> > > | title | varchar(10) | YES  |     | NULL    |       |
> > > +-------+-------------+------+-----+---------+-------+
> > > I want to export it to hbase by sqoop.
> > > I execute this:
> > > sqoop import --connect jdbc:mysql://h2/test --username testu --password
> > > testp \
> > >   --table testtable --hbase-create-table --hbase-table hbasetable \
> > >   --column-family name --column-family age --column-family email
> > > --column-family title \
> > >   --hbase-row-key id
> > >
> > > I get full data in hbase, but I want custom row-family. Exactly I need
> > > transformation like this:
> > >
> > > HBase                       MySQL Columns
> > >
> > > row key    <-----------     id
> > > baseinfo   <-----------     name
> > >                  `-<---     age
> > > workinfo   <-----------     email
> > >                  `-<---     title
> > >
> > > Anyone have any idea?
> > >
> > > --
> > > Regards,
> > > RobberPhex
> > >
> > > About me: http://about.me/RobberPhex
> >
> 
> 
> 
> -- 
> Regards,
> RobberPhex
> 
> About me: http://about.me/RobberPhex

Re: mysql to hbase, how to custom row-family?

Posted by Robber Phex <ro...@gmail.com>.
I think I can do it:

HBase                       MySQL Columns

row key    <-----------     id
baseinfo   <-----------     name
                 `-<---     age

But, I use following command import zero record.

sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
password --table testtable --hbase-table hbasetable --column-family
baseinfo --columns "name,age" --hbase-row-key id

Or,you means:

sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
password --table testtable --hbase-table hbasetable --column-family name
--hbase-row-key id

sqoop import --connect jdbc:mysql://h2/test --username hiveuser --password
password --table testtable --hbase-table hbasetable --column-family age
--hbase-row-key id

Thanks.

On Mon, May 20, 2013 at 10:10 PM, Jarek Jarcec Cecho <ja...@apache.org>wrote:

> Hi Robber,
> Sqoop currently supports importing only into single column family. You can
> import data twice using two different column families in each case and
> parameter --columns to filter only relevant columns for given column family.
>
> Jarcec
>
> On Mon, May 20, 2013 at 09:42:14PM +0800, Robber Phex wrote:
> > I have a table in mysql like this:
> > +-------+-------------+------+-----+---------+-------+
> > | Field | Type        | Null | Key | Default | Extra |
> > +-------+-------------+------+-----+---------+-------+
> > | id    | int(11)     | NO   | PRI | NULL    |       |
> > | name  | varchar(45) | YES  |     | NULL    |       |
> > | age   | int(11)     | YES  |     | NULL    |       |
> > | email | varchar(20) | YES  |     | NULL    |       |
> > | title | varchar(10) | YES  |     | NULL    |       |
> > +-------+-------------+------+-----+---------+-------+
> > I want to export it to hbase by sqoop.
> > I execute this:
> > sqoop import --connect jdbc:mysql://h2/test --username testu --password
> > testp \
> >   --table testtable --hbase-create-table --hbase-table hbasetable \
> >   --column-family name --column-family age --column-family email
> > --column-family title \
> >   --hbase-row-key id
> >
> > I get full data in hbase, but I want custom row-family. Exactly I need
> > transformation like this:
> >
> > HBase                       MySQL Columns
> >
> > row key    <-----------     id
> > baseinfo   <-----------     name
> >                  `-<---     age
> > workinfo   <-----------     email
> >                  `-<---     title
> >
> > Anyone have any idea?
> >
> > --
> > Regards,
> > RobberPhex
> >
> > About me: http://about.me/RobberPhex
>



-- 
Regards,
RobberPhex

About me: http://about.me/RobberPhex

Re: mysql to hbase, how to custom row-family?

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Robber,
Sqoop currently supports importing only into single column family. You can import data twice using two different column families in each case and parameter --columns to filter only relevant columns for given column family.

Jarcec

On Mon, May 20, 2013 at 09:42:14PM +0800, Robber Phex wrote:
> I have a table in mysql like this:
> +-------+-------------+------+-----+---------+-------+
> | Field | Type        | Null | Key | Default | Extra |
> +-------+-------------+------+-----+---------+-------+
> | id    | int(11)     | NO   | PRI | NULL    |       |
> | name  | varchar(45) | YES  |     | NULL    |       |
> | age   | int(11)     | YES  |     | NULL    |       |
> | email | varchar(20) | YES  |     | NULL    |       |
> | title | varchar(10) | YES  |     | NULL    |       |
> +-------+-------------+------+-----+---------+-------+
> I want to export it to hbase by sqoop.
> I execute this:
> sqoop import --connect jdbc:mysql://h2/test --username testu --password
> testp \
>   --table testtable --hbase-create-table --hbase-table hbasetable \
>   --column-family name --column-family age --column-family email
> --column-family title \
>   --hbase-row-key id
> 
> I get full data in hbase, but I want custom row-family. Exactly I need
> transformation like this:
> 
> HBase                       MySQL Columns
> 
> row key    <-----------     id
> baseinfo   <-----------     name
>                  `-<---     age
> workinfo   <-----------     email
>                  `-<---     title
> 
> Anyone have any idea?
> 
> -- 
> Regards,
> RobberPhex
> 
> About me: http://about.me/RobberPhex