You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by A Saravanan <as...@alphaworkz.com> on 2014/07/16 12:18:13 UTC

FW: issue when importing data from mysql to hbase using sqoop API...

 

 

From: A Saravanan [mailto:asaravanan@alphaworkz.com] 
Sent: 16 July 2014 PM 02:58
To: 'dev@sqoop.apache.org'
Subject: issue when importing data from mysql to hbase using sqoop API...

 

Hi,

                Am facing a  issue when importing data from MySQL to Hbase
using Sqoop API.. My scenario is doing Sqoop process for same table with two
different column family (cf1,cf2)..the Sqoop process is compiled
successfully for both the column family But the values inside hbase table
are different.

 

CDH version       : 4.7

Hbase version   : 0.94.15

Sqoop version   : Sqoop 1.4.3

 

My expected Output is:

hbase(main):061:0> scan 'tab2'

ROW                   COLUMN+CELL

1                    column=cf1:t2_col1, timestamp=1405496119028, value=23

1                    column=cf1:t2_col2, timestamp=1405496119028, value=45

1                    column=cf1:t2_col4, timestamp=1405496119028, value=78

1                    column=cf1:t2_col5, timestamp=1405496119028, value=89

1                    column=cf2:t2_col3, timestamp=1405496054925, value=67

1                    column=cf2:t2_col6, timestamp=1405496054925, value=55

2                    column=cf1:t2_col1, timestamp=1405496119028, value=24

2                    column=cf1:t2_col2, timestamp=1405496119028, value=46

2                    column=cf1:t2_col4, timestamp=1405496119028, value=80

2                    column=cf1:t2_col5, timestamp=1405496119028, value=92

2                    column=cf2:t2_col3, timestamp=1405496054925, value=68

2                    column=cf2:t2_col6, timestamp=1405496054925, value=60

3                    column=cf1:t2_col1, timestamp=1405496119028, value=25

3                    column=cf1:t2_col2, timestamp=1405496119028, value=47

3                    column=cf1:t2_col4, timestamp=1405496119028, value=82

3                    column=cf1:t2_col5, timestamp=1405496119028, value=95

3                    column=cf2:t2_col3, timestamp=1405496054925, value=69

3                    column=cf2:t2_col6, timestamp=1405496054925, value=65

4                    column=cf1:t2_col1, timestamp=1405496119028, value=26

4                    column=cf1:t2_col2, timestamp=1405496119028, value=48

4                    column=cf1:t2_col4, timestamp=1405496119028, value=84

4                    column=cf1:t2_col5, timestamp=1405496119028, value=98

4                    column=cf2:t2_col3, timestamp=1405496054925, value=70

4                    column=cf2:t2_col6, timestamp=1405496054925, value=70

5                    column=cf1:t2_col1, timestamp=1405496119028, value=27

5                    column=cf1:t2_col2, timestamp=1405496119028, value=49

5                    column=cf1:t2_col4, timestamp=1405496119028, value=86

5                    column=cf1:t2_col5, timestamp=1405496119028, value=101

5                    column=cf2:t2_col3, timestamp=1405496054925, value=71

5                    column=cf2:t2_col6, timestamp=1405496054925, value=75

5 row(s) in 0.0410 seconds

 

But the actual output is:

ROW                   COLUMN+CELL

1                    column=cf2:t2_col3, timestamp=1405495388483, value=67

1                    column=cf2:t2_col6, timestamp=1405495388483, value=55

2                    column=cf2:t2_col3, timestamp=1405495388483, value=68

2                    column=cf2:t2_col6, timestamp=1405495388483, value=60

3                    column=cf2:t2_col3, timestamp=1405495388483, value=69

3                    column=cf2:t2_col6, timestamp=1405495388483, value=65

4                    column=cf2:t2_col3, timestamp=1405495388483, value=70

4                    column=cf2:t2_col6, timestamp=1405495388483, value=70

5                    column=cf2:t2_col3, timestamp=1405495388483, value=71

5                    column=cf2:t2_col6, timestamp=1405495388483, value=75

78                   column=cf1:t2_col3, timestamp=1405495391894, value=23

78                   column=cf1:t2_col6, timestamp=1405495391894, value=45

80                   column=cf1:t2_col3, timestamp=1405495391894, value=24

80                   column=cf1:t2_col6, timestamp=1405495391894, value=46

82                   column=cf1:t2_col3, timestamp=1405495391894, value=25

82                   column=cf1:t2_col6, timestamp=1405495391894, value=47

84                   column=cf1:t2_col3, timestamp=1405495391894, value=26

84                   column=cf1:t2_col6, timestamp=1405495391894, value=48

86                   column=cf1:t2_col3, timestamp=1405495391894, value=27

86                   column=cf1:t2_col6, timestamp=1405495391894, value=49

10 row(s) in 0.0250 seconds

 

My sql table is:


rowkey

t2_col1

t2_col2

t2_col3

t2_col4

t2_col5

t2_col6


1

23

45

67

78

89

55


2

24

46

68

80

92

60


3

25

47

69

82

95

65


4

26

48

70

84

98

70


5

27

49

71

86

101

75

 

I also have attached my java code below please check and give some
suggestion.Thanks in advance..

 


Re: FW: issue when importing data from mysql to hbase using sqoop API...

Posted by Abraham Elmahrek <ab...@cloudera.com>.
Hey there,

It seems like you're calling the ImportTool directly. Instead, have you
tried using a command executor in Java like Runtime.getRuntime().exec?
Also, you should be able to specify the "verbose" option and it will tell
you the boundaries on your queries, etc.

-Abe


On Wed, Jul 16, 2014 at 3:18 AM, A Saravanan <as...@alphaworkz.com>
wrote:

>
>
>
>
> *From:* A Saravanan [mailto:asaravanan@alphaworkz.com]
> *Sent:* 16 July 2014 PM 02:58
> *To:* 'dev@sqoop.apache.org'
> *Subject:* issue when importing data from mysql to hbase using sqoop
> API...
>
>
>
> Hi,
>
>                 Am facing a  issue when importing data from MySQL to Hbase
> using Sqoop API.. My scenario is doing Sqoop process for same table with
> two different column family (cf1,cf2)..the Sqoop process is compiled
> successfully for both the column family But the values inside hbase table
> are different…
>
>
>
> CDH version       : 4.7
>
> Hbase version   : 0.94.15
>
> Sqoop version   : Sqoop 1.4.3
>
>
>
> *My expected Output is:*
>
> hbase(main):061:0> scan 'tab2'
>
> ROW                   COLUMN+CELL
>
> 1                    column=cf1:t2_col1, timestamp=1405496119028, value=23
>
> 1                    column=cf1:t2_col2, timestamp=1405496119028, value=45
>
> 1                    column=cf1:t2_col4, timestamp=1405496119028, value=78
>
> 1                    column=cf1:t2_col5, timestamp=1405496119028, value=89
>
> 1                    column=cf2:t2_col3, timestamp=1405496054925, value=67
>
> 1                    column=cf2:t2_col6, timestamp=1405496054925, value=55
>
> 2                    column=cf1:t2_col1, timestamp=1405496119028, value=24
>
> 2                    column=cf1:t2_col2, timestamp=1405496119028, value=46
>
> 2                    column=cf1:t2_col4, timestamp=1405496119028, value=80
>
> 2                    column=cf1:t2_col5, timestamp=1405496119028, value=92
>
> 2                    column=cf2:t2_col3, timestamp=1405496054925, value=68
>
> 2                    column=cf2:t2_col6, timestamp=1405496054925, value=60
>
> 3                    column=cf1:t2_col1, timestamp=1405496119028, value=25
>
> 3                    column=cf1:t2_col2, timestamp=1405496119028, value=47
>
> 3                    column=cf1:t2_col4, timestamp=1405496119028, value=82
>
> 3                    column=cf1:t2_col5, timestamp=1405496119028, value=95
>
> 3                    column=cf2:t2_col3, timestamp=1405496054925, value=69
>
> 3                    column=cf2:t2_col6, timestamp=1405496054925, value=65
>
> 4                    column=cf1:t2_col1, timestamp=1405496119028, value=26
>
> 4                    column=cf1:t2_col2, timestamp=1405496119028, value=48
>
> 4                    column=cf1:t2_col4, timestamp=1405496119028, value=84
>
> 4                    column=cf1:t2_col5, timestamp=1405496119028, value=98
>
> 4                    column=cf2:t2_col3, timestamp=1405496054925, value=70
>
> 4                    column=cf2:t2_col6, timestamp=1405496054925, value=70
>
> 5                    column=cf1:t2_col1, timestamp=1405496119028, value=27
>
> 5                    column=cf1:t2_col2, timestamp=1405496119028, value=49
>
> 5                    column=cf1:t2_col4, timestamp=1405496119028, value=86
>
> 5                    column=cf1:t2_col5, timestamp=1405496119028, value=101
>
> 5                    column=cf2:t2_col3, timestamp=1405496054925, value=71
>
> 5                    column=cf2:t2_col6, timestamp=1405496054925, value=75
>
> 5 row(s) in 0.0410 seconds
>
>
>
> *But the actual output is:*
>
> ROW                   COLUMN+CELL
>
> 1                    column=cf2:t2_col3, timestamp=1405495388483, value=67
>
> 1                    column=cf2:t2_col6, timestamp=1405495388483, value=55
>
> 2                    column=cf2:t2_col3, timestamp=1405495388483, value=68
>
> 2                    column=cf2:t2_col6, timestamp=1405495388483, value=60
>
> 3                    column=cf2:t2_col3, timestamp=1405495388483, value=69
>
> 3                    column=cf2:t2_col6, timestamp=1405495388483, value=65
>
> 4                    column=cf2:t2_col3, timestamp=1405495388483, value=70
>
> 4                    column=cf2:t2_col6, timestamp=1405495388483, value=70
>
> 5                    column=cf2:t2_col3, timestamp=1405495388483, value=71
>
> 5                    column=cf2:t2_col6, timestamp=1405495388483, value=75
>
> 78                   column=cf1:t2_col3, timestamp=1405495391894, value=23
>
> 78                   column=cf1:t2_col6, timestamp=1405495391894, value=45
>
> 80                   column=cf1:t2_col3, timestamp=1405495391894, value=24
>
> 80                   column=cf1:t2_col6, timestamp=1405495391894, value=46
>
> 82                   column=cf1:t2_col3, timestamp=1405495391894, value=25
>
> 82                   column=cf1:t2_col6, timestamp=1405495391894, value=47
>
> 84                   column=cf1:t2_col3, timestamp=1405495391894, value=26
>
> 84                   column=cf1:t2_col6, timestamp=1405495391894, value=48
>
> 86                   column=cf1:t2_col3, timestamp=1405495391894, value=27
>
> 86                   column=cf1:t2_col6, timestamp=1405495391894, value=49
>
> 10 row(s) in 0.0250 seconds
>
>
>
> *My sql table is:*
>
> rowkey
>
> t2_col1
>
> t2_col2
>
> t2_col3
>
> t2_col4
>
> t2_col5
>
> t2_col6
>
> 1
>
> 23
>
> 45
>
> 67
>
> 78
>
> 89
>
> 55
>
> 2
>
> 24
>
> 46
>
> 68
>
> 80
>
> 92
>
> 60
>
> 3
>
> 25
>
> 47
>
> 69
>
> 82
>
> 95
>
> 65
>
> 4
>
> 26
>
> 48
>
> 70
>
> 84
>
> 98
>
> 70
>
> 5
>
> 27
>
> 49
>
> 71
>
> 86
>
> 101
>
> 75
>
>
>
> I also have attached my java code below please check and give some
> suggestion…Thanks in advance..
>
>
>

Re: FW: issue when importing data from mysql to hbase using sqoop API...

Posted by Abraham Elmahrek <ab...@cloudera.com>.
Hey there,

It seems like you're calling the ImportTool directly. Instead, have you
tried using a command executor in Java like Runtime.getRuntime().exec?
Also, you should be able to specify the "verbose" option and it will tell
you the boundaries on your queries, etc.

-Abe


On Wed, Jul 16, 2014 at 3:18 AM, A Saravanan <as...@alphaworkz.com>
wrote:

>
>
>
>
> *From:* A Saravanan [mailto:asaravanan@alphaworkz.com]
> *Sent:* 16 July 2014 PM 02:58
> *To:* 'dev@sqoop.apache.org'
> *Subject:* issue when importing data from mysql to hbase using sqoop
> API...
>
>
>
> Hi,
>
>                 Am facing a  issue when importing data from MySQL to Hbase
> using Sqoop API.. My scenario is doing Sqoop process for same table with
> two different column family (cf1,cf2)..the Sqoop process is compiled
> successfully for both the column family But the values inside hbase table
> are different…
>
>
>
> CDH version       : 4.7
>
> Hbase version   : 0.94.15
>
> Sqoop version   : Sqoop 1.4.3
>
>
>
> *My expected Output is:*
>
> hbase(main):061:0> scan 'tab2'
>
> ROW                   COLUMN+CELL
>
> 1                    column=cf1:t2_col1, timestamp=1405496119028, value=23
>
> 1                    column=cf1:t2_col2, timestamp=1405496119028, value=45
>
> 1                    column=cf1:t2_col4, timestamp=1405496119028, value=78
>
> 1                    column=cf1:t2_col5, timestamp=1405496119028, value=89
>
> 1                    column=cf2:t2_col3, timestamp=1405496054925, value=67
>
> 1                    column=cf2:t2_col6, timestamp=1405496054925, value=55
>
> 2                    column=cf1:t2_col1, timestamp=1405496119028, value=24
>
> 2                    column=cf1:t2_col2, timestamp=1405496119028, value=46
>
> 2                    column=cf1:t2_col4, timestamp=1405496119028, value=80
>
> 2                    column=cf1:t2_col5, timestamp=1405496119028, value=92
>
> 2                    column=cf2:t2_col3, timestamp=1405496054925, value=68
>
> 2                    column=cf2:t2_col6, timestamp=1405496054925, value=60
>
> 3                    column=cf1:t2_col1, timestamp=1405496119028, value=25
>
> 3                    column=cf1:t2_col2, timestamp=1405496119028, value=47
>
> 3                    column=cf1:t2_col4, timestamp=1405496119028, value=82
>
> 3                    column=cf1:t2_col5, timestamp=1405496119028, value=95
>
> 3                    column=cf2:t2_col3, timestamp=1405496054925, value=69
>
> 3                    column=cf2:t2_col6, timestamp=1405496054925, value=65
>
> 4                    column=cf1:t2_col1, timestamp=1405496119028, value=26
>
> 4                    column=cf1:t2_col2, timestamp=1405496119028, value=48
>
> 4                    column=cf1:t2_col4, timestamp=1405496119028, value=84
>
> 4                    column=cf1:t2_col5, timestamp=1405496119028, value=98
>
> 4                    column=cf2:t2_col3, timestamp=1405496054925, value=70
>
> 4                    column=cf2:t2_col6, timestamp=1405496054925, value=70
>
> 5                    column=cf1:t2_col1, timestamp=1405496119028, value=27
>
> 5                    column=cf1:t2_col2, timestamp=1405496119028, value=49
>
> 5                    column=cf1:t2_col4, timestamp=1405496119028, value=86
>
> 5                    column=cf1:t2_col5, timestamp=1405496119028, value=101
>
> 5                    column=cf2:t2_col3, timestamp=1405496054925, value=71
>
> 5                    column=cf2:t2_col6, timestamp=1405496054925, value=75
>
> 5 row(s) in 0.0410 seconds
>
>
>
> *But the actual output is:*
>
> ROW                   COLUMN+CELL
>
> 1                    column=cf2:t2_col3, timestamp=1405495388483, value=67
>
> 1                    column=cf2:t2_col6, timestamp=1405495388483, value=55
>
> 2                    column=cf2:t2_col3, timestamp=1405495388483, value=68
>
> 2                    column=cf2:t2_col6, timestamp=1405495388483, value=60
>
> 3                    column=cf2:t2_col3, timestamp=1405495388483, value=69
>
> 3                    column=cf2:t2_col6, timestamp=1405495388483, value=65
>
> 4                    column=cf2:t2_col3, timestamp=1405495388483, value=70
>
> 4                    column=cf2:t2_col6, timestamp=1405495388483, value=70
>
> 5                    column=cf2:t2_col3, timestamp=1405495388483, value=71
>
> 5                    column=cf2:t2_col6, timestamp=1405495388483, value=75
>
> 78                   column=cf1:t2_col3, timestamp=1405495391894, value=23
>
> 78                   column=cf1:t2_col6, timestamp=1405495391894, value=45
>
> 80                   column=cf1:t2_col3, timestamp=1405495391894, value=24
>
> 80                   column=cf1:t2_col6, timestamp=1405495391894, value=46
>
> 82                   column=cf1:t2_col3, timestamp=1405495391894, value=25
>
> 82                   column=cf1:t2_col6, timestamp=1405495391894, value=47
>
> 84                   column=cf1:t2_col3, timestamp=1405495391894, value=26
>
> 84                   column=cf1:t2_col6, timestamp=1405495391894, value=48
>
> 86                   column=cf1:t2_col3, timestamp=1405495391894, value=27
>
> 86                   column=cf1:t2_col6, timestamp=1405495391894, value=49
>
> 10 row(s) in 0.0250 seconds
>
>
>
> *My sql table is:*
>
> rowkey
>
> t2_col1
>
> t2_col2
>
> t2_col3
>
> t2_col4
>
> t2_col5
>
> t2_col6
>
> 1
>
> 23
>
> 45
>
> 67
>
> 78
>
> 89
>
> 55
>
> 2
>
> 24
>
> 46
>
> 68
>
> 80
>
> 92
>
> 60
>
> 3
>
> 25
>
> 47
>
> 69
>
> 82
>
> 95
>
> 65
>
> 4
>
> 26
>
> 48
>
> 70
>
> 84
>
> 98
>
> 70
>
> 5
>
> 27
>
> 49
>
> 71
>
> 86
>
> 101
>
> 75
>
>
>
> I also have attached my java code below please check and give some
> suggestion…Thanks in advance..
>
>
>