You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Michal Taborsky <mi...@gmail.com> on 2013/06/12 23:11:49 UTC

CLOB data not imported into HBase from Oracle

Hello,

I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.

I am trying to get data from Oracle 11gR2 to HBase. The import works, but
CLOB columns are not making it into HBase.

My simplest testcase:

In Oracle:
CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL CLOB
);
INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
'clobval');

The sqoop command I run is following (the connect parameter is shortened,
but works):

sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1 --hbase-table
table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1

The job runs OK, the only surprising is the second to last line:
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
7.3188 seconds (0 bytes/sec)
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.

Anyway, after looking at the table in HBase:

# hbase shell
Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012

hbase(main):001:0> scan 'table1'
ROW                            COLUMN+CELL
 1                             column=d:STRCOL, timestamp=1371070804479,
value=strval
1 row(s) in 0.6070 seconds

The CLOBCOL is not there. The CLOB handling in sqoop must work in general,
because when I import the same table into Hive or just text file, the clob
data is there. The problem exists only when importing into HBase. I tried
searching Sqoop Jira and the internets at large, but could not find any
mention of CLOBs not getting into HBase.

Thank you for your help,
Michal Taborsky

Re: CLOB data not imported into HBase from Oracle

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Thank you!

Jarcec

On Thu, Jun 13, 2013 at 08:46:25PM +0200, Michal Taborsky wrote:
> Created https://issues.apache.org/jira/browse/SQOOP-1079
> 
> Michal
> 
> 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> 
> > Hi Michal,
> > I'm glad to see it working for you! I assume that this type mapping is
> > just masking the real issue, would you mind filling a JIRA for it? [1]
> >
> > Jarcec
> >
> > Links:
> > 1: https://issues.apache.org/jira/browse/SQOOP
> >
> > On Thu, Jun 13, 2013 at 05:12:49PM +0200, Michal Taborsky wrote:
> > > Thanks Jarcec.
> > >
> > > This fixed the issue and the data is coming into HBase. Sqoop could do
> > this
> > > automatically, I suppose, because for the Hive import it works.
> > >
> > > But at least now I have a workaround. It works even in the 1.3.0 version,
> > > by the way.
> > >
> > > Thanks again,
> > > Michal Taborsky
> > >
> > >
> > > 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> > >
> > > > Thank you for upgrading your Sqoop installation Michal! Would you mind
> > > > trying to map the column from CLOB to string to see if that helps?
> > > > Something like:
> > > >
> > > >   sqoop import ... --map-column-java CLOBCOL=String
> > > >
> > > > More information about type mapping can be find in our user guide:
> > > >
> > > >
> > > >
> > http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_controlling_type_mapping
> > > >
> > > > Jarcec
> > > >
> > > > On Thu, Jun 13, 2013 at 09:13:16AM +0200, Michal Taborsky wrote:
> > > > > Thanks, Jarcec, for the suggestion. Well, I tried with 1.4.3 and the
> > > > result
> > > > > is the same.
> > > > >
> > > > > Michal Taborsky
> > > > >
> > > > >
> > > > > 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> > > > >
> > > > > > Hi Michal,
> > > > > > version 1.3.0 is quite old release (and CDH3 is not supported
> > anymore),
> > > > > > therefore I would strongly suggest you to upgrade to the latest
> > release
> > > > > > that can be downloaded from [1].
> > > > > >
> > > > > > Jarcec
> > > > > >
> > > > > > Links:
> > > > > > 1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3
> > > > > >
> > > > > > On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> > > > > > > Hello,
> > > > > > >
> > > > > > > I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> > > > > > >
> > > > > > > I am trying to get data from Oracle 11gR2 to HBase. The import
> > > > works, but
> > > > > > > CLOB columns are not making it into HBase.
> > > > > > >
> > > > > > > My simplest testcase:
> > > > > > >
> > > > > > > In Oracle:
> > > > > > > CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE),
> > > > CLOBCOL
> > > > > > CLOB
> > > > > > > );
> > > > > > > INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> > > > > > > 'clobval');
> > > > > > >
> > > > > > > The sqoop command I run is following (the connect parameter is
> > > > shortened,
> > > > > > > but works):
> > > > > > >
> > > > > > > sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1
> > > > > > --hbase-table
> > > > > > > table1 --hbase-create-table --hbase-row-key NUMCOL
> > --column-family d
> > > > -m 1
> > > > > > >
> > > > > > > The job runs OK, the only surprising is the second to last line:
> > > > > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0
> > bytes
> > > > in
> > > > > > > 7.3188 seconds (0 bytes/sec)
> > > > > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1
> > records.
> > > > > > >
> > > > > > > Anyway, after looking at the table in HBase:
> > > > > > >
> > > > > > > # hbase shell
> > > > > > > Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> > > > > > >
> > > > > > > hbase(main):001:0> scan 'table1'
> > > > > > > ROW                            COLUMN+CELL
> > > > > > >  1                             column=d:STRCOL,
> > > > timestamp=1371070804479,
> > > > > > > value=strval
> > > > > > > 1 row(s) in 0.6070 seconds
> > > > > > >
> > > > > > > The CLOBCOL is not there. The CLOB handling in sqoop must work in
> > > > > > general,
> > > > > > > because when I import the same table into Hive or just text
> > file, the
> > > > > > clob
> > > > > > > data is there. The problem exists only when importing into
> > HBase. I
> > > > tried
> > > > > > > searching Sqoop Jira and the internets at large, but could not
> > find
> > > > any
> > > > > > > mention of CLOBs not getting into HBase.
> > > > > > >
> > > > > > > Thank you for your help,
> > > > > > > Michal Taborsky
> > > > > >
> > > >
> >

Re: CLOB data not imported into HBase from Oracle

Posted by Michal Taborsky <mi...@gmail.com>.
Created https://issues.apache.org/jira/browse/SQOOP-1079

Michal

2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>

> Hi Michal,
> I'm glad to see it working for you! I assume that this type mapping is
> just masking the real issue, would you mind filling a JIRA for it? [1]
>
> Jarcec
>
> Links:
> 1: https://issues.apache.org/jira/browse/SQOOP
>
> On Thu, Jun 13, 2013 at 05:12:49PM +0200, Michal Taborsky wrote:
> > Thanks Jarcec.
> >
> > This fixed the issue and the data is coming into HBase. Sqoop could do
> this
> > automatically, I suppose, because for the Hive import it works.
> >
> > But at least now I have a workaround. It works even in the 1.3.0 version,
> > by the way.
> >
> > Thanks again,
> > Michal Taborsky
> >
> >
> > 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> >
> > > Thank you for upgrading your Sqoop installation Michal! Would you mind
> > > trying to map the column from CLOB to string to see if that helps?
> > > Something like:
> > >
> > >   sqoop import ... --map-column-java CLOBCOL=String
> > >
> > > More information about type mapping can be find in our user guide:
> > >
> > >
> > >
> http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_controlling_type_mapping
> > >
> > > Jarcec
> > >
> > > On Thu, Jun 13, 2013 at 09:13:16AM +0200, Michal Taborsky wrote:
> > > > Thanks, Jarcec, for the suggestion. Well, I tried with 1.4.3 and the
> > > result
> > > > is the same.
> > > >
> > > > Michal Taborsky
> > > >
> > > >
> > > > 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> > > >
> > > > > Hi Michal,
> > > > > version 1.3.0 is quite old release (and CDH3 is not supported
> anymore),
> > > > > therefore I would strongly suggest you to upgrade to the latest
> release
> > > > > that can be downloaded from [1].
> > > > >
> > > > > Jarcec
> > > > >
> > > > > Links:
> > > > > 1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3
> > > > >
> > > > > On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> > > > > > Hello,
> > > > > >
> > > > > > I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> > > > > >
> > > > > > I am trying to get data from Oracle 11gR2 to HBase. The import
> > > works, but
> > > > > > CLOB columns are not making it into HBase.
> > > > > >
> > > > > > My simplest testcase:
> > > > > >
> > > > > > In Oracle:
> > > > > > CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE),
> > > CLOBCOL
> > > > > CLOB
> > > > > > );
> > > > > > INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> > > > > > 'clobval');
> > > > > >
> > > > > > The sqoop command I run is following (the connect parameter is
> > > shortened,
> > > > > > but works):
> > > > > >
> > > > > > sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1
> > > > > --hbase-table
> > > > > > table1 --hbase-create-table --hbase-row-key NUMCOL
> --column-family d
> > > -m 1
> > > > > >
> > > > > > The job runs OK, the only surprising is the second to last line:
> > > > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0
> bytes
> > > in
> > > > > > 7.3188 seconds (0 bytes/sec)
> > > > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1
> records.
> > > > > >
> > > > > > Anyway, after looking at the table in HBase:
> > > > > >
> > > > > > # hbase shell
> > > > > > Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> > > > > >
> > > > > > hbase(main):001:0> scan 'table1'
> > > > > > ROW                            COLUMN+CELL
> > > > > >  1                             column=d:STRCOL,
> > > timestamp=1371070804479,
> > > > > > value=strval
> > > > > > 1 row(s) in 0.6070 seconds
> > > > > >
> > > > > > The CLOBCOL is not there. The CLOB handling in sqoop must work in
> > > > > general,
> > > > > > because when I import the same table into Hive or just text
> file, the
> > > > > clob
> > > > > > data is there. The problem exists only when importing into
> HBase. I
> > > tried
> > > > > > searching Sqoop Jira and the internets at large, but could not
> find
> > > any
> > > > > > mention of CLOBs not getting into HBase.
> > > > > >
> > > > > > Thank you for your help,
> > > > > > Michal Taborsky
> > > > >
> > >
>

Re: CLOB data not imported into HBase from Oracle

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Michal,
I'm glad to see it working for you! I assume that this type mapping is just masking the real issue, would you mind filling a JIRA for it? [1]

Jarcec

Links:
1: https://issues.apache.org/jira/browse/SQOOP

On Thu, Jun 13, 2013 at 05:12:49PM +0200, Michal Taborsky wrote:
> Thanks Jarcec.
> 
> This fixed the issue and the data is coming into HBase. Sqoop could do this
> automatically, I suppose, because for the Hive import it works.
> 
> But at least now I have a workaround. It works even in the 1.3.0 version,
> by the way.
> 
> Thanks again,
> Michal Taborsky
> 
> 
> 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> 
> > Thank you for upgrading your Sqoop installation Michal! Would you mind
> > trying to map the column from CLOB to string to see if that helps?
> > Something like:
> >
> >   sqoop import ... --map-column-java CLOBCOL=String
> >
> > More information about type mapping can be find in our user guide:
> >
> >
> > http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_controlling_type_mapping
> >
> > Jarcec
> >
> > On Thu, Jun 13, 2013 at 09:13:16AM +0200, Michal Taborsky wrote:
> > > Thanks, Jarcec, for the suggestion. Well, I tried with 1.4.3 and the
> > result
> > > is the same.
> > >
> > > Michal Taborsky
> > >
> > >
> > > 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> > >
> > > > Hi Michal,
> > > > version 1.3.0 is quite old release (and CDH3 is not supported anymore),
> > > > therefore I would strongly suggest you to upgrade to the latest release
> > > > that can be downloaded from [1].
> > > >
> > > > Jarcec
> > > >
> > > > Links:
> > > > 1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3
> > > >
> > > > On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> > > > > Hello,
> > > > >
> > > > > I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> > > > >
> > > > > I am trying to get data from Oracle 11gR2 to HBase. The import
> > works, but
> > > > > CLOB columns are not making it into HBase.
> > > > >
> > > > > My simplest testcase:
> > > > >
> > > > > In Oracle:
> > > > > CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE),
> > CLOBCOL
> > > > CLOB
> > > > > );
> > > > > INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> > > > > 'clobval');
> > > > >
> > > > > The sqoop command I run is following (the connect parameter is
> > shortened,
> > > > > but works):
> > > > >
> > > > > sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1
> > > > --hbase-table
> > > > > table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d
> > -m 1
> > > > >
> > > > > The job runs OK, the only surprising is the second to last line:
> > > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes
> > in
> > > > > 7.3188 seconds (0 bytes/sec)
> > > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> > > > >
> > > > > Anyway, after looking at the table in HBase:
> > > > >
> > > > > # hbase shell
> > > > > Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> > > > >
> > > > > hbase(main):001:0> scan 'table1'
> > > > > ROW                            COLUMN+CELL
> > > > >  1                             column=d:STRCOL,
> > timestamp=1371070804479,
> > > > > value=strval
> > > > > 1 row(s) in 0.6070 seconds
> > > > >
> > > > > The CLOBCOL is not there. The CLOB handling in sqoop must work in
> > > > general,
> > > > > because when I import the same table into Hive or just text file, the
> > > > clob
> > > > > data is there. The problem exists only when importing into HBase. I
> > tried
> > > > > searching Sqoop Jira and the internets at large, but could not find
> > any
> > > > > mention of CLOBs not getting into HBase.
> > > > >
> > > > > Thank you for your help,
> > > > > Michal Taborsky
> > > >
> >

Re: CLOB data not imported into HBase from Oracle

Posted by Michal Taborsky <mi...@gmail.com>.
Thanks Jarcec.

This fixed the issue and the data is coming into HBase. Sqoop could do this
automatically, I suppose, because for the Hive import it works.

But at least now I have a workaround. It works even in the 1.3.0 version,
by the way.

Thanks again,
Michal Taborsky


2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>

> Thank you for upgrading your Sqoop installation Michal! Would you mind
> trying to map the column from CLOB to string to see if that helps?
> Something like:
>
>   sqoop import ... --map-column-java CLOBCOL=String
>
> More information about type mapping can be find in our user guide:
>
>
> http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_controlling_type_mapping
>
> Jarcec
>
> On Thu, Jun 13, 2013 at 09:13:16AM +0200, Michal Taborsky wrote:
> > Thanks, Jarcec, for the suggestion. Well, I tried with 1.4.3 and the
> result
> > is the same.
> >
> > Michal Taborsky
> >
> >
> > 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> >
> > > Hi Michal,
> > > version 1.3.0 is quite old release (and CDH3 is not supported anymore),
> > > therefore I would strongly suggest you to upgrade to the latest release
> > > that can be downloaded from [1].
> > >
> > > Jarcec
> > >
> > > Links:
> > > 1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3
> > >
> > > On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> > > > Hello,
> > > >
> > > > I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> > > >
> > > > I am trying to get data from Oracle 11gR2 to HBase. The import
> works, but
> > > > CLOB columns are not making it into HBase.
> > > >
> > > > My simplest testcase:
> > > >
> > > > In Oracle:
> > > > CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE),
> CLOBCOL
> > > CLOB
> > > > );
> > > > INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> > > > 'clobval');
> > > >
> > > > The sqoop command I run is following (the connect parameter is
> shortened,
> > > > but works):
> > > >
> > > > sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1
> > > --hbase-table
> > > > table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d
> -m 1
> > > >
> > > > The job runs OK, the only surprising is the second to last line:
> > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes
> in
> > > > 7.3188 seconds (0 bytes/sec)
> > > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> > > >
> > > > Anyway, after looking at the table in HBase:
> > > >
> > > > # hbase shell
> > > > Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> > > >
> > > > hbase(main):001:0> scan 'table1'
> > > > ROW                            COLUMN+CELL
> > > >  1                             column=d:STRCOL,
> timestamp=1371070804479,
> > > > value=strval
> > > > 1 row(s) in 0.6070 seconds
> > > >
> > > > The CLOBCOL is not there. The CLOB handling in sqoop must work in
> > > general,
> > > > because when I import the same table into Hive or just text file, the
> > > clob
> > > > data is there. The problem exists only when importing into HBase. I
> tried
> > > > searching Sqoop Jira and the internets at large, but could not find
> any
> > > > mention of CLOBs not getting into HBase.
> > > >
> > > > Thank you for your help,
> > > > Michal Taborsky
> > >
>

Re: CLOB data not imported into HBase from Oracle

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Thank you for upgrading your Sqoop installation Michal! Would you mind trying to map the column from CLOB to string to see if that helps? Something like:

  sqoop import ... --map-column-java CLOBCOL=String

More information about type mapping can be find in our user guide:

http://sqoop.apache.org/docs/1.4.3/SqoopUserGuide.html#_controlling_type_mapping

Jarcec

On Thu, Jun 13, 2013 at 09:13:16AM +0200, Michal Taborsky wrote:
> Thanks, Jarcec, for the suggestion. Well, I tried with 1.4.3 and the result
> is the same.
> 
> Michal Taborsky
> 
> 
> 2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>
> 
> > Hi Michal,
> > version 1.3.0 is quite old release (and CDH3 is not supported anymore),
> > therefore I would strongly suggest you to upgrade to the latest release
> > that can be downloaded from [1].
> >
> > Jarcec
> >
> > Links:
> > 1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3
> >
> > On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> > > Hello,
> > >
> > > I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> > >
> > > I am trying to get data from Oracle 11gR2 to HBase. The import works, but
> > > CLOB columns are not making it into HBase.
> > >
> > > My simplest testcase:
> > >
> > > In Oracle:
> > > CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL
> > CLOB
> > > );
> > > INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> > > 'clobval');
> > >
> > > The sqoop command I run is following (the connect parameter is shortened,
> > > but works):
> > >
> > > sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1
> > --hbase-table
> > > table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1
> > >
> > > The job runs OK, the only surprising is the second to last line:
> > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
> > > 7.3188 seconds (0 bytes/sec)
> > > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> > >
> > > Anyway, after looking at the table in HBase:
> > >
> > > # hbase shell
> > > Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> > >
> > > hbase(main):001:0> scan 'table1'
> > > ROW                            COLUMN+CELL
> > >  1                             column=d:STRCOL, timestamp=1371070804479,
> > > value=strval
> > > 1 row(s) in 0.6070 seconds
> > >
> > > The CLOBCOL is not there. The CLOB handling in sqoop must work in
> > general,
> > > because when I import the same table into Hive or just text file, the
> > clob
> > > data is there. The problem exists only when importing into HBase. I tried
> > > searching Sqoop Jira and the internets at large, but could not find any
> > > mention of CLOBs not getting into HBase.
> > >
> > > Thank you for your help,
> > > Michal Taborsky
> >

Re: CLOB data not imported into HBase from Oracle

Posted by Michal Taborsky <mi...@gmail.com>.
Thanks, Jarcec, for the suggestion. Well, I tried with 1.4.3 and the result
is the same.

Michal Taborsky


2013/6/13 Jarek Jarcec Cecho <ja...@apache.org>

> Hi Michal,
> version 1.3.0 is quite old release (and CDH3 is not supported anymore),
> therefore I would strongly suggest you to upgrade to the latest release
> that can be downloaded from [1].
>
> Jarcec
>
> Links:
> 1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3
>
> On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> > Hello,
> >
> > I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> >
> > I am trying to get data from Oracle 11gR2 to HBase. The import works, but
> > CLOB columns are not making it into HBase.
> >
> > My simplest testcase:
> >
> > In Oracle:
> > CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL
> CLOB
> > );
> > INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> > 'clobval');
> >
> > The sqoop command I run is following (the connect parameter is shortened,
> > but works):
> >
> > sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1
> --hbase-table
> > table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1
> >
> > The job runs OK, the only surprising is the second to last line:
> > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
> > 7.3188 seconds (0 bytes/sec)
> > 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> >
> > Anyway, after looking at the table in HBase:
> >
> > # hbase shell
> > Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> >
> > hbase(main):001:0> scan 'table1'
> > ROW                            COLUMN+CELL
> >  1                             column=d:STRCOL, timestamp=1371070804479,
> > value=strval
> > 1 row(s) in 0.6070 seconds
> >
> > The CLOBCOL is not there. The CLOB handling in sqoop must work in
> general,
> > because when I import the same table into Hive or just text file, the
> clob
> > data is there. The problem exists only when importing into HBase. I tried
> > searching Sqoop Jira and the internets at large, but could not find any
> > mention of CLOBs not getting into HBase.
> >
> > Thank you for your help,
> > Michal Taborsky
>

Re: CLOB data not imported into HBase from Oracle

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi Michal,
version 1.3.0 is quite old release (and CDH3 is not supported anymore), therefore I would strongly suggest you to upgrade to the latest release that can be downloaded from [1].

Jarcec

Links:
1: http://www.apache.org/dyn/closer.cgi/sqoop/1.4.3

On Wed, Jun 12, 2013 at 11:11:49PM +0200, Michal Taborsky wrote:
> Hello,
> 
> I am running Sqoop 1.3.0-cdh3u4, as part of the Cloudera CDH.
> 
> I am trying to get data from Oracle 11gR2 to HBase. The import works, but
> CLOB columns are not making it into HBase.
> 
> My simplest testcase:
> 
> In Oracle:
> CREATE TABLE TABLE1 ( NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL CLOB
> );
> INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval',
> 'clobval');
> 
> The sqoop command I run is following (the connect parameter is shortened,
> but works):
> 
> sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1 --hbase-table
> table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1
> 
> The job runs OK, the only surprising is the second to last line:
> 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in
> 7.3188 seconds (0 bytes/sec)
> 13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.
> 
> Anyway, after looking at the table in HBase:
> 
> # hbase shell
> Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012
> 
> hbase(main):001:0> scan 'table1'
> ROW                            COLUMN+CELL
>  1                             column=d:STRCOL, timestamp=1371070804479,
> value=strval
> 1 row(s) in 0.6070 seconds
> 
> The CLOBCOL is not there. The CLOB handling in sqoop must work in general,
> because when I import the same table into Hive or just text file, the clob
> data is there. The problem exists only when importing into HBase. I tried
> searching Sqoop Jira and the internets at large, but could not find any
> mention of CLOBs not getting into HBase.
> 
> Thank you for your help,
> Michal Taborsky