You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Michal Taborsky (JIRA)" <ji...@apache.org> on 2013/06/13 20:43:21 UTC

[jira] [Created] (SQOOP-1079) CLOB data not imported into HBase from Oracle

Michal Taborsky created SQOOP-1079:
--------------------------------------

             Summary: CLOB data not imported into HBase from Oracle
                 Key: SQOOP-1079
                 URL: https://issues.apache.org/jira/browse/SQOOP-1079
             Project: Sqoop
          Issue Type: Bug
          Components: connectors/oracle, hbase-integration
    Affects Versions: 1.4.3, 1.3.0
         Environment: Oracle 11gR2, Linux, Oracle Thin Client
            Reporter: Michal Taborsky
            Priority: Minor


Trying to get data from Oracle 11gR2 to HBase. The import works, but CLOB columns are not making it into HBase.

Simplest testcase:

In Oracle:
CREATE TABLE TABLE1 (	NUMCOL NUMBER, STRCOL VARCHAR2(20 BYTE), CLOBCOL CLOB );
INSERT INTO TABLE1 (NUMCOL, STRCOL, CLOBCOL) VALUES (1, 'strval','clobval');

The sqoop command I run is following (the connect parameter is shortened, but works):

sqoop import --connect="jdbc:oracle:thin:..." --table TABLE1 --hbase-table table1 --hbase-create-table --hbase-row-key NUMCOL --column-family d -m 1

The job runs OK, the only surprising is the second to last line:
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Transferred 0 bytes in 7.3188 seconds (0 bytes/sec)
13/06/12 23:00:06 INFO mapreduce.ImportJobBase: Retrieved 1 records.

Anyway, after looking at the table in HBase:

# hbase shell
Version 0.90.6-cdh3u4, r, Mon May  7 13:14:00 PDT 2012

hbase(main):001:0> scan 'table1'
ROW                            COLUMN+CELL
 1                             column=d:STRCOL, timestamp=1371070804479, value=strval
1 row(s) in 0.6070 seconds

The CLOBCOL is not there.

The problem can be worked around by appending mapping parameter:
--map-column-java CLOBCOL=String

With this parameter, the data gets into HBase.

hbase(main):001:0> scan 'table1'
ROW                   COLUMN+CELL
 1                    column=d:CLOBCOL, timestamp=1371135224197, value=clobval
 1                    column=d:STRCOL, timestamp=1371135224197, value=strval
1 row(s) in 0.5260 seconds


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira