You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by YouPeng Yang <yy...@gmail.com> on 2013/05/22 12:20:06 UTC

Re: Export in free form And Mixed update/insert

Hi Jarek Jarcec Cecho

  I also have found the tip. Thank you.

  Here comes another question.

  I find that  sqoop-1.4  support one  to update rows if they exist in the
database already or insert rows if they do not exist yet  by using
--update-key <col-name> --update-mode <mode>.

  However, I got the error when I tried it:
  ERROR tool.ExportTool: Error during export: Mixed update/insert is not
supported against the target database yet

 Note: the database is oracle.

 1.Does it only supports the MySQL,but I found nothing that hint this in
docs.

 2.Is there any solutions that fullfil my issue to update rows if they
exist in the database already or insert rows if they do not exist yet



Thanks you.


Regards.


2013/5/22 Jarek Jarcec Cecho <ja...@apache.org>

> Hi YouPeng,
> Sqoop 1 do not supports custom insert query when exporting data from HDFS.
> I think that in your use case you can use parameter --columns to specify
> which columns and in what order are present on HDFS, for example:
>
>   sqoop ... --columns ID,TIMEID,COLA,COLB
>
> Jarcec
>
> On Wed, May 22, 2013 at 02:49:04PM +0800, YouPeng Yang wrote:
> > Hi
> >   I want to  export data on the HDFS to the oracle database with
> > sqoop-1.4(sqoop-1.4.1-cdh4.1.2). However the columns betwean HDFS and
> > Oracle table  are  not exactly same to each other.
> >
> >  For example,Data on HDFS:
> > -------------------------------------------------------------
> > | ID |      TIMEID          |   COLA  | COLB  |
> > -------------------------------------------------------------
> > | 6  |  201305221335    |    0        |  20      |
> > -------------------------------------------------------------
> >
> > the Oracle table:
> > ------------------------------------------------------------------------
> > | ID |      TIMEID          |   COLC  | COLB   | COLA |
> > ------------------------------------------------------------------------
> > | 7  |  201305221335    |     kk      |  20      |  1       |
> > ------------------------------------------------------------------------
> > Note:Additional COLC and unsame order.
> >
> >
> > I notice the sqoop export command:
> > --export-dir   HDFS source path for the export
> > --table  Table to populate.
> > It seams not to export the data to oracle in free from just as Free-form
> > Query Imports
> > using the --query argument.
> >
> > Could I acheive that goal ?
> >
> > Thanks very much
> >
> > Regards
>

Re: Export in free form And Mixed update/insert

Posted by Felix GV <fe...@mate1inc.com>.
MySQL has built-in INSERT ... ON DUPLICATE KEY UPDATE and INSERT
IGNOREstatements, but this is not standard SQL and, as far as I know,
Oracle
doesn't support it.

There are work arounds<http://stackoverflow.com/questions/237327/oracle-how-to-upsert-update-or-insert-into-a-table>in
Oracle, but they seem messier to implement, which would probably
explain
why the sqoop team didn't get to it.

In any case, you should know that the MySQL calls I mentioned above are
very costly, and should probably be avoided in most cases, if you have any
concerns for scaling and performance...

They're convenient, but not as essential as one may think. You can often
design your schemas and data flows differently and side step the need for
these costly shortcuts altogether.

--
Felix


On Wed, May 22, 2013 at 6:20 AM, YouPeng Yang <yy...@gmail.com>wrote:

> Hi Jarek Jarcec Cecho
>
>   I also have found the tip. Thank you.
>
>   Here comes another question.
>
>   I find that  sqoop-1.4  support one  to update rows if they exist in
> the database already or insert rows if they do not exist yet  by using
> --update-key <col-name> --update-mode <mode>.
>
>   However, I got the error when I tried it:
>   ERROR tool.ExportTool: Error during export: Mixed update/insert is not
> supported against the target database yet
>
>  Note: the database is oracle.
>
>  1.Does it only supports the MySQL,but I found nothing that hint this in
> docs.
>
>  2.Is there any solutions that fullfil my issue to update rows if they
> exist in the database already or insert rows if they do not exist yet
>
>
>
> Thanks you.
>
>
> Regards.
>
>
> 2013/5/22 Jarek Jarcec Cecho <ja...@apache.org>
>
>> Hi YouPeng,
>> Sqoop 1 do not supports custom insert query when exporting data from
>> HDFS. I think that in your use case you can use parameter --columns to
>> specify which columns and in what order are present on HDFS, for example:
>>
>>   sqoop ... --columns ID,TIMEID,COLA,COLB
>>
>> Jarcec
>>
>> On Wed, May 22, 2013 at 02:49:04PM +0800, YouPeng Yang wrote:
>> > Hi
>> >   I want to  export data on the HDFS to the oracle database with
>> > sqoop-1.4(sqoop-1.4.1-cdh4.1.2). However the columns betwean HDFS and
>> > Oracle table  are  not exactly same to each other.
>> >
>> >  For example,Data on HDFS:
>> > -------------------------------------------------------------
>> > | ID |      TIMEID          |   COLA  | COLB  |
>> > -------------------------------------------------------------
>> > | 6  |  201305221335    |    0        |  20      |
>> > -------------------------------------------------------------
>> >
>> > the Oracle table:
>> > ------------------------------------------------------------------------
>> > | ID |      TIMEID          |   COLC  | COLB   | COLA |
>> > ------------------------------------------------------------------------
>> > | 7  |  201305221335    |     kk      |  20      |  1       |
>> > ------------------------------------------------------------------------
>> > Note:Additional COLC and unsame order.
>> >
>> >
>> > I notice the sqoop export command:
>> > --export-dir   HDFS source path for the export
>> > --table  Table to populate.
>> > It seams not to export the data to oracle in free from just as Free-form
>> > Query Imports
>> > using the --query argument.
>> >
>> > Could I acheive that goal ?
>> >
>> > Thanks very much
>> >
>> > Regards
>>
>
>

Re: Export in free form And Mixed update/insertj

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi YouPeng,
I'm glad to hear that the upsert mode has started working for you!

Jarcec

On Fri, May 24, 2013 at 01:06:41PM +0800, YouPeng Yang wrote:
> Hi Jarek Jarcec Cecho
> 
>    Thanks for your response.
> 
>   I am sorry that the failed job because of the oraoop plugin which I add
> before. Anyways I post the command and the log[1].
> 
>   The mixed update/insert job succeeds when I adopt the pure sqoop . Thanks
> a  lot.
> 
> 
> 
> 
> [1]---------------------------------------------------------------------------------------------------
> [hadoop@Hadoop01 ~]$  /home/sqoop-1.4.1-cdh4.1.2/bin/sqoop export --connect
> jdbc:oracle:thin:@10.167.14.225:1521:wxoss --username XUJINGYU --password
> 123456  --export-dir sqoop/CMTSIFTABLE/201305131241  --table
> XUJINGYU.CMTSIFTABLE   --fields-terminated-by "|"    --columns
> IFID,DEVICEID,IFDESC,IFINDEX,IFTYPE,IFSPEED,IFALIAS,IFNAME --update-key
> IFID,DEVICEID   --update-key IFID,CMTSID --update-mode  allowinsert
> Warning: /usr/lib/hbase does not exist! HBase imports will fail.
> Please set $HBASE_HOME to the root of your HBase installation.
> 13/05/24 12:58:55 WARN tool.BaseSqoopTool: Setting your password on the
> command-line is insecure. Consider using -P instead.
> 13/05/24 12:58:56 INFO manager.SqlManager: Using default fetchSize of 1000
> 13/05/24 12:58:57 INFO oraoop.OraOopOracleQueries: Current schema is:
> XUJINGYU
> 13/05/24 12:58:57 INFO oraoop.OraOopManagerFactory:
> ***********************************************************************
> *** Using Quest庐 Data Connector for Oracle and Hadoop 1.6.0-cdh4-20 ***
> *** Copyright 2012 Quest Software, Inc.                             ***
> *** ALL RIGHTS RESERVED.                                            ***
> ***********************************************************************
> 13/05/24 12:58:57 WARN oraoop.OraOopManagerFactory:
> *****************************************************************************************************************************
> The table "XUJINGYU"."CMTSIFTABLE" does not have a valid index on the
> column(s) IFID,DEVICEID.
> As a consequence, this export may take a long time to complete.
> If performance is unacceptable, consider reattempting this job after
> creating an index on this table via the SQL...
> create index <index_name> on "XUJINGYU"."CMTSIFTABLE"(IFID,DEVICEID);
> *****************************************************************************************************************************
> 13/05/24 12:58:57 INFO oraoop.OraOopManagerFactory: Oracle Database
> version: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
> 13/05/24 12:58:57 INFO oraoop.OraOopManagerFactory: This Oracle database is
> not a RAC.
> 13/05/24 12:58:57 WARN conf.Configuration: mapred.map.max.attempts is
> deprecated. Instead, use mapreduce.map.maxattempts
> 13/05/24 12:58:57 INFO manager.SqlManager: Executing SQL statement: SELECT
> IFID,DEVICEID,IFDESC,IFINDEX,IFTYPE,IFSPEED,IFNAME,IFALIAS FROM
> XUJINGYU.CMTSIFTABLE WHERE 1=0
> 13/05/24 12:58:57 INFO tool.CodeGenTool: Beginning code generation
> 13/05/24 12:58:57 INFO manager.SqlManager: Executing SQL statement: SELECT
> "IFID","DEVICEID","IFDESC","IFINDEX","IFTYPE","IFSPEED","IFNAME","IFALIAS"
> FROM XUJINGYU.CMTSIFTABLE WHERE 0=1
> 13/05/24 12:58:57 INFO orm.CompilationManager: HADOOP_HOME is
> /usr/local/hadoop
> Note:
> /tmp/sqoop-hadoop/compile/7f64b56b53b34f3fe3d6eab03a5255ae/XUJINGYU_CMTSIFTABLE.java
> uses or overrides a deprecated API.
> Note: Recompile with -Xlint:deprecation for details.
> 13/05/24 12:58:59 INFO orm.CompilationManager: Writing jar file:
> /tmp/sqoop-hadoop/compile/7f64b56b53b34f3fe3d6eab03a5255ae/XUJINGYU.CMTSIFTABLE.jar
> 13/05/24 12:58:59 ERROR tool.ExportTool: Error during export: Mixed
> update/insert is not supported against the target database yet
> 
> 
> 
> 
> 
> 
> 
> 2013/5/23 Jarek Jarcec Cecho <ja...@apache.org>
> 
> > Hi YouPeng,
> > Sqoop's Oracle connector do supports upsert mode [1]. Would you mind
> > sharing with us entire Sqoop command line and log generated with parameter
> > --verbose?
> >
> > Jarcec
> >
> > Links:
> > 1:
> > https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/manager/OracleManager.java#L401
> >
> > On Wed, May 22, 2013 at 06:20:06PM +0800, YouPeng Yang wrote:
> > > Hi Jarek Jarcec Cecho
> > >
> > >   I also have found the tip. Thank you.
> > >
> > >   Here comes another question.
> > >
> > >   I find that  sqoop-1.4  support one  to update rows if they exist in
> > the
> > > database already or insert rows if they do not exist yet  by using
> > > --update-key <col-name> --update-mode <mode>.
> > >
> > >   However, I got the error when I tried it:
> > >   ERROR tool.ExportTool: Error during export: Mixed update/insert is not
> > > supported against the target database yet
> > >
> > >  Note: the database is oracle.
> > >
> > >  1.Does it only supports the MySQL,but I found nothing that hint this in
> > > docs.
> > >
> > >  2.Is there any solutions that fullfil my issue to update rows if they
> > > exist in the database already or insert rows if they do not exist yet
> > >
> > >
> > >
> > > Thanks you.
> > >
> > >
> > > Regards.
> > >
> > >
> > > 2013/5/22 Jarek Jarcec Cecho <ja...@apache.org>
> > >
> > > > Hi YouPeng,
> > > > Sqoop 1 do not supports custom insert query when exporting data from
> > HDFS.
> > > > I think that in your use case you can use parameter --columns to
> > specify
> > > > which columns and in what order are present on HDFS, for example:
> > > >
> > > >   sqoop ... --columns ID,TIMEID,COLA,COLB
> > > >
> > > > Jarcec
> > > >
> > > > On Wed, May 22, 2013 at 02:49:04PM +0800, YouPeng Yang wrote:
> > > > > Hi
> > > > >   I want to  export data on the HDFS to the oracle database with
> > > > > sqoop-1.4(sqoop-1.4.1-cdh4.1.2). However the columns betwean HDFS and
> > > > > Oracle table  are  not exactly same to each other.
> > > > >
> > > > >  For example,Data on HDFS:
> > > > > -------------------------------------------------------------
> > > > > | ID |      TIMEID          |   COLA  | COLB  |
> > > > > -------------------------------------------------------------
> > > > > | 6  |  201305221335    |    0        |  20      |
> > > > > -------------------------------------------------------------
> > > > >
> > > > > the Oracle table:
> > > > >
> > ------------------------------------------------------------------------
> > > > > | ID |      TIMEID          |   COLC  | COLB   | COLA |
> > > > >
> > ------------------------------------------------------------------------
> > > > > | 7  |  201305221335    |     kk      |  20      |  1       |
> > > > >
> > ------------------------------------------------------------------------
> > > > > Note:Additional COLC and unsame order.
> > > > >
> > > > >
> > > > > I notice the sqoop export command:
> > > > > --export-dir   HDFS source path for the export
> > > > > --table  Table to populate.
> > > > > It seams not to export the data to oracle in free from just as
> > Free-form
> > > > > Query Imports
> > > > > using the --query argument.
> > > > >
> > > > > Could I acheive that goal ?
> > > > >
> > > > > Thanks very much
> > > > >
> > > > > Regards
> > > >
> >

Re: Export in free form And Mixed update/insert

Posted by YouPeng Yang <yy...@gmail.com>.
Hi Jarek Jarcec Cecho

   Thanks for your response.

  I am sorry that the failed job because of the oraoop plugin which I add
before. Anyways I post the command and the log[1].

  The mixed update/insert job succeeds when I adopt the pure sqoop . Thanks
a  lot.




[1]---------------------------------------------------------------------------------------------------
[hadoop@Hadoop01 ~]$  /home/sqoop-1.4.1-cdh4.1.2/bin/sqoop export --connect
jdbc:oracle:thin:@10.167.14.225:1521:wxoss --username XUJINGYU --password
123456  --export-dir sqoop/CMTSIFTABLE/201305131241  --table
XUJINGYU.CMTSIFTABLE   --fields-terminated-by "|"    --columns
IFID,DEVICEID,IFDESC,IFINDEX,IFTYPE,IFSPEED,IFALIAS,IFNAME --update-key
IFID,DEVICEID   --update-key IFID,CMTSID --update-mode  allowinsert
Warning: /usr/lib/hbase does not exist! HBase imports will fail.
Please set $HBASE_HOME to the root of your HBase installation.
13/05/24 12:58:55 WARN tool.BaseSqoopTool: Setting your password on the
command-line is insecure. Consider using -P instead.
13/05/24 12:58:56 INFO manager.SqlManager: Using default fetchSize of 1000
13/05/24 12:58:57 INFO oraoop.OraOopOracleQueries: Current schema is:
XUJINGYU
13/05/24 12:58:57 INFO oraoop.OraOopManagerFactory:
***********************************************************************
*** Using Quest庐 Data Connector for Oracle and Hadoop 1.6.0-cdh4-20 ***
*** Copyright 2012 Quest Software, Inc.                             ***
*** ALL RIGHTS RESERVED.                                            ***
***********************************************************************
13/05/24 12:58:57 WARN oraoop.OraOopManagerFactory:
*****************************************************************************************************************************
The table "XUJINGYU"."CMTSIFTABLE" does not have a valid index on the
column(s) IFID,DEVICEID.
As a consequence, this export may take a long time to complete.
If performance is unacceptable, consider reattempting this job after
creating an index on this table via the SQL...
create index <index_name> on "XUJINGYU"."CMTSIFTABLE"(IFID,DEVICEID);
*****************************************************************************************************************************
13/05/24 12:58:57 INFO oraoop.OraOopManagerFactory: Oracle Database
version: Oracle Database 10g Enterprise Edition Release 10.2.0.1.0 - 64bi
13/05/24 12:58:57 INFO oraoop.OraOopManagerFactory: This Oracle database is
not a RAC.
13/05/24 12:58:57 WARN conf.Configuration: mapred.map.max.attempts is
deprecated. Instead, use mapreduce.map.maxattempts
13/05/24 12:58:57 INFO manager.SqlManager: Executing SQL statement: SELECT
IFID,DEVICEID,IFDESC,IFINDEX,IFTYPE,IFSPEED,IFNAME,IFALIAS FROM
XUJINGYU.CMTSIFTABLE WHERE 1=0
13/05/24 12:58:57 INFO tool.CodeGenTool: Beginning code generation
13/05/24 12:58:57 INFO manager.SqlManager: Executing SQL statement: SELECT
"IFID","DEVICEID","IFDESC","IFINDEX","IFTYPE","IFSPEED","IFNAME","IFALIAS"
FROM XUJINGYU.CMTSIFTABLE WHERE 0=1
13/05/24 12:58:57 INFO orm.CompilationManager: HADOOP_HOME is
/usr/local/hadoop
Note:
/tmp/sqoop-hadoop/compile/7f64b56b53b34f3fe3d6eab03a5255ae/XUJINGYU_CMTSIFTABLE.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.
13/05/24 12:58:59 INFO orm.CompilationManager: Writing jar file:
/tmp/sqoop-hadoop/compile/7f64b56b53b34f3fe3d6eab03a5255ae/XUJINGYU.CMTSIFTABLE.jar
13/05/24 12:58:59 ERROR tool.ExportTool: Error during export: Mixed
update/insert is not supported against the target database yet







2013/5/23 Jarek Jarcec Cecho <ja...@apache.org>

> Hi YouPeng,
> Sqoop's Oracle connector do supports upsert mode [1]. Would you mind
> sharing with us entire Sqoop command line and log generated with parameter
> --verbose?
>
> Jarcec
>
> Links:
> 1:
> https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/manager/OracleManager.java#L401
>
> On Wed, May 22, 2013 at 06:20:06PM +0800, YouPeng Yang wrote:
> > Hi Jarek Jarcec Cecho
> >
> >   I also have found the tip. Thank you.
> >
> >   Here comes another question.
> >
> >   I find that  sqoop-1.4  support one  to update rows if they exist in
> the
> > database already or insert rows if they do not exist yet  by using
> > --update-key <col-name> --update-mode <mode>.
> >
> >   However, I got the error when I tried it:
> >   ERROR tool.ExportTool: Error during export: Mixed update/insert is not
> > supported against the target database yet
> >
> >  Note: the database is oracle.
> >
> >  1.Does it only supports the MySQL,but I found nothing that hint this in
> > docs.
> >
> >  2.Is there any solutions that fullfil my issue to update rows if they
> > exist in the database already or insert rows if they do not exist yet
> >
> >
> >
> > Thanks you.
> >
> >
> > Regards.
> >
> >
> > 2013/5/22 Jarek Jarcec Cecho <ja...@apache.org>
> >
> > > Hi YouPeng,
> > > Sqoop 1 do not supports custom insert query when exporting data from
> HDFS.
> > > I think that in your use case you can use parameter --columns to
> specify
> > > which columns and in what order are present on HDFS, for example:
> > >
> > >   sqoop ... --columns ID,TIMEID,COLA,COLB
> > >
> > > Jarcec
> > >
> > > On Wed, May 22, 2013 at 02:49:04PM +0800, YouPeng Yang wrote:
> > > > Hi
> > > >   I want to  export data on the HDFS to the oracle database with
> > > > sqoop-1.4(sqoop-1.4.1-cdh4.1.2). However the columns betwean HDFS and
> > > > Oracle table  are  not exactly same to each other.
> > > >
> > > >  For example,Data on HDFS:
> > > > -------------------------------------------------------------
> > > > | ID |      TIMEID          |   COLA  | COLB  |
> > > > -------------------------------------------------------------
> > > > | 6  |  201305221335    |    0        |  20      |
> > > > -------------------------------------------------------------
> > > >
> > > > the Oracle table:
> > > >
> ------------------------------------------------------------------------
> > > > | ID |      TIMEID          |   COLC  | COLB   | COLA |
> > > >
> ------------------------------------------------------------------------
> > > > | 7  |  201305221335    |     kk      |  20      |  1       |
> > > >
> ------------------------------------------------------------------------
> > > > Note:Additional COLC and unsame order.
> > > >
> > > >
> > > > I notice the sqoop export command:
> > > > --export-dir   HDFS source path for the export
> > > > --table  Table to populate.
> > > > It seams not to export the data to oracle in free from just as
> Free-form
> > > > Query Imports
> > > > using the --query argument.
> > > >
> > > > Could I acheive that goal ?
> > > >
> > > > Thanks very much
> > > >
> > > > Regards
> > >
>

Re: Export in free form And Mixed update/insert

Posted by Jarek Jarcec Cecho <ja...@apache.org>.
Hi YouPeng,
Sqoop's Oracle connector do supports upsert mode [1]. Would you mind sharing with us entire Sqoop command line and log generated with parameter --verbose?

Jarcec

Links:
1: https://github.com/apache/sqoop/blob/trunk/src/java/org/apache/sqoop/manager/OracleManager.java#L401

On Wed, May 22, 2013 at 06:20:06PM +0800, YouPeng Yang wrote:
> Hi Jarek Jarcec Cecho
> 
>   I also have found the tip. Thank you.
> 
>   Here comes another question.
> 
>   I find that  sqoop-1.4  support one  to update rows if they exist in the
> database already or insert rows if they do not exist yet  by using
> --update-key <col-name> --update-mode <mode>.
> 
>   However, I got the error when I tried it:
>   ERROR tool.ExportTool: Error during export: Mixed update/insert is not
> supported against the target database yet
> 
>  Note: the database is oracle.
> 
>  1.Does it only supports the MySQL,but I found nothing that hint this in
> docs.
> 
>  2.Is there any solutions that fullfil my issue to update rows if they
> exist in the database already or insert rows if they do not exist yet
> 
> 
> 
> Thanks you.
> 
> 
> Regards.
> 
> 
> 2013/5/22 Jarek Jarcec Cecho <ja...@apache.org>
> 
> > Hi YouPeng,
> > Sqoop 1 do not supports custom insert query when exporting data from HDFS.
> > I think that in your use case you can use parameter --columns to specify
> > which columns and in what order are present on HDFS, for example:
> >
> >   sqoop ... --columns ID,TIMEID,COLA,COLB
> >
> > Jarcec
> >
> > On Wed, May 22, 2013 at 02:49:04PM +0800, YouPeng Yang wrote:
> > > Hi
> > >   I want to  export data on the HDFS to the oracle database with
> > > sqoop-1.4(sqoop-1.4.1-cdh4.1.2). However the columns betwean HDFS and
> > > Oracle table  are  not exactly same to each other.
> > >
> > >  For example,Data on HDFS:
> > > -------------------------------------------------------------
> > > | ID |      TIMEID          |   COLA  | COLB  |
> > > -------------------------------------------------------------
> > > | 6  |  201305221335    |    0        |  20      |
> > > -------------------------------------------------------------
> > >
> > > the Oracle table:
> > > ------------------------------------------------------------------------
> > > | ID |      TIMEID          |   COLC  | COLB   | COLA |
> > > ------------------------------------------------------------------------
> > > | 7  |  201305221335    |     kk      |  20      |  1       |
> > > ------------------------------------------------------------------------
> > > Note:Additional COLC and unsame order.
> > >
> > >
> > > I notice the sqoop export command:
> > > --export-dir   HDFS source path for the export
> > > --table  Table to populate.
> > > It seams not to export the data to oracle in free from just as Free-form
> > > Query Imports
> > > using the --query argument.
> > >
> > > Could I acheive that goal ?
> > >
> > > Thanks very much
> > >
> > > Regards
> >