You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@sqoop.apache.org by Douglas Spadotto <do...@gmail.com> on 2016/10/27 18:12:43 UTC

--update-mode allow-insert generating bad code?

Hello everyone,

Today's not a good day to use Sqoop for me... :(

First I discover that the export with update operation won't work with
Parquet (or anything not Sequence file or text).
https://issues.apache.org/jira/browse/SQOOP-2858
https://issues.apache.org/jira/browse/SQOOP-1538
https://issues.apache.org/jira/browse/SQOOP-2948
and the saviour
https://issues.apache.org/jira/browse/SQOOP-2846
confirmed that.

So I exported my parquet table into a text file and happily I went on my
export update mission.

And I got stuck again.

Here's a simplified test case that illustrates what I'm experiencing:

Here's my file on HDFS:

[cloudera@quickstart hive_creates]$ hadoop fs -cat
/user/hive/warehouse/teste_texto/dat_ref_carga=2016-10-26/000000_0
1,Abel
2,Zaha

Here's the DDL for my table to receive the export on Oracle
  CREATE TABLE TESTE_ORACLE
   ( ID NUMBER(2,0),
NOME VARCHAR2(15)
   );

Here's my export command and edited output for readability:

[cloudera@quickstart hive_creates]$ sqoop export --connect
jdbc:oracle:thin:cloudera@//localhost:1521/xe --table TESTE_ORACLE
--update-mode allowinsert --update-key ID --export-dir
/user/hive/warehouse/teste_texto/dat_ref_carga=2016-10-26 --username x
--password y
...
16/10/27 10:05:17 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is
/usr/lib/hadoop-mapreduce
Note:
/tmp/sqoop-cloudera/compile/4753116d2357bb8f7dccb4def31a90f9/TESTE_ORACLE.java
uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.   <-- should I worry
about this?

...
16/10/27 10:05:37 INFO mapreduce.Job: Job job_1477585661213_0013 failed
with state FAILED due to: Task failed task_1477585661213_0013_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0
...
16/10/27 10:05:37 INFO mapreduce.ExportJobBase: Transferred 0 bytes in
17.6109 seconds (0 bytes/sec)
16/10/27 10:05:37 INFO mapreduce.ExportJobBase: Exported 0 records.
16/10/27 10:05:37 ERROR tool.ExportTool: Error during export: Export job
failed!

I dug into the MR job log, and found this:

2016-10-27 10:05:34,637 FATAL [IPC Server handler 3 on 33735]
org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task:
attempt_1477585661213_0013_m_000001_0 - exited : java.io.IOException:
java.sql.SQLException: Missing IN or OUT parameter at index:: 5
at
org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:197)
at
org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.sql.SQLException: Missing IN or OUT parameter at index:: 5
at
oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1821)
at
oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10010)
at
oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1358)
at
org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
at
org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:153)
at
org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:194)
... 8 more

This error is characteristic of trying to add more fields than there are on
the table.

I went into the generated code for this table export, and the write method,
when I added the --allow-insert option was like this:

  public int write(PreparedStatement __dbStmt, int __off) throws
SQLException {
    JdbcWritableBridge.writeBigDecimal(ID, 1 + __off, 2, __dbStmt);
    JdbcWritableBridge.writeString(NOME, 2 + __off, 12, __dbStmt);
    JdbcWritableBridge.writeBigDecimal(ID, 3 + __off, 2, __dbStmt);
    JdbcWritableBridge.writeString(NOME, 4 + __off, 12, __dbStmt);
    return 4;
  }

And when I just used the --update-key field, it was like this:

  public int write(PreparedStatement __dbStmt, int __off) throws
SQLException {
    JdbcWritableBridge.writeString(NOME, 1 + __off, 12, __dbStmt);
    JdbcWritableBridge.writeBigDecimal(ID, 2 + __off, 2, __dbStmt);
    return 2;
  }

Is this the source of my problems, or is the above the correct code to
generate?

How can I make this very simple test case pass?


Thanks in advance,

Douglas

-----
Frodo: "I wish none of this had happened."
Gandalf: "So do all who live to see such times, but that is not for them to
decide. All we have to decide is what to do with the time that is given to
us."
-- Lord of the Rings: The Fellowship of the Ring (2001)

Re: --update-mode allow-insert generating bad code?

Posted by Venkat Ranganathan <vr...@hortonworks.com>.
There was an issue with quotes not properly handled.   This should be fixed in the trunk version.

Let me double check and commit it if it is not already there

Venkat

From: Douglas Spadotto <do...@gmail.com>
Reply-To: "user@sqoop.apache.org" <us...@sqoop.apache.org>
Date: Thursday, October 27, 2016 at 11:12 AM
To: "user@sqoop.apache.org" <us...@sqoop.apache.org>, "Rafael.ricotta@dell.com" <Ra...@dell.com>
Subject: --update-mode allow-insert generating bad code?

Hello everyone,

Today's not a good day to use Sqoop for me... :(

First I discover that the export with update operation won't work with Parquet (or anything not Sequence file or text).
https://issues.apache.org/jira/browse/SQOOP-2858
https://issues.apache.org/jira/browse/SQOOP-1538
https://issues.apache.org/jira/browse/SQOOP-2948
and the saviour
https://issues.apache.org/jira/browse/SQOOP-2846
confirmed that.

So I exported my parquet table into a text file and happily I went on my export update mission.

And I got stuck again.

Here's a simplified test case that illustrates what I'm experiencing:

Here's my file on HDFS:

[cloudera@quickstart hive_creates]$ hadoop fs -cat /user/hive/warehouse/teste_texto/dat_ref_carga=2016-10-26/000000_0
1,Abel
2,Zaha

Here's the DDL for my table to receive the export on Oracle
  CREATE TABLE TESTE_ORACLE
   ( ID NUMBER(2,0),
NOME VARCHAR2(15)
   );

Here's my export command and edited output for readability:

[cloudera@quickstart hive_creates]$ sqoop export --connect jdbc:oracle:thin:cloudera@//localhost:1521/xe --table TESTE_ORACLE --update-mode allowinsert --update-key ID --export-dir /user/hive/warehouse/teste_texto/dat_ref_carga=2016-10-26 --username x --password y
...
16/10/27 10:05:17 INFO orm.CompilationManager: HADOOP_MAPRED_HOME is /usr/lib/hadoop-mapreduce
Note: /tmp/sqoop-cloudera/compile/4753116d2357bb8f7dccb4def31a90f9/TESTE_ORACLE.java uses or overrides a deprecated API.
Note: Recompile with -Xlint:deprecation for details.   <-- should I worry about this?

...
16/10/27 10:05:37 INFO mapreduce.Job: Job job_1477585661213_0013 failed with state FAILED due to: Task failed task_1477585661213_0013_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0
...
16/10/27 10:05:37 INFO mapreduce.ExportJobBase: Transferred 0 bytes in 17.6109 seconds (0 bytes/sec)
16/10/27 10:05:37 INFO mapreduce.ExportJobBase: Exported 0 records.
16/10/27 10:05:37 ERROR tool.ExportTool: Error during export: Export job failed!

I dug into the MR job log, and found this:

2016-10-27 10:05:34,637 FATAL [IPC Server handler 3 on 33735] org.apache.hadoop.mapred.TaskAttemptListenerImpl: Task: attempt_1477585661213_0013_m_000001_0 - exited : java.io.IOException: java.sql.SQLException: Missing IN or OUT parameter at index:: 5
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:197)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:670)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:793)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1693)
at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.sql.SQLException: Missing IN or OUT parameter at index:: 5
at oracle.jdbc.driver.OraclePreparedStatement.processCompletedBindRow(OraclePreparedStatement.java:1821)
at oracle.jdbc.driver.OraclePreparedStatement.addBatch(OraclePreparedStatement.java:10010)
at oracle.jdbc.driver.OraclePreparedStatementWrapper.addBatch(OraclePreparedStatementWrapper.java:1358)
at org.apache.sqoop.mapreduce.UpdateOutputFormat$UpdateRecordWriter.getPreparedStatement(UpdateOutputFormat.java:174)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.execUpdate(AsyncSqlRecordWriter.java:153)
at org.apache.sqoop.mapreduce.AsyncSqlRecordWriter.close(AsyncSqlRecordWriter.java:194)
... 8 more

This error is characteristic of trying to add more fields than there are on the table.

I went into the generated code for this table export, and the write method, when I added the --allow-insert option was like this:

  public int write(PreparedStatement __dbStmt, int __off) throws SQLException {
    JdbcWritableBridge.writeBigDecimal(ID, 1 + __off, 2, __dbStmt);
    JdbcWritableBridge.writeString(NOME, 2 + __off, 12, __dbStmt);
    JdbcWritableBridge.writeBigDecimal(ID, 3 + __off, 2, __dbStmt);
    JdbcWritableBridge.writeString(NOME, 4 + __off, 12, __dbStmt);
    return 4;
  }

And when I just used the --update-key field, it was like this:

  public int write(PreparedStatement __dbStmt, int __off) throws SQLException {
    JdbcWritableBridge.writeString(NOME, 1 + __off, 12, __dbStmt);
    JdbcWritableBridge.writeBigDecimal(ID, 2 + __off, 2, __dbStmt);
    return 2;
  }

Is this the source of my problems, or is the above the correct code to generate?

How can I make this very simple test case pass?


Thanks in advance,

Douglas

-----
Frodo: "I wish none of this had happened."
Gandalf: "So do all who live to see such times, but that is not for them to decide. All we have to decide is what to do with the time that is given to us."
-- Lord of the Rings: The Fellowship of the Ring (2001)