You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Warren Chisholm (JIRA)" <ji...@apache.org> on 2015/12/17 06:43:46 UTC

[jira] [Created] (SQOOP-2748) Remove parentheses from WHERE clause

Warren Chisholm created SQOOP-2748:
--------------------------------------

             Summary: Remove parentheses from WHERE clause
                 Key: SQOOP-2748
                 URL: https://issues.apache.org/jira/browse/SQOOP-2748
             Project: Sqoop
          Issue Type: Wish
          Components: codegen
    Affects Versions: 1.4.6
         Environment: Redhat 6
Cloudera 5.5.0
            Reporter: Warren Chisholm
            Priority: Minor


When performing a sqoop import with option "-m 1" the following where clause is added by sqoop:
WHERE ( 1=1 ) AND ( 1=1 )

This where clause is redundant and creates an incompatibility with the IBM IMS Universal JDBC driver.
The IMS JDBC Driver does not allow the use of parentheses in the where clause. The below quote was taken from 2nd dot point here: https://www-01.ibm.com/support/knowledgecenter/SSEPH2_13.1.0/com.ibm.ims13.doc.apg/ims_odbjdbcusagewhere.htm.
"Do not use parentheses. Qualification statements are evaluated from left to right. The order of evaluation for operators is the IMS evaluation order for segment search arguments."

I have modified the below commands and output to anonymize the details.

Sample sqoop command:
sqoop import --connect 'jdbc:ims://myserver:9999/MYDATABASE:currentSchema=MYSCHEMA;' -username ???????? -password ???????? --null-string '\\N' --null-non-string '\\N' --hive-import --create-hive-table --table MYIMSTABLE --hive-table MYHIVETABLE --hive-drop-import-delims --driver com.ibm.ims.jdbc.IMSDriver --delete-target-dir -m 1

Extract of log produced:
...
15/12/16 17:20:05 INFO mapreduce.Job:  map 0% reduce 0%
15/12/16 17:20:16 INFO mapreduce.Job: Task Id : attempt_1449625205240_0003_m_000000_0, Status : FAILED
Error: java.io.IOException: SQLException in nextKeyValue
        at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:277)
        at org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.nextKeyValue(MapTask.java:556)
        at org.apache.hadoop.mapreduce.task.MapContextImpl.nextKeyValue(MapContextImpl.java:80)
        at org.apache.hadoop.mapreduce.lib.map.WrappedMapper$Context.nextKeyValue(WrappedMapper.java:91)
        at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:144)
        at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64)
        at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:787)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
        at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1671)
        at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
Caused by: java.sql.SQLException: Error parsing the following SQL statement: SELECT FIELD1 FROM MYIMSTABLE AS MYIMSTABLE WHERE ( 1=1 ) AND ( 1=1 ). Encountered " <ONEEQUALONE> "1=1 "" at line 1, column 1167.
Was expecting:
    <ID> ...

        at com.ibm.ims.jdbc.SQLForDLIParser.parse(SQLForDLIParser.java:651)
        at com.ibm.ims.jdbc.PreparedStatementImpl.parseSQLQuery(PreparedStatementImpl.java:133)
        at com.ibm.ims.jdbc.PreparedStatementImpl.<init>(PreparedStatementImpl.java:87)
        at com.ibm.ims.jdbc.ConnectionImpl.prepareStatement(ConnectionImpl.java:978)
        at org.apache.sqoop.mapreduce.db.DBRecordReader.executeQuery(DBRecordReader.java:101)
        at org.apache.sqoop.mapreduce.db.DBRecordReader.nextKeyValue(DBRecordReader.java:235)
        ... 12 more


In summary, for sqoop to be compatible with the IBM IMS JDBC driver, I am asking for either the where clause "( 1=1 ) AND ( 1=1 )" to be removed or for the parentheses to be removed.


Workaround
The only workaround I have been able to come up with is to switch to using the --query option instead of --table and hack the $CONDITIONS variable into the select part of the query - to get rid of the parentheses from the where clause.

For example:
sqoop import --connect 'jdbc:ims://myserver:9999/MYDATABASE:currentSchema=MYSCHEMA;' -username ???????? -password ???????? --null-string '\\N' --null-non-string '\\N' --hive-import --create-hive-table --query "select FIELD1,'\$CONDITIONS' as CONDITIONS from MYIMSTABLE" --hive-table MYHIVETABLE --hive-drop-import-delims --driver com.ibm.ims.jdbc.IMSDriver --delete-target-dir -m 1 --target-dir test

Obviously this is not desirable as the commands to retrieve the metadata (WHERE 1 = 0) will retrieve the data with it. The sql will look like this:
select FIELD1,' (1 = 0) ' as CONDITIONS from MYIMSTABLE
however it does import the data successfully (albeit inefficiently).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)