You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@sqoop.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2012/05/02 01:42:52 UTC

[jira] [Created] (SQOOP-481) Sqoop import with --have-import using wrong column names in --columns throws a NPE

Cheolsoo Park created SQOOP-481:
-----------------------------------

             Summary: Sqoop import with --have-import using wrong column names in --columns throws a NPE
                 Key: SQOOP-481
                 URL: https://issues.apache.org/jira/browse/SQOOP-481
             Project: Sqoop
          Issue Type: Bug
            Reporter: Cheolsoo Park
            Assignee: Cheolsoo Park


To reproduce the error, 

1) Create a table "foo" with a column name "I" on Oracle DB
2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import

This generates the following call stack:

12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
java.lang.NullPointerException
	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)

The reason is simple. In the following lines of code:

{code}
Integer colType = columnTypes.get(col);
...
tring hiveColType = connManager.toHiveType(colType);
{code}

colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.

It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheolsoo Park updated SQOOP-481:
--------------------------------

    Attachment: SQOOP_481-2.patch

In my patch, I introduced a new class SqlTypeMap that sub-classes Java HashMap and encapsulates validation logic.

The main reason why I took this approach is because it is much cleaner - eliminating need to add a null check to every auto-unboxing code.

I'd like to know what others think.

Review board:
https://reviews.apache.org/r/4974/
                
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>         Attachments: SQOOP_481-2.patch, SQOOP_481.patch
>
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266776#comment-13266776 ] 

Cheolsoo Park edited comment on SQOOP-481 at 5/2/12 6:04 PM:
-------------------------------------------------------------

Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. I guess that the real problem that I am raising in this jira is *not auto-boxing but auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new IOException("Column " + col + " does not exist in table " + tableName);
}
String javaType = toJavaType(col, sqlType);
{code}

Please correct me know if I misunderstand your suggestions.
                
      was (Author: cheolsoo):
    Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. I guess that the real problem that I am raising in this jira is *not auto-boxing but auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new throw new IOException("Column " + col + " does not exist in table " + tableName);
}
{code}

Please correct me know if I misunderstand your suggestions.
                  
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Hudson (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287519#comment-13287519 ] 

Hudson commented on SQOOP-481:
------------------------------

Integrated in Sqoop-ant-jdk-1.6 #117 (See [https://builds.apache.org/job/Sqoop-ant-jdk-1.6/117/])
    SQOOP-481. Sqoop import with --hive-import using wrong column names in --columns throws a NPE.

(Cheolsoo Park via Jarek Jarcec Cecho) (Revision 1345225)

     Result = SUCCESS
jarcec : 
Files : 
* /sqoop/trunk/src/java/org/apache/sqoop/manager/SqlManager.java
* /sqoop/trunk/src/java/org/apache/sqoop/util/SqlTypeMap.java
* /sqoop/trunk/src/test/com/cloudera/sqoop/hive/TestTableDefWriter.java

                
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>             Fix For: 1.4.2-incubating
>
>         Attachments: SQOOP_481-2.patch, SQOOP_481.patch
>
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266776#comment-13266776 ] 

Cheolsoo Park edited comment on SQOOP-481 at 5/2/12 6:08 PM:
-------------------------------------------------------------

Hi Arvind and Jarec, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. For nonexistent keys, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new IOException("Column " + col + " does not exist in table " + tableName);
}
String javaType = toJavaType(col, sqlType);
{code}

Please correct me know if I misunderstand your suggestions.
                
      was (Author: cheolsoo):
    Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. For nonexistent keys, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new IOException("Column " + col + " does not exist in table " + tableName);
}
String javaType = toJavaType(col, sqlType);
{code}

Please correct me know if I misunderstand your suggestions.
                  
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SQOOP-481) Sqoop import with --have-import using wrong column names in --columns throws a NPE

Posted by "Arvind Prabhakar (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266265#comment-13266265 ] 

Arvind Prabhakar commented on SQOOP-481:
----------------------------------------

Thanks for finding this and the analysis Cheolsoo. One alternate way to consider handling this issue is to implement fail-fast logic. Which is to say that the moment a column map is constructed with empty or null column names in it, we should immediately register a failure. 
                
> Sqoop import with --have-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheolsoo Park updated SQOOP-481:
--------------------------------

    Attachment: SQOOP_481.patch
    
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>         Attachments: SQOOP_481.patch
>
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (SQOOP-481) Sqoop import with --have-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheolsoo Park updated SQOOP-481:
--------------------------------

    Description: 
To reproduce the error, 

1) Create a table "foo" with a column name "I" on Oracle DB
2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import

This generates the following call stack:

{code}
12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
java.lang.NullPointerException
	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
{code}

The reason is simple. In the following lines of code:

{code}
Integer colType = columnTypes.get(col);
...
tring hiveColType = connManager.toHiveType(colType);
{code}

colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.

It would be better if more informative message is provided rather than a random NPE.

  was:
To reproduce the error, 

1) Create a table "foo" with a column name "I" on Oracle DB
2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import

This generates the following call stack:

12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
java.lang.NullPointerException
	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)

The reason is simple. In the following lines of code:

{code}
Integer colType = columnTypes.get(col);
...
tring hiveColType = connManager.toHiveType(colType);
{code}

colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.

It would be better if more informative message is provided rather than a random NPE.

    
> Sqoop import with --have-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "jiraposter@reviews.apache.org (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13267058#comment-13267058 ] 

jiraposter@reviews.apache.org commented on SQOOP-481:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/4974/
-----------------------------------------------------------

Review request for Sqoop, Arvind Prabhakar and Jarek Cecho.


Summary
-------

The changes include:

1) Introduce a new class SqlTypeMap (subclass of HashMap) that validates values inside the get() and put() methods. This guarantees that the values in the map are always valid (i.e. not null) so that a NPE during auto unboxing can be prevented.

2) Replace HashMap<String, Integer> with SqlTypeMap<String, Integer> in code.


This addresses bug SQOOP-481.
    https://issues.apache.org/jira/browse/SQOOP-481


Diffs
-----

  /src/test/com/cloudera/sqoop/hive/TestTableDefWriter.java 1333183 
  /src/java/org/apache/sqoop/manager/SqlManager.java 1333183 
  /src/java/org/apache/sqoop/util/SqlTypeMap.java PRE-CREATION 

Diff: https://reviews.apache.org/r/4974/diff


Testing
-------

Ran ant test, ant test -Dthirdparty=true, and ant checkstyle.


Thanks,

Cheolsoo


                
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Issue Comment Edited] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266776#comment-13266776 ] 

Cheolsoo Park edited comment on SQOOP-481 at 5/2/12 6:07 PM:
-------------------------------------------------------------

Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. For nonexistent keys, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new IOException("Column " + col + " does not exist in table " + tableName);
}
String javaType = toJavaType(col, sqlType);
{code}

Please correct me know if I misunderstand your suggestions.
                
      was (Author: cheolsoo):
    Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. I guess that the real problem that I am raising in this jira is *not auto-boxing but auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new IOException("Column " + col + " does not exist in table " + tableName);
}
String javaType = toJavaType(col, sqlType);
{code}

Please correct me know if I misunderstand your suggestions.
                  
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13284318#comment-13284318 ] 

Jarek Jarcec Cecho commented on SQOOP-481:
------------------------------------------

I personally like this approach as it solve the issue with proper error message, it's quite clean and I do not believe that it will introduce backward incompatibility.

Jarcec
                
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>         Attachments: SQOOP_481-2.patch, SQOOP_481.patch
>
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SQOOP-481) Sqoop import with --have-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266263#comment-13266263 ] 

Cheolsoo Park commented on SQOOP-481:
-------------------------------------

In fact, there are multiple places where the same issue can occur due to use of autoboxing. For example, in the following lines of code in ClassWriter.generateLoadLargeObjects() can also throw a NPE if "col" does not exist in the table because columnTypes.get() will return a null that in turn will be autocated to a primitive integer:

{code}
int sqlType = columnTypes.get(col);
String javaType = toJavaType(col, sqlType);
{code}

Even worse, in some DBs including Oracle, it is very easy to run into this problem because column names are case-sensitive. For example, table has a column "i" but the user may specify "I" in an option.

What I think we should do is:
1) Identify all places where autoboxing takes place in Sqoop
2) Surround them with a try-catch block (or add a null check) and print an informative error message such as: column 'x' does not exist in table 'y'.

Please let me know if anyone has a better suggestion.

                
> Sqoop import with --have-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Jarek Jarcec Cecho (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266366#comment-13266366 ] 

Jarek Jarcec Cecho commented on SQOOP-481:
------------------------------------------

I would also prefer failing as soon as possible with proper description what has happened.

E.g. Arvind +1 :-)

Jarcec
                
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266776#comment-13266776 ] 

Cheolsoo Park commented on SQOOP-481:
-------------------------------------

Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. I guess that the real problem that I am raising in this jira is *not auto-boxing but auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new throw new IOException("Column " + col + " does not exist in table " + tableName);
}
{code}

Please correct me know if I misunderstand your suggestions.
                
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

Posted by "Cheolsoo Park (JIRA)" <ji...@apache.org>.

     [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Cheolsoo Park updated SQOOP-481:
--------------------------------

    Summary: Sqoop import with --hive-import using wrong column names in --columns throws a NPE  (was: Sqoop import with --have-import using wrong column names in --columns throws a NPE)
    
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira