You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Cheolsoo Park (JIRA)" <ji...@apache.org> on 2012/05/02 20:04:50 UTC

[jira] [Issue Comment Edited] (SQOOP-481) Sqoop import with --hive-import using wrong column names in --columns throws a NPE

    [ https://issues.apache.org/jira/browse/SQOOP-481?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13266776#comment-13266776 ] 

Cheolsoo Park edited comment on SQOOP-481 at 5/2/12 6:04 PM:
-------------------------------------------------------------

Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. I guess that the real problem that I am raising in this jira is *not auto-boxing but auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new IOException("Column " + col + " does not exist in table " + tableName);
}
String javaType = toJavaType(col, sqlType);
{code}

Please correct me know if I misunderstand your suggestions.
                
      was (Author: cheolsoo):
    Hi Arvind and Jarece, thank you for your suggestions!

Indeed, failing fast is a good thing to do, and we should always prevent maps from having null values. In fact, fast-fast logic is already in place. For example, we call valueOf() when putting a new value into a map, and valueOf does not return a null.

{code:title=protected Map<String, Integer> getColumnTypesForRawQuery(String stmt)}
colTypes.put(colName, Integer.valueOf(typeId));
{code}

But the problem remains because it is still possible for maps to not have specific keys, which cannot be detected until a look-up happens. I guess that the real problem that I am raising in this jira is *not auto-boxing but auto-unboxing*. For auto-unboxing, when the get() call is made is the earliest point we can fail. So I believe that adding a null check after get() is the best we can do.

{code}
Integer sqlType = columnTypes.get(col);
if (sqlType == null) {
   throw new throw new IOException("Column " + col + " does not exist in table " + tableName);
}
{code}

Please correct me know if I misunderstand your suggestions.
                  
> Sqoop import with --hive-import using wrong column names in --columns throws a NPE
> ----------------------------------------------------------------------------------
>
>                 Key: SQOOP-481
>                 URL: https://issues.apache.org/jira/browse/SQOOP-481
>             Project: Sqoop
>          Issue Type: Bug
>            Reporter: Cheolsoo Park
>            Assignee: Cheolsoo Park
>
> To reproduce the error, 
> 1) Create a table "foo" with a column name "I" on Oracle DB
> 2) Run sqoop import --connect jdbc:oracle:thin:@//localhost/xe --username **** --password **** --verbose --table foo --split-by i --columns i --hive-import
> This generates the following call stack:
> {code}
> 12/05/01 16:12:00 ERROR sqoop.Sqoop: Got exception running Sqoop: java.lang.NullPointerException
> java.lang.NullPointerException
> 	at com.cloudera.sqoop.hive.TableDefWriter.getCreateTableStmt(TableDefWriter.java:162)
> 	at com.cloudera.sqoop.hive.HiveImport.importTable(HiveImport.java:195)
> 	at com.cloudera.sqoop.tool.ImportTool.importTable(ImportTool.java:394)
> 	at com.cloudera.sqoop.tool.ImportTool.run(ImportTool.java:455)
> 	at com.cloudera.sqoop.Sqoop.run(Sqoop.java:146)
> 	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
> 	at com.cloudera.sqoop.Sqoop.runSqoop(Sqoop.java:182)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:221)
> 	at com.cloudera.sqoop.Sqoop.runTool(Sqoop.java:230)
> 	at com.cloudera.sqoop.Sqoop.main(Sqoop.java:239)
> {code}
> The reason is simple. In the following lines of code:
> {code}
> Integer colType = columnTypes.get(col);
> ...
> tring hiveColType = connManager.toHiveType(colType);
> {code}
> colType is null because column "i" does not exist in the table "foo" but "I" exists. Now toHiveType(int colType) tries to autocast a null to a primitive int, resulting a NPE.
> It would be better if more informative message is provided rather than a random NPE.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira