You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Venkat Ramachandran (JIRA)" <ji...@apache.org> on 2015/07/23 05:12:05 UTC
[jira] [Updated] (SQOOP-2387) NPE thrown when sqoop tries to import
table with column name containing some special character
[ https://issues.apache.org/jira/browse/SQOOP-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Venkat Ramachandran updated SQOOP-2387:
---------------------------------------
Attachment: SQOOP-2387.1.patch
The previous patch did not apply correctly on 1.4.6 or trunk. I'm attaching a new patch -- the implementation idea is the same where the getFieldMap() method of the generated ORM class will have real column names as keys.
I have tested with multiple columns with special character on a MySQL instance and it works as expected.
***** MySQL *****
mysql> describe sqoop_1;
+-------+---------+------+-----+---------+-------+
| Field | Type | Null | Key | Default | Extra |
+-------+---------+------+-----+---------+-------+
| a+b | int(11) | YES | | NULL | |
| a-c | int(11) | YES | | NULL | |
| a$d | int(11) | YES | | NULL | |
+-------+---------+------+-----+---------+-------+
3 rows in set (0.00 sec)
mysql>
mysql> select * from sqoop_1;
+------+------+------+
| a+b | a-c | a$d |
+------+------+------+
| 1 | 1 | 1 |
| 2 | 2 | 2 |
| 3 | 3 | 3 |
| 4 | 4 | 4 |
| 5 | 5 | 5 |
+------+------+------+
5 rows in set (0.00 sec)
****HIVE****
hive> describe sqoop_1;
OK
a+b int
a-c int
a$d int
Time taken: 0.051 seconds, Fetched: 3 row(s)
hive> select * from sqoop_1;
OK
1 1 1
2 2 2
3 3 3
4 4 4
5 5 5
Time taken: 0.069 seconds, Fetched: 5 row(s)
> NPE thrown when sqoop tries to import table with column name containing some special character
> ----------------------------------------------------------------------------------------------
>
> Key: SQOOP-2387
> URL: https://issues.apache.org/jira/browse/SQOOP-2387
> Project: Sqoop
> Issue Type: Bug
> Components: hive-integration
> Affects Versions: 1.4.5, 1.4.6
> Environment: HDP 2.2.0.0-2041
> Reporter: Pavel Benes
> Priority: Critical
> Attachments: SQOOP-2387.1.patch, SQOOP-2387.patch, joblog.txt, sqoop.log
>
>
> This sqoop import:
> sqoop import --connect jdbc:mysql://some.merck.com:1234/dbname --username XXX --password YYY --table some_table --hcatalog-database some_database --hcatalog-table some_table --hive-partition-key mg_version --hive-partition-value 2015-05-28-13-18 -m 1 --verbose --fetch-size -2147483648
> fails with with this error:
> 2015-06-01 13:20:39,209 WARN [main] org.apache.hadoop.mapred.YarnChild: Exception running child : java.lang.NullPointerException
> at org.apache.hive.hcatalog.data.schema.HCatSchema.get(HCatSchema.java:105)
> at org.apache.sqoop.mapreduce.hcat.SqoopHCatImportHelper.convertToHCatRecord(SqoopHCatImportHelper.java:194)
> at org.apache.sqoop.mapreduce.hcat.SqoopHCatImportMapper.map(SqoopHCatImportMapper.java:52)
> at org.apache.sqoop.mapreduce.hcat.SqoopHCatImportMapper.map(SqoopHCatImportMapper.java:34)
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:145)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:784)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:341)
> at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:415)
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> It seems that the error is caused by a column name containing a hyphen ('-'). Column names are converted to java identifiers but later this converted name could not be found in HCatalog schema.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)