You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@sqoop.apache.org by "Hari Sekhon (JIRA)" <ji...@apache.org> on 2014/02/07 11:40:20 UTC
[jira] [Commented] (SQOOP-1283) Export doesn't detect Avro files without .avro extension (ie created by Hive)

    [ https://issues.apache.org/jira/browse/SQOOP-1283?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13894385#comment-13894385 ] 

Hari Sekhon commented on SQOOP-1283:
------------------------------------

Thanks Harsh! I'd prefer if Sqoop did the detection regardless of the file extension... it's one less thing for users to worry about. If you've already got the backing files without .avro then having to transform a large table is annoying...

> Export doesn't detect Avro files without .avro extension (ie created by Hive)
> -----------------------------------------------------------------------------
>
>                 Key: SQOOP-1283
>                 URL: https://issues.apache.org/jira/browse/SQOOP-1283
>             Project: Sqoop
>          Issue Type: Bug
>          Components: connectors/postgresql, hive-integration
>    Affects Versions: 1.4.3
>         Environment: CDH 4.5
>            Reporter: Hari Sekhon
>
> Exporting to PostgreSQL, Sqoop doesn't detect Avro files properly if they don't have the .avro extension (ie they are called 000000_0 in HDFS as they were created by Hive) and falls back to unknown file type in the code, which then attempts to use Text export mapper which fails with a parse exception:
> java.io.IOException: Can't export data, please check failed map task logs 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
> at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) 
> at org.apache.hadoop.mapred.Child.main(Child.java:262) 
> Caused by: java.lang.RuntimeException: Can't parse input data: 'Objavro.codecdeflateavro.schema�{"type":"record","name":"<scrubbed>","namespace":"<scrubbed>.avro","fields":[{"name":"pane 
> 14/02/03 17:13:52 INFO mapred.JobClient: Task Id : attempt_201312101527_93532_m_000000_0, Status : FAILED 
> java.io.IOException: Can't export data, please check failed map task logs 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:112) 
> at org.apache.sqoop.mapreduce.TextExportMapper.map(TextExportMapper.java:39) 
> at org.apache.hadoop.mapreduce.Mapper.run(Mapper.java:140) 
> at org.apache.sqoop.mapreduce.AutoProgressMapper.run(AutoProgressMapper.java:64) 
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:672) 
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:330) 
> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) 
> at java.security.AccessController.doPrivileged(Native Method) 
> at javax.security.auth.Subject.doAs(Subject.java:396) 
> at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) 
> at org.apache.hadoop.mapred.Child.main(Child.java:262) 
> Thanks
> Hari Sekhon
> http://www.linkedin.com/in/harisekhon



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)