You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Brock Noland (JIRA)" <ji...@apache.org> on 2015/11/24 22:32:11 UTC

[jira] [Commented] (HIVE-11977) Hive should handle an external avro table with zero length files present

    [ https://issues.apache.org/jira/browse/HIVE-11977?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15025457#comment-15025457 ] 

Brock Noland commented on HIVE-11977:
-------------------------------------

[~dossett] Sorry, I just saw this ping! I moved my mail account and had not yet configured my rules appropiately. This patch looks good! Nice work


[~sershe] - agreed, it'd be great to see this in 1.x.

> Hive should handle an external avro table with zero length files present
> ------------------------------------------------------------------------
>
>                 Key: HIVE-11977
>                 URL: https://issues.apache.org/jira/browse/HIVE-11977
>             Project: Hive
>          Issue Type: Bug
>          Components: Serializers/Deserializers
>    Affects Versions: 0.14.0, 1.0.0, 1.2.0, 1.1.0, 1.2.1
>            Reporter: Aaron Dossett
>            Assignee: Aaron Dossett
>             Fix For: 2.0.0
>
>         Attachments: HIVE-11977.2.patch, HIVE-11977.patch
>
>
> If a zero length file is in the top level directory housing an external avro table,  all hive queries on the table fail.
> This issue is that org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader creates a new org.apache.avro.file.DataFileReader and DataFileReader throws an exception when trying to read an empty file (because the empty file lacks the magic number marking it as avro).  
> AvroGenericRecordReader should detect an empty file and then behave reasonably.
> Caused by: java.io.IOException: Not a data file.
> at org.apache.avro.file.DataFileStream.initialize(DataFileStream.java:102)
> at org.apache.avro.file.DataFileReader.<init>(DataFileReader.java:97)
> at org.apache.hadoop.hive.ql.io.avro.AvroGenericRecordReader.<init>(AvroGenericRecordReader.java:81)
> at org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat.getRecordReader(AvroContainerInputFormat.java:51)
> at org.apache.hadoop.hive.ql.io.HiveInputFormat.getRecordReader(HiveInputFormat.java:246)
> ... 25 more



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)