You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Venkata Puneet Ravuri (JIRA)" <ji...@apache.org> on 2014/08/27 00:07:58 UTC

[jira] [Created] (HIVE-7886) Aggregation queries fail with RCFile based Hive tables with S3 storage

Venkata Puneet Ravuri created HIVE-7886:
-------------------------------------------

             Summary: Aggregation queries fail with RCFile based Hive tables with S3 storage
                 Key: HIVE-7886
                 URL: https://issues.apache.org/jira/browse/HIVE-7886
             Project: Hive
          Issue Type: Bug
          Components: File Formats
    Affects Versions: 0.13.1
            Reporter: Venkata Puneet Ravuri


Aggregation queries on Hive tables which use RCFile format and S3 storage are failing.

My setup is Hadoop 2.5.0 and Hive 0.13.1.

I create a table with following schema:-
CREATE EXTERNAL TABLE `testtable`(
  `col1` string, 
  `col2` tinyint, 
  `col3` int, 
  `col4` float, 
  `col5` boolean, 
  `col6` smallint)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.columnar.ColumnarSerDe' 
WITH SERDEPROPERTIES (
  'serialization.format'='\t',
  'line.delim'='\n',
  'field.delim'='\t'
)
STORED AS INPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.RCFileInputFormat' 
OUTPUTFORMAT 
  'org.apache.hadoop.hive.ql.io.RCFileOutputFormat'
LOCATION
  's3n://<testbucket>/testtable';

When I run 'select count(*) from testtable', it gives the following exception stack:-

Error: java.io.IOException: java.io.IOException: java.io.EOFException: Attempted to seek or read past the end of the file
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:256)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:171)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:198)
	at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:184)
	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:52)
	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:342)
	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:168)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:415)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614)
	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:163)
Caused by: java.io.IOException: java.io.EOFException: Attempted to seek or read past the end of the file
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
	at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:344)
	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101)
	at org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:122)
	at org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:254)
	... 11 more
Caused by: java.io.EOFException: Attempted to seek or read past the end of the file
	at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.processException(Jets3tNativeFileSystemStore.java:462)
	at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleException(Jets3tNativeFileSystemStore.java:411)
	at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.retrieve(Jets3tNativeFileSystemStore.java:234)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
	at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
	at org.apache.hadoop.fs.s3native.$Proxy17.retrieve(Unknown Source)
	at org.apache.hadoop.fs.s3native.NativeS3FileSystem$NativeS3FsInputStream.seek(NativeS3FileSystem.java:205)
	at org.apache.hadoop.fs.BufferedFSInputStream.seek(BufferedFSInputStream.java:96)
	at org.apache.hadoop.fs.BufferedFSInputStream.skip(BufferedFSInputStream.java:67)
	at java.io.DataInputStream.skipBytes(DataInputStream.java:220)
	at org.apache.hadoop.hive.ql.io.RCFile$ValueBuffer.readFields(RCFile.java:739)
	at org.apache.hadoop.hive.ql.io.RCFile$Reader.currentValueBuffer(RCFile.java:1720)
	at org.apache.hadoop.hive.ql.io.RCFile$Reader.getCurrentRow(RCFile.java:1898)
	at org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:149)
	at org.apache.hadoop.hive.ql.io.RCFileRecordReader.next(RCFileRecordReader.java:44)
	at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:339)
	... 15 more




--
This message was sent by Atlassian JIRA
(v6.2#6252)