You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@tez.apache.org by "Hitesh Shah (JIRA)" <ji...@apache.org> on 2016/08/07 18:01:20 UTC
[jira] [Commented] (TEZ-3403) Empty partition issue with Hive on TEZ

    [ https://issues.apache.org/jira/browse/TEZ-3403?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15411020#comment-15411020 ] 

Hitesh Shah commented on TEZ-3403:
----------------------------------

There are 2 issues mentioned in the description. For the first one, please file a Hive jira. For the second one, we can re-use this jira. Please attach yarn application logs for the second issue.

For both issues, please provide the following info:
   - tez and hive versions
   - which InputFormat you are using with the query in question 

> Empty partition issue with Hive on TEZ
> --------------------------------------
>
>                 Key: TEZ-3403
>                 URL: https://issues.apache.org/jira/browse/TEZ-3403
>             Project: Apache Tez
>          Issue Type: Bug
>            Reporter: Ashish Kumar
>
> Hi,
> I'm experiencing few failures with TEZ regarding Hive partitions. Even though there is no partition column used in the query still it is giving partition file path not found error.
> I'm trying to run below query with Hive on TEZ and getting some partition issue. The same query is working fine with MR engine. Used table is external one and having partitions on year and month columns. I've seen few times 
> *Query:*
> *select count(crn) as bookings, month(to_date(from_utc_timestamp(pickup_date,'IST'))) as month from bookings_table and year=2016 group by month(to_date(from_utc_timestamp(pickup_date,'IST')));*
> *Error:*
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException: java.io.IOException: While processing file s3n://<bucket>/warehouse/bookings_table/year=2016/month=1. null 
> at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:78) 
> at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.run(MapRecordProcessor.java:292) 
> at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:148) 
> ... 14 more 
> Caused by: java.io.IOException: java.io.IOException: While processing file s3n://<bucket>/warehouse/bookings_table/year=2016/month=1. null 
> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) 
> at org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) 
> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:372) 
> at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:79) 
> at org.apache.hadoop.hive.ql.io.HiveRecordReader.doNext(HiveRecordReader.java:33) 
> at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:118) 
> at org.apache.hadoop.mapred.split.TezGroupedSplitsInputFormat$TezGroupedSplitsRecordReader.next(TezGroupedSplitsInputFormat.java:137) 
> at org.apache.tez.mapreduce.lib.MRReaderMapred.next(MRReaderMapred.java:113) 
> at org.apache.hadoop.hive.ql.exec.tez.MapRecordSource.pushRecord(MapRecordSource.java:68) 
> ... 16 more 
> *Another error for other query:*
> DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:4 
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1470240409111_2339_1_06, diagnostics=[Vertex vertex_1470240409111_2339_1_06 [Map 1] killed/failed due to:ROOT_INPUT_INIT_FAILURE, Vertex Input: app_sessions initializer failed, vertex=vertex_1470240409111_2339_1_06 [Map 1], java.io.FileNotFoundException: No such file or directory: s3n://<bucket>/warehouse/<table>/year=2015/month=02/day=14/hour=03 
> at org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:1078) 
> at org.apache.hadoop.fs.s3a.S3AFileSystem.listStatus(S3AFileSystem.java:783) 
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1500) 
> at org.apache.hadoop.fs.FileSystem.listStatus(FileSystem.java:1540) 
> at org.apache.hadoop.fs.FileSystem$4.(FileSystem.java:1704) 
> at org.apache.hadoop.fs.FileSystem.listLocatedStatus(FileSystem.java:1703) 
> at org.apache.hadoop.mapred.InputPathProcessor.perPathComputation(InputPathProcessor.java:235) 
> at org.apache.hadoop.mapred.InputPathProcessor.access$000(InputPathProcessor.java:28) 
> at org.apache.hadoop.mapred.InputPathProcessor$2.run(InputPathProcessor.java:338) 
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) 
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) 
> at java.lang.Thread.run(Thread.java:745) 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)