You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Wei Zheng (JIRA)" <ji...@apache.org> on 2015/11/02 22:01:28 UTC

[jira] [Commented] (HIVE-9291) Hive error when GROUPING by TIMESTAMP column when storage orc TBLPROPERTIES ('transactional'='true')

    [ https://issues.apache.org/jira/browse/HIVE-9291?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14986028#comment-14986028 ] 

Wei Zheng commented on HIVE-9291:
---------------------------------

This looks like a real issue. Working on the repro.

> Hive error when GROUPING by TIMESTAMP column when storage orc TBLPROPERTIES ('transactional'='true')
> ----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-9291
>                 URL: https://issues.apache.org/jira/browse/HIVE-9291
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.14.0
>         Environment: Hortonsworks Sandbox HDP 2.2
>            Reporter: Geoffrey Cleaves
>            Assignee: Wei Zheng
>              Labels: hadoop, newbie
>
> I am unable to successfully run a "SQL" query that groups by a timestamp column when the underlying table is created as ORC and TBLPROPERTIES ('transactional'='true').  If I remove ('transactional'='true') when creating the table then I can run the group by query correctly.
> (Additionally, pig does not read tables created with TBLPROPERTIES ('transactional'='true')).
> h3. Error output
> hive> select to_date(createdat), count( * ) from entrance_t
>     > group by to_date(createdat);
> Query ID = root_20150107131414_f6739293-a87f-4c05-8100-b86ae060be3a
> Total jobs = 1
> Launching Job 1 out of 1
> Number of reduce tasks not specified. Estimated from input data size: 1
> In order to change the average load for a reducer (in bytes):
>   set hive.exec.reducers.bytes.per.reducer=<number>
> In order to limit the maximum number of reducers:
>   set hive.exec.reducers.max=<number>
> In order to set a constant number of reducers:
>   set mapreduce.job.reduces=<number>
> Starting Job = job_1420194485920_0106, Tracking URL = http://sandbox.hortonworks.com:8088/proxy/application_1420194485920_0106/
> Kill Command = /usr/hdp/2.2.0.0-2041/hadoop/bin/hadoop job  -kill job_1420194485920_0106
> Hadoop job information for Stage-1: number of mappers: 3; number of reducers: 1
> 2015-01-07 13:14:50,082 Stage-1 map = 0%,  reduce = 0%
> 2015-01-07 13:15:30,154 Stage-1 map = 100%,  reduce = 100%
> Ended Job = job_1420194485920_0106 with errors
> Error during job, obtaining debugging information...
> Examining task ID: task_1420194485920_0106_m_000000 (and more) from job job_1420194485920_0106
> Task with the most failures(4): 
> -----
> Task ID:
>   task_1420194485920_0106_m_000001
> URL:
>   http://sandbox.hortonworks.com:8088/taskdetails.jsp?jobid=job_1420194485920_0106&tipid=task_1420194485920_0106_m_000001
> -----
> Diagnostic Messages for this Task:
> Error: java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row 
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:185)
> 	at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:54)
> 	at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:450)
> 	at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
> 	at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:163)
> 	at java.security.AccessController.doPrivileged(Native Method)
> 	at javax.security.auth.Subject.doAs(Subject.java:415)
> 	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1628)
> 	at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row 
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:52)
> 	at org.apache.hadoop.hive.ql.exec.mr.ExecMapper.map(ExecMapper.java:176)
> 	... 8 more
> Caused by: java.lang.ArrayIndexOutOfBoundsException: 2
> 	at org.apache.hadoop.hive.ql.exec.vector.expressions.LongToStringUnaryUDF.evaluate(LongToStringUnaryUDF.java:57)
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorHashKeyWrapperBatch.evaluateBatch(VectorHashKeyWrapperBatch.java:91)
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator$ProcessingModeHashAggregate.processBatch(VectorGroupByOperator.java:315)
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorGroupByOperator.processOp(VectorGroupByOperator.java:859)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorSelectOperator.processOp(VectorSelectOperator.java:138)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:815)
> 	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:95)
> 	at org.apache.hadoop.hive.ql.exec.MapOperator$MapOpCtx.forward(MapOperator.java:157)
> 	at org.apache.hadoop.hive.ql.exec.vector.VectorMapOperator.process(VectorMapOperator.java:45)
> 	... 9 more
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.mr.MapRedTask
> MapReduce Jobs Launched: 
> Stage-Stage-1: Map: 3  Reduce: 1   HDFS Read: 0 HDFS Write: 0 FAIL
> Total MapReduce CPU Time Spent: 0 msec
> h3. Problem table creation
> create table entrance_t 
> (createdAt timestamp, ip string)
> clustered by (createdAt) into 3 buckets STORED AS orc *TBLPROPERTIES ('transactional'='true')*;
> insert into table entrance_t select createdat, ip from ad_server LIMIT 10;
> h3. No problem table creation
> create table entrance_t 
> (createdAt timestamp, ip string)
> clustered by (createdAt) into 3 buckets STORED AS orc;
> insert into table entrance_t select createdat, ip from ad_server LIMIT 10;



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)