You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "jiraposter@reviews.apache.org (Commented) (JIRA)" <ji...@apache.org> on 2011/12/07 08:02:43 UTC
[jira] [Commented] (HIVE-2332) If all of the parameters of distinct functions are exists in group by columns, query fails in runtime

    [ https://issues.apache.org/jira/browse/HIVE-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13164193#comment-13164193 ] 

jiraposter@reviews.apache.org commented on HIVE-2332:
-----------------------------------------------------


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/1314/
-----------------------------------------------------------

(Updated 2011-12-07 07:01:30.547976)


Review request for hive, John Sichi and Carl Steinbach.


Changes
-------

Adding null keys induced confusions especially for optimizers. This patch just modifies key-order minimizing side effects.


Summary
-------

If all of the distinct params are in group by keys, union column reserved for distinct params would not be added, which making problems initializing RS operator.

This patch is just a simple bypass adding dummy expression for the union column. Someone would know better way to resolve the problem.


This addresses bug HIVE-2332.
    https://issues.apache.org/jira/browse/HIVE-2332


Diffs (updated)
-----

  ql/src/java/org/apache/hadoop/hive/ql/exec/ReduceSinkOperator.java 732a5aa 
  ql/src/java/org/apache/hadoop/hive/ql/plan/PlanUtils.java c6ae55d 
  ql/src/test/queries/clientpositive/groupby_distinct_samekey.q PRE-CREATION 
  ql/src/test/results/clientpositive/groupby_distinct_samekey.q.out PRE-CREATION 

Diff: https://reviews.apache.org/r/1314/diff


Testing
-------

added clientpositive/groupby_distinct_samekey.q


Thanks,

Navis


                
> If all of the parameters of distinct functions are exists in group by columns, query fails in runtime
> -----------------------------------------------------------------------------------------------------
>
>                 Key: HIVE-2332
>                 URL: https://issues.apache.org/jira/browse/HIVE-2332
>             Project: Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Critical
>             Fix For: 0.9.0
>
>         Attachments: HIVE-2332.1.patch.txt
>
>
> select sum(key_int1), sum(distinct key_int1) from t1 group by key_int1;
> fails with message..
> {code}
> FAILED: Execution Error, return code 2 from org.apache.hadoop.hive.ql.exec.MapRedTask
> {code}
> hadoop says..
> {code}
> Caused by: java.lang.IndexOutOfBoundsException: Index: 1, Size: 1
> 	at java.util.ArrayList.RangeCheck(ArrayList.java:547)
> 	at java.util.ArrayList.get(ArrayList.java:322)
> 	at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.init(StandardStructObjectInspector.java:95)
> 	at org.apache.hadoop.hive.serde2.objectinspector.StandardStructObjectInspector.(StandardStructObjectInspector.java:86)
> 	at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorFactory.getStandardStructObjectInspector(ObjectInspectorFactory.java:252)
> 	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.initEvaluatorsAndReturnStruct(ReduceSinkOperator.java:188)
> 	at org.apache.hadoop.hive.ql.exec.ReduceSinkOperator.processOp(ReduceSinkOperator.java:197)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)
> 	at org.apache.hadoop.hive.ql.exec.TableScanOperator.processOp(TableScanOperator.java:85)
> 	at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:471)
> 	at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:744)
> 	at org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:532)
> {code}
> I think the deficient number of key expression, compared to number of key column, is the problem, which should be equal or more. 
> Would it be solved if add some key expression? I'll try.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira