You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Aihua Xu (JIRA)" <ji...@apache.org> on 2015/06/01 15:31:18 UTC

[jira] [Commented] (HIVE-10826) Support min()/max() functions over x preceding and y preceding windowing

    [ https://issues.apache.org/jira/browse/HIVE-10826?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14567288#comment-14567288 ] 

Aihua Xu commented on HIVE-10826:
---------------------------------

[~leftylev] Yes. Supported window specification should be updated. Instead of 
{noformat}
ROWS ((CURRENT ROW) | (UNBOUNDED | [num]) PRECEDING) AND (UNBOUNDED | [num]) FOLLOWING
{noformat}

The window spec should be:
{{ROWS BETWEEN ((CURRENT ROW) | (UNBOUNDED | [num]) PRECEDING | [num] FOLLOWING) AND (CURRENT ROW) | [num] PRECEDING | (UNBOUNDED | [num]) FOLLOWING}}
with the restriction of the first value of the window spec <= the second value. 
So in {{rows x preceding and y preceding}}, x should be same or less than y. {{rows 2 following and current row}} are invalid.

Similar to the doc in here:
http://docs.oracle.com/cd/E11882_01/server.112/e41084/functions004.htm#SQLRF06174

I copied the explanation here:

{noformat}
BETWEEN ... AND 
Use the BETWEEN ... AND clause to specify a start point and end point for the window. The first expression (before AND) defines the start point and the second expression (after AND) defines the end point.

If you omit BETWEEN and specify only one end point, then Oracle considers it the start point, and the end point defaults to the current row.

UNBOUNDED PRECEDING 
Specify UNBOUNDED PRECEDING to indicate that the window starts at the first row of the partition. This is the start point specification and cannot be used as an end point specification.

UNBOUNDED FOLLOWING 
Specify UNBOUNDED FOLLOWING to indicate that the window ends at the last row of the partition. This is the end point specification and cannot be used as a start point specification.

CURRENT ROW 
As a start point, CURRENT ROW specifies that the window begins at the current row or value (depending on whether you have specified ROW or RANGE, respectively). In this case the end point cannot be value_expr PRECEDING.

As an end point, CURRENT ROW specifies that the window ends at the current row or value (depending on whether you have specified ROW or RANGE, respectively). In this case the start point cannot be value_expr FOLLOWING.

value_expr PRECEDING or value_expr FOLLOWING 
For RANGE or ROW:

If value_expr FOLLOWING is the start point, then the end point must be value_expr FOLLOWING.

If value_expr PRECEDING is the end point, then the start point must be value_expr PRECEDING.
{noformat}


> Support min()/max() functions over x preceding and y preceding windowing 
> -------------------------------------------------------------------------
>
>                 Key: HIVE-10826
>                 URL: https://issues.apache.org/jira/browse/HIVE-10826
>             Project: Hive
>          Issue Type: Sub-task
>          Components: PTF-Windowing
>            Reporter: Aihua Xu
>            Assignee: Aihua Xu
>             Fix For: 1.3.0
>
>         Attachments: HIVE-10826.patch
>
>
> Currently the query 
> {noformat}
> select key, value, min(value) over (partition by key order by value rows between 1 preceding and 1 preceding) from small;
> {noformat}
> doesn't work. It failed with 
> {noformat}
> java.lang.RuntimeException: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
>         at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:256)
>         at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)
>         at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)
>         at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:449)
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Hive Runtime Error while processing row (tag=0) {"key":{"reducesinkkey0":"2"},"value":{"_col0":"500"}}
>         at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:244)
>         ... 3 more
> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Internal Error: cannot generate all output rows for a Partition
>         at org.apache.hadoop.hive.ql.udf.ptf.WindowingTableFunction.finishPartition(WindowingTableFunction.java:520)
>         at org.apache.hadoop.hive.ql.exec.PTFOperator$PTFInvocation.finishPartition(PTFOperator.java:337)
>         at org.apache.hadoop.hive.ql.exec.PTFOperator.process(PTFOperator.java:114)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:837)
>         at org.apache.hadoop.hive.ql.exec.SelectOperator.process(SelectOperator.java:88)
>         at org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:235)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)