You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Harish Butani (JIRA)" <ji...@apache.org> on 2012/12/23 19:48:12 UTC
[jira] [Commented] (HIVE-896) Add LEAD/LAG/FIRST/LAST analytical
windowing functions to Hive.
[ https://issues.apache.org/jira/browse/HIVE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13539072#comment-13539072 ]
Harish Butani commented on HIVE-896:
------------------------------------
Hi,
We are posting a preliminary patch for a Partitioned Table Function mechanism and
Windowing clause support based on this. The solution let's you invoke a
Partitioned Table Function anywhere a Table/SubQuery can appear in HQL.
The Windowing clause support matches standard SQL as much as possible:
ability to define windows with the Query or individual Function; ability to
specify a range or value based window with any UDAF. But since Windowing is
handled as a PTF invocation, all Window specification must have the same Partition
and Order specification.
You can read about the details in a (work in progress) document
here http://tinyurl.com/ck4nopn. We have added a lot of tests to show case the
functionality. A good starting point is ptf_general_queries.q, which has 49 queries.
But let us emphasize that this is a preliminary patch. We wanted to get this out early
to get your feedback sooner rather than later. We need to do a lot of cleanup,
refactoring and documentation. The starting point was our SQLWindowing on top of Hive
project; which used Hive's metadata and runtime components but had its own Query form.
So some components still reflect the assumptions from that project. We started by
taking all the code from that project and placing it in the ql.ptf package.
Gradually we have dissipated the stuff under this package; but we still have some
ways to go. For background it may help to look at our Hadoop Summit
presentation(http://tinyurl.com/bm4qb7z).
Finally and most importantly we are not completely finished. We are missing support for
Queries with multiple Inserts. We have to address the case of Queries with aggregations
with no group by and with constants as columns in the Select List. On the entire ql
testsuite there are still around 15 failures, because of these 2 issues.
Harish Butani, Prajakta Kalmegh
> Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
> ---------------------------------------------------------------
>
> Key: HIVE-896
> URL: https://issues.apache.org/jira/browse/HIVE-896
> Project: Hive
> Issue Type: New Feature
> Components: OLAP, UDF
> Reporter: Amr Awadallah
> Priority: Minor
>
> Windowing functions are very useful for click stream processing and similar time-series/sliding-window analytics.
> More details at:
> http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1006709
> http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007059
> http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007032
> -- amr
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira