You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by "Alan Gates (JIRA)" <ji...@apache.org> on 2013/01/08 23:26:13 UTC

[jira] [Commented] (HIVE-896) Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.

    [ https://issues.apache.org/jira/browse/HIVE-896?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13547347#comment-13547347 ] 

Alan Gates commented on HIVE-896:
---------------------------------

Harish,

Could you point out the interfaces (in the API sense, not the Java sense) that are most important in this patch?  In particular I'm intersted in interfaces between UDFs and Hive.  Based on my review so far the classes that stand out as important in this regard are TableFunctionEvaluator, TableFunctionResolver, and PTFPartition.  Are there others I should be looking at?

Questions I have so far:
* If I read this right you are using CLUSTER BY and SORT BY instead of PARTITION BY and ORDER BY for syntax in OVER.  Why?
* Does it ever make sense for a windowing function to return a partition?  Should there be a interface/abstract class specific for windowing functions that only returns a single entry?
* Can I put one of the existing aggregate functions in an OVER clause using this?
* Could you explain how the partition is handled in memory?  It looks to me as if the entire partition is read into memory.  Is that correct?  If so, does it read it aggresively or as the iterator moves through the records?  It also appears there is no effort to drop earlier parts of the partition that are now out of range of the window.  Is that also correct?

                
> Add LEAD/LAG/FIRST/LAST analytical windowing functions to Hive.
> ---------------------------------------------------------------
>
>                 Key: HIVE-896
>                 URL: https://issues.apache.org/jira/browse/HIVE-896
>             Project: Hive
>          Issue Type: New Feature
>          Components: OLAP, UDF
>            Reporter: Amr Awadallah
>            Priority: Minor
>         Attachments: HIVE-896.1.patch.txt
>
>
> Windowing functions are very useful for click stream processing and similar time-series/sliding-window analytics.
> More details at:
> http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1006709
> http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007059
> http://download-west.oracle.com/docs/cd/B13789_01/server.101/b10736/analysis.htm#i1007032
> -- amr

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira