You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hive.apache.org by Dave Houston <ro...@crankyadmin.net> on 2012/01/11 15:54:08 UTC
Lag & Lead
Hi guys,
trying to calculate the dwell time of pages in a weblog. In oracle we would used the lead analytic function to find the next row for a particular cookie. What is the best approach for Hive?
Thanks
Dave
Dave Houston
root@crankyadmin.net
Re: Lag & Lead
Posted by Mark Grover <mg...@oanda.com>.
Dave,
I had a similar need for the "first" function but since the Hive ticket Ed mentioned is still unresolved, I ended up writing a reducer (pluggable into Hive via the "transform" functionality) that returned the first row. In your example, you would "distribute by" the cookie before you send the data to the reducer.
You could look into doing something similar as well. Perhaps, a nicer way would be to write a UDAF but the reducer works fine for me.
Mark
Mark Grover, Business Intelligence Analyst
OANDA Corporation
www: oanda.com www: fxtrade.com
e: mgrover@oanda.com
"Best Trading Platform" - World Finance's Forex Awards 2009.
"The One to Watch" - Treasury Today's Adam Smith Awards 2009.
----- Original Message -----
From: "Edward Capriolo" <ed...@gmail.com>
To: user@hive.apache.org
Sent: Wednesday, January 11, 2012 12:02:08 PM
Subject: Re: Lag & Lead
See this for discussion.
https://issues.apache.org/jira/browse/HIVE-896
On Wed, Jan 11, 2012 at 9:54 AM, Dave Houston < root@crankyadmin.net > wrote:
Hi guys,
trying to calculate the dwell time of pages in a weblog. In oracle we would used the lead analytic function to find the next row for a particular cookie. What is the best approach for Hive?
Thanks
Dave
Dave Houston
root@crankyadmin.net
Re: Lag & Lead
Posted by Edward Capriolo <ed...@gmail.com>.
See this for discussion.
https://issues.apache.org/jira/browse/HIVE-896
On Wed, Jan 11, 2012 at 9:54 AM, Dave Houston <ro...@crankyadmin.net> wrote:
> Hi guys,
>
> trying to calculate the dwell time of pages in a weblog. In oracle we
> would used the lead analytic function to find the next row for a particular
> cookie. What is the best approach for Hive?
>
> Thanks
>
> Dave
>
> Dave Houston
> root@crankyadmin.net
>
>
>
>