You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2011/04/20 02:05:47 UTC

[Hadoop Wiki] Update of "Hive/LanguageManual/UDF" by PhiloVivero

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "Hive/LanguageManual/UDF" page has been changed by PhiloVivero.
The comment on this change is: How to reformulate aggregation functions using subqueries.
http://wiki.apache.org/hadoop/Hive/LanguageManual/UDF?action=diff&rev1=60&rev2=61

--------------------------------------------------

  from log a lateral view json_tuple(a.appevent, 'eventid', 'eventname') b as f1, f2;
  }}}
  
+ === GROUPing and SORTing on f(column) ===
+ 
+ If you would like to GROUP BY or SORT BY a column on which you've applied a function, like this:
+ 
+ {{{
+ select f(col) as fc, count(*) from table_name group by fc
+ }}}
+ 
+ You will get an error:
+ 
+ {{{
+ FAILED: Error in semantic analysis: line 1:86 Invalid Table Alias or Column Reference fc
+ }}}
+ 
+ Because you are not able to GROUP BY or SORT BY a column on which a function has been applied. However, you can reformulate this query with subqueries:
+ 
+ {{{
+ select sq.col,count(*) from (select f(column) as col) sq group by sq.col
+ }}}
+