You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Alexander Pivovarov <ap...@gmail.com> on 2015/04/20 22:26:00 UTC

herfindahl index UDAF

Hi Everyone

Do you think it is possible to create UDAF to calculate Herfindahl Index
(HHI)
http://en.wikipedia.org/wiki/Herfindahl_index

Calculation:

v    ratio  ratio^2
100   0.1    0.01
100   0.1    0.01
300   0.3    0.09
100   0.1    0.01
200   0.2    0.04
200   0.2    0.04
------------------------
SUM
1000  1      0.2

HHI = 0.2

Currently I use the following SQL:
select sum(pow(t.v / t2.sum_v, 2))
from t
join (select sum(v) sum_v from t) t2


Is it possible to create UDAF for it?
e.g.
select hhi(v) from t;

Lets assume set of v can fit in reducer memory.