You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Gabor Kaszab (Jira)" <ji...@apache.org> on 2020/04/09 13:44:00 UTC
[jira] [Created] (IMPALA-9633) Implement ds_hll_union() builtin
function
Gabor Kaszab created IMPALA-9633:
------------------------------------
Summary: Implement ds_hll_union() builtin function
Key: IMPALA-9633
URL: https://issues.apache.org/jira/browse/IMPALA-9633
Project: IMPALA
Issue Type: New Feature
Components: Backend, Frontend
Reporter: Gabor Kaszab
ds_hll_union() is an aggregating function that accepts sketches and produces a single scratch that is the combination of the received scratches.
Example from Hive:
{code:java}
create temporary table sketch_intermediate (category char(1), sketch binary);
insert into sketch_intermediate select category, ds_hll_sketch(id) from sketch_input group by category;
select ds_hll_estimate(ds_hll_union(sketch)) from sketch_intermediate;
{code}
Some test data for the example:
{code:java}
create temporary table sketch_input (id int, category char(1));
insert into table sketch_input values
(1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (5, 'a'), (6, 'a'), (7, 'a'), (8, 'a'), (9, 'a'), (10, 'a'),
(6, 'b'), (7, 'b'), (8, 'b'), (9, 'b'), (10, 'b'), (11, 'b'), (12, 'b'), (13, 'b'), (14, 'b'), (15, 'b');
{code}
Approximate result:
{code:java}
15.000000521540663
{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)