You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Gabor Kaszab (Jira)" <ji...@apache.org> on 2020/06/02 20:59:00 UTC

[jira] [Work started] (IMPALA-9633) Implement ds_hll_union() builtin function

     [ https://issues.apache.org/jira/browse/IMPALA-9633?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Work on IMPALA-9633 started by Gabor Kaszab.
--------------------------------------------
> Implement ds_hll_union() builtin function
> -----------------------------------------
>
>                 Key: IMPALA-9633
>                 URL: https://issues.apache.org/jira/browse/IMPALA-9633
>             Project: IMPALA
>          Issue Type: New Feature
>          Components: Backend, Frontend
>            Reporter: Gabor Kaszab
>            Assignee: Gabor Kaszab
>            Priority: Major
>
> ds_hll_union() is an aggregating function that accepts sketches and produces a single scratch that is the combination of the received scratches.
> Example from Hive:
> {code:java}
> create temporary table sketch_intermediate (category char(1), sketch binary);
> insert into sketch_intermediate select category, ds_hll_sketch(id) from sketch_input group by category;
> select ds_hll_estimate(ds_hll_union(sketch)) from sketch_intermediate;
> {code}
> Some test data for the example:
> {code:java}
> create temporary table sketch_input (id int, category char(1));
> insert into table sketch_input values
>   (1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (5, 'a'), (6, 'a'), (7, 'a'), (8, 'a'), (9, 'a'), (10, 'a'),
>   (6, 'b'), (7, 'b'), (8, 'b'), (9, 'b'), (10, 'b'), (11, 'b'), (12, 'b'), (13, 'b'), (14, 'b'), (15, 'b');
> {code}
> Approximate result:
> {code:java}
> 15.000000521540663
> {code}
> Hive change that introduced the same: https://issues.apache.org/jira/browse/HIVE-22940



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org