You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Fucun Chu (Jira)" <ji...@apache.org> on 2021/02/19 15:10:00 UTC

[jira] [Created] (IMPALA-10520) Implement ds_theta_intersect() builtin function

Fucun Chu created IMPALA-10520:
----------------------------------

             Summary: Implement ds_theta_intersect() builtin function
                 Key: IMPALA-10520
                 URL: https://issues.apache.org/jira/browse/IMPALA-10520
             Project: IMPALA
          Issue Type: New Feature
          Components: Backend, Frontend
            Reporter: Fucun Chu
             Fix For: Impala 4.0


ds_theta_intersect() is an aggregate function that accepts a sketch and produces a single sketch, which is the intersection of the received sketches.

Example from Hive:
{code:java}
create temporary table sketch_intermediate (category char(1), sketch binary);
insert into sketch_intermediate select category, ds_theta_sketch(id) from sketch_input group by category;
select ds_theta_estimate(ds_theta_intersect(sketch)) from sketch_intermediate;{code}
Some test data for the example:
{code:java}
create temporary table sketch_input (id int, category char(1));
insert into table sketch_input values
 (1, 'a'), (2, 'a'), (3, 'a'), (4, 'a'), (5, 'a'), (6, 'a'), (7, 'a'), (8, 'a'), (9, 'a'), (10, 'a'),
 (6, 'b'), (7, 'b'), (8, 'b'), (9, 'b'), (10, 'b'), (11, 'b'), (12, 'b'), (13, 'b'), (14, 'b'), (15, 'b');{code}
Approximate result:
{code:java}
5.0{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)