You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Fucun Chu (Code Review)" <ge...@cloudera.org> on 2021/03/18 15:50:52 UTC

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Fucun Chu has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17186


Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................

IMPALA-10581: Implement ds_theta_intersect_f() function

This function receives two strings that are serialized Apache
DataSketches Theta sketches. Computes the intersection of two sketches
of same or different column and returns the resulting sketch of
intersection.

Example:
select ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2))
from sketch_tbl;
+-----------------------------------------------------------+
| ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) |
+-----------------------------------------------------------+
| 5                                                         |
+-----------------------------------------------------------+

Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
---
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
4 files changed, 119 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/17186/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8393/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 2
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 18 Mar 2021 16:12:38 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 6
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 29 Mar 2021 10:13:43 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Fucun Chu (Code Review)" <ge...@cloudera.org>.
Fucun Chu has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 5:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17186/4/be/src/exprs/datasketches-functions-ir.cc
File be/src/exprs/datasketches-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17186/4/be/src/exprs/datasketches-functions-ir.cc@194
PS4, Line 194:   datasketches::compact_theta_sketch sketch = intersection_sketch.get_result();
> Please add more comment about the use cases when this could return false. a
Done


http://gerrit.cloudera.org:8080/#/c/17186/4/be/src/exprs/datasketches-functions-ir.cc@195
PS4, Line 195: riali
> typo: theta
Done


http://gerrit.cloudera.org:8080/#/c/17186/4/be/src/exprs/datasketches-functions-ir.cc@223
PS4, Line 223:   if (serialized_sketch.is_null || serialized_sketch.len == 0) return BigIntVal::null();
> This comment is not needed
Done


http://gerrit.cloudera.org:8080/#/c/17186/4/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
File testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test:

http://gerrit.cloudera.org:8080/#/c/17186/4/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@560
PS4, Line 560: 0
> I miss 2 tests here:
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Mar 2021 15:19:17 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 29 Mar 2021 10:13:07 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Fucun Chu (Code Review)" <ge...@cloudera.org>.
Fucun Chu has uploaded a new patch set (#4). ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................

IMPALA-10581: Implement ds_theta_intersect_f() function

This function receives two strings that are serialized Apache
DataSketches Theta sketches. Computes the intersection of two sketches
of same or different column and returns the resulting sketch of
intersection.

Example:
select ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2))
from sketch_tbl;
+-----------------------------------------------------------+
| ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) |
+-----------------------------------------------------------+
| 5                                                         |
+-----------------------------------------------------------+

Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
---
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
4 files changed, 123 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/17186/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 4
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 4:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8431/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 4
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Wed, 24 Mar 2021 12:00:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................

IMPALA-10581: Implement ds_theta_intersect_f() function

This function receives two strings that are serialized Apache
DataSketches Theta sketches. Computes the intersection of two sketches
of same or different column and returns the resulting sketch of
intersection.

Example:
select ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2))
from sketch_tbl;
+-----------------------------------------------------------+
| ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) |
+-----------------------------------------------------------+
| 5                                                         |
+-----------------------------------------------------------+

Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Reviewed-on: http://gerrit.cloudera.org:8080/17186
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M be/src/exprs/datasketches-common.cc
M be/src/exprs/datasketches-common.h
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
6 files changed, 157 insertions(+), 13 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 7
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 6: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 6
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 29 Mar 2021 15:59:48 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 6:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/7017/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 6
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Mon, 29 Mar 2021 10:13:44 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 5:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/8445/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Mar 2021 15:38:57 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Fucun Chu (Code Review)" <ge...@cloudera.org>.
Fucun Chu has uploaded a new patch set (#5). ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................

IMPALA-10581: Implement ds_theta_intersect_f() function

This function receives two strings that are serialized Apache
DataSketches Theta sketches. Computes the intersection of two sketches
of same or different column and returns the resulting sketch of
intersection.

Example:
select ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2))
from sketch_tbl;
+-----------------------------------------------------------+
| ds_theta_estimate(ds_theta_intersect_f(sketch1, sketch2)) |
+-----------------------------------------------------------+
| 5                                                         |
+-----------------------------------------------------------+

Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
---
M be/src/exprs/datasketches-common.cc
M be/src/exprs/datasketches-common.h
M be/src/exprs/datasketches-functions-ir.cc
M be/src/exprs/datasketches-functions.h
M common/function-registry/impala_functions.py
M testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
6 files changed, 157 insertions(+), 13 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/86/17186/5
-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 5
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>

[Impala-ASF-CR] IMPALA-10581: Implement ds theta intersect f() function

Posted by "Gabor Kaszab (Code Review)" <ge...@cloudera.org>.
Gabor Kaszab has posted comments on this change. ( http://gerrit.cloudera.org:8080/17186 )

Change subject: IMPALA-10581: Implement ds_theta_intersect_f() function
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17186/4/be/src/exprs/datasketches-functions-ir.cc
File be/src/exprs/datasketches-functions-ir.cc:

http://gerrit.cloudera.org:8080/#/c/17186/4/be/src/exprs/datasketches-functions-ir.cc@194
PS4, Line 194: /// Returns true when intersection_sketch is updated
Please add more comment about the use cases when this could return false. and also please mention that this might write error log as well.

Additionally this could also be added to the header file as well in my opinion.


http://gerrit.cloudera.org:8080/#/c/17186/4/be/src/exprs/datasketches-functions-ir.cc@195
PS4, Line 195: thera
typo: theta


http://gerrit.cloudera.org:8080/#/c/17186/4/be/src/exprs/datasketches-functions-ir.cc@223
PS4, Line 223:   //  Result
This comment is not needed


http://gerrit.cloudera.org:8080/#/c/17186/4/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test
File testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test:

http://gerrit.cloudera.org:8080/#/c/17186/4/testdata/workloads/functional-query/queries/QueryTest/datasketches-theta.test@560
PS4, Line 560: ====
I miss 2 tests here:
1: Where the non-empty input sketches are distinct and the result is empyt.
2: When the inputs aren't the same but has some (but not all) items common.

Could you please add coverage for these as well?



-- 
To view, visit http://gerrit.cloudera.org:8080/17186
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I335eada00730036d5433775cfe673e0e4babaa01
Gerrit-Change-Number: 17186
Gerrit-PatchSet: 4
Gerrit-Owner: Fucun Chu <ch...@hotmail.com>
Gerrit-Reviewer: Gabor Kaszab <ga...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Comment-Date: Thu, 25 Mar 2021 09:16:34 +0000
Gerrit-HasComments: Yes