You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org> on 2020/03/05 21:19:26 UTC

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Bankim Bhavsar has uploaded this change for review. ( http://gerrit.cloudera.org:8080/15373


Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................

[util] Import "Or" function to BlockBloomFilter from Impala

Impala will be switching to using the Block Bloom filter from kudu-util.
"Or" function was missing and this change adds it.

Note that original implementation for OrEqualArrayAvx() in Impala is
targeted for AVX and not AVX2, however AVX2 is super-set of AVX instructions
and there is already provision in the Block Bloom filter to separate out
AVX2 v/s non-AVX2 (SSE) code. Hence don't see need to add separate AVX
specific file/implementation.

Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
---
M src/kudu/util/block_bloom_filter-test.cc
M src/kudu/util/block_bloom_filter.cc
M src/kudu/util/block_bloom_filter.h
M src/kudu/util/block_bloom_filter_avx2.cc
4 files changed, 101 insertions(+), 1 deletion(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/15373/1
-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 1
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 3: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h
File src/kudu/util/block_bloom_filter.h:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@153
PS1, Line 153: te
> I implemented operator |= after your feedback. However after incorporating 
Ah, I see.  That makes sense to me.


http://gerrit.cloudera.org:8080/#/c/15373/3/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/3/src/kudu/util/block_bloom_filter.cc@296
PS3, Line 296: CHECK_NE
BTW, why not to use the same approach as with difference in directory size here, i.e. return Status::InvalidArgument() in this case well?



-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Mar 2020 04:03:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, Tim Armstrong, Wenzhe Zhou, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15373

to look at the new patch set (#2).

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................

[util] Import "Or" function to BlockBloomFilter from Impala

Impala will be switching to using the Block Bloom filter from kudu-util.
"Or" function was missing and this change adds it.

Note that original implementation for OrEqualArrayAvx() in Impala is
targeted for AVX and not AVX2, however AVX2 is super-set of AVX instructions
and there is already provision in the Block Bloom filter to separate out
AVX2 v/s non-AVX2 (SSE) code. Hence don't see need to add separate AVX
specific file/implementation.

Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
---
M src/kudu/util/block_bloom_filter-test.cc
M src/kudu/util/block_bloom_filter.cc
M src/kudu/util/block_bloom_filter.h
M src/kudu/util/block_bloom_filter_avx2.cc
4 files changed, 132 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/15373/2
-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................

[util] Import "Or" function to BlockBloomFilter from Impala

Impala will be switching to using the Block Bloom filter from kudu-util.
"Or" function was missing and this change adds it.

Note that original implementation for OrEqualArrayAvx() in Impala is
targeted for AVX and not AVX2, however AVX2 is super-set of AVX instructions
and there is already provision in the Block Bloom filter to separate out
AVX2 v/s non-AVX2 (SSE) code. Hence don't see need to add separate AVX
specific file/implementation.

Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Reviewed-on: http://gerrit.cloudera.org:8080/15373
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/util/block_bloom_filter-test.cc
M src/kudu/util/block_bloom_filter.cc
M src/kudu/util/block_bloom_filter.h
M src/kudu/util/block_bloom_filter_avx2.cc
4 files changed, 134 insertions(+), 3 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Adar Dembo: Looks good to me, approved
  Alexey Serbin: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 4
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, Adar Dembo, Tim Armstrong, Wenzhe Zhou, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/15373

to look at the new patch set (#3).

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................

[util] Import "Or" function to BlockBloomFilter from Impala

Impala will be switching to using the Block Bloom filter from kudu-util.
"Or" function was missing and this change adds it.

Note that original implementation for OrEqualArrayAvx() in Impala is
targeted for AVX and not AVX2, however AVX2 is super-set of AVX instructions
and there is already provision in the Block Bloom filter to separate out
AVX2 v/s non-AVX2 (SSE) code. Hence don't see need to add separate AVX
specific file/implementation.

Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
---
M src/kudu/util/block_bloom_filter-test.cc
M src/kudu/util/block_bloom_filter.cc
M src/kudu/util/block_bloom_filter.h
M src/kudu/util/block_bloom_filter_avx2.cc
4 files changed, 134 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/15373/3
-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15373/3/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/3/src/kudu/util/block_bloom_filter.cc@296
PS3, Line 296: CHECK_NE
> BTW, why not to use the same approach as with difference in directory size 
For the reasons mentioned in the comment above.
  // Moreover for a reference "other" to be an AlwaysTrueFilter the reference needs
  // to be created from a nullptr and so we get into undefined behavior territory.
  // Comparing AlwaysTrueFilter with "&other" results in a compiler warning for
  // comparing a non-null argument "other" with NULL [-Wnonnull-compare].



-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Mar 2020 17:51:35 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter-test.cc
File src/kudu/util/block_bloom_filter-test.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter-test.cc@304
PS1, Line 304:   for (int i = 11; i < 50; ++i) ASSERT_TRUE(bf1->Find(i));
if this fails, how to know what 'i' it was?


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h
File src/kudu/util/block_bloom_filter.h:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@153
PS1, Line 153: Or
For more elegant syntax alternative, maybe introduce
  
  BlockBloomFilter& operator |=(const BlockBloomFilter& other)

as well?


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@229
PS1, Line 229: AVX
AVX2 ?


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@287
PS1, Line 287: DCHECK_NE(&other, kAlwaysTrueFilter);
Why is this disallowed?  Could you add a short explanation?


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@291
PS1, Line 291: DCHECK_EQ
BTW, is this Or() method protected from inputs from the wild that might not come under radar in DEBUG case?  In other words, why DCHECK_EQ() is enough and CHECK_EQ() is not needed here?


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter_avx2.cc
File src/kudu/util/block_bloom_filter_avx2.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter_avx2.cc@86
PS1, Line 86: DCHECK_EQ
Is it guaranteed that debug builds provide enough coverage to make sure n % kAVXRegisterBytes == 0 in release builds?



-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 1
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Fri, 06 Mar 2020 07:28:08 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Mar 2020 06:17:09 +0000
Gerrit-HasComments: No

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@50
PS1, Line 50: constexpr BlockBloomFilter* const BlockBloomFilter::kAlwaysTrueFilter;
> Since the variable is constexpr it needs to be initialized in the class dec
Could you add a comment around the declaration (and use of null) to explain that the value doesn't matter?



-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Mon, 09 Mar 2020 22:51:25 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h
File src/kudu/util/block_bloom_filter.h:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@161
PS1, Line 161:   static constexpr BlockBloomFilter* const kAlwaysTrueFilter = nullptr;
Does this need to be public? Why?


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@50
PS1, Line 50: constexpr BlockBloomFilter* const BlockBloomFilter::kAlwaysTrueFilter;
Does this need to be initialized to something meaningful? In the header it's null; is that intentional?


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@292
PS1, Line 292: reinterpret_cast<uint8_t*>(other.directory_)
Nit: push this to the next line so the symmetry with the second reinterpret_cast is more obvious?



-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 1
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Thu, 05 Mar 2020 23:31:56 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@50
PS1, Line 50: constexpr BlockBloomFilter* const BlockBloomFilter::kAlwaysTrueFilter;
> Could you add a comment around the declaration (and use of null) to explain
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 2
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Mon, 09 Mar 2020 23:25:54 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/15373/3/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/3/src/kudu/util/block_bloom_filter.cc@296
PS3, Line 296: CHECK_NE
> For the reasons mentioned in the comment above.
Ah, I thought that part was for the code below when it gets to if (this == &other), etc.

Basically, that's about passing a reference to a null object, which is not possible unless it's artificially crafted.

Thank you for the clarification!



-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 3
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Tue, 10 Mar 2020 19:52:04 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] Import "Or" function to BlockBloomFilter from Impala

Posted by "Bankim Bhavsar (Code Review)" <ge...@cloudera.org>.
Bankim Bhavsar has posted comments on this change. ( http://gerrit.cloudera.org:8080/15373 )

Change subject: [util] Import "Or" function to BlockBloomFilter from Impala
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter-test.cc
File src/kudu/util/block_bloom_filter-test.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter-test.cc@304
PS1, Line 304:   for (int i = 11; i < 50; ++i) ASSERT_TRUE(bf1->Find(i));
> if this fails, how to know what 'i' it was?
Good catch. This one was missed.


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h
File src/kudu/util/block_bloom_filter.h:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@153
PS1, Line 153: Or
> For more elegant syntax alternative, maybe introduce
I implemented operator |= after your feedback. However after incorporating other feedback related to invalid input arguments, changed return datatype of Or() function from void to Status. So implementing operator |= won't be possible unless we throw exception for invalid input case. Hence not implementing operator |=.


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@161
PS1, Line 161:   static constexpr BlockBloomFilter* const kAlwaysTrueFilter = nullptr;
> Does this need to be public? Why?
Yes.
Impala uses special case of an always true Bloom filter.
Some examples:
https://github.com/apache/impala/blob/master/be/src/exec/partitioned-hash-join-builder.cc#L805
https://github.com/apache/impala/blob/master/be/src/runtime/runtime-filter-bank.cc#L340


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.h@229
PS1, Line 229: AVX
> AVX2 ?
Done


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc
File src/kudu/util/block_bloom_filter.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@50
PS1, Line 50: constexpr BlockBloomFilter* const BlockBloomFilter::kAlwaysTrueFilter;
> Does this need to be initialized to something meaningful? In the header it'
Since the variable is constexpr it needs to be initialized in the class declaration in the header and since it's a static it needs to be defined here in the .cc file.
This puzzled me as well but not initializing to nullptr in the class declaration led to a warning and initializing to nullptr in .cc file also led to a warning.
Eventually I just followed the code in Impala and no more warnings :).

https://github.com/apache/impala/blob/master/be/src/util/bloom-filter.h#L102
https://github.com/apache/impala/blob/master/be/src/util/bloom-filter.cc#L41


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@287
PS1, Line 287: DCHECK_NE(&other, kAlwaysTrueFilter);
> Why is this disallowed?  Could you add a short explanation?
Done


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter.cc@291
PS1, Line 291: DCHECK_EQ
> BTW, is this Or() method protected from inputs from the wild that might not
My bad, I copied the Impala implementation without giving more thought. I've changed the return type Status to return back an error in such cases.


http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter_avx2.cc
File src/kudu/util/block_bloom_filter_avx2.cc:

http://gerrit.cloudera.org:8080/#/c/15373/1/src/kudu/util/block_bloom_filter_avx2.cc@86
PS1, Line 86: DCHECK_EQ
> Is it guaranteed that debug builds provide enough coverage to make sure n %
Yes, checking in debug provides enough coverage. This is checking that the size of the directory structure is multiple of 32 bytes.
Directory consists of multiple buckets (at least 1) where each bucket comprises of 8 BucketWords and each BucketWord is 4 bytes.
https://github.com/apache/kudu/blob/master/src/kudu/util/block_bloom_filter.h#L157



-- 
To view, visit http://gerrit.cloudera.org:8080/15373
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibe5f9311f73dcff883dd2cce18fd558e7d57d14f
Gerrit-Change-Number: 15373
Gerrit-PatchSet: 1
Gerrit-Owner: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Bankim Bhavsar <ba...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Wenzhe Zhou <wz...@cloudera.com>
Gerrit-Comment-Date: Mon, 09 Mar 2020 22:47:42 +0000
Gerrit-HasComments: Yes