You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Mahesh Reddy (Code Review)" <ge...@cloudera.org> on 2021/02/20 03:05:13 UTC

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schems with unbounded ranges.

Mahesh Reddy has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17090


Change subject: KUDU-2671: Adds compatibility for per range hash schems with unbounded ranges.
......................................................................

KUDU-2671: Adds compatibility for per range hash schems with unbounded ranges.

This patch updates the logic at the end of PartitionSchema::CreatePartiitons()
to allow per range hash schemas to be compatible with unbounded ranges. Some
additional context about the block of code is given below.

For the start partition key, it iterates in reverse order through the partition's
hash buckets. It breaks out of the loop at the first instance of a bucket not
equal to 0; If the bucket is equal to 0 it erases that part of the partition key.
Essentially, if all hash buckets are equal to 0, then it erases the entire key.

For the end partition key, it also iterates in reverse order through the
partition's hash buckets. It first erases the index portition of the
partition key. It then checks if the current hash bucket is the max
bucket of the current hash schema. If it is not the max, it encodes the
current hash bucket + 1 at the index portion of the key and breaks the loop.
If it is the max, it continues within the loop. Essentially, if all the
hash buckets are the max then it erases the entire key.

Prior to this change, this block of code assumed the same hash bucket
schema for each partition. With per range hash schemas, that may not
necessarily be the case. The vector 'partition_mapping' maps each
partition to the index of 'bounds_with_hash_schemas' to ensure the
correct hash bucket schema is used. '-1' is used to signify the use
of the table wide hash schema.

Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
---
M src/kudu/common/partition-test.cc
M src/kudu/common/partition.cc
2 files changed, 159 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/17090/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 1
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Mahesh Reddy (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17090

to look at the new patch set (#3).

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................

KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

This patch updates the logic at the end of PartitionSchema::CreatePartiitons()
to allow per range hash schemas to be compatible with unbounded ranges. Some
additional context about the block of code is given below.

For the start partition key, it iterates in reverse order through the partition's
hash buckets. It breaks out of the loop at the first instance of a bucket not
equal to 0; If the bucket is equal to 0 it erases that part of the partition key.
Essentially, if all hash buckets are equal to 0, then it erases the entire key.

For the end partition key, it also iterates in reverse order through the
partition's hash buckets. It first erases the index portition of the
partition key. It then checks if the current hash bucket is the max
bucket of the current hash schema. If it is not the max, it encodes the
current hash bucket + 1 at the index portion of the key and breaks the loop.
If it is the max, it continues within the loop. Essentially, if all the
hash buckets are the max then it erases the entire key.

Prior to this change, this block of code assumed the same hash bucket
schema for each partition. With per range hash schemas, that may not
necessarily be the case. The vector 'partition_mapping' maps each
partition to the index of 'bounds_with_hash_schemas' to ensure the
correct hash bucket schema is used. '-1' is used to signify the use
of the table wide hash schema.

Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
---
M src/kudu/common/partition-test.cc
M src/kudu/common/partition.cc
2 files changed, 171 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/17090/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 3
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17090 )

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................


Patch Set 4: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 23 Feb 2021 19:52:54 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17090 )

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc@521
PS2, Line 521: if (partition.range_key_start().empty()) {
             :       for (int i = static_cast<int>(partition.hash_buckets().size()) - 1; i >= 0; i--) {
             :         if (partition.hash_buckets()[i] != 0) {
             :           break;
             :         }
             :         partition.partition_key_start_.erase(kEncodedBucketSize * i);
             :       }
             :     }
nit: maybe comment

"Find the first zero-valued bucket from the end and truncate the partition key starting from that bucket onwards."


http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc@531
PS2, Line 531:         partition.partition_key_end_.erase(kEncodedBucketSize * i);
             :         int32_t hash_bucket = partition.hash_buckets()[i] + 1;
nit: maybe comment

"""
Starting from the last hash bucket, truncate the partition key until we hit the first non-max-valued bucket, at which point, we replace the encoding with the next-incremented bucket value. For example, the following range end partition keys should be transformed, where the key is (hash_comp1, hash_comp2, range_comp):

 [ (0, 0, "a2b2"), (0, 1, _) ) -> [ (0, 0, "a2b2"), (0, 1, _) )
 [ (0, 1, "a2b2"), (0, 1, _) ) -> [ (0, 1, "a2b2"), (1, _, _) )
 [ (1, 0, "a2b2"), (1, 0, _) ) -> [ (1, 0, "a2b2"), (1, 1, _) )
 [ (1, 1, "a2b2"), (1, 1, _) ) -> [ (1, 1, "a2b2"), (_, _, _) )
"""


http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc@533
PS2, Line 533: HashBucketSchema
Use a pointer, otherwise we're making a copies here.


http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc@537
PS2, Line 537: partition_mapping[j]
nit: how about calling this 'schemas_idx_by_partition_idx' or something? Or defining some

const auto& bounds_idx_for_partition = partition_mapping[j];



-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 2
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Mon, 22 Feb 2021 08:22:56 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Mahesh Reddy (Code Review)" <ge...@cloudera.org>.
Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17090 )

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17090/3/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/17090/3/src/kudu/common/partition.cc@523
PS3, Line 523: Finds
> nit: here and elsewhere, when describing code with in-line comments, we typ
So 'Find' rather than 'Finds? and 'truncate' rather than 'truncates'?


http://gerrit.cloudera.org:8080/#/c/17090/3/src/kudu/common/partition.cc@537
PS3, Line 537: 
             :     //
> nit: similar to other comments, separate with an empty comment line rather 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 3
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 23 Feb 2021 18:58:38 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Mahesh Reddy (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17090

to look at the new patch set (#2).

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................

KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

This patch updates the logic at the end of PartitionSchema::CreatePartiitons()
to allow per range hash schemas to be compatible with unbounded ranges. Some
additional context about the block of code is given below.

For the start partition key, it iterates in reverse order through the partition's
hash buckets. It breaks out of the loop at the first instance of a bucket not
equal to 0; If the bucket is equal to 0 it erases that part of the partition key.
Essentially, if all hash buckets are equal to 0, then it erases the entire key.

For the end partition key, it also iterates in reverse order through the
partition's hash buckets. It first erases the index portition of the
partition key. It then checks if the current hash bucket is the max
bucket of the current hash schema. If it is not the max, it encodes the
current hash bucket + 1 at the index portion of the key and breaks the loop.
If it is the max, it continues within the loop. Essentially, if all the
hash buckets are the max then it erases the entire key.

Prior to this change, this block of code assumed the same hash bucket
schema for each partition. With per range hash schemas, that may not
necessarily be the case. The vector 'partition_mapping' maps each
partition to the index of 'bounds_with_hash_schemas' to ensure the
correct hash bucket schema is used. '-1' is used to signify the use
of the table wide hash schema.

Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
---
M src/kudu/common/partition-test.cc
M src/kudu/common/partition.cc
2 files changed, 158 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/17090/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 2
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17090 )

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................


Patch Set 4: Verified+1

The python failure doesn't seem to be relatd.


-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Wed, 24 Feb 2021 00:05:32 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Mahesh Reddy (Code Review)" <ge...@cloudera.org>.
Mahesh Reddy has posted comments on this change. ( http://gerrit.cloudera.org:8080/17090 )

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc@521
PS2, Line 521: r (int j = 0; j < partitions->size(); j++) {
             :     Partition& partition = (*partitions)[j];
             :     // Finds the first zero-valued bucket from the end and truncates the partition key
             :     // starting from that bucket onwards for zero-valued buckets.
             :     if (partition.range_key_start().empty()) {
             :       for (int i = static_cast<int>(partition.hash_buckets().size()) - 1; i >= 0; i--) {
             :         if (partition.hash_buckets()[i] != 0) {
             :      
> nit: maybe comment
Done


http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc@531
PS2, Line 531:       }
             :     }
> nit: maybe comment
Done


http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc@533
PS2, Line 533: tarting from the
> Use a pointer, otherwise we're making a copies here.
Done


http://gerrit.cloudera.org:8080/#/c/17090/2/src/kudu/common/partition.cc@537
PS2, Line 537: 
> nit: how about calling this 'schemas_idx_by_partition_idx' or something? Or
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 3
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 23 Feb 2021 01:44:13 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17090 )

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17090/3/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/17090/3/src/kudu/common/partition.cc@523
PS3, Line 523: Find 
> So 'Find' rather than 'Finds? and 'truncate' rather than 'truncates'?
Right.



-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 23 Feb 2021 19:53:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Mahesh Reddy (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17090

to look at the new patch set (#4).

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................

KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

This patch updates the logic at the end of PartitionSchema::CreatePartiitons()
to allow per range hash schemas to be compatible with unbounded ranges. Some
additional context about the block of code is given below.

For the start partition key, it iterates in reverse order through the partition's
hash buckets. It breaks out of the loop at the first instance of a bucket not
equal to 0; If the bucket is equal to 0 it erases that part of the partition key.
Essentially, if all hash buckets are equal to 0, then it erases the entire key.

For the end partition key, it also iterates in reverse order through the
partition's hash buckets. It first erases the index portition of the
partition key. It then checks if the current hash bucket is the max
bucket of the current hash schema. If it is not the max, it encodes the
current hash bucket + 1 at the index portion of the key and breaks the loop.
If it is the max, it continues within the loop. Essentially, if all the
hash buckets are the max then it erases the entire key.

Prior to this change, this block of code assumed the same hash bucket
schema for each partition. With per range hash schemas, that may not
necessarily be the case. The vector 'partition_idx_to_hash_schemas_idx'
maps each partition to the index of 'bounds_with_hash_schemas' to ensure
the correct hash bucket schema is used. '-1' is used to signify the use
of the table wide hash schema.

Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
---
M src/kudu/common/partition-test.cc
M src/kudu/common/partition.cc
2 files changed, 172 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/90/17090/4
-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17090 )

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................


Patch Set 3: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17090/3/src/kudu/common/partition.cc
File src/kudu/common/partition.cc:

http://gerrit.cloudera.org:8080/#/c/17090/3/src/kudu/common/partition.cc@523
PS3, Line 523: Finds
nit: here and elsewhere, when describing code with in-line comments, we typically use imperative tense rather than present tense, and reserve present tense for comments that describe functions and members in the declarations/headers.


http://gerrit.cloudera.org:8080/#/c/17090/3/src/kudu/common/partition.cc@537
PS3, Line 537: 
             :     //
nit: similar to other comments, separate with an empty comment line rather than an entirely empty line.



-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 3
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Tue, 23 Feb 2021 07:24:09 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17090 )

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................

KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

This patch updates the logic at the end of PartitionSchema::CreatePartiitons()
to allow per range hash schemas to be compatible with unbounded ranges. Some
additional context about the block of code is given below.

For the start partition key, it iterates in reverse order through the partition's
hash buckets. It breaks out of the loop at the first instance of a bucket not
equal to 0; If the bucket is equal to 0 it erases that part of the partition key.
Essentially, if all hash buckets are equal to 0, then it erases the entire key.

For the end partition key, it also iterates in reverse order through the
partition's hash buckets. It first erases the index portition of the
partition key. It then checks if the current hash bucket is the max
bucket of the current hash schema. If it is not the max, it encodes the
current hash bucket + 1 at the index portion of the key and breaks the loop.
If it is the max, it continues within the loop. Essentially, if all the
hash buckets are the max then it erases the entire key.

Prior to this change, this block of code assumed the same hash bucket
schema for each partition. With per range hash schemas, that may not
necessarily be the case. The vector 'partition_idx_to_hash_schemas_idx'
maps each partition to the index of 'bounds_with_hash_schemas' to ensure
the correct hash bucket schema is used. '-1' is used to signify the use
of the table wide hash schema.

Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Reviewed-on: http://gerrit.cloudera.org:8080/17090
Reviewed-by: Andrew Wong <aw...@cloudera.com>
Tested-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/common/partition-test.cc
M src/kudu/common/partition.cc
2 files changed, 172 insertions(+), 5 deletions(-)

Approvals:
  Andrew Wong: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 5
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has removed a vote on this change.

Change subject: KUDU-2671: Adds compatibility for per range hash schemas with unbounded ranges.
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/17090
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I5f6c709e211359b04f7597af5f670c787bda7481
Gerrit-Change-Number: 17090
Gerrit-PatchSet: 4
Gerrit-Owner: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mahesh Reddy <mr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot (241)