You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Dan Burkert (Code Review)" <ge...@cloudera.org> on 2016/08/30 22:27:36 UTC

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Hello Matthew Jacobs, Adar Dembo, Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/4173

to review the following change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................

[java-client] Add ScanToken.stringifySerializedToken

stringifySerializedToken takes a serialized scan token, and returns a
String suitable for debug printing. The string contains information
sufficient to determine which range of tablet(s) the scan token will
cover, and which rows within the tablets. Namely, the range partition
bounds and primary key bounds. Example output, wrapped for readability
(a, b, and c are column names with type STRING):

ScanToken{table=org.apache.kudu.client.TestKuduClient-1472595465767,
          lower-bound-primary-key=(string a=1, string b=3, string c=5),
          upper-bound-primary-key=(string a=2, string b=4, string c=),
          hash-partition-buckets: [2],
          range-partition: [(string a=0, string b=0, string c=0),
                            (string a=3, string b=5, string c=6))}

The Java client did not have any method of deserializing encoded primary
or partition keys, so most of the work in this commit is introducing
utility methods for that purpose. I haven't added tests of the specific
format of the strings, but I have added the printing to many of the
existing ScanToken tests in order to make sure that the formatting code
itself doesn't fail. I've also verified that the output looks good. The
format doesn't include information like predicates or consistency, but
that could be added in the future if so desired.

Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
---
M java/kudu-client/src/main/java/org/apache/kudu/client/DeadlineTracker.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
M java/kudu-client/src/main/java/org/apache/kudu/client/RowResult.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKeyEncoding.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestOperation.java
M java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java
10 files changed, 452 insertions(+), 81 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/4173/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 2: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has submitted this change and it was merged.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


[java-client] Add ScanToken.stringifySerializedToken

stringifySerializedToken takes a serialized scan token, and returns a
String suitable for debug printing. The string contains information
sufficient to determine which range of tablet(s) the scan token will
cover, and which rows within the tablets. Namely, the range partition
bounds and primary key bounds. Example output, wrapped for readability
(a, b, and c are column names with type STRING):

ScanToken{table=org.apache.kudu.client.TestKuduClient-1472595465767,
          lower-bound-primary-key=(string a=1, string b=3, string c=5),
          upper-bound-primary-key=(string a=2, string b=4, string c=),
          hash-partition-buckets: [2],
          range-partition: [(string a=0, string b=0, string c=0),
                            (string a=3, string b=5, string c=6))}

The Java client did not have any method of deserializing encoded primary
or partition keys, so most of the work in this commit is introducing
utility methods for that purpose. I haven't added tests of the specific
format of the strings, but I have added the printing to many of the
existing ScanToken tests in order to make sure that the formatting code
itself doesn't fail. I've also verified that the output looks good. The
format doesn't include information like predicates or consistency, but
that could be added in the future if so desired.

Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Reviewed-on: http://gerrit.cloudera.org:8080/4173
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert <da...@cloudera.com>
---
M java/kudu-client/src/main/java/org/apache/kudu/client/DeadlineTracker.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
M java/kudu-client/src/main/java/org/apache/kudu/client/RowResult.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKeyEncoding.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestOperation.java
M java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java
10 files changed, 454 insertions(+), 81 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 2: Code-Review+2

Leaving open in case MJ has more comments.

-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 1:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/4173/1//COMMIT_MSG
Commit Message:

PS1, Line 25: I haven't added tests of the specific
            : format of the strings
> If we're considering this to be a stable-ish interface, or at least enough 
I think this is just for debug printing, at least according to what Dan writes above. I'd be nervous about making it stable. What would be the use case?


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java:

Line 313:   private static byte[] decodeNonTerminalBinaryColumn(ByteBuffer key) {
Would be nice to see a comment explaining why the below code is the way it is, possibly including a description of the binary layout in 'key'.


PS1, Line 364:     Pair<List<Integer>, PartialRow> lower = decodePartitionKey(schema, partitionSchema, lowerBound);
             :     Pair<List<Integer>, PartialRow> upper = decodePartitionKey(schema, partitionSchema, upperBound);
Maybe some sort of assertion that lower and upper agree on hash buckets?

Alternatively, don't bother decoding the hash buckets of the upper bound.


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java:

Line 102:   public static KuduScanner deserializeIntoScanner(byte[] buf, KuduClient client) throws IOException {
Does this need to be in the release notes?


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/main/java/org/apache/kudu/client/RowResult.java
File java/kudu-client/src/main/java/org/apache/kudu/client/RowResult.java:

Line 19: import com.google.common.base.Preconditions;
Where is this used?


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java:

Line 413:       LOG.info(KuduScanToken.stringifySerializedToken(token.serialize(), syncClient));
Not debug() here?


-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/4173/1//COMMIT_MSG
Commit Message:

PS1, Line 16: ScanToken
> I'm having a hard time understanding this example- what would the partition
The partition that the token covers is determined by the hash buckets and range partition bounds.


PS1, Line 17: lower-bound-primary-key=(string a=1, string b=3, string c=5),
            :           upper-bound-primary-key=(string a=2, string b=4, string c=)
> If I'm not mistaken, these are implied by the partitioning, right? I guess 
The primary key bounds aren't very useful to Impala currently, but they will be once token splitting is implemented and used.


PS1, Line 25: I haven't added tests of the specific
            : format of the strings
> The motivation for this was that Impala has a suite of planner tests that v
I'm not crazy about committing to a specific format, but I think it's fine to say that this will be a string that is useful for debugging scan token issues.  I was under the impression that this was not something that Impala will be parsing?


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java:

Line 313:     // bytes in the input are escaped as 0x0001.
> Would be nice to see a comment explaining why the below code is the way it 
Done


PS1, Line 364:     // Even though we parse hash buckets for the upper and lower bound partition
             :     // keys, we only use the lower bound set. Upper bound partition keys are
> Maybe some sort of assertion that lower and upper agree on hash buckets?
The upper bound partition key of a tablet is exclusive, and therefore may contain an incremented hash bucket, for a concrete example see https://github.com/apache/kudu/blob/master/src/kudu/common/partition-test.cc#L474.

I agree throwing away the upper bound hash buckets is kind of weird, but the alternative of just parsing out the range portion is not trivial since the key must be checked for sufficient length.  I think it's easier to just leave it and reuse that logic in decodePartitionKey.


PS1, Line 401: (
> ]
We use this notation to indicate that its a lower-inclusive/upper-exclusive range.


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java:

Line 102:   public static KuduScanner deserializeIntoScanner(byte[] buf, KuduClient client) throws IOException {
> Does this need to be in the release notes?
eh maybe.


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/main/java/org/apache/kudu/client/RowResult.java
File java/kudu-client/src/main/java/org/apache/kudu/client/RowResult.java:

Line 19: import org.apache.kudu.ColumnSchema;
> Where is this used?
Done


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
File java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java:

Line 413:       LOG.debug(KuduScanToken.stringifySerializedToken(token.serialize(), syncClient));
> Not debug() here?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 3:

Build Started http://104.196.14.100/job/kudu-gerrit/3186/

-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 1:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/4173/1//COMMIT_MSG
Commit Message:

PS1, Line 16: ScanToken
I'm having a hard time understanding this example- what would the partitioning be here?


PS1, Line 17: lower-bound-primary-key=(string a=1, string b=3, string c=5),
            :           upper-bound-primary-key=(string a=2, string b=4, string c=)
If I'm not mistaken, these are implied by the partitioning, right? I guess these are less actionable, and maybe less useful... I wonder if it'd be more useful to put these after the hash buckets and range partitions.


PS1, Line 25: I haven't added tests of the specific
            : format of the strings
If we're considering this to be a stable-ish interface, or at least enough that Impala can depend on it, perhaps a few tests for the format of the strings would be worthwhile.


http://gerrit.cloudera.org:8080/#/c/4173/1/java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
File java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java:

PS1, Line 401: )
]

this should close the [ on l385.


-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 2:

Build Started http://104.196.14.100/job/kudu-gerrit/3171/

-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4173/1//COMMIT_MSG
Commit Message:

PS1, Line 25: I haven't added tests of the specific
            : format of the strings
> I'm not crazy about committing to a specific format, but I think it's fine 
Not parsing but if it changes it could break our tests. That's not the end of the world, of course.


-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Kudu Jenkins (Code Review)" <ge...@cloudera.org>.
Kudu Jenkins has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 1:

Build Started http://104.196.14.100/job/kudu-gerrit/3155/

-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Matthew Jacobs (Code Review)" <ge...@cloudera.org>.
Matthew Jacobs has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/4173/1//COMMIT_MSG
Commit Message:

PS1, Line 25: I haven't added tests of the specific
            : format of the strings
> I think this is just for debug printing, at least according to what Dan wri
The motivation for this was that Impala has a suite of planner tests that validate generated plans. One component of this is checking that the Impala scans are being placed as expected, so the relevant files (post-partition pruning) are validated. We want to also be able to check that Kudu-based table scans are going to end up in the right place.

In short, Impala would like to depend on this. See https://gerrit.cloudera.org/#/c/4120/

However, if you don't like the idea of making this method stable, breaking it up a bit would also be very helpful to us- e.g. to expose some of the more specific to-string methods as a public interface so Impala can call them (and essentially write our own version of stringifySerializedToken()).


-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Matthew Jacobs, Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4173

to look at the new patch set (#3).

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................

[java-client] Add ScanToken.stringifySerializedToken

stringifySerializedToken takes a serialized scan token, and returns a
String suitable for debug printing. The string contains information
sufficient to determine which range of tablet(s) the scan token will
cover, and which rows within the tablets. Namely, the range partition
bounds and primary key bounds. Example output, wrapped for readability
(a, b, and c are column names with type STRING):

ScanToken{table=org.apache.kudu.client.TestKuduClient-1472595465767,
          lower-bound-primary-key=(string a=1, string b=3, string c=5),
          upper-bound-primary-key=(string a=2, string b=4, string c=),
          hash-partition-buckets: [2],
          range-partition: [(string a=0, string b=0, string c=0),
                            (string a=3, string b=5, string c=6))}

The Java client did not have any method of deserializing encoded primary
or partition keys, so most of the work in this commit is introducing
utility methods for that purpose. I haven't added tests of the specific
format of the strings, but I have added the printing to many of the
existing ScanToken tests in order to make sure that the formatting code
itself doesn't fail. I've also verified that the output looks good. The
format doesn't include information like predicates or consistency, but
that could be added in the future if so desired.

Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
---
M java/kudu-client/src/main/java/org/apache/kudu/client/DeadlineTracker.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
M java/kudu-client/src/main/java/org/apache/kudu/client/RowResult.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKeyEncoding.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestOperation.java
M java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java
10 files changed, 454 insertions(+), 81 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/4173/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................


Patch Set 3:

Turns out adding/dropping a checked exception does *not* break binary compat in Java, and since this was changing Exception -> IOException, it doesn't break source as well.

-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] [java-client] Add ScanToken.stringifySerializedToken

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4173

to look at the new patch set (#2).

Change subject: [java-client] Add ScanToken.stringifySerializedToken
......................................................................

[java-client] Add ScanToken.stringifySerializedToken

stringifySerializedToken takes a serialized scan token, and returns a
String suitable for debug printing. The string contains information
sufficient to determine which range of tablet(s) the scan token will
cover, and which rows within the tablets. Namely, the range partition
bounds and primary key bounds. Example output, wrapped for readability
(a, b, and c are column names with type STRING):

ScanToken{table=org.apache.kudu.client.TestKuduClient-1472595465767,
          lower-bound-primary-key=(string a=1, string b=3, string c=5),
          upper-bound-primary-key=(string a=2, string b=4, string c=),
          hash-partition-buckets: [2],
          range-partition: [(string a=0, string b=0, string c=0),
                            (string a=3, string b=5, string c=6))}

The Java client did not have any method of deserializing encoded primary
or partition keys, so most of the work in this commit is introducing
utility methods for that purpose. I haven't added tests of the specific
format of the strings, but I have added the printing to many of the
existing ScanToken tests in order to make sure that the formatting code
itself doesn't fail. I've also verified that the output looks good. The
format doesn't include information like predicates or consistency, but
that could be added in the future if so desired.

Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
---
M docs/release_notes.adoc
M java/kudu-client/src/main/java/org/apache/kudu/client/DeadlineTracker.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KeyEncoder.java
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/PartialRow.java
M java/kudu-client/src/main/java/org/apache/kudu/client/RowResult.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestFlexiblePartitioning.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKeyEncoding.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestKuduClient.java
M java/kudu-client/src/test/java/org/apache/kudu/client/TestOperation.java
M java/kudu-mapreduce/src/main/java/org/apache/kudu/mapreduce/KuduTableInputFormat.java
11 files changed, 457 insertions(+), 81 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/73/4173/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4173
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I42014da270e66c370cc6d6ed68fb38331130ff6d
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Dan Burkert <da...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Matthew Jacobs <mj...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>