You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "eric-maynard (Code Review)" <ge...@cloudera.org> on 2016/11/30 23:24:12 UTC

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

eric-maynard has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/5291

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................

KUDU-1422 [java client] modifiable error collector capacity

Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
M java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java
2 files changed, 16 insertions(+), 1 deletion(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/5291/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "eric-maynard (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5291

to look at the new patch set (#2).

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................

KUDU-1422 [java client] modifiable error collector capacity

Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
M java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java
2 files changed, 23 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/5291/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................


Patch Set 2:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5291/2/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java:

Line 213:    * Sets the maximum capacity of this session's ErrorCollector.
I'm a little nervous about this public API, come to think of it, because typically people want to bound memory usage, not bound the count of items (when they don't know how large they could be). That change is in-flight on the C++ side to go to memory-based error limiting, and I'm worried about introducing this API here if we know it's not the one we want to maintain long-term (APIs are forever).

Would you be interested in making the capacity size-based instead of count-based?


http://gerrit.cloudera.org:8080/#/c/5291/2/java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java
File java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java:

Line 43:     Preconditions.checkArgument(maxCapacity > 0, "Error collector needs to be able to store at least one row error");
why not refactor this to use setMaxCapacity? Also, looking at this, I wonder why we don't support setting it to 0


Line 79:     this.maxCapacity = maxCapacity;
if you're reducing capacity, would make sense to trim the current queue down to the specified capacity, no?


-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................


Patch Set 1:

(5 comments)

should also add a test case for this patch

http://gerrit.cloudera.org:8080/#/c/5291/1/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java:

Line 207: 	this.errorCollector.setMaxCapacity(size);
still think it should be a separate setter.


http://gerrit.cloudera.org:8080/#/c/5291/1/java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java
File java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java:

Line 69:   
nit: remove trailing whitespace


PS1, Line 73: RuntimeException
should throw illegal argument exception


Line 76:     if (maxCapacity > 0) {
typically we check for error cases first and "early exit", like:

if (maxCapacity <= 0) {
  throw ...
}
this.maxCapacity = maxCapacity

this style makes it so that the "normal path" is the one that's least indented.


PS1, Line 79: 	else {
            :       throw new RuntimeException("Need to be able to store at least one row error");
            : 	}
nit: style issues here (no tabs, fix indentation)


-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "eric-maynard (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5291

to look at the new patch set (#4).

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................

KUDU-1422 [java client] modifiable error collector capacity

Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
M java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java
2 files changed, 25 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/5291/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: eric-maynard <em...@cloudera.com>

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "eric-maynard (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5291

to look at the new patch set (#3).

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................

KUDU-1422 [java client] modifiable error collector capacity

Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
M java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java
2 files changed, 23 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/5291/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "Jean-Daniel Cryans (Code Review)" <ge...@cloudera.org>.
Jean-Daniel Cryans has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................


Patch Set 4:

I'm a fan of offering both options. I think direct users will normally want to use the row count limit, so the memory size limit should be high enough to not surprise them if they receive few row errors than expected.

-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: eric-maynard <em...@cloudera.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "eric-maynard (Code Review)" <ge...@cloudera.org>.
eric-maynard has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................


Patch Set 5:

The build error seems unrelated (?):

09:29:53   Error: chose Boost outside
09:29:53   /home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/common:
09:29:53   /home/jenkins-slave/workspace/kudu-master/0/thirdparty/installed/uninstrumented/include
09:29:53 
09:29:53 
09:29:53 -- Configuring incomplete, errors occurred!

-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: eric-maynard <em...@cloudera.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "eric-maynard (Code Review)" <ge...@cloudera.org>.
eric-maynard has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................


Patch Set 5:

Added trimming when error queue capacity is decreased

-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: eric-maynard <em...@cloudera.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5291/2/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java:

Line 213:    * Sets the maximum capacity of this session's ErrorCollector.
> My thinking is that you'd actually want to specify the number of errors you
I can see the argument for both size-based and count-based capacity.

We used sized-based capacity in the C++ client at the request of Impala, who wanted to upper bound the memory consumption of a particular query against Kudu. Writ large, I think size-based capacity makes sense when the user of the client is an application.

On the other hand, as Eric points out, count-based capacity can be useful when the user of the client is a user who expects to interact directly with errors rather than adapting them into something else.

Another option is to implement both approaches and consider either bound when collecting a new error. What do you guys think?


-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: eric-maynard <em...@cloudera.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "eric-maynard (Code Review)" <ge...@cloudera.org>.
eric-maynard has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/5291/1/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java:

Line 207:     this.mutationBufferSpace = size;
> still think it should be a separate setter.
Done


http://gerrit.cloudera.org:8080/#/c/5291/2/java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java
File java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java:

Line 43:     Preconditions.checkArgument(maxCapacity > 0, "Error collector needs to be able to store at least one row error");
> why not refactor this to use setMaxCapacity? Also, looking at this, I wonde
Good idea about using {{setMaxCapacity}}. I'm not sure why you would want a size of 0, but it wouldn't break anything like using a negative number would so we can make that change if you think it's best.


Line 79:     this.maxCapacity = maxCapacity;
> if you're reducing capacity, would make sense to trim the current queue dow
As it is now, you'd be prevented from adding errors after lowering the capacity, but you wouldn't throw out any existing errors that might be in the queue. I'm not sure if you'd want to dispose of errors you'd already captured, but if you'd prefer it that way let me know.


-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: eric-maynard <em...@cloudera.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "eric-maynard (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5291

to look at the new patch set (#5).

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................

KUDU-1422 [java client] modifiable error collector capacity

Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
---
M java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
M java/kudu-client/src/main/java/org/apache/kudu/client/ErrorCollector.java
2 files changed, 28 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/91/5291/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: eric-maynard <em...@cloudera.com>

[kudu-CR] KUDU-1422 [java client] modifiable error collector capacity

Posted by "eric-maynard (Code Review)" <ge...@cloudera.org>.
eric-maynard has posted comments on this change.

Change subject: KUDU-1422 [java client] modifiable error collector capacity
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5291/2/java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java
File java/kudu-client/src/main/java/org/apache/kudu/client/AsyncKuduSession.java:

Line 213:    * Sets the maximum capacity of this session's ErrorCollector.
> I'm a little nervous about this public API, come to think of it, because ty
My thinking is that you'd actually want to specify the number of errors you're capturing, not necessarily the size of the buffer that stores the RowError objects (since the string message of the RowError can be any arbitrary length, and therefore a given RowError can be any arbitrary size). I know that if I wanted to capture the first 10 errors, I would have no idea what to set the memory usage to. 

I can see how you also might want to limit the memory usage of the buffer though. I'm not sure which approach I think is better.


-- 
To view, visit http://gerrit.cloudera.org:8080/5291
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae382050eb5eaf9c169dd3ef7ed4dbf5b5f1c334
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: eric-maynard <em...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: eric-maynard <em...@cloudera.com>
Gerrit-HasComments: Yes