You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Will Berkeley (Code Review)" <ge...@cloudera.org> on 2017/08/10 17:01:57 UTC

[kudu-CR] KUDU-2078: Sink failure if batch size > session's flush buffer size

Will Berkeley has uploaded a new change for review.

  http://gerrit.cloudera.org:8080/7641

Change subject: KUDU-2078: Sink failure if batch size > session's flush buffer size
......................................................................

KUDU-2078: Sink failure if batch size > session's flush buffer size

The Flume sink uses manual flush mode, so if users set the
sink's batch size parameter above the manual flush default
buffer size, the sink could fail batches (over and over). This
patch sets the session's buffer size (which is in terms of
number of ops) to the same as the batch size, so this problem
can no longer occur.

I considered using AUTO_FLUSH_BACKGROUND for the flushing as
well, but it can result in out-of-order writes, which might be
unexpected semantics for Flume (as opposed to, say, Spark).
Using AUTO_FLUSH_BACKGROUND with a high batch size would likely
be more performant, but we can add that as an additional
configuration later if the need arises.

Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
---
M java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java
1 file changed, 5 insertions(+), 4 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/7641/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-2078: Sink failure if batch size > session's flush buffer size

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-2078: Sink failure if batch size > session's flush buffer size
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: No

[kudu-CR] KUDU-2078: Sink failure if batch size > session's flush buffer size

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Will Berkeley has posted comments on this change.

Change subject: KUDU-2078: Sink failure if batch size > session's flush buffer size
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7641/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java
File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java:

PS1, Line 70: 100
> doc needs updating
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2078: Sink failure if batch size > session's flush buffer size

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-2078: Sink failure if batch size > session's flush buffer size
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7641/1/java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java
File java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java:

PS1, Line 70: 100
doc needs updating


-- 
To view, visit http://gerrit.cloudera.org:8080/7641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2078: Sink failure if batch size > session's flush buffer size

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has submitted this change and it was merged.

Change subject: KUDU-2078: Sink failure if batch size > session's flush buffer size
......................................................................


KUDU-2078: Sink failure if batch size > session's flush buffer size

The Flume sink uses manual flush mode, so if users set the
sink's batch size parameter above the manual flush default
buffer size, the sink could fail batches (over and over). This
patch sets the session's buffer size (which is in terms of
number of ops) to the same as the batch size, so this problem
can no longer occur.

I considered using AUTO_FLUSH_BACKGROUND for the flushing as
well, but it can result in out-of-order writes, which might be
unexpected semantics for Flume (as opposed to, say, Spark).
Using AUTO_FLUSH_BACKGROUND with a high batch size would likely
be more performant, but we can add that as an additional
configuration later if the need arises.

Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
Reviewed-on: http://gerrit.cloudera.org:8080/7641
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy <mp...@apache.org>
---
M java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java
1 file changed, 6 insertions(+), 5 deletions(-)

Approvals:
  Mike Percy: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>

[kudu-CR] KUDU-2078: Sink failure if batch size > session's flush buffer size

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: KUDU-2078: Sink failure if batch size > session's flush buffer size
......................................................................


Patch Set 1: Code-Review+1

-- 
To view, visit http://gerrit.cloudera.org:8080/7641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-2078: Sink failure if batch size > session's flush buffer size

Posted by "Will Berkeley (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7641

to look at the new patch set (#2).

Change subject: KUDU-2078: Sink failure if batch size > session's flush buffer size
......................................................................

KUDU-2078: Sink failure if batch size > session's flush buffer size

The Flume sink uses manual flush mode, so if users set the
sink's batch size parameter above the manual flush default
buffer size, the sink could fail batches (over and over). This
patch sets the session's buffer size (which is in terms of
number of ops) to the same as the batch size, so this problem
can no longer occur.

I considered using AUTO_FLUSH_BACKGROUND for the flushing as
well, but it can result in out-of-order writes, which might be
unexpected semantics for Flume (as opposed to, say, Spark).
Using AUTO_FLUSH_BACKGROUND with a high batch size would likely
be more performant, but we can add that as an additional
configuration later if the need arises.

Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
---
M java/kudu-flume-sink/src/main/java/org/apache/kudu/flume/sink/KuduSink.java
1 file changed, 6 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/7641/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7641
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Id1c54bcecc3e13ae64dd90efe6cf53021517dcdf
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Will Berkeley <wd...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Will Berkeley <wd...@gmail.com>