You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2017/05/04 20:45:32 UTC

[kudu-CR] WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier

Hello David Ribeiro Alves, Adar Dembo,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/6802

to review the following change.

Change subject: WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier
......................................................................

WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier

This changes the maintenance manager to start triggering flushes at the
60% heap usage threshold, and adjusts the memory limiting to only start
rejecting writes at the 80% threshold.

This is based on analysis in [1] ("After tweaking memory backpressure
rejection") in which we found that earlier flushing could keep the
system from making too many rejections.

WIP: want to run before/after test again with this latest iter

[1] https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit#heading=h.gw87heq4fq6l

Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
---
M src/kudu/util/maintenance_manager.cc
M src/kudu/util/process_memory.cc
M src/kudu/util/process_memory.h
3 files changed, 21 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/6802/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>

[kudu-CR] KUDU-1949. Maintenance Manager should trigger flushes earlier

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1949. Maintenance Manager should trigger flushes earlier
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier
......................................................................


Patch Set 1:

> If you're already going to run some benchmarks, could you roll into this change the fix to flush_threshold_mb calculation?

hm, would rather not confuse the two changes, but I will do that next (already had kicked off the benchmark before seeing this)

-- 
To view, visit http://gerrit.cloudera.org:8080/6802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1949. Maintenance Manager should trigger flushes earlier

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1949. Maintenance Manager should trigger flushes earlier
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier
......................................................................


Patch Set 1:

looks good, waiting for those results

-- 
To view, visit http://gerrit.cloudera.org:8080/6802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No

[kudu-CR] WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: WIP: KUDU-1949. Maintenance Manager should trigger flushes earlier
......................................................................


Patch Set 1:

If you're already going to run some benchmarks, could you roll into this change the fix to flush_threshold_mb calculation?

-- 
To view, visit http://gerrit.cloudera.org:8080/6802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: No

[kudu-CR] KUDU-1949. Maintenance Manager should trigger flushes earlier

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6802

to look at the new patch set (#2).

Change subject: KUDU-1949. Maintenance Manager should trigger flushes earlier
......................................................................

KUDU-1949. Maintenance Manager should trigger flushes earlier

This changes the maintenance manager to start triggering flushes at the
60% heap usage threshold, and adjusts the memory limiting to only start
rejecting writes at the 80% threshold.

This is based on analysis in [1] ("After tweaking memory backpressure
rejection") in which we found that earlier flushing could keep the
system from making too many rejections.

I ran tpch_real_world with SF=300 as described in the doc below before
and after the patch. Some key metrics:

                Before       After
Mem rejections  1,162,527    62,790    (18.5x fewer)
Wall time       3156s        2490s     (1.26x faster)
Server CPU      37,348s      72,873s   (1.95x less efficient)

This matches the results from the original experiment. Because the
writes come in faster, we don't get enough time to run compactions, and
so each write ends up taking more CPU. Some future optimizations are
still pending which will reduce the CPU time per bloom lookup.
Regardless, this new behavior is configurable such that, if overall CPU
efficiency is more important than wall time, we could bump the "memory
pressure threshold" up higher than the "memory rejection threshold".

[1] https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit#heading=h.gw87heq4fq6l

Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
---
M src/kudu/util/maintenance_manager.cc
M src/kudu/util/process_memory.cc
M src/kudu/util/process_memory.h
3 files changed, 31 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/02/6802/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1949. Maintenance Manager should trigger flushes earlier

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1949. Maintenance Manager should trigger flushes earlier
......................................................................


KUDU-1949. Maintenance Manager should trigger flushes earlier

This changes the maintenance manager to start triggering flushes at the
60% heap usage threshold, and adjusts the memory limiting to only start
rejecting writes at the 80% threshold.

This is based on analysis in [1] ("After tweaking memory backpressure
rejection") in which we found that earlier flushing could keep the
system from making too many rejections.

I ran tpch_real_world with SF=300 as described in the doc below before
and after the patch. Some key metrics:

                Before       After
Mem rejections  1,162,527    62,790    (18.5x fewer)
Wall time       3156s        2490s     (1.26x faster)
Server CPU      37,348s      72,873s   (1.95x less efficient)

This matches the results from the original experiment. Because the
writes come in faster, we don't get enough time to run compactions, and
so each write ends up taking more CPU. Some future optimizations are
still pending which will reduce the CPU time per bloom lookup.
Regardless, this new behavior is configurable such that, if overall CPU
efficiency is more important than wall time, we could bump the "memory
pressure threshold" up higher than the "memory rejection threshold".

[1] https://docs.google.com/document/d/1U1IXS1XD2erZyq8_qG81A1gZaCeHcq2i0unea_eEf5c/edit#heading=h.gw87heq4fq6l

Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Reviewed-on: http://gerrit.cloudera.org:8080/6802
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Reviewed-by: David Ribeiro Alves <da...@gmail.com>
---
M src/kudu/util/maintenance_manager.cc
M src/kudu/util/process_memory.cc
M src/kudu/util/process_memory.h
3 files changed, 31 insertions(+), 2 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6802
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I8b7e703c82ac14fbce3a699bdf6a2f0fb4ed93a1
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>