You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Yifan Zhang (Code Review)" <ge...@cloudera.org> on 2020/08/10 16:50:06 UTC

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Yifan Zhang has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16319


Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................

KUDU-3180: prioritize larger mem-stores in time-based flusing

Current time-based flush policy will always pick a mem-store
that haven't been flushed in a long time instead of a mem-store
anchoring more memory, this may lead to:
- more memory used by mem-stores.
- more small rowsets on disk so we need to do more compaction.

This patch improve current flush policy by considering both
mem-stores' size and time since last flush. When a mem-store
become large or old enough, it will be more likely to flush,
then we can avoid anchoring large (but below the threshold)
mem-stores or WALs for too long.

Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
---
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica_mm_ops.cc
2 files changed, 36 insertions(+), 27 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/16319/1
-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 1
Gerrit-Owner: Yifan Zhang <ch...@163.com>

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16319 )

Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................


Patch Set 1: Code-Review+1

(2 comments)

http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc
File src/kudu/tablet/tablet_replica_mm_ops.cc:

http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc@78
PS1, Line 78: "Number of milliseconds after which a non-empty MRS/DMS will reach "
            :              "'full perf improvement' in time-based flushing.");
nit: "full perf improvement" may lead users to believe that a higher score cannot be given to these ops. How about something like:

"Number of milliseconds after which the time-based performance improvement score of a non-empty MRS/DMS flush op will reach its maximum value. The score may further increase as the MRS/DMS grows in size."


http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc@120
PS1, Line 120: std::max(1.0, extra_mb)
> We want to distinguish between mem-stores that above the threshold and that
Got it. Thanks for the clarification!

At first I had concerns about whether this would starve flushing for older mem-stores, but even if it did, this change doesn't make it much worse, given we can restore something close to the original behavior by increasing the flush threshold by 1.



-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 1
Gerrit-Owner: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Comment-Date: Wed, 12 Aug 2020 22:00:56 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Posted by "Yifan Zhang (Code Review)" <ge...@cloudera.org>.
Yifan Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16319 )

Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc
File src/kudu/tablet/tablet_replica_mm_ops.cc:

http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc@120
PS1, Line 120: std::max(1.0, extra_mb)
> Clever!
We want to distinguish between mem-stores that above the threshold and that doesn't reach the threshold, now if there is a mem-store over threshold less than 1MB, it will have the same perf_improvement with a small mem-store that has been there for a while. This change ensures over threshold mem-stores always have a higher perf_improvement than those below threshold.

With this change I think we could reduce memory usage and avoid generating too much small diskrowsets. However I didn't test this on the real production cluster and it's difficult to reproduce actual workloads in a test cluster. Of course enable tablet throttler could also help our cluster I think.



-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 1
Gerrit-Owner: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Comment-Date: Tue, 11 Aug 2020 11:37:27 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Posted by "Yifan Zhang (Code Review)" <ge...@cloudera.org>.
Yifan Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16319 )

Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................


Patch Set 2:

> Patch Set 2: Code-Review+2
> 
> It'd be nice if you could try deploying this on a real (maybe small) cluster.

I deployed kudu on a real cluster and upgrade some tservers to a new version with this change, all the tservers have same configurations, then I insert 10,000,000 rows to each tserver, new version tservers could flush more bytes in the first period of time.


-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 2
Gerrit-Owner: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Comment-Date: Fri, 14 Aug 2020 05:20:48 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16319 )

Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................


Patch Set 2: Code-Review+2

It'd be nice if you could try deploying this on a real (maybe small) cluster.


-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 2
Gerrit-Owner: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Comment-Date: Thu, 13 Aug 2020 05:10:11 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Posted by "Yifan Zhang (Code Review)" <ge...@cloudera.org>.
Yifan Zhang has posted comments on this change. ( http://gerrit.cloudera.org:8080/16319 )

Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc
File src/kudu/tablet/tablet_replica_mm_ops.cc:

http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc@78
PS1, Line 78: "Number of milliseconds after which the time-based performance improvement "
            :              "score of a non-empty MRS/DMS flush op will reach i
> nit: "full perf improvement" may lead users to believe that a higher score 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 2
Gerrit-Owner: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>
Gerrit-Comment-Date: Thu, 13 Aug 2020 02:56:23 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Posted by "Yifan Zhang (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Andrew Wong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16319

to look at the new patch set (#2).

Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................

KUDU-3180: prioritize larger mem-stores in time-based flusing

Current time-based flush policy will always pick a mem-store
that haven't been flushed in a long time instead of a mem-store
anchoring more memory, this may lead to:
- more memory used by mem-stores.
- more small rowsets on disk so we need to do more compaction.

This patch improve current flush policy by considering both
mem-stores' size and time since last flush. When a mem-store
become large or old enough, it will be more likely to flush,
then we can avoid anchoring large (but below the threshold)
mem-stores or WALs for too long.

Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
---
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica_mm_ops.cc
2 files changed, 37 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/19/16319/2
-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 2
Gerrit-Owner: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16319 )

Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc
File src/kudu/tablet/tablet_replica_mm_ops.cc:

http://gerrit.cloudera.org:8080/#/c/16319/1/src/kudu/tablet/tablet_replica_mm_ops.cc@120
PS1, Line 120: std::max(1.0, extra_mb)
Clever!

That said, I'm curious why we'd prefer this change over, for instance, lowering --flush_threshold_mb by 1. You mentioned on the ticket that lowering the threshold isn't viable because it conflicts with the block cache capacity. That said, does this change help your cluster much?



-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 1
Gerrit-Owner: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 10 Aug 2020 22:29:32 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-3180: prioritize larger mem-stores in time-based flusing

Posted by "Yifan Zhang (Code Review)" <ge...@cloudera.org>.
Yifan Zhang has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16319 )

Change subject: KUDU-3180: prioritize larger mem-stores in time-based flusing
......................................................................

KUDU-3180: prioritize larger mem-stores in time-based flusing

Current time-based flush policy will always pick a mem-store
that haven't been flushed in a long time instead of a mem-store
anchoring more memory, this may lead to:
- more memory used by mem-stores.
- more small rowsets on disk so we need to do more compaction.

This patch improve current flush policy by considering both
mem-stores' size and time since last flush. When a mem-store
become large or old enough, it will be more likely to flush,
then we can avoid anchoring large (but below the threshold)
mem-stores or WALs for too long.

Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Reviewed-on: http://gerrit.cloudera.org:8080/16319
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tablet/tablet_replica_mm_ops.cc
2 files changed, 37 insertions(+), 27 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/16319
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I0a826643709a4990e40b0a49f89f4ea34f14163b
Gerrit-Change-Number: 16319
Gerrit-PatchSet: 3
Gerrit-Owner: Yifan Zhang <ch...@163.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Yifan Zhang <ch...@163.com>