You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2019/03/04 19:02:57 UTC

[kudu-CR] [master] introduce TTL-based cache

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/12652


Change subject: [master] introduce TTL-based cache
......................................................................

[master] introduce TTL-based cache

Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
---
M src/kudu/master/CMakeLists.txt
A src/kudu/master/ttl_cache-test.cc
A src/kudu/master/ttl_cache.h
3 files changed, 233 insertions(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/12652/1
-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] [util] introduce TTL-based cache

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [util] introduce TTL-based cache
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12652/2//COMMIT_MSG
Commit Message:

PS2: 
I think I'll try an alternative approach: adapting already existing LRU cache in util/cache.{h,cc} to be used for the TTL cache.  One of the anticipated benefits of the alternative approach is to make sure the cache eviction policy depends not only on the capacity of the cache expressed in number of items, but also a limit expressed in amount of maximum memory used for the cache.

If I found it's easy to add the memory tracking into this implementation, I'll get back to this separate (simple and small) implementation.



-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 12 Mar 2019 17:02:16 +0000
Gerrit-HasComments: Yes

[kudu-CR] [master] introduce TTL-based cache

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [master] introduce TTL-based cache
......................................................................


Patch Set 1:

> Yep, I took a look at that cache prior starting working on this one.  I think the cache in util/cache.{cc,h} looks very solid but a bit complicated for this simple case.  Ideally I'd like to have the simple TTL-based cache with all the nice metrics and memtracker integration.  Probably, I'll need to think about some common base to use some common code there and here.

Yeah I think both of those features are table stakes for the Sentry privilege cache. Metrics especially; you could make the argument that memtracker integration isn't as important if the footprint of the cache is low.

You can also look at util/file_cache.{cc,h}. The overall model doesn't make sense for this use case, but it's an example of having a different interface on top of the more generic util/cache.{cc,h}. The same is true for cfile/block_cache.{cc.h}, though to a lesser extent.


-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 06 Mar 2019 20:06:29 +0000
Gerrit-HasComments: No

[kudu-CR] [util] introduce TTL-based cache

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Andrew Wong, Adar Dembo, Greg Solovyev, Hao Hao, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/12652

to look at the new patch set (#2).

Change subject: [util] introduce TTL-based cache
......................................................................

[util] introduce TTL-based cache

Added unit tests as well.

Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
---
M src/kudu/util/CMakeLists.txt
M src/kudu/util/metrics.cc
M src/kudu/util/metrics.h
A src/kudu/util/ttl_cache-test.cc
A src/kudu/util/ttl_cache.h
A src/kudu/util/ttl_cache_metrics.cc
A src/kudu/util/ttl_cache_metrics.h
7 files changed, 512 insertions(+), 0 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/52/12652/2
-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] [master] introduce TTL-based cache

Posted by "Greg Solovyev (Code Review)" <ge...@cloudera.org>.
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [master] introduce TTL-based cache
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h
File src/kudu/master/ttl_cache.h:

http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@45
PS1, Line 45:     CHECK(capacity_ > 0) << "cache capacity should be greater than 0";
I wonder if there is a legitimate case for 0-capacity cache, such as disabling the cache altogether. Will there be another way to disable the cache?



-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 04 Mar 2019 19:48:33 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] introduce TTL-based cache

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [util] introduce TTL-based cache
......................................................................


Patch Set 2:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache-test.cc
File src/kudu/master/ttl_cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache-test.cc@79
PS1, Line 79: 
> Could you add a comment explaining why you expect cache size to be 5 here?
Sure -- that's because cache's size is always capped by its capacity parameter regardless number of Put() calls.

Done.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache-test.cc@85
PS1, Line 85: 
> nit, typo: "should"?
Done


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h
File src/kudu/master/ttl_cache.h:

PS1: 
> This probably belongs in /util, given there's nothing master-specific about
Done.

The code should be in a .h file since it's a class template.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@45
PS1, Line 45: 
> I wonder if there is a legitimate case for 0-capacity cache, such as disabl
0-capacity cache is not legitimate.  Yes, there is a way to disable the cache: see https://gerrit.cloudera.org/#/c/12653/1/src/kudu/master/sentry_authz_provider.cc@152

I prefer to use explicit semantics.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@54
PS1, Line 54: 
> nit: FindOrNull? Below too.
Done


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@81
PS1, Line 81: 
> Do you expect to use this outside of test? If so, could you add a function-
Done


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@88
PS1, Line 88: 
> I found this part of the API is kind of surprising. I would have expected t
Yep, that approach differs from cache to cache.  I'll change it to reflect the most common expectations.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@100
PS1, Line 100: 
> nit: number of elements in
Done


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@105
PS1, Line 105: 
> nit: writing this without the addition (very slightly) reduces the cycles i
Whoops, I'm surprised I wrote that.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@108
PS1, Line 108: 
             : 
             : 
             : 
             : 
> If you did a D/CHECK that (lookup_map_.size() == capacity_), you could get 
That's a good idea -- my first approach was to write some generic method, but I agree there is no need for that in this implementation.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@124
PS1, Line 124: 
> Add a comment indicating that the cache maintains that this list is sorted 
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 11 Mar 2019 23:32:07 +0000
Gerrit-HasComments: Yes

[kudu-CR] [master] introduce TTL-based cache

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [master] introduce TTL-based cache
......................................................................


Patch Set 1:

Did you look into augmenting the existing generic cache code in util/cache.{cc,h}? It's got rougher edges (e.g. you have to serialize/copy keys and values rather than using C++ objects), but it's also received a fair amount of performance tuning already, and supports some neat features like metrics and memtracker integration.


-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 06 Mar 2019 08:30:59 +0000
Gerrit-HasComments: No

[kudu-CR] [master] introduce TTL-based cache

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [master] introduce TTL-based cache
......................................................................


Patch Set 1:

> Did you look into augmenting the existing generic cache code in
 > util/cache.{cc,h}? It's got rougher edges (e.g. you have to
 > serialize/copy keys and values rather than using C++ objects), but
 > it's also received a fair amount of performance tuning already, and
 > supports some neat features like metrics and memtracker
 > integration.

Thank you for the comment.

Yep, I took a look at that cache prior starting working on this one.  I think the cache in util/cache.{cc,h} looks very solid but a bit complicated for this simple case.  Ideally I'd like to have the simple TTL-based cache with all the nice metrics and memtracker integration.  Probably, I'll need to think about some common base to use some common code there and here.


-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 06 Mar 2019 19:43:10 +0000
Gerrit-HasComments: No

[kudu-CR] [master] introduce TTL-based cache

Posted by "Greg Solovyev (Code Review)" <ge...@cloudera.org>.
Greg Solovyev has posted comments on this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [master] introduce TTL-based cache
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache-test.cc
File src/kudu/master/ttl_cache-test.cc:

http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache-test.cc@79
PS1, Line 79: ASSERT_EQ(5, cache.size());
Could you add a comment explaining why you expect cache size to be 5 here?


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache-test.cc@85
PS1, Line 85: shoule
nit, typo: "should"?



-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 04 Mar 2019 19:46:45 +0000
Gerrit-HasComments: Yes

[kudu-CR] [master] introduce TTL-based cache

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [master] introduce TTL-based cache
......................................................................


Patch Set 1:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h
File src/kudu/master/ttl_cache.h:

PS1: 
This probably belongs in /util, given there's nothing master-specific about it. Also maybe pull some of this out into a cc file?

Also, please add function-level comments on how a user should expect the public API to work wrt TTLs, eviction, etc.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@54
PS1, Line 54:     const auto lookup_it = lookup_map_.find(key);
nit: FindOrNull? Below too.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@81
PS1, Line 81: if (value) {
Do you expect to use this outside of test? If so, could you add a function-level comment indicating this is allowed by the Get API?


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@88
PS1, Line 88:  lookup_map_.erase(lookup_it);
I found this part of the API is kind of surprising. I would have expected the TTL would prevent us from returning an expired entry. What's the rationale behind returning an expired entry?

Also probably worth adding the behavior to the function-level comment.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@100
PS1, Line 100: contents of
nit: number of elements in


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@105
PS1, Line 105: lookup_map_.size() + 1 <= capacity_
nit: writing this without the addition (very slightly) reduces the cycles it takes to understand this.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@108
PS1, Line 108: while (lookup_map_.size() >= capacity_ && !lookup_map_.empty()) {
             :       const auto& oldest_entry = entry_list_.back();
             :       CHECK_EQ(1, lookup_map_.erase(oldest_entry.first));
             :       entry_list_.pop_back();
             :     }
If you did a D/CHECK that (lookup_map_.size() == capacity_), you could get rid of the loop.


http://gerrit.cloudera.org:8080/#/c/12652/1/src/kudu/master/ttl_cache.h@124
PS1, Line 124:   std::list<std::pair<K, V>> entry_list_;
Add a comment indicating that the cache maintains that this list is sorted by expiration time of its expiration time.



-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Mon, 04 Mar 2019 19:42:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] [util] introduce TTL-based cache

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has abandoned this change. ( http://gerrit.cloudera.org:8080/12652 )

Change subject: [util] introduce TTL-based cache
......................................................................


Abandoned

A patch with alternative approach is coming: this patch will obsoleted.
-- 
To view, visit http://gerrit.cloudera.org:8080/12652
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: abandon
Gerrit-Change-Id: If214fe19810728fa590d1e0a3e5b2da86b020ac2
Gerrit-Change-Number: 12652
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Greg Solovyev <gs...@cloudera.com>
Gerrit-Reviewer: Hao Hao <ha...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)