You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2017/05/12 04:04:31 UTC

[kudu-CR] WIP: threadpool: token-based sequencing

Hello Todd Lipcon,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/6874

to review the following change.

Change subject: WIP: threadpool: token-based sequencing
......................................................................

WIP: threadpool: token-based sequencing

Here's a reaaaaally rough implementation of token-based sequencing for
util/threadpool. The idea is to allow multiple contexts to share a single
threadpool with a high number of threads while also ensuring that the pool
executes tasks belonging to each context in order. Previously this was only
achievable via one-singleton-threadpool-per-context, which grossly inflates
the total number of threads.

WIP because it's incomplete, is poorly documented, needs a lot more tests,
is suboptimal from an allocation perspective (too much map access?) and is
probably outright broken.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
3 files changed, 158 insertions(+), 19 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/1
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 278:     if (!s->not_running_cond.TimedWait(delta)) {
on spurious wakeup, won't this wait longer than expected? I think implementing WaitUntil in terms of a deadline cond var wait would be better, and convert this one to use the deadline?


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS8, Line 188: Upon completion, 't' has been released
not following this bit. I'm reading this to mean "Release()" below, but that doesn't make sense. I'd expect 'this' to be released prior to the move.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 12: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 12:

leaving it open in case todd wants to take another look

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 12:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 441:   std::deque<ThreadPool::Task> entries;
> warning: invalid case style for private member 'entries' [readability-ident
Done


Line 445:   ConditionVariable not_running_cond;
> warning: invalid case style for private member 'not_running_cond' [readabil
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 3:

(2 comments)

oops had two lingering comments, not sure if relevant, but might as well push

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS3, Line 203: FIFO
nit: missing "queue"


PS3, Line 220: The execution order will be A1, T1, T2, A2, A3.
this is a _possible_ execution order, right?


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 7: Code-Review+1

LGTM.  Maybe other reviewers have some other things to point at.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#9).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 981 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/9
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] WIP: threadpool: token-based sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: WIP: threadpool: token-based sequencing
......................................................................


Patch Set 1:

> (1 comment)
 > - does this affect the performance of a normal ThreadPool that
 > doesn't use the token feature? From looking at the code it seems
 > like it would just be a few branches and no extra allocation, etc,
 > but would be good to be sure of that
 
Yes, that was my intent. However, we can also go in one of two other directions:
1. Recognize that there's no real use case for token and non-token submission in the same pool and further split them up. Some ideas here:
a. A common ThreadPool interface with two implementations, sharing code where possible.
b. Expose the difference in behavior via templatization (i.e. traits).
Neither of these approaches are particularly clean due to the difference in Submit() and Wait() (though I'll admit I didn't think about it very hard) which is why I haven't pursued them yet.
2. Reduce code and complexity by implementing non-token submission as token-based submission. Dan argued for this when he and were discussing the idea yesterday; it's not a bad one if we further optimize token-based submission and avoid allocations.

 > - not sure about using a string for a token vs having a more opaque
 > token, such as ThreadPool::ObtainSequencer() returning a
 > 'Sequencer' struct (which could well be just an int or somesuch)
 
Yes, Dan suggested something similar (though he went a step further and suggested that Submit()/Wait() should happen through this opaque "handle"). The advantage of using an int is that we could convert the map of token queues into a array or vector, and explicit token lifecycle means we can probably avoid shared ownership on token queues.

The disadvantage is that now ThreadPool users have to manage the lifecycle of the token, though you could argue they had to already in that they probably must call WaitFor(token) before destructing anyway.

 > - docs wise, would be good to document fairness and
 > starvation-freedom properties. Particularly, the fact that this
 > exhibits neither :-D Probably fine since I think we would typically
 > use this feature in threadpools with no max thread count specified.
 > (is it worth actually prohibiting use of tokens in a pool with a
 > max thread count? or is it just "YMMV" situation?)

Agreed. The current implementation prioritizes non-token over token submission. Submission order is preserved for all tasks belonging to a token. The use of a FIFO for token rotation means one busy token can't starve the rest.

Note that "no max thread count" means "max of <num_cpus> threads", so it's very likely that num_tokens > num_cpus.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 12: Code-Review+1

(3 comments)

Looks pretty good to me

http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS12, Line 134: m
Is this different than pool_->lock_ ?


Line 196:       while (state() != State::QUIESCED) {
Could use a comment indicating who is responsible for transitioning the token to QUIESCED.


http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS12, Line 410: m
Would be nice to mention what 'm' is for.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: threadpool: token-based task sequencing
......................................................................


threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Reviewed-on: http://gerrit.cloudera.org:8080/6874
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 991 insertions(+), 91 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 15
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 138: // 2. CONCURRENT: submitted tasks may be run in parallel. This isn't unlike
> Suppose you have n contexts sharing a single threadpool. If tokens aren't u
got it.


PS11, Line 190: 
              :   // Waits for the pool to reach the idle state, or until 'delta' time elapses.
              :   // Returns true if the pool reached the idle state, false otherwise
> Er, why? It is a "new paragraph" within the NewToken() documentation, so I 
I meant that the whole comment block belongs with the function rather than the definition, but it's ok


http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS12, Line 410: m
> Moot now that pool->lock is referenced directly.
hm, did you forget to post the newest rev with this change?


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#6).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 947 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/6
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 3:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 419: TEST_F(ThreadPoolTest, TestTokenSubmitsProcessedSerially) {
> I think this may fail on a single CPU system.
I'm not seeing why; can you elaborate?


http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS3, Line 146: std::move
> I think you want std::forward here.
I can't say I understand the difference between move() and forward() (even after reading an article or two about it), but I've changed both of these moves to be forwards.


http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 198:   DISALLOW_COPY_AND_ASSIGN(SequenceToken);
> I was curious about this, so I looked it up in Effective Modern C++ (page 1
I think I will since it adds information for people who aren't familiar with that aspect of move operations.


PS3, Line 203: FIFO
> nit: missing "queue"
Done


PS3, Line 220: The execution order will be A1, T1, T2, A2, A3.
> this is a _possible_ execution order, right?
Dan: A2 isn't allowed to run until A1 has finished. You're right though that if A1 finishes before A2 is submitted, A2 will run second. I'll clarify that.

I should have clarified that this execution order presupposes that the tasks are long-running (i.e. that A3 will be submitted before any task finishes).


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 12:

got it. I'm working on figuring out a better way to manage my gerrit queue so I don't get confused by stuff like this

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 14: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 14
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 8: -Code-Review

(14 comments)

http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/debug-util.h
File src/kudu/util/debug-util.h:

PS8, Line 105: bool HasCollected() const {
> docs
Done


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 40: #include "kudu/util/threadpool.h"
> warning: #includes are not sorted properly [llvm-include-order]
Done


PS8, Line 418: t3 = std::move(t1);
> agree with tidy bot that these uses after move are weird
But how else am I to test that the move constructor and move assignment operator properly "zero out" the moved instance?

To squelch the clang-tidy warnings I added NOLINT to these lines.


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 130: const SequenceToken::SlotIndex SequenceToken::kInvalidSlotIndex;
> warning: redundant 'kInvalidSlotIndex' declaration [readability-redundant-d
Not redundant; if I remove this, threadpool-test fails to link.


PS8, Line 173: (std::shared_ptr<Runnable>
> nit: make_shared?
Done


PS8, Line 188: pool_->CheckNotPoolThreadUnlocked();
> what is the point of this check? can you leave a comment?
It's called in every user-facing function that may block, to ensure that the caller isn't a ThreadPool worker thread.

I don't think it makes sense for there to be a comment explaining that in every call-site; best to look at the (short) implementation to learn what's being checked.


PS8, Line 258: s->state() == ThreadPool::SequenceSlot::State::RUNNABLE ||
             :          s->state() == ThreadPool::SequenceSlot::State::RUNNING ||
             :          s->state() == ThreadPool::SequenceSlot::State::QUIESCING
> isn't this equivalent to IsActive()
Yeah, will replace.


PS8, Line 275: s->state() == ThreadPool::SequenceSlot::State::RUNNABLE ||
             :          s->state() == ThreadPool::SequenceSlot::State::RUNNING ||
             :          s->state() == ThreadPool::SequenceSlot::State::QUIESCING
> same.
Done


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS8, Line 140: Wait variants
> shutdown is also thread safe, right?
Yes. I went back and forth on whether to call it out; I reasoned folks would interpret "lifecycle functions" as "ctor/dtor and assignment operators".

But apparently not, so I'll make it more explicit.


PS8, Line 156: Waits until all the tasks submitted via this token are completed and also
             :   // marks the token as unusable for future submissions.
> mind reverting this sentence? I reads like we wait first and then close the
You mean reversing? Done.


PS8, Line 212: Tasks submitted via one token are fairly prioritized with respect to tasks
             : // submitted via another token.
> not sure I understand this... how are tasks on a token prioritized over ano
Tokens are processed in a round-robin style. When a worker thread processes a token, it runs just one of its queued tasks. In this way, we can ensure that one token cannot starve other tokens.

I'll add more detail.


PS8, Line 360: QUIESCED  -> AVAILABLE:
> why do we need to transition the state back to available?
Only AVAILABLE tokens may be returned to clients via AllocateTokenSlot(). If a token remained in QUIESCED forever, it couldn't be reused by another client.


PS8, Line 424: StackTrace running_submit_stack;
> what's the perf impact of this? I realize this is a debug only thing but we
I ran "perf report" on the new fuzz unit test I added and Collect() consumed 1.42% CPU, which I think is reasonable.


Line 516: std::ostream& operator<<(std::ostream& o, ThreadPool::SequenceSlot::State s);
> warning: redundant 'operator<<' declaration [readability-redundant-declarat
This isn't redundant; if I remove it I get compilation errors in threadpool.cc.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 11:

> @adar: did you end up with another solution to the problem you were
 > seeing with tokens still being alive when the tp was destroyed?

Yes, I posted a WIP here: https://gerrit.cloudera.org/7183. Still waiting for Todd or someone else to take a look.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#7).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 955 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/7
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 8:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/debug-util.h
File src/kudu/util/debug-util.h:

PS8, Line 105: bool HasCollected() const {
docs


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

PS8, Line 418: t3 = std::move(t1);
agree with tidy bot that these uses after move are weird


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS8, Line 173: (std::shared_ptr<Runnable>
nit: make_shared?


PS8, Line 188: pool_->CheckNotPoolThreadUnlocked();
what is the point of this check? can you leave a comment?


PS8, Line 258: s->state() == ThreadPool::SequenceSlot::State::RUNNABLE ||
             :          s->state() == ThreadPool::SequenceSlot::State::RUNNING ||
             :          s->state() == ThreadPool::SequenceSlot::State::QUIESCING
isn't this equivalent to IsActive()


PS8, Line 275: s->state() == ThreadPool::SequenceSlot::State::RUNNABLE ||
             :          s->state() == ThreadPool::SequenceSlot::State::RUNNING ||
             :          s->state() == ThreadPool::SequenceSlot::State::QUIESCING
same.


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS8, Line 140: Wait variants
shutdown is also thread safe, right?


PS8, Line 156: Waits until all the tasks submitted via this token are completed and also
             :   // marks the token as unusable for future submissions.
mind reverting this sentence? I reads like we wait first and then close the close the token


PS8, Line 212: Tasks submitted via one token are fairly prioritized with respect to tasks
             : // submitted via another token.
not sure I understand this... how are tasks on a token prioritized over another token? how do you chose which token get priority?


PS8, Line 360: QUIESCED  -> AVAILABLE:
why do we need to transition the state back to available?


PS8, Line 424: StackTrace running_submit_stack;
what's the perf impact of this? I realize this is a debug only thing but we don't want debug modes to be abnormally slow. Consider gating this behind a flag if the perf impact is considerable.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 4:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/6874/4//COMMIT_MSG
Commit Message:

Line 7: threadpool: token-based task sequencing
nit: consider keeping length of lines under 72 chars:

https://git-scm.com/book/en/v2/Distributed-Git-Contributing-to-a-Project#_commit_guidelines


PS4, Line 10: m
nit: would it be better for readability if this letter were capital?


PS4, Line 10: n
ditto


http://gerrit.cloudera.org:8080/#/c/6874/4/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

PS4, Line 563:   for (auto& t : tokens) {
             :     t.Wait();
             :   }
What happens if an assertion fires above?  Is it necessary to wait for tokens or destroying them without wait would be fine?


http://gerrit.cloudera.org:8080/#/c/6874/4/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS4, Line 328: SequenceToken ThreadPool::AllocateTokenSlot() {
Is there a restriction on maximum number of allocated token slots?


PS4, Line 345:     unique_ptr<SequenceSlot> new_slot(new SequenceSlot(&lock_));
             :     DCHECK_EQ(SequenceSlot::DEALLOCATED, new_slot->state());
             :     slots_.emplace_back(std::move(new_slot));
             :     idx = slots_.size() - 1;
             :   }
             : 
             :   slots_[idx]->Transition(SequenceSlot::INACTIVE);
Regarding this transition from DEALLOCATED to INACTIVE for non-existing slot: why is it necessary?  I see the transition signals other actors calling Wait() for WaitFor(), but why would those guys be interested in receiving a notification that a new token slot has been allocated while waiting on some other slots?

I might be missing something here, but would it be feasible setting the state of a newly allocated slot to INACTIVE, not DEALLOCATED?


http://gerrit.cloudera.org:8080/#/c/6874/4/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS4, Line 21: #include <list>
nit: not sure this is needed


PS4, Line 24: #include <unordered_map>
nit: not sure this is needed


PS4, Line 157: binded
nit: bound?


PS4, Line 182: int
nit: why not to use the newly introduced typedef here as well (and re-order the typedef and this declaration)?


PS4, Line 182: kInvalidSlotIndex
nit: since it's an integral type and we are using C++11, consider defining the value of this static variable right here in the header.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 6:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/debug-util.h
File src/kudu/util/debug-util.h:

PS6, Line 105: {
> add 'const' specifier?
Done


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS6, Line 333: DEALLOCATED
> AVAILABLE
Done


PS6, Line 344: DEALLOCATED
> AVAILABLE
Done


Line 435:   SequenceSlot* slot = slot_index ? slots_[*slot_index].get() : nullptr;
> paranoid nit: does it make sense to add something like
Done


Line 473:     DCHECK(state == SequenceSlot::ASSIGNED ||
> Why not (state != SequenceSlot::AVAILABLE)?
I find explicitly enumerating the allowed states to be more clear. It's also easier to notice mistakes if/when a new state is introduced.

Yes, the guarantee is that if DoSubmit() was called with a slot index, it must have originated with a token, which means the slot cannot be AVAILABLE.

I'm not a fan of EMPTY because I can see myself confusing it with "the slot's queue is empty of tasks" (i.e. ASSIGNED).


PS6, Line 708: }
> Does it make sense to add CHECK(!task) for the case new_state == ASSIGNED?
Yes, I'll add that.


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS6, Line 177: ;
> add 'const' specifier?
Done


Line 304:                   boost::optional<SequenceToken::SlotIndex> slot_index);
> Could you add a comment explaining the expected usage with slot_index (slot
Sure.


Line 398:     void CheckStateTransition(State new_state, const boost::optional<Task>& task);
> add 'const' specifier?
Done


PS6, Line 453: Queues
> nit: Queue
No, this is a container of slots, and each slot is logically a queue.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 7: Code-Review-2

See the note I left in https://gerrit.cloudera.org/#/c/6946/6 about implementing SequenceToken::Shutdown().

Until I do that, please don't merge this.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 7
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Todd Lipcon, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#12).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 990 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/12
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 11:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS11, Line 135: // Furthermore, tasks submitted to tokens constructed via ExecutionMode::SERIAL
              : // are guaranteed not to run in parallel.
the first time I read through this (before reading the code) I thought the SERIAL vs CONCURRENT was a property of a task rather than a token, and it confused me a bit. Maybe rephrase a little so it says something like:

A ThreadPoolToken operates in one of two ExecutionModes:
- SERIAL: tasks submitted via this token run one at a time
- CONCURRENT: tasks may be run concurrently blah blah

... so it's just a little more obvious that the modes are set per token and not per task


Line 138: // Tasks submitted without a token or via ExecutionMode::CONCURRENT tokens are
can you remind me what the advantage of CONCURRENT tokens is, vs just sharing one big threadpool across multiple usage contexts? Is it to take advantage of per-token metrics? Might be worth noting something about this in the doc.


Line 190:   // Allocates a new token for use in token-based task submission.
can you document the requirements around token lifetime relative to pool lifetime? namely I think you have to destroy all tokens before destroying the pool?


PS11, Line 190: // Allocates a new token for use in token-based task submission.
              :   //
              :   // There is no limit on the number of tokens that may be allocated.
nit: I think this comment belongs below the enum rather than above it


PS11, Line 298: whom
nit: which


Line 330:   ~ThreadPoolToken();
I think worth noting whether the token-owner is responsible for doing a Shutdown() and/or Wait() before allowing it to be destructed.


PS11, Line 332:   // Submits a function using the kudu Closure system.
              :   Status SubmitClosure(Closure c) WARN_UNUSED_RESULT;
              : 
              :   // Submits a function bound using boost::bind(&FuncName, args...).
              :   Status SubmitFunc(boost::function<void()> f) WARN_UNUSED_RESULT;
              : 
              :   // Submits a Runnable class.
              :   Status Submit(std::shared_ptr<Runnable> r) WARN_UNUSED_RESULT;
do we still need all of these variants? I seem to recall that we have fewer in use than actually defined, so if we aren't using all three, maybe we can use this new API as an opportunity to consolidate?


PS11, Line 342: If a task is in flight
nit: should be plural, since you can make a CONCURRENT token, right?


Line 361:  private:
What's the thread safety on the members here? Are they protected by the threadpool's lock? think it's worth a block comment explaining.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 11:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS11, Line 135: // Furthermore, tasks submitted to tokens constructed via ExecutionMode::SERIAL
              : // are guaranteed not to run in parallel.
> the first time I read through this (before reading the code) I thought the 
Will update.


Line 138: // Tasks submitted without a token or via ExecutionMode::CONCURRENT tokens are
> can you remind me what the advantage of CONCURRENT tokens is, vs just shari
Suppose you have n contexts sharing a single threadpool. If tokens aren't used for submission, there's no way for one of the contexts to atomically:
1. Abort any of its queued tasks.
2. If any are running, wait for them to finish.
3. Prohibit future submission.

ThreadPool::Shutdown() is inappropriate because it applies to the entire pool, not just to this one context. I don't know exactly what to call this property, so the description is a little vague ("...can be used to block on logical groups of tasks").

Per-context metrics is useful, but it's not necessary for correctness the way this is.


Line 190:   // Allocates a new token for use in token-based task submission.
> can you document the requirements around token lifetime relative to pool li
Done


PS11, Line 190: // Allocates a new token for use in token-based task submission.
              :   //
              :   // There is no limit on the number of tokens that may be allocated.
> nit: I think this comment belongs below the enum rather than above it
Er, why? It is a "new paragraph" within the NewToken() documentation, so I think it should be part of the "Allocates a new token..." block of text.


PS11, Line 298: whom
> nit: which
Done


Line 330:   ~ThreadPoolToken();
> I think worth noting whether the token-owner is responsible for doing a Shu
OK. The destructor will call Shutdown(), so I'll note that doing so explicitly is not a requirement.


PS11, Line 332:   // Submits a function using the kudu Closure system.
              :   Status SubmitClosure(Closure c) WARN_UNUSED_RESULT;
              : 
              :   // Submits a function bound using boost::bind(&FuncName, args...).
              :   Status SubmitFunc(boost::function<void()> f) WARN_UNUSED_RESULT;
              : 
              :   // Submits a Runnable class.
              :   Status Submit(std::shared_ptr<Runnable> r) WARN_UNUSED_RESULT;
> do we still need all of these variants? I seem to recall that we have fewer
All three variants are in use in unit tests. The functor variant is also used by peers, and the closure variant is also used by the message queue and raft consensus.


PS11, Line 342: If a task is in flight
> nit: should be plural, since you can make a CONCURRENT token, right?
Done


Line 361:  private:
> What's the thread safety on the members here? Are they protected by the thr
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#10).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 979 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/10
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 3:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS3, Line 146: std::move
I think you want std::forward here.


http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 198:   DISALLOW_COPY_AND_ASSIGN(SequenceToken);
I was curious about this, so I looked it up in Effective Modern C++ (page 111):

> Declaring a move operation (construction or assignment) in a class causes compilers to disable the copy operations. (The copy operations are disabled by deleting them—see Item 11).

So this should be safe to leave off.  That being said, It's perfectly fine with me if you leave it.


Line 220: // The execution order will be A1, T1, T2, A2, A3.
What happens if A1 begins running before A2 is submitted?  Won't A2 run second in that case?


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 12:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS12, Line 410: m
> hm, did you forget to post the newest rev with this change?
I'm still working on feedback in the other patches in this series. My MO has been to respond to comments as I make the changes locally to the affected patch, then repush the whole thing when it's ready.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Dan Burkert, David Ribeiro Alves, Todd Lipcon,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#4).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
m contexts to share a single pool with n threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 943 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/4
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 9:

The implementation has changed significantly. Here's a summary:
- Tokens can now run tasks either serially (the old SequenceToken implementation) or concurrently. Execution mode notwithstanding, tokens serve as a way to logically group tasks, so that Wait()/Shutdown() block on them and not on all tasks in the pool.
- Tokenless submission now goes through a single concurrent token.
- Slots and tokens have been unified. This has reduced the amount of code as well as the number of states in the (now token) state machine.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 9
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 11:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS11, Line 194: processed
> nit: maybe "executed" is a better term?
Done


PS11, Line 197: will
> s/will/may
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 2:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/6874/4//COMMIT_MSG
Commit Message:

Line 7: threadpool: token-based task sequencing
> nit: consider keeping length of lines under 72 chars:
I wrap at 76, so that, after the 4 char indentation automatically added by git, the lines are under 80 and thus fit into a standard 80x24 terminal window.


PS4, Line 10: n
> ditto
Done


PS4, Line 10: m
> nit: would it be better for readability if this letter were capital?
Done


http://gerrit.cloudera.org:8080/#/c/6874/4/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

PS4, Line 563: 
             : TEST_F(ThreadPoolTest, TestTokenSubmissionsAdhereToMaxQueueSize) {
             :   A
> What happens if an assertion fires above?  Is it necessary to wait for toke
You're right; I should wait on them. I'll move this into a scoped cleanup.

I adjusted the cleanup for some other tests too.


http://gerrit.cloudera.org:8080/#/c/6874/4/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 139: SequenceToken::SequenceToken(SequenceToken&& t)
> warning: move constructors should be marked noexcept [misc-noexcept-move-co
Done


Line 145: SequenceToken& SequenceToken::operator=(SequenceToken&& t) {
> warning: move assignment operators should be marked noexcept [misc-noexcept
Done


PS4, Line 328:   SequenceToken::SlotIndex idx = -1;
> Is there a restriction on maximum number of allocated token slots?
No. Nor do I see the need to add one.


PS4, Line 345: 
             :   return SequenceToken(this, idx);
             : }
             : 
             : void ThreadPool::ReleaseTokenSlot(SequenceToken* t) {
             :   {
             :     // Release the slot with the lock held.
> Regarding this transition from DEALLOCATED to INACTIVE for non-existing slo
So why do we actually need both DEALLOCATED and INACTIVE? Because we need to distinguish between "slot exists in slots_ but has no token" and "slot exists in slots_ and there exists a token for it". If we don't distinguish between the two, we won't be able to reuse slots for future tokens.

It's true that new_slot could start in INACTIVE instead of DEALLOCATED, but I wanted the state transitions for "slot just allocated on the heap" and "slot reused" to be identical, which is why they both go from DEALLOCATED to INACTIVE.

As for avoiding the condvar broadcast in the DEALLOCATED->INACTIVE transition: we could avoid it, but it should be a no-op anyway, because the condvar is specific to this slot, and since there's no token for it, there should be no waiters either. So I didn't think it mattered and opted to keep the state transition code simple.

BTW, I'm going to rename the states to ones that are intuitive.


http://gerrit.cloudera.org:8080/#/c/6874/4/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS4, Line 21: #include <list>
> nit: not sure this is needed
Done


PS4, Line 24: #include <unordered_map>
> nit: not sure this is needed
Done


Line 144:   SequenceToken(SequenceToken&& t);
> warning: move constructors should be marked noexcept [misc-noexcept-move-co
Done


Line 145:   SequenceToken& operator=(SequenceToken&& t);
> warning: move assignment operators should be marked noexcept [misc-noexcept
Done


PS4, Line 157: binded
> nit: bound?
Copied from ThreadPool::SubmitFunc(), I'll correct both.


PS4, Line 182: int
> nit: why not to use the newly introduced typedef here as well (and re-order
Done


PS4, Line 182: kInvalidSlotIndex
> nit: since it's an integral type and we are using C++11, consider defining 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 10:

@adar i guess that the first explicit call to Shutdown() is racing with the the rest of the stuff. i.e. the token’s shutdown completes and the shutdown thread thinks it’s released but there are actually other tasks hanging on to tokens even though they haven’t yet submitted. Of course if you did let them submit they might cause SIGSEGV’s cause the pool was destroyed or somesuch.

My suggestion would be:
- Make ThreadPool ref counted, make tokens hold a ref. (to make sure the tp survives and to avoid adding locks inside of ThreadPoolToken).
- On Token shutdown actually release the token from the pool after you’ve quiesced all the tasks(but the tokens ref for the pool will only be reduced on the dctor).
- Make the pool accept but return a bad status for unknown/quiesced tokens (which I already think it does, just spelling it out).

This way the shutdown thread will clear all pending tasks and make sure that the running tasks complete but if there are places still hanging on to tokens that should still be ok.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 10
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 419: TEST_F(ThreadPoolTest, TestTokenSubmitsProcessedSerially) {
I think this may fail on a single CPU system.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#8).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token goes
out of scope, the slot is returned to the pool. The token may also be shut
down before being destroyed, which has semantics equivalent to shutting down
the pool itself (i.e. tasks may no longer be submitted with the token).

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 1,168 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/8
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6874/4/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS4, Line 345:     unique_ptr<SequenceSlot> new_slot(new SequenceSlot(&lock_));
             :     DCHECK_EQ(SequenceSlot::DEALLOCATED, new_slot->state());
             :     slots_.emplace_back(std::move(new_slot));
             :     idx = slots_.size() - 1;
             :   }
             : 
             :   slots_[idx]->Transition(SequenceSlot::INACTIVE);
> So why do we actually need both DEALLOCATED and INACTIVE? Because we need t
ok, I see: the broadcasting on conditional variable is no-op in that case.  Thank you for the clarification.  Yes, the new names for the states in PS6 look better than the former ones.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 11:

I took a look at the other patch. Todd is a better arbitrer as to correctness but in general the solution seems reasonable.
I honestly would still prefer not to have to mandate that all tokens are returned to the pool (i.e. enforce that all tasks are finished and no future tasks can run, but don't force all threads hanging on to tokens to return them), from a library design perspective. This is specially problematic if we ever have more than one thread pool in which the running tasks have cyclic dependencies (i.e. tasks from a tp A submit to a tp B and vice versa). But I guess that the end goal is to have a single tp, making the point less relevant in this case.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 6:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 473:     DCHECK(state == SequenceSlot::ASSIGNED ||
Why not (state != SequenceSlot::AVAILABLE)?

Just checking that I understand this correctly, this implies that the caller knows the SequenceSlot at slot_index is not in the AVAILABLE state, right? And this is ensured by the fact that only a SequenceToken will DoSubmit with a slot_index.

I know it's gone through name changes already, but how do you feel about EMPTY instead of AVAILABLE? My thinking, at least for this bit of logic, is that it's clearer that a task shouldn't be assigned to an empty slot


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 304:                   boost::optional<SequenceToken::SlotIndex> slot_index);
Could you add a comment explaining the expected usage with slot_index (slot_index must refer to a SequenceSlot that is occupied by a token)?


PS6, Line 453: Queues
nit: Queue


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello David Ribeiro Alves, Mike Percy, Alexey Serbin, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#13).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

The new logic is implemented in ThreadPoolToken. Tasks submitted via tokens
are logically grouped together. They may also be sequenced, depending on the
token's execution mode. Tokens expose Wait() variants which will block on
the task group. Tokens may also be shut down before being destroyed, which
has semantics equivalent to shutting down the pool itself (i.e. tasks may no
longer be submitted with these tokens).

Some notes:
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 991 insertions(+), 91 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/13
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 13
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 11:

(2 comments)

@adar: did you end up with another solution to the problem you were seeing with tokens still being alive when the tp was destroyed?

http://gerrit.cloudera.org:8080/#/c/6874/11/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS11, Line 194: processed
nit: maybe "executed" is a better term?


PS11, Line 197: will
s/will/may


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 11
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 6:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/debug-util.h
File src/kudu/util/debug-util.h:

PS6, Line 105: {
add 'const' specifier?


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS6, Line 333: DEALLOCATED
AVAILABLE


PS6, Line 344: DEALLOCATED
AVAILABLE


Line 435:   SequenceSlot* slot = slot_index ? slots_[*slot_index].get() : nullptr;
paranoid nit: does it make sense to add something like

DASSERT(!slot_index || *slot_index < slots_.size());

?


PS6, Line 708: }
Does it make sense to add CHECK(!task) for the case new_state == ASSIGNED?


http://gerrit.cloudera.org:8080/#/c/6874/6/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS6, Line 177: ;
add 'const' specifier?


Line 398:     void CheckStateTransition(State new_state, const boost::optional<Task>& task);
add 'const' specifier?


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/6874/2/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 45: using strings::Substitute;
> warning: using decl 'Substitute' is unused [misc-unused-using-decls]
Done


http://gerrit.cloudera.org:8080/#/c/6874/2/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 168:   return Submit(std::shared_ptr<Runnable>(new ClosureRunnable(std::move(c))));
> warning: std::move of the const variable has no effect; remove std::move() 
Done


Line 380:   return Submit(std::shared_ptr<Runnable>(new ClosureRunnable(std::move(c))));
> warning: std::move of the const variable has no effect; remove std::move() 
Done


http://gerrit.cloudera.org:8080/#/c/6874/2/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

Line 285:   Status DoSubmit(std::shared_ptr<Runnable> task,
> warning: function 'kudu::ThreadPool::DoSubmit' has a definition with differ
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 3: Verified+1

Failure was in kinit in a Java test.

-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 12:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

PS12, Line 134: m
> Is this different than pool_->lock_ ?
It's not. I should just reference that directly to avoid confusion.


Line 196:       while (state() != State::QUIESCED) {
> Could use a comment indicating who is responsible for transitioning the tok
Done


http://gerrit.cloudera.org:8080/#/c/6874/12/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS12, Line 410: m
> Would be nice to mention what 'm' is for.
Moot now that pool->lock is referenced directly.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 12
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 8:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.cc
File src/kudu/util/threadpool.cc:

Line 278:     if (!s->not_running_cond.TimedWait(delta)) {
> on spurious wakeup, won't this wait longer than expected? I think implement
Good point, but this is how ThreadPool::WaitFor() works today and I don't think SequenceToken::WaitFor() is any worse. Can I make the change in a subsequent patch? We don't yet have a deadline-based cv wait either.


http://gerrit.cloudera.org:8080/#/c/6874/8/src/kudu/util/threadpool.h
File src/kudu/util/threadpool.h:

PS8, Line 188: Upon completion, 't' has been released
> not following this bit. I'm reading this to mean "Release()" below, but tha
Yeah, this is confusing, sorry. When I wrote "'t' has been released" what I meant is that 't' is no longer usable as a token, not that its slot had been released. I'll reword.


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 8
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#5).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
M contexts to share a single pool with N threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 947 insertions(+), 74 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/5
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Dan Burkert, David Ribeiro Alves, Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#2).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
m contexts to share a single pool with n threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 708 insertions(+), 70 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/2
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/6874

to look at the new patch set (#3).

Change subject: threadpool: token-based task sequencing
......................................................................

threadpool: token-based task sequencing

This patch adds task sequencing to util/threadpool. Task sequencing allows
m contexts to share a single pool with n threads while also ensuring that
the pool executes tasks belonging to each context in order. Previously this
was only achievable via "one singleton pool per context", which grossly
inflated the total number of threads and overall state.

A group of tasks are sequenced by submission to a dedicated "slot". A client
obtains exclusive access to a slot via AllocateTokenSlot(), which returns a
"token" that can be used for task submission/waiting. When the token is
destroyed (or Release() is called), the slot is returned to the pool. For
implementation simplicity, clients must wait for a token's outstanding tasks
to complete before destroying their token.

Some notes:
- Slots and tokens are mapped 1-1 so they could theoretically be combined,
  but I prefer this separation of concerns.
- The current implementation requires tokens to have no outstanding tasks
  when they are released. This was a conscious choice made so that token
  destruction (especially when done as part of a larger object graph
  destruct) won't take an unusually long amount of time.
- I evaluated two other implementations. In one, tokens had an implicit
  lifecycle that was automatically managed by the threadpool. While simpler
  for clients, the threadpool was more inefficient with more allocations and
  deallocations in each submission. In the other, token submission was built
  using regular task submission. This afforded a certain separation between
  the "core" of the threadpool and the token logic, but it complicated
  locking and tracking of queued tasks.
- I tried to keep submission (whether token-based or tokenless) fast. Just
  the change from std::list to std::deque for the main queue ought to
  improve performance of tokenless submissions. The next bottleneck is
  likely to be lock contention on the global threadpool lock.

Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
---
M src/kudu/util/debug-util.h
M src/kudu/util/threadpool-test.cc
M src/kudu/util/threadpool.cc
M src/kudu/util/threadpool.h
4 files changed, 731 insertions(+), 75 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/74/6874/3
-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] threadpool: token-based task sequencing

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: threadpool: token-based task sequencing
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/6874/3/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 419:   ASSERT_EQ(SequenceToken::kInvalidSlotIndex, t2.slot_index_);
> I'm not seeing why; can you elaborate?
woops, I misunderstood the test


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] WIP: threadpool: token-based sequencing

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: WIP: threadpool: token-based sequencing
......................................................................


Patch Set 1:

(1 comment)

this seems like a reasonable approach. Potential concerns for discussion:

- does this affect the performance of a normal ThreadPool that doesn't use the token feature? From looking at the code it seems like it would just be a few branches and no extra allocation, etc, but would be good to be sure of that

- not sure about using a string for a token vs having a more opaque token, such as ThreadPool::ObtainSequencer() returning a 'Sequencer' struct (which could well be just an int or somesuch)

- docs wise, would be good to document fairness and starvation-freedom properties. Particularly, the fact that this exhibits neither :-D Probably fine since I think we would typically use this feature in threadpools with no max thread count specified. (is it worth actually prohibiting use of tokens in a pool with a max thread count? or is it just "YMMV" situation?)

http://gerrit.cloudera.org:8080/#/c/6874/1/src/kudu/util/threadpool-test.cc
File src/kudu/util/threadpool-test.cc:

Line 392:       result += c;
I think you'd probably want to add a random sleep at the beginning of this lambda, so that if they did accidentally execute in parallel, it would be more likely to actually misorder


-- 
To view, visit http://gerrit.cloudera.org:8080/6874
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: If46dc34212027b6ea5dbc2ead7c7af8be57f2c70
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes