You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2019/10/17 03:27:26 UTC

[kudu-CR] log: separate out allocation logic

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/14479


Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
---
M src/kudu/consensus/log-test-base.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
3 files changed, 313 insertions(+), 295 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/14479/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] log: separate out allocation logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 11: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 09 Nov 2019 03:59:59 +0000
Gerrit-HasComments: No

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 9:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14479/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14479/8//COMMIT_MSG@42
PS8, Line 42: Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
> That seems unlikely given the nature of the refactoring.
Ack



-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 09 Nov 2019 03:29:49 +0000
Gerrit-HasComments: Yes

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 8:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.h@103
PS8, Line 103:   SegmentAllocator(const LogOptions* opts,
> Doc this, especially the functors.
Done


http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.cc@27
PS8, Line 27: #include <boost/bind.hpp>
            : #include <boost/function.hpp>
> Can we use std bind/function for this?
Done


http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.cc@569
PS8, Line 569: Status SegmentAllocator::AsyncAllocateSegmentUnlocked() {
> Assert that allocation_lock_ is held?
Done


http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.cc@999
PS8, Line 999:       if (ctx_.hooks) {
> Could we push the hooks into the SegmentAllocator? Would it be OK to call P
Not as clean since WriteEntryBatch isn't a function of the SegmentAllocator, but it's mostly there. Might be able to push WriteEntryBatch into SegmentAllocator, but I think it begins blurring the interface a bit.



-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 09 Nov 2019 03:29:38 +0000
Gerrit-HasComments: Yes

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14479

to look at the new patch set (#10).

Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

This meant a few things:
- I've pushed the schema into the SegmentAllocator. This makes sense
  because the schema is only really needed by the WAL so it can write
  out the current schema of each segment. As such, only the entity that
  manages segment allocation ought to know about it.
- Many of the required Log members are put into a new LogContext,
  containing tablet ID, metrics, etc. We could also use this in the
  AppendThread if we wanted to, but I opted to keep this patch focused
  on the SegmentAllocator.
- The SegmentAllocator needs to call back into the Log in order to
  update the LogReader when it rolls over. Since the LogReader is
  protected by a lock in the Log, rather than providing a back-pointer
  to the Log class or the LogReader class, this passes thread-safe Log
  function calls into the SegmentAllocator.

I tried to keep the SegmentAllocator methods in the same locations as
their Log predecessors to keep the diff small. I'll put them closer
together in a follow-up patch.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 494 insertions(+), 388 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/14479/10
-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 10:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14479/9/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

http://gerrit.cloudera.org:8080/#/c/14479/9/src/kudu/consensus/log.h@191
PS9, Line 191:   std::shared_ptr<LogFaultHooks> hooks_;
> warning: invalid case style for private member 'hooks' [readability-identif
Done


http://gerrit.cloudera.org:8080/#/c/14479/9/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

http://gerrit.cloudera.org:8080/#/c/14479/9/src/kudu/consensus/log.cc@178
PS9, Line 178: using consensus::ReplicateRefPtr;
> warning: using decl 'OpId' is unused [misc-unused-using-decls]
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 09 Nov 2019 03:35:04 +0000
Gerrit-HasComments: Yes

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14479

to look at the new patch set (#8).

Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

This meant a few things:
- I've pushed the schema into the SegmentAllocator. This makes sense
  because the schema is only really needed by the WAL so it can write
  out the current schema of each segment. As such, only the entity that
  manages segment allocation ought to know about it.
- Many of the required Log members are put into a new LogContext,
  containing tablet ID, metrics, etc. We could also use this in the
  AppendThread if we wanted to, but I opted to keep this patch focused
  on the SegmentAllocator.
- The SegmentAllocator needs to call back into the Log in order to
  update the LogReader when it rolls over. Since the LogReader is
  protected by a lock in the Log, rather than providing a back-pointer
  to the Log class or the LogReader class, this passes thread-safe Log
  function calls into the SegmentAllocator.

I tried to keep the SegmentAllocator methods in the same locations as
their Log predecessors to keep the diff small. I'll put them closer
together in a follow-up patch.

There are no functional changes in this patch.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 486 insertions(+), 384 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/14479/8
-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14479

to look at the new patch set (#7).

Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

This meant a few things:
- I've pushed the schema into the SegmentAllocator. This makes sense
  because the schema is only really needed by the WAL so it can write
  out the current schema of each segment. As such, only the entity that
  manages segment allocation ought to know about it.
- Many of the required Log members are put into a new LogContext,
  containing tablet ID, metrics, etc. We could also use this in the
  AppendThread if we wanted to, but I opted to keep this patch focused
  on the SegmentAllocator.
- The SegmentAllocator needs to call back into the Log in order to
  update the LogReader when it rolls over. Since the LogReader is
  protected by a lock in the Log, rather than providing a back-pointer
  to the Log class or the LogReader class, this passes thread-safe Log
  function calls into the SegmentAllocator.

I tried to keep the SegmentAllocator methods in the same locations as
their Log predecessors to keep the diff small. I'll put them closer
together in a follow-up patch.

There are no functional changes in this patch.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 485 insertions(+), 384 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/14479/7
-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 1:

I'm not convinced this is much better, but it's something.


-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Comment-Date: Thu, 17 Oct 2019 03:28:03 +0000
Gerrit-HasComments: No

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 5:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14479/1/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

http://gerrit.cloudera.org:8080/#/c/14479/1/src/kudu/consensus/log.h@80
PS1, Line 80:   FsManager* fs_manager;
> Would be nice to take the opportunity to doc new functions (at least public
Done


http://gerrit.cloudera.org:8080/#/c/14479/1/src/kudu/consensus/log.h@132
PS1, Line 132:   // Shuts down the allocator threadpool. Note that this _doesn't_ close the
> This backpointer makes the code quite hard to read as you have to go back a
This backpointer wasn't really needed for Sync() as you've described. Updated to use function calls, per an offline discussion.


http://gerrit.cloudera.org:8080/#/c/14479/1/src/kudu/consensus/log.h@349
PS1, Line 349:   //    [302-???]  <open>   (counts as 0MB)
> Doc?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 02 Nov 2019 02:12:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 6:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14479/5/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

http://gerrit.cloudera.org:8080/#/c/14479/5/src/kudu/consensus/log.cc@508
PS5, Line 508:   ctx.metrics.reset(metric_entity ? new LogMetrics(metric_entity) : nullptr);
> warning: std::move of the variable 'options' of the trivially-copyable type
Ack


http://gerrit.cloudera.org:8080/#/c/14479/5/src/kudu/consensus/log.cc@516
PS5, Line 516: 
> warning: std::move of the variable 'options' of the trivially-copyable type
Ack


http://gerrit.cloudera.org:8080/#/c/14479/5/src/kudu/consensus/log.cc@1163
PS5, Line 1163:   RETURN_NOT_OK_PREPEND(reader_add_segment_(std::move(readable_segment)), "add");
> warning: prefer ptr1 = std::move(ptr2) over ptr1.reset(ptr2.release()) [mis
Done


http://gerrit.cloudera.org:8080/#/c/14479/5/src/kudu/consensus/log.cc@1188
PS5, Line 1188: 
> warning: passing result of std::move() as a const reference argument; no mo
Done


http://gerrit.cloudera.org:8080/#/c/14479/5/src/kudu/consensus/log.cc@1193
PS5, Line 1193: 
> warning: passing result of std::move() as a const reference argument; no mo
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sun, 03 Nov 2019 06:44:53 +0000
Gerrit-HasComments: Yes

[kudu-CR] log: separate out allocation logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14479/1/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

http://gerrit.cloudera.org:8080/#/c/14479/1/src/kudu/consensus/log.h@80
PS1, Line 80: class SegmentAllocator {
Would be nice to take the opportunity to doc new functions (at least public ones) that have no documentation.


http://gerrit.cloudera.org:8080/#/c/14479/1/src/kudu/consensus/log.h@132
PS1, Line 132:   Log* log_;
This backpointer makes the code quite hard to read as you have to go back and forth between Log and SegmentAllocator.

Would it be possible to avoid it by pushing even more state into SegmentAllocator? Log's immutable state (like options) could be copied into SegmentAllocator at construction time. And instances where SegmentAllocator needs to call back into Log could be handled by composition.

For example, I saw that SegmentAllocator::CloseCurrentSegment needs to call Log::Sync. To work around this dependency, define Log::CloseCurrentSegment that first calls Log::Sync, then calls SegmentAllocator::CloseCurrentSegment. That way SegmentAllocator is no longer responsible for Sync and has one less reason to need the backpointer.


http://gerrit.cloudera.org:8080/#/c/14479/1/src/kudu/consensus/log.h@349
PS1, Line 349:   Status AddEmptySegmentInReader(scoped_refptr<ReadableLogSegment> segment);
Doc?



-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Comment-Date: Thu, 17 Oct 2019 06:55:44 +0000
Gerrit-HasComments: Yes

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 11:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14479/10/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

http://gerrit.cloudera.org:8080/#/c/14479/10/src/kudu/consensus/log.cc@810
PS10, Line 810:         RETURN_NOT_OK_PREPEND(hooks_->PostSyncIfFsyncEnabled(),
> error: use of undeclared identifier 'hooks' [clang-diagnostic-error]
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Sat, 09 Nov 2019 03:54:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14479

to look at the new patch set (#6).

Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

This meant a few things:
- I've pushed the schema into the SegmentAllocator. This makes sense
  because the schema is only really needed by the WAL so it can write
  out the current schema of each segment. As such, only the entity that
  manages segment allocation ought to know about it.
- Many of the required Log members are put into a new LogContext,
  containing tablet ID, metrics, etc. We could also use this in the
  AppendThread if we wanted to, but I opted to keep this patch focused
  on the SegmentAllocator.
- The SegmentAllocator needs to call back into the Log in order to
  update the LogReader when it rolls over. Since the LogReader is
  protected by a lock in the Log, rather than providing a back-pointer
  to the Log class or the LogReader class, this passes thread-safe Log
  function calls into the SegmentAllocator.

I tried to keep the SegmentAllocator methods in the same locations as
their Log predecessors to keep the diff small. I'll put them closer
together in a follow-up patch.

There are no functional changes in this patch.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 486 insertions(+), 384 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/14479/6
-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

This meant a few things:
- I've pushed the schema into the SegmentAllocator. This makes sense
  because the schema is only really needed by the WAL so it can write
  out the current schema of each segment. As such, only the entity that
  manages segment allocation ought to know about it.
- Many of the required Log members are put into a new LogContext,
  containing tablet ID, metrics, etc. We could also use this in the
  AppendThread if we wanted to, but I opted to keep this patch focused
  on the SegmentAllocator.
- The SegmentAllocator needs to call back into the Log in order to
  update the LogReader when it rolls over. Since the LogReader is
  protected by a lock in the Log, rather than providing a back-pointer
  to the Log class or the LogReader class, this passes thread-safe Log
  function calls into the SegmentAllocator.

I tried to keep the SegmentAllocator methods in the same locations as
their Log predecessors to keep the diff small. I'll put them closer
together in a follow-up patch.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Reviewed-on: http://gerrit.cloudera.org:8080/14479
Reviewed-by: Adar Dembo <ad...@cloudera.com>
Tested-by: Kudu Jenkins
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 494 insertions(+), 388 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 12
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14479

to look at the new patch set (#11).

Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

This meant a few things:
- I've pushed the schema into the SegmentAllocator. This makes sense
  because the schema is only really needed by the WAL so it can write
  out the current schema of each segment. As such, only the entity that
  manages segment allocation ought to know about it.
- Many of the required Log members are put into a new LogContext,
  containing tablet ID, metrics, etc. We could also use this in the
  AppendThread if we wanted to, but I opted to keep this patch focused
  on the SegmentAllocator.
- The SegmentAllocator needs to call back into the Log in order to
  update the LogReader when it rolls over. Since the LogReader is
  protected by a lock in the Log, rather than providing a back-pointer
  to the Log class or the LogReader class, this passes thread-safe Log
  function calls into the SegmentAllocator.

I tried to keep the SegmentAllocator methods in the same locations as
their Log predecessors to keep the diff small. I'll put them closer
together in a follow-up patch.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 494 insertions(+), 388 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/14479/11
-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14479

to look at the new patch set (#5).

Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

This meant a few things:
- I've pushed the schema into the SegmentAllocator. This makes sense
  because the schema is only really needed by the WAL so it can write
  out the current schema of each segment. As such, only the entity that
  manages segment allocation ought to know about it.
- Many of the required Log members are put into a new LogContext,
  containing tablet ID, metrics, etc. We could also use this in the
  AppendThread if we wanted to, but I opted to keep this patch focused
  on the SegmentAllocator.
- The SegmentAllocator needs to call back into the Log in order to
  update the LogReader when it rolls over. Since the LogReader is
  protected by a lock in the Log, rather than providing a back-pointer
  to the Log class or the LogReader class, this passes thread-safe Log
  function calls into the SegmentAllocator.

I tried to keep the SegmentAllocator methods in the same locations as
their Log predecessors to keep the diff small. I'll put them closer
together in a follow-up patch.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 479 insertions(+), 378 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/14479/5
-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>

[kudu-CR] log: separate out allocation logic

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14479 )

Change subject: log: separate out allocation logic
......................................................................


Patch Set 8:

(5 comments)

Looks like an improvement, though I think it'll be clearer once the code has had a chance to settle.

http://gerrit.cloudera.org:8080/#/c/14479/8//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/14479/8//COMMIT_MSG@42
PS8, Line 42: There are no functional changes in this patch.
That seems unlikely given the nature of the refactoring.


http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.h
File src/kudu/consensus/log.h:

http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.h@103
PS8, Line 103:   SegmentAllocator(const LogOptions* opts,
Doc this, especially the functors.


http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.cc
File src/kudu/consensus/log.cc:

http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.cc@27
PS8, Line 27: #include <boost/bind.hpp>
            : #include <boost/function.hpp>
Can we use std bind/function for this?


http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.cc@569
PS8, Line 569: Status SegmentAllocator::AsyncAllocateSegmentUnlocked() {
Assert that allocation_lock_ is held?


http://gerrit.cloudera.org:8080/#/c/14479/8/src/kudu/consensus/log.cc@999
PS8, Line 999:       if (ctx_.hooks) {
Could we push the hooks into the SegmentAllocator? Would it be OK to call PreClose() and PostClose() in CloseCurrentSegment()? And PostAppend() in WriteEntryBatch()?



-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)
Gerrit-Comment-Date: Fri, 08 Nov 2019 23:59:15 +0000
Gerrit-HasComments: Yes

[kudu-CR] log: separate out allocation logic

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Alexey Serbin, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14479

to look at the new patch set (#9).

Change subject: log: separate out allocation logic
......................................................................

log: separate out allocation logic

The Log class has a few responsibilities:
1. adding batches to an in-memory queue asynchronously
2. picking off batches from the queue and writing them to disk
3. pre-allocating files in the background when a log segment grows too
   large
4. "rolling" onto a newly pre-allocated file once it is fully allocated
5. garbage collecting old log segments
6. a myriad of other bookkeeping related to indexing and size

1 and 2 have been neatly encapsulated in the Log::AppendThread subclass,
leaving the rest as the monolith that is the Log class.

This patch separates out 3 and 4 into a separate SegmentAllocator class.

This meant a few things:
- I've pushed the schema into the SegmentAllocator. This makes sense
  because the schema is only really needed by the WAL so it can write
  out the current schema of each segment. As such, only the entity that
  manages segment allocation ought to know about it.
- Many of the required Log members are put into a new LogContext,
  containing tablet ID, metrics, etc. We could also use this in the
  AppendThread if we wanted to, but I opted to keep this patch focused
  on the SegmentAllocator.
- The SegmentAllocator needs to call back into the Log in order to
  update the LogReader when it rolls over. Since the LogReader is
  protected by a lock in the Log, rather than providing a back-pointer
  to the Log class or the LogReader class, this passes thread-safe Log
  function calls into the SegmentAllocator.

I tried to keep the SegmentAllocator methods in the same locations as
their Log predecessors to keep the diff small. I'll put them closer
together in a follow-up patch.

Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
---
M src/kudu/consensus/log-test.cc
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 494 insertions(+), 387 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/79/14479/9
-- 
To view, visit http://gerrit.cloudera.org:8080/14479
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I08be4dbdd8e98b02278de76273e931c314b08161
Gerrit-Change-Number: 14479
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Tidy Bot (241)