You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Mike Percy (Code Review)" <ge...@cloudera.org> on 2017/08/17 23:11:30 UTC

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Hello David Ribeiro Alves, Todd Lipcon, Alexey Serbin,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/7717

to review the following change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................

consensus: Improve contract for API to fetch last-logged op id

It's important that we differentiate between when we have a known
last-logged op and when we don't actually know what it is or whether we
have ever appended something to the local WAL. This applies both to the
TABLET_DATA_READY case, where this information is stored in the WAL, and
the TABLET_DATA_TOMBSTONED case, where this information is stored in the
superblock.

Cases where we are unable to determine the last-logged OpId from the WAL
when a replica is in TABLET_DATA_READY state:

* Early in the tablet replica lifecycle (before Raft is started).
* When a replica encounters an error during initialization.

Cases where we are unable to determine the last-logged OpId from the
TabletMetadata when a replica is in TABLET_DATA_TOMBSTONED state:

* If the replica was tombstoned while in a failed state.

Included in this patch are the following API improvements:

1. Delete Log::GetLatestEntryOpId(). Previously, this method would only
   return something other than MinimumOpId() if a log entry had been
   appended during the object's lifetime. It is abandoned in favor of
   RaftConsensus::GetLastOpId(RECEIVED_OPID) which delegates to
   PeerMessageQueue::GetLastOpIdInLog().
2. Merge PeerMessageQueue::Init() into the PeerMessageQueue constructor.
   This allows us to remove one lifecycle state and allows us to
   guarantee that, once the queue is constructed, we can always get a
   valid last-logged opid from it (see #1).
3. Make TabletMetadata::tombstone_last_logged_opid() return a
   boost::optional<OpId>. We need to clearly differentiate between when
   we know the last-logged opid and when we don't. We also consider
   MinimumOpId() to be equal to boost::none at superblock load time,
   since previous versions of Kudu may have written (0,0) into the
   TabletMetadata 'tombstone_last_logged_opid' field.

Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
---
M src/kudu/consensus/consensus_peers-test.cc
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/mt-log-test.cc
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
18 files changed, 200 insertions(+), 224 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/7717/1
-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

Line 134:   if (last_logged_opid && last_logged_opid->term() > caller_term) {
> worth a log message indicating that it is allowing replacement of a failed 
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7717

to look at the new patch set (#4).

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................

consensus: Improve contract for API to fetch last-logged op id

It's important that we differentiate between when we have a known
last-logged op and when we don't actually know what it is or whether we
have ever appended something to the local WAL. This applies both to the
TABLET_DATA_READY case, where this information is stored in the WAL, and
the TABLET_DATA_TOMBSTONED case, where this information is stored in the
superblock.

Cases where we are unable to determine the last-logged OpId from the WAL
when a replica is in TABLET_DATA_READY state:

* Early in the tablet replica lifecycle (before Raft is started).
* When a replica encounters an error during initialization.

Cases where we are unable to determine the last-logged OpId from the
TabletMetadata when a replica is in TABLET_DATA_TOMBSTONED state:

* If the replica was tombstoned while in a failed state.

Included in this patch are the following API improvements:

1. Delete Log::GetLatestEntryOpId(). Previously, this method would only
   return something other than MinimumOpId() if a log entry had been
   appended during the object's lifetime. It is abandoned in favor of
   RaftConsensus::GetLastOpId(RECEIVED_OPID) which delegates to
   PeerMessageQueue::GetLastOpIdInLog().
2. Merge PeerMessageQueue::Init() into the PeerMessageQueue constructor.
   This allows us to remove one lifecycle state and allows us to
   guarantee that, once the queue is constructed, we can always get a
   valid last-logged opid from it (see #1).
3. Make TabletMetadata::tombstone_last_logged_opid() return a
   boost::optional<OpId>. We need to clearly differentiate between when
   we know the last-logged opid and when we don't. We also consider
   MinimumOpId() to be equal to boost::none at superblock load time,
   since previous versions of Kudu may have written (0,0) into the
   TabletMetadata 'tombstone_last_logged_opid' field.

Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/consensus_peers-test.cc
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/mt-log-test.cc
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
19 files changed, 244 insertions(+), 242 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/7717/4
-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 1:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/consensus/consensus_peers-test.cc
File src/kudu/consensus/consensus_peers-test.cc:

Line 210:   ASSERT_OPID_EQ(first, message_queue_->GetLastOpIdInLog());
> warning: local copy '_left210' of the variable 'first' is never modified; c
Done


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 125:       log_cache_(metric_entity, std::move(log), local_peer_pb_.permanent_uuid(), tablet_id_),
> warning: passing result of std::move() as a const reference argument; no mo
Done


Line 126:       metrics_(std::move(metric_entity)),
> warning: passing result of std::move() as a const reference argument; no mo
Done


Line 143:   log_cache_.Init(queue_state_.last_appended);
> mind adding a TODO to merge LogCache::Init into the LogCache ctor, given we
Done


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 245:                                       queue.get(),
> in the case that one of the RETURN_NOT_OK calls below fails, then you'll en
Done


Line 250:   pending->SetInitialCommittedOpId(info.last_committed_id);
> I believe it was on purpose that this was called after the appending of the
Ah, thank you for the catch. This caused a bug that I was having trouble diagnosing.


PS1, Line 1492: opt_local_last_logged_opid) ? 
> you can use .value_or(MinimumOpId())
Actually, in this patch I can bypass this because we know RaftConsensus is running so we can use queue_->GetLastOpIdInLog()


Line 1493:                                                              : MinimumOpId();
> is this right? I thought, if we don't know our own local op id, then it wou
Not sure what I was thinking. Removed.


Line 2226:   if (!queue_ || !pending_) return boost::none;
> think it's worth a comment here explaining the cases where we could hit thi
Done


PS1, Line 2234: default:
              :       return boost::none;
> is there another valid value to pass here? not FATAL?
Hrm, yeah. It's a protobuf enum that is exposed to a remote read API via RPC, so for forward compatibility I'll make it a DFATAL and return boost::none.


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

Line 216:   void SetPreFlushCallback(StatusClosure callback) { pre_flush_callback_ = callback; }
> warning: parameter 'callback' is passed by value and only copied once; cons
Done


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

Line 134:   if (last_logged_opid && last_logged_opid->term() > caller_term) {
> same question as in the Vote case -- is it correct to allow replacement if 
I'm not sure what else we can do. If we have tombstoned a failed replica, we will not know the last-logged opid, and this covers that case.


Line 237:     if (last_logged_opid && last_logged_opid->term() > remote_cstate_->current_term()) {
> same
see above


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 460:         if (last_logged_opid) {
> same question
see my response in the in other file


Line 474:         int64_t last_logged_term = opt_last_logged_opid->term();
> is this guaranteed to be non-none?
Ah, indeed no. I'll give it the same treatment as the others for now.


-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 474:         int64_t last_logged_term = opt_last_logged_opid->term();
> Ah, indeed no. I'll give it the same treatment as the others for now.
Actually, since we are in TABLET_DATA_READY state, not knowing the last-logged opid would be unexpected. So we should be able to reject the TabletCopy request in that case.


-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Todd Lipcon, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7717

to look at the new patch set (#3).

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................

consensus: Improve contract for API to fetch last-logged op id

It's important that we differentiate between when we have a known
last-logged op and when we don't actually know what it is or whether we
have ever appended something to the local WAL. This applies both to the
TABLET_DATA_READY case, where this information is stored in the WAL, and
the TABLET_DATA_TOMBSTONED case, where this information is stored in the
superblock.

Cases where we are unable to determine the last-logged OpId from the WAL
when a replica is in TABLET_DATA_READY state:

* Early in the tablet replica lifecycle (before Raft is started).
* When a replica encounters an error during initialization.

Cases where we are unable to determine the last-logged OpId from the
TabletMetadata when a replica is in TABLET_DATA_TOMBSTONED state:

* If the replica was tombstoned while in a failed state.

Included in this patch are the following API improvements:

1. Delete Log::GetLatestEntryOpId(). Previously, this method would only
   return something other than MinimumOpId() if a log entry had been
   appended during the object's lifetime. It is abandoned in favor of
   RaftConsensus::GetLastOpId(RECEIVED_OPID) which delegates to
   PeerMessageQueue::GetLastOpIdInLog().
2. Merge PeerMessageQueue::Init() into the PeerMessageQueue constructor.
   This allows us to remove one lifecycle state and allows us to
   guarantee that, once the queue is constructed, we can always get a
   valid last-logged opid from it (see #1).
3. Make TabletMetadata::tombstone_last_logged_opid() return a
   boost::optional<OpId>. We need to clearly differentiate between when
   we know the last-logged opid and when we don't. We also consider
   MinimumOpId() to be equal to boost::none at superblock load time,
   since previous versions of Kudu may have written (0,0) into the
   TabletMetadata 'tombstone_last_logged_opid' field.

Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/consensus_peers-test.cc
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/mt-log-test.cc
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
19 files changed, 242 insertions(+), 242 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/7717/3
-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 1:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 143:   log_cache_.Init(queue_state_.last_appended);
mind adding a TODO to merge LogCache::Init into the LogCache ctor, given we now have all the necessary parameters at construction time?


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/consensus/raft_consensus.cc
File src/kudu/consensus/raft_consensus.cc:

Line 245:                                       queue.get(),
in the case that one of the RETURN_NOT_OK calls below fails, then you'll end up with peer_manager_ constructed with a pointer to a destructed 'queue'. I think you may need to also defer storage of peer_manager_ until down below as well.


Line 250:   pending->SetInitialCommittedOpId(info.last_committed_id);
I believe it was on purpose that this was called after the appending of the pending ops to the PendingRound structure below.


PS1, Line 1492: opt_local_last_logged_opid) ? 
you can use .value_or(MinimumOpId())


Line 1493:                                                              : MinimumOpId();
is this right? I thought, if we don't know our own local op id, then it would be unsafe to vote ever, no? ie it should be MaximumOpId or something?


Line 2226:   if (!queue_ || !pending_) return boost::none;
think it's worth a comment here explaining the cases where we could hit this case


PS1, Line 2234: default:
              :       return boost::none;
is there another valid value to pass here? not FATAL?


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

Line 134:   if (last_logged_opid && last_logged_opid->term() > caller_term) {
same question as in the Vote case -- is it correct to allow replacement if our last opid is 'unknown'?


Line 237:     if (last_logged_opid && last_logged_opid->term() > remote_cstate_->current_term()) {
same


http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/ts_tablet_manager.cc
File src/kudu/tserver/ts_tablet_manager.cc:

Line 460:         if (last_logged_opid) {
same question


Line 474:         int64_t last_logged_term = opt_last_logged_opid->term();
is this guaranteed to be non-none?


-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/7717

to look at the new patch set (#2).

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................

consensus: Improve contract for API to fetch last-logged op id

It's important that we differentiate between when we have a known
last-logged op and when we don't actually know what it is or whether we
have ever appended something to the local WAL. This applies both to the
TABLET_DATA_READY case, where this information is stored in the WAL, and
the TABLET_DATA_TOMBSTONED case, where this information is stored in the
superblock.

Cases where we are unable to determine the last-logged OpId from the WAL
when a replica is in TABLET_DATA_READY state:

* Early in the tablet replica lifecycle (before Raft is started).
* When a replica encounters an error during initialization.

Cases where we are unable to determine the last-logged OpId from the
TabletMetadata when a replica is in TABLET_DATA_TOMBSTONED state:

* If the replica was tombstoned while in a failed state.

Included in this patch are the following API improvements:

1. Delete Log::GetLatestEntryOpId(). Previously, this method would only
   return something other than MinimumOpId() if a log entry had been
   appended during the object's lifetime. It is abandoned in favor of
   RaftConsensus::GetLastOpId(RECEIVED_OPID) which delegates to
   PeerMessageQueue::GetLastOpIdInLog().
2. Merge PeerMessageQueue::Init() into the PeerMessageQueue constructor.
   This allows us to remove one lifecycle state and allows us to
   guarantee that, once the queue is constructed, we can always get a
   valid last-logged opid from it (see #1).
3. Make TabletMetadata::tombstone_last_logged_opid() return a
   boost::optional<OpId>. We need to clearly differentiate between when
   we know the last-logged opid and when we don't. We also consider
   MinimumOpId() to be equal to boost::none at superblock load time,
   since previous versions of Kudu may have written (0,0) into the
   TabletMetadata 'tombstone_last_logged_opid' field.

Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/consensus_peers-test.cc
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/mt-log-test.cc
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
19 files changed, 227 insertions(+), 242 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/17/7717/2
-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has submitted this change and it was merged.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


consensus: Improve contract for API to fetch last-logged op id

It's important that we differentiate between when we have a known
last-logged op and when we don't actually know what it is or whether we
have ever appended something to the local WAL. This applies both to the
TABLET_DATA_READY case, where this information is stored in the WAL, and
the TABLET_DATA_TOMBSTONED case, where this information is stored in the
superblock.

Cases where we are unable to determine the last-logged OpId from the WAL
when a replica is in TABLET_DATA_READY state:

* Early in the tablet replica lifecycle (before Raft is started).
* When a replica encounters an error during initialization.

Cases where we are unable to determine the last-logged OpId from the
TabletMetadata when a replica is in TABLET_DATA_TOMBSTONED state:

* If the replica was tombstoned while in a failed state.

Included in this patch are the following API improvements:

1. Delete Log::GetLatestEntryOpId(). Previously, this method would only
   return something other than MinimumOpId() if a log entry had been
   appended during the object's lifetime. It is abandoned in favor of
   RaftConsensus::GetLastOpId(RECEIVED_OPID) which delegates to
   PeerMessageQueue::GetLastOpIdInLog().
2. Merge PeerMessageQueue::Init() into the PeerMessageQueue constructor.
   This allows us to remove one lifecycle state and allows us to
   guarantee that, once the queue is constructed, we can always get a
   valid last-logged opid from it (see #1).
3. Make TabletMetadata::tombstone_last_logged_opid() return a
   boost::optional<OpId>. We need to clearly differentiate between when
   we know the last-logged opid and when we don't. We also consider
   MinimumOpId() to be equal to boost::none at superblock load time,
   since previous versions of Kudu may have written (0,0) into the
   TabletMetadata 'tombstone_last_logged_opid' field.

Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Reviewed-on: http://gerrit.cloudera.org:8080/7717
Reviewed-by: Todd Lipcon <to...@apache.org>
Tested-by: Kudu Jenkins
---
M src/kudu/consensus/consensus-test-util.h
M src/kudu/consensus/consensus_peers-test.cc
M src/kudu/consensus/consensus_queue-test.cc
M src/kudu/consensus/consensus_queue.cc
M src/kudu/consensus/consensus_queue.h
M src/kudu/consensus/log.cc
M src/kudu/consensus/log.h
M src/kudu/consensus/mt-log-test.cc
M src/kudu/consensus/raft_consensus.cc
M src/kudu/consensus/raft_consensus.h
M src/kudu/consensus/raft_consensus_quorum-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/tablet_replica-test.cc
M src/kudu/tools/kudu-tool-test.cc
M src/kudu/tserver/tablet_copy_client.cc
M src/kudu/tserver/tablet_copy_source_session.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/ts_tablet_manager.cc
19 files changed, 244 insertions(+), 242 deletions(-)

Approvals:
  Todd Lipcon: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 3:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/7717/3/src/kudu/consensus/consensus_queue.cc
File src/kudu/consensus/consensus_queue.cc:

Line 114: PeerMessageQueue::PeerMessageQueue(scoped_refptr<MetricEntity> metric_entity,
> warning: the parameter 'metric_entity' is copied for each invocation but on
This has cascading effects and I don't want to fix all the downstream classes in this patch. We'll slowly work toward that in the future.


Line 115:                                    scoped_refptr<log::Log> log,
> warning: the parameter 'log' is copied for each invocation but only used as
same


-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 4:

I had to manually rebase on top of Alexey's IWYU patch. The conflicts were very minimal.

-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] consensus: Improve contract for API to fetch last-logged op id

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: consensus: Improve contract for API to fetch last-logged op id
......................................................................


Patch Set 2: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/7717/1/src/kudu/tserver/tablet_copy_client.cc
File src/kudu/tserver/tablet_copy_client.cc:

Line 134:   if (last_logged_opid && last_logged_opid->term() > caller_term) {
> I'm not sure what else we can do. If we have tombstoned a failed replica, w
worth a log message indicating that it is allowing replacement of a failed replica? I seem to remember we convinced ourselves that this is probably slightly unsafe technically but that we should do it, but I think it would be good to dig up some of our thinking and put it in a comment and have some log message which indicates that we are "bending the rules" in some way here.


-- 
To view, visit http://gerrit.cloudera.org:8080/7717
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ia4e4501a61cd40fdee0dc918b77675a0bc2515e7
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes