You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2020/09/23 02:03:52 UTC

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/16492


Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 675 insertions(+), 119 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/1
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#9).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 735 insertions(+), 130 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/9
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Patch Set 11: Verified+1

The failed test is unrelated to this patch:

TabletCopyClientTest.TestLifeCycle


-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 06 Oct 2020 19:22:13 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Reviewed-on: http://gerrit.cloudera.org:8080/16492
Tested-by: Andrew Wong <aw...@cloudera.com>
Reviewed-by: Alexey Serbin <as...@cloudera.com>
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 739 insertions(+), 133 deletions(-)

Approvals:
  Andrew Wong: Verified
  Alexey Serbin: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 12
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Patch Set 7:

(11 comments)

http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG@17
PS3, Line 17: this can be used to store the owner of
            :   the transaction
> We currently use the owner on the TxnStatusManager to ensure that only the 
Thank you for the clarification!


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc
File src/kudu/integration-tests/txn_participant-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc@231
PS7, Line 231:     FLAGS_flush_threshold_mb = 1;
             :     FLAGS_flush_threshold_secs = 1;
             :     FLAGS_log_segment_size_mb = 1;
nit: it would be nice to document the reasoning behind the customization of these flags


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc@234
PS7, Line 234:     NO_FATALS(TxnParticipantITest::SetUp());
nit: should we also set FLAGS_maintenance_manager_polling_interval_ms to some lower value (default is 250) ?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/ops/participant_op.cc@181
PS7, Line 181: op.finalized_commit_timestamp();
nit: does it make sense to add

  DCHECK(op.has_finalized_commit_timestamp())

?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/tablet.h@122
PS7, Line 122: std::unordered_set<int64_t>
nit: could we get away with just

  const std::unordered_set<int64_t>& in_flight_txn_ids = {}

?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h@181
PS3, Line 181: Begins
> I don't think it's worth updating all such instances in such a large file, 
Yep, my point was to keep the tense as with the majority of comments in the file.  OK, I guess it's not a big deal anyways :)


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@68
PS3, Line 68: ent state 
> See https://kudu.apache.org/docs/contributing.html#_pointers
SGTM: it seems memory usage benefits (i.e. using 8 instead of 16 bytes) is the crux here.


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc
File src/kudu/tablet/txn_participant-test.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@448
PS7, Line 448:   ASSERT_OK(tablet_replica_->GetGCableDataSize(&gcable_size));
nit: do you mind adding an assert just before performing any txn-related participant ops to make sure GC-able size was 0 at that point?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@462
PS7, Line 462: ASSERT_EQ(0, gcable_size);
Not sure whether whether I'm missing it, but I cannot see the call to tablet_replica_->GetGCableDataSize() before this assertion to update 'gcable_size'.

Also, if performing such a verification, should we set --log_min_segments_to_retain to unnatural low value of 0 to make sure we don't hit the case of non-finished segment in Log::AllocateSegmentAndRollOverForTests()?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@468
PS7, Line 468: ASSERT_EQ(0, gcable_size);
Not sure I see how 'gcable_size' could be updated before this assertion.  Am I missing something?


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@506
PS7, Line 506: Test that rebuilding transaction state from the WALs and metadata
An incomplete sentence?



-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 30 Sep 2020 04:35:43 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#8).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 719 insertions(+), 131 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/8
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#5).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 678 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/5
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#7).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 680 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/7
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Patch Set 11: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 07 Oct 2020 04:54:18 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has removed a vote on this change.

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Patch Set 5:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG@17
PS3, Line 17: this can be used to store the owner of
            :   the transaction
> Not related to this changelist directly, but I'm curios: what's the benefit
We currently use the owner on the TxnStatusManager to ensure that only the owner performs DDL on the transaction. I think it also may make sense to ensure only the owner performs DML on participants, in which case we either need to persist the owner on disk, or fetch it from the TxnStatusManager upon starting up.


http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG@19
PS3, Line 19: commit timestamp
> It would be nice if you could added more colors on the intended use of the 
Done


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/metadata.proto@167
PS3, Line 167: Map from txn ID to metadata associated with the transaction. This is
             :   // updated on eac
> It would be nice to add more colors: what they key of the map is (transacti
Done


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/ops/participant_op.h
File src/kudu/tablet/ops/participant_op.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/ops/participant_op.h@76
PS3, Line 76: data.
> Is it worth documenting the role of the 'tablet' parameter here?
Done


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/ops/participant_op.cc@232
PS3, Line 232:     return;
> nit: any reason not to move this after the 'if (PREDICT_FALSE(result == Op:
If you're referring to the ClearIfInitFailed(), an initialization failure can only happen if we've aborted -- the only state change that can leave us here is if we try to BEGIN_TXN and abort, we'd start with kInitialized and fail to get to kOpen. Added a comment.


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h@181
PS3, Line 181: Begins
> nit here and below: it seems the majority of methods here are documented us
I don't think it's worth updating all such instances in such a large file, especially if there isn't a clear winner between tenses. I personally try to stick to describing the method when writing comments about methods, and using imperative tense when writing in-line comments about what certain code snippets are doing.


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@68
PS3, Line 68: RefCounted
> nit: why not to use std::shared_ptr instead of scoped_refptr and use make_s
See https://kudu.apache.org/docs/contributing.html#_pointers

The scoped_refptr stores the counter with the object. The main benefit I see for shared_ptr is the weak_ptr use-case, which I don't expect to have here.


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@70
PS3, Line 70: bool aborted = false,
            :               boost::optional<int64_t> commit_ts
> Maybe, make these parameters to be 'false' and 'boost::none' by default?
Done


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@73
PS3, Line 73: std::move
> nit: add std::move() ?
Done


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@95
PS3, Line 95: ~TxnMetadata() = 
> nit: would the following be more idiomatic?
Done


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@446
PS3, Line 446: std::vector<std::unique_ptr<log::MinLogIndexAnchorer>>
> Is it possible to switch to std::vector<log::MinLogIndexAnchorer> and use m
Not currently; MinLogIndexAnchorer doesn't allow copies or assignment. Defining a move constructor seems non-trivial.


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc@511
PS3, Line 511: txn_metadata_by_txn_id_ = std::move(txn
> nit: why 'swap' is preferred instead of 'move' here?
Replied below.


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc@624
PS3, Line 624: // Now that we've flushed, we 
> Do you mind adding a comment to explain why it's necessary to explicitly em
Done


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc@839
PS3, Line 839: }
> nit: IIRC, we tend to uniformly use std::move() in cases like this.  Why 's
I was thinking we might not want to call any destructors under lock, as it adds more work, but it might be a needless optimization. Done


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc@841
PS3, Line 841: ) {
> ditto
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 29 Sep 2020 00:12:19 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#2).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 676 insertions(+), 119 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/2
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Patch Set 3:

(15 comments)

I did a first quick pass.  So far looks reasonable, but I'm not sure I have a clear big picture here, so my comments are just some nits.

http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG@17
PS3, Line 17: this can be used to store the owner of
            :   the transaction
Not related to this changelist directly, but I'm curios: what's the benefit of storing the owner of a transaction in the superblock?  As of now, we have that information stored in the transaction status table and we use it to restrict access to transaction management operations.  Where we would use the information on the owner stored in the superblock?


http://gerrit.cloudera.org:8080/#/c/16492/3//COMMIT_MSG@19
PS3, Line 19: commit timestamp
It would be nice if you could added more colors on the intended use of the txn commit timestamp stored in the superblock.


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/metadata.proto
File src/kudu/tablet/metadata.proto:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/metadata.proto@167
PS3, Line 167: Metadata associated with each transaction that this tablet is a
             :   // participant in
It would be nice to add more colors: what they key of the map is (transaction id?) and how often this field is updated to reflect the changes in runtime?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/ops/participant_op.h
File src/kudu/tablet/ops/participant_op.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/ops/participant_op.h@76
PS3, Line 76: tablet
Is it worth documenting the role of the 'tablet' parameter here?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/ops/participant_op.cc@232
PS3, Line 232:     txn_participant->ClearIfInitFailed(txn_id);
nit: any reason not to move this after the 'if (PREDICT_FALSE(result == Op::ABORTED))' clause?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h@181
PS3, Line 181: Begins
nit here and below: it seems the majority of methods here are documented using a different tense, e.g. here it would be 'Begin the transactions, ...'.  Maybe, it's worth keeping it consistent across this file?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h
File src/kudu/tablet/tablet_metadata.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@68
PS3, Line 68: RefCounted
nit: why not to use std::shared_ptr instead of scoped_refptr and use make_shared to create related instances?  Why scoped_refptr is better?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@70
PS3, Line 70: bool aborted,
            :               boost::optional<int64_t> commit_ts
Maybe, make these parameters to be 'false' and 'boost::none' by default?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@73
PS3, Line 73: commit_ts
nit: add std::move() ?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@95
PS3, Line 95: ~TxnMetadata() {}
nit: would the following be more idiomatic?

  ~TxnMetadata() = default;


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.h@446
PS3, Line 446: std::vector<std::unique_ptr<log::MinLogIndexAnchorer>>
Is it possible to switch to std::vector<log::MinLogIndexAnchorer> and use move semantics for passing parameters instead using std::unique_ptr?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc
File src/kudu/tablet/tablet_metadata.cc:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc@511
PS3, Line 511: txn_metadata_by_txn_id_.swap(txn_metas)
nit: why 'swap' is preferred instead of 'move' here?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc@624
PS3, Line 624: anchors_needing_flush.clear();
Do you mind adding a comment to explain why it's necessary to explicitly empty the container?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc@839
PS3, Line 839: in_flight_txn_ids->swap(in_flights);
nit: IIRC, we tend to uniformly use std::move() in cases like this.  Why 'swap' is better for here?


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet_metadata.cc@841
PS3, Line 841: swap
ditto



-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 24 Sep 2020 22:14:52 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#11).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 739 insertions(+), 133 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/11
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has removed a vote on this change.

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#10).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 728 insertions(+), 131 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/10
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Patch Set 8: Code-Review+2

(1 comment)

Looks good to me.

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/tablet.h@122
PS7, Line 122: std::unordered_set<int64_t>
> Unfortunately not, it seems. I tried that in an earlier revision and the AS
I see: probably this is something specific to older gcc's version.  Thank you for clarifying on this.



-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Tue, 06 Oct 2020 04:19:05 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#4).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 684 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/4
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#3).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 676 insertions(+), 119 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/3
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Patch Set 8: Verified+1

Failures seem to be related to the cluster ID races.


-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 01 Oct 2020 23:39:01 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/16492 )

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................


Patch Set 8:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc
File src/kudu/integration-tests/txn_participant-itest.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc@231
PS7, Line 231:       vector<Status> statuses = RunOnReplicas(followers, txn_id, op);
             :       for (int i = 0; i < statuses.size(); i++) {
             :         const auto& s = statuses[i
> nit: it would be nice to document the reasoning behind the customization of
Done


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/integration-tests/txn_participant-itest.cc@234
PS7, Line 234:         ASSERT_TRUE(s.IsIllegalState()) << s.ToString();
> nit: should we also set FLAGS_maintenance_manager_polling_interval_ms to so
Done


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/ops/participant_op.cc
File src/kudu/tablet/ops/participant_op.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/ops/participant_op.cc@181
PS7, Line 181: commit_timestamp());
> nit: does it make sense to add
Done


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/tablet.h@122
PS7, Line 122: std::unordered_set<int64_t>
> nit: could we get away with just
Unfortunately not, it seems. I tried that in an earlier revision and the ASAN build failed to build because of it:

 /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tablet/tablet.h:122:50: error: chosen constructor is explicit in copy-initialization
 22:25:20   Status Open(const std::unordered_set<int64_t>& in_flight_txn_ids = {});
 22:25:20                                                  ^                   ~~
 22:25:20 /usr/lib/gcc/x86_64-linux-gnu/4.8/../../../../include/c++/4.8/bits/unordered_set.h:132:7: note: explicit constructor declared here
 22:25:20       unordered_set(size_type __n = 10,
 22:25:20       ^
 22:25:20 /home/jenkins-slave/workspace/kudu-master/2/src/kudu/tablet/tablet.h:122:50: note: passing argument to parameter 
 'in_flight_txn_ids' here
 22:25:20   Status Open(const std::unordered_set<int64_t>& in_flight_txn_ids = {});


http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/16492/3/src/kudu/tablet/tablet.h@181
PS3, Line 181: Begins
> Yep, my point was to keep the tense as with the majority of comments in the
Yeah, I do try to be consistent when writing new code, and if there were only a fewer updates I might've gone through with it. Perhaps in a separate patch.


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc
File src/kudu/tablet/txn_participant-test.cc:

http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@448
PS7, Line 448:   int current_key = 0;
> nit: do you mind adding an assert just before performing any txn-related pa
Done


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@462
PS7, Line 462: // op, and WALs should be 
> Not sure whether whether I'm missing it, but I cannot see the call to table
Done


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@468
PS7, Line 468: ASSERT_OK(tablet_replica_-
> Not sure I see how 'gcable_size' could be updated before this assertion.  A
Nope, you didn't miss it. My mistake for not adding them.


http://gerrit.cloudera.org:8080/#/c/16492/7/src/kudu/tablet/txn_participant-test.cc@506
PS7, Line 506: SSERT_OK(tablet_replica_->GetGCableDataSize(&gcable_size));
> An incomplete sentence?
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 01 Oct 2020 05:22:16 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 p11: persist txn metadata in superblock

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/16492

to look at the new patch set (#6).

Change subject: KUDU-2612 p11: persist txn metadata in superblock
......................................................................

KUDU-2612 p11: persist txn metadata in superblock

Currently, transaction metadata is entirely stored in the WALs; once a
transaction is begun, it anchors WALs indefinitely. This is untenable,
as any tablet that participates in a transaction would never be able to
GC its WALs.

This patch addresses this by storing transaction metadata in the tablet
superblock for the following participant ops:
- BEGIN_TXN: an empty record is added to the superblock for the given
  transaction ID. In the future, this can be used to store the owner of
  the transaction.
- FINALIZE_COMMIT: the commit timestamp is added to the superblock for
  the given transaction ID. The commit timestamp of a transaction will
  be used when scanning over mutations applied to a tablet instead of
  the apply timestamp.
- ABORT_TXN: an abort bit is added to the superblock for the given
  transaction ID.

Upon applying each of these ops, the op's OpId is anchored in the WALs
until the tablet metadata is successfully flushed, ensuring that if we
don't flush the tablet metadata before restarting, we can rebuild any
ops that whose mutations had not been persisted by replaying the WALs.

What about BEGIN_COMMIT ops? While these ops will still need to be
anchored, BEGIN_COMMIT ops don't need to persist any long-term
information. As such, their anchoring is slightly different -- rather
than mutating the tablet metadata and unanchoring on a metadata flush,
BEGIN_COMMIT ops will update the in-memory state for the in-flight Txn,
and unanchor once the corresponding FINALIZE_COMMIT or ABORT_TXN begins
applying.

On bootstrap, the following rules are applied:
- If we've persisted terminal information (i.e. abort or commit) about a
  transaction ID, we can skip replaying any participant op for that
  transaction.
- If we otherwise have a persisted record for the transaction ID, we
  should start an open transaction and replay all participant ops for
  that transaction.
- If we have no record of a transaction persisted, we must replay all
  participant ops for the given transaction.

While storing things in the superblock is not ideal, there are further
improvements that can be made to reduce its size, e.g. after finalizing
a transaction, the transactions mutations may be merged in with the rest
of the tablet; once all mutations of a given transaction have been
merged, the transaction metadata may be removed from the metadata.

Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
---
M src/kudu/integration-tests/txn_participant-itest.cc
M src/kudu/tablet/metadata.proto
M src/kudu/tablet/ops/participant_op.cc
M src/kudu/tablet/ops/participant_op.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_metadata-test.cc
M src/kudu/tablet/tablet_metadata.cc
M src/kudu/tablet/tablet_metadata.h
M src/kudu/tablet/txn_participant-test.cc
M src/kudu/tablet/txn_participant.cc
M src/kudu/tablet/txn_participant.h
13 files changed, 678 insertions(+), 121 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/92/16492/6
-- 
To view, visit http://gerrit.cloudera.org:8080/16492
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2f32808aef10484f4e0ad3942bb005f61fbdb34a
Gerrit-Change-Number: 16492
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins (120)