You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Alexey Serbin (Code Review)" <ge...@cloudera.org> on 2021/05/12 19:59:43 UTC

[kudu-CR] WIP KUDU-2612 automatically flush sessions on txn commit

Alexey Serbin has uploaded this change for review. ( http://gerrit.cloudera.org:8080/17431


Change subject: WIP KUDU-2612 automatically flush sessions on txn commit
......................................................................

WIP KUDU-2612 automatically flush sessions on txn commit

With this patch, all still-in-the-scope transactional sessions
are automatically flushed upon call of the KuduTransaction::Commit()
method for the corresponding transaction handle.

As for the KuduTransaction::StartCommit() method, it's now necessary
to flush all the transactional sessions created off the transaction
handle before calling the method, otherwise an Status::IllegalState()
would be returned.

As Andrew and I discussed offline, it might be an option to return
an error from KuduSession::Apply() for a transactional session whose
transaction has already started committing.  However, after looking
at this closer, I realized that would be tricky and would add
more complexity into a hot path of pushing write operations into a
session.  So, I opted not to perform that consistency check, relying
on KuduSession::Flush() and KuduSession::FlushAsync() reporting
on an error later on when accumulated write operations are being
submitted to the server side.

This patch also contains test for the newly introduced functionality.

WIP:
  * Add in-line documentation about the related behavior of a
    transactional KuduSession for KuduTransaction::Commit() and
    KuduTransaction::StartCommit()
  * Is it OK to keep just a weak pointer to a session in the
    transaction? Similar functionality might be trickier in Java
    client.  Overall, do we expect any issues with keeping shared_ptr,
    not weak_ptr for a session for a transaction?
  * Is it OK to accept an write operation in a transactional session
    (i.e. Apply() doesn't return an error), but return an error only
    later on in an attemp to flush a session?

Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
---
M src/kudu/client/client-test.cc
M src/kudu/client/transaction-internal.cc
M src/kudu/client/transaction-internal.h
3 files changed, 634 insertions(+), 15 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/17431/1
-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 1
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................

KUDU-2612 automatically flush sessions on txn commit

With this patch, all transactional sessions created off a transction
handle are automatically flushed upon calling Commit() on the handle.

As for the KuduTransaction::StartCommit() method, it's now necessary
to flush all the transactional sessions created off the transaction
handle before calling the method, otherwise Status::IllegalState()
would be returned.

As Andrew and I discussed offline, it might be an option to return an
error from KuduSession::Apply() for a transactional session whose
transaction has already started committing.  However, after looking at
this closer, I realized that it would require either an atomic or an
extra synchronization primitive, bringing more complexity into the hot
path of applying write operations in the context of a session.  So,
I opted not to perform the consistency check as a part of the
KuduSession::Apply() method, rather relying on the logic of
KuduSession::Flush() and KuduSession::FlushAsync() methods instead.

Another design detail worth pointing at is that a KuduTransaction handle
keeps shared, not weak pointers to transaction sessions originated off
the handle (I did several back-and-forth iterations on this, though).
Even if using shared_ptr, not weak_ptr, no circular dependencies are
introduced since a transactional session doesn't keep a reference
to the corresponding transactional handle.  The shared_ptr-based
approach looks better than one with weak_ptr because
  (1) It might prevent a data loss due to a mistake in an application
      code, and it takes time to find and fix those.
  (2) It looks more portable and consistent if thinking about similar
      functionality to implement in the Java client.

This patch also contains several test scenarios for the newly introduced
functionality.

Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Reviewed-on: http://gerrit.cloudera.org:8080/17431
Tested-by: Alexey Serbin <as...@cloudera.com>
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/client/client-test.cc
M src/kudu/client/client.h
M src/kudu/client/transaction-internal.cc
M src/kudu/client/transaction-internal.h
M src/kudu/integration-tests/txn_write_ops-itest.cc
5 files changed, 657 insertions(+), 61 deletions(-)

Approvals:
  Alexey Serbin: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 4
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17431/3/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17431/3/src/kudu/client/client-test.cc@7495
PS3, Line 7495: ,
> Ugh I glanced over the comment too quickly, and read it as Foo.Bar(), not {
np, thank you for the review



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 20:31:12 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17431/3/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17431/3/src/kudu/client/client-test.cc@7495
PS3, Line 7495: ,
> Why period?  It's supposed to be a comma: that's the list of methods in the
Ugh I glanced over the comment too quickly, and read it as Foo.Bar(), not {Foo,Bar}(). Sorry for the noise.



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 20:26:46 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 2:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/17431/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17431/2//COMMIT_MSG@22
PS2, Line 22: So, I opted
            : not to perform the consistency check as a part of KuduSession::Apply(),
            : rather relying on the logic of KuduSession::Flush() and
            : KuduSession::FlushAsync() methods instead.
> This means that the check performed in StartCommit() is best-effort, i.e. i
This is true even for Commit() (synchronous), right? Once calling commit, users can still apply more ops to each session?

I don't think it's worth being more strict here -- if this pops up as an issue with real users, they can complain and we can address it. At the very least, any post-commit flushes will already result in op errors, and I think that is reasonable behavior.


http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client-test.cc@7433
PS2, Line 7433: ASSERT_OK(session->SetFlushMode(KuduSession::MANUAL_FLUSH));
I do find it a bit odd for us to be implicitly flushing on Commit() even if a user has specified MANUAL_FLUSH mode. My impression of this mode is that the user has requested full control of all flushing matters. Given FlushMode() isn't constant, how would you feel about doing a best effort check for MANUAL_FLUSH mode, and calling some KuduSession::FlushForCommit() that takes into account the flush mode, only performing a flush in one of the AUTO modes?


http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client.h
File src/kudu/client/client.h:

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client.h@385
PS2, Line 385: awaits for the
nit: either "awaits the" or "waits for the". Same below.


http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/transaction-internal.cc
File src/kudu/client/transaction-internal.cc:

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/transaction-internal.cc@198
PS2, Line 198: // It's a rare case, but nothing prevents to call session.reset() or
             :         // session.swap(nullptr) for a transactional session in the user code.
Doesn't this just replace the pointer in the scoped_refptr instance with a nullptr, and not affect other instances of KuduSession*? I.e. if a user-code caller were to call my_session.swap(nullptr), wouldn't that only affect that instance of the scoped_refptr `my_session` and not other references to the same underlying session?



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 01:07:14 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has removed a vote on this change.

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Removed Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteVote
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Andrew Wong, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17431

to look at the new patch set (#3).

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................

KUDU-2612 automatically flush sessions on txn commit

With this patch, all transactional sessions created off a transction
handle are automatically flushed upon calling Commit() on the handle.

As for the KuduTransaction::StartCommit() method, it's now necessary
to flush all the transactional sessions created off the transaction
handle before calling the method, otherwise Status::IllegalState()
would be returned.

As Andrew and I discussed offline, it might be an option to return an
error from KuduSession::Apply() for a transactional session whose
transaction has already started committing.  However, after looking at
this closer, I realized that it would require either an atomic or an
extra synchronization primitive, bringing more complexity into the hot
path of applying write operations in the context of a session.  So,
I opted not to perform the consistency check as a part of the
KuduSession::Apply() method, rather relying on the logic of
KuduSession::Flush() and KuduSession::FlushAsync() methods instead.

Another design detail worth pointing at is that a KuduTransaction handle
keeps shared, not weak pointers to transaction sessions originated off
the handle (I did several back-and-forth iterations on this, though).
Even if using shared_ptr, not weak_ptr, no circular dependencies are
introduced since a transactional session doesn't keep a reference
to the corresponding transactional handle.  The shared_ptr-based
approach looks better than one with weak_ptr because
  (1) It might prevent a data loss due to a mistake in an application
      code, and it takes time to find and fix those.
  (2) It looks more portable and consistent if thinking about similar
      functionality to implement in the Java client.

This patch also contains several test scenarios for the newly introduced
functionality.

Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
---
M src/kudu/client/client-test.cc
M src/kudu/client/client.h
M src/kudu/client/transaction-internal.cc
M src/kudu/client/transaction-internal.h
M src/kudu/integration-tests/txn_write_ops-itest.cc
5 files changed, 657 insertions(+), 61 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/17431/3
-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 2:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client-test.cc@7433
PS2, Line 7433: ASSERT_OK(session->SetFlushMode(KuduSession::MANUAL_FLUSH));
> Good point -- I think I've come around to keeping it as is. My initial impr
SGTM


http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client.h
File src/kudu/client/client.h:

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client.h@385
PS2, Line 385: awaits for the
> nit: either "awaits the" or "waits for the". Same below.
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 05:12:49 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 2:

(3 comments)

Thank you for the feedback!

http://gerrit.cloudera.org:8080/#/c/17431/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17431/2//COMMIT_MSG@22
PS2, Line 22: So, I opted
            : not to perform the consistency check as a part of KuduSession::Apply(),
            : rather relying on the logic of KuduSession::Flush() and
            : KuduSession::FlushAsync() methods instead.
> This is true even for Commit() (synchronous), right? Once calling commit, u
Right -- it doesn't matter whether it's StartCommit() or Commit(), the point is that any concurrent KuduSession.Apply() might bet write operations left aside.  Yes, in other than AUTO_FLUSH_SYNC modes users of the API can still call KuduSession.Apply() after commit and not get error, and the error will be reported only upon KuduSession.Flush() called implicitly or explicitly.

As I mentioned, we can address that by adding an atomic boolean into a session (and that was so at some point in one of local revisions of this patch), but I'm not sure we want to protect from that.  I think our effort here is to protect against programmatic errors, rather than from very intricate made-up use cases.


http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client-test.cc@7433
PS2, Line 7433: ASSERT_OK(session->SetFlushMode(KuduSession::MANUAL_FLUSH));
> I do find it a bit odd for us to be implicitly flushing on Commit() even if
That looks like a viable option, but why would anybody add rows into a buffer via KuduSession.Apply() just to have those rows dropped on the floor later if they don't call KuduSession.Flush() before calling KuduTransaction.Commit()?  I guess in that sense KuduTransction.Commit() implies flushing all existing transaction session, and we just call Flush() to make life easier for the users of Kudu client API.  It would be great to do so also for StartCommit(), but the implementation has many more quirks there, that's why we opted just returning an error to warn people about possible data loss as early as possible.

Also, since the flush mode can be dynamically updated, that makes it a bit trickier to judge when to sample the mode of the session for the sort of verification that you mentioned.  I.e., what if the mode for the session was AUTO_FLUSH_BACKGROUND, and just before the commit they change it to MANUAL_FLUSH, but didn't add any new rows?

In other words, I think doing such a differentiation based on the flush mode would add more quirks and lessen the ease of use of the API.

Personally, I'd also change the behavior of KuduSession to automatically flush all buffered operations on destruction, but that's another story.

I guess I can add a note that auto-flushing behavior is there just for convenience and to avoid unintentionally dropping the data on the floor.

What do you think?


http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/transaction-internal.cc
File src/kudu/client/transaction-internal.cc:

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/transaction-internal.cc@198
PS2, Line 198: // It's a rare case, but nothing prevents to call session.reset() or
             :         // session.swap(nullptr) for a transactional session in the user code.
> Doesn't this just replace the pointer in the scoped_refptr instance with a 
Good point -- that has become a BS when session is std::shared_ptr.  Sure, the operations in the user code doesn't affect the copy stored in the txn_sessions_.



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 03:02:52 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17431/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/17431/2//COMMIT_MSG@22
PS2, Line 22: So, I opted
            : not to perform the consistency check as a part of KuduSession::Apply(),
            : rather relying on the logic of KuduSession::Flush() and
            : KuduSession::FlushAsync() methods instead.
This means that the check performed in StartCommit() is best-effort, i.e. in case of concurrent activity on the session while the transaction being committed, it might happen that some operations are applied to the session, but they might not get to the server when the transaction is committed.

Is it good enough or we want to be strict on this and protect against concurrent activity on session handles while a transaction is being committed?



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Thu, 13 May 2021 21:49:56 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17431/2/src/kudu/client/client-test.cc@7433
PS2, Line 7433: ASSERT_OK(session->SetFlushMode(KuduSession::MANUAL_FLUSH));
> That looks like a viable option, but why would anybody add rows into a buff
Good point -- I think I've come around to keeping it as is. My initial impression was that MANUAL_FLUSH mode is meant to give users more control of _whether_ to flush at all, and when. I think it's more likely though that users mainly use MANUAL_FLUSH mode to control the size of batches, while still meaning to flush. In that sense, automatically flushing is indeed much less error prone.



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 03:28:30 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 3: Verified+1

unrelated TSAN warnings in TxnCommitITest.TestAbortRacingWithBotchedCommit


-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 06:00:02 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins, Andrew Wong, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/17431

to look at the new patch set (#2).

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................

KUDU-2612 automatically flush sessions on txn commit

With this patch, all transactional sessions created off a transction
handle are automatically flushed upon calling Commit() on the handle.

As for the KuduTransaction::StartCommit() method, it's now necessary
to flush all the transactional sessions created off the transaction
handle before calling the method, otherwise Status::IllegalState()
would be returned.

As Andrew and I discussed offline, it might be an option to return an
error from KuduSession::Apply() for a transactional session whose
transaction has already started committing.  However, after looking at
this closer, I realized that it would a bit tricky given it should be
thread-safe, so it would add more complexity into the hot path of
applying write operations in the context of a session.  So, I opted
not to perform the consistency check as a part of KuduSession::Apply(),
rather relying on the logic of KuduSession::Flush() and
KuduSession::FlushAsync() methods instead.

Another design detail worth pointing at is that a KuduTransaction handle
keeps shared, not weak pointers to transaction sessions originated off
the handle (I did several back-and-forth iterations on this, though).
Even if using shared_ptr, not weak_ptr, no circular dependencies are
introduced since a transactional session doesn't keep a reference
to the corresponding transactional handle.  The shared_ptr-based
approach looks better than one with weak_ptr because
  (1) It might prevent a data loss due to a mistake in an application
      code, and it takes time to find and fix those.
  (2) It looks more portable and consistent if thinking about similar
      functionality to implement in the Java client.

This patch also contains several test scenarios for the newly introduced
functionality.

Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
---
M src/kudu/client/client-test.cc
M src/kudu/client/client.h
M src/kudu/client/transaction-internal.cc
M src/kudu/client/transaction-internal.h
M src/kudu/integration-tests/txn_write_ops-itest.cc
5 files changed, 670 insertions(+), 59 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/31/17431/2
-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17431/3/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17431/3/src/kudu/client/client-test.cc@7495
PS3, Line 7495: ,
> nit: period
Why period?  It's supposed to be a comma: that's the list of methods in the bourne-shell format, i.e. it unfolds into KuduTransaction::Commit() KuduTransaction::StartCommit()



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 20:16:03 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2612 automatically flush sessions on txn commit

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/17431 )

Change subject: KUDU-2612 automatically flush sessions on txn commit
......................................................................


Patch Set 3: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/17431/3/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/17431/3/src/kudu/client/client-test.cc@7495
PS3, Line 7495: ,
nit: period



-- 
To view, visit http://gerrit.cloudera.org:8080/17431
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I2480129a99fb19d16868e14f9b9e33c83e3d8e7f
Gerrit-Change-Number: 17431
Gerrit-PatchSet: 3
Gerrit-Owner: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 14 May 2021 20:00:19 +0000
Gerrit-HasComments: Yes