You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2018/11/02 05:14:37 UTC

[kudu-CR] delta store: support iteration with is deleted virtual column

Hello Mike Percy, David Ribeiro Alves, Grant Henke, Todd Lipcon,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/11859

to review the following change.


Change subject: delta_store: support iteration with is_deleted virtual column
......................................................................

delta_store: support iteration with is_deleted virtual column

This patch adds support to delta stores (i.e. DMS and delta files) for
iterating with an IS_DELETED virtual column. A diff scan will include
IS_DELETED in the projection so that clients can tell if a row was deleted
in the requested time range.

Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
---
M src/kudu/common/schema.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/memrowset.cc
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/tablet-test-util.h
7 files changed, 99 insertions(+), 26 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/11859/1
-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/11859 )

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11859 )

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................


Patch Set 4: Code-Review+2

(1 comment)

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.cc
File src/kudu/tablet/delta_store.cc:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.cc@249
PS3, Line 249: 
> Yes, it's guaranteed to be disjoint, and yes, we do need to set both delete
The comments you've added in rev 4 make this much more clear, that's great.



-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Mon, 26 Nov 2018 20:03:35 +0000
Gerrit-HasComments: Yes

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/11859 )

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................

(04/05) delta_store: support iteration with is_deleted virtual column

This patch adds support to delta stores (i.e. DMS and delta files) for
iterating with an IS_DELETED virtual column. A diff scan will include
IS_DELETED in the projection so that clients can tell if a row was deleted
in the requested time range.

Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Reviewed-on: http://gerrit.cloudera.org:8080/11859
Tested-by: Kudu Jenkins
Reviewed-by: Mike Percy <mp...@apache.org>
---
M src/kudu/common/schema.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/memrowset.cc
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/tablet-test-util.h
7 files changed, 112 insertions(+), 26 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Mike Percy: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11859 )

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................


Patch Set 2: Verified+1

RELEASE build hung in a Python test. I posted http://gerrit.cloudera.org:8080/11863 to address that.


-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 02 Nov 2018 16:10:30 +0000
Gerrit-HasComments: No

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11859 )

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................


Patch Set 5: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 07 Dec 2018 03:45:25 +0000
Gerrit-HasComments: No

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/11859 )

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/common/schema.h
File src/kudu/common/schema.h:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/common/schema.h@870
PS3, Line 870: size_t
> should be int because of
Done


http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.h@356
PS3, Line 356: size_t
> should be int
Done


http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.cc
File src/kudu/tablet/delta_store.cc:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.cc@249
PS3, Line 249:       uint32_t idx_in_block = row_id - prev_prepared_idx_;
> is this guaranteed to be disjoint from rows in deleted_ ? If so I don't see
Yes, it's guaranteed to be disjoint, and yes, we do need to set both deleted rows to true and reinserted ones to false.

To understand why, consider how the DeltaIteratorMerger works. It's a vector of DeltaIterators and when we make our usual PrepareBlock()/ApplyUpdates() calls, the merger will fan them out to every DeltaIterator in order. As such, it's possible for the first iterator in the merger to delete row foo and the next iterator to reinsert row foo, which means it's important for rows that we think are reinserted to be explicitly unset _back_ to false.

That's the same reasoning behind the selection vector logic in ApplyDeletes. There's a comment in UpdateDeletionState that sort of hints at this; I'll try to make it more explicit.


http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/memrowset.h
File src/kudu/tablet/memrowset.h:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/memrowset.h@598
PS3, Line 598: size_t
> int here also I think
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 21 Nov 2018 23:35:38 +0000
Gerrit-HasComments: Yes

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Mike Percy (Code Review)" <ge...@cloudera.org>.
Mike Percy has posted comments on this change. ( http://gerrit.cloudera.org:8080/11859 )

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................


Patch Set 3:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/common/schema.h
File src/kudu/common/schema.h:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/common/schema.h@870
PS3, Line 870: size_t
should be int because of

  static const int kColumnNotFound = -1;


http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.h
File src/kudu/tablet/delta_store.h:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.h@356
PS3, Line 356: size_t
should be int


http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.cc
File src/kudu/tablet/delta_store.cc:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/delta_store.cc@249
PS3, Line 249:       uint32_t idx_in_block = row_id - prev_prepared_idx_;
is this guaranteed to be disjoint from rows in deleted_ ? If so I don't see the point in setting the vcol to false, if not disjoint just wondering why that is.


http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/memrowset.h
File src/kudu/tablet/memrowset.h:

http://gerrit.cloudera.org:8080/#/c/11859/3/src/kudu/tablet/memrowset.h@598
PS3, Line 598: size_t
int here also I think



-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 21 Nov 2018 21:07:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, David Ribeiro Alves, Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11859

to look at the new patch set (#4).

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................

(04/05) delta_store: support iteration with is_deleted virtual column

This patch adds support to delta stores (i.e. DMS and delta files) for
iterating with an IS_DELETED virtual column. A diff scan will include
IS_DELETED in the projection so that clients can tell if a row was deleted
in the requested time range.

Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
---
M src/kudu/common/schema.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/memrowset.cc
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/tablet-test-util.h
7 files changed, 112 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/11859/4
-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] (04/05) delta store: support iteration with is deleted virtual column

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Mike Percy, David Ribeiro Alves, Kudu Jenkins, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/11859

to look at the new patch set (#2).

Change subject: (04/05) delta_store: support iteration with is_deleted virtual column
......................................................................

(04/05) delta_store: support iteration with is_deleted virtual column

This patch adds support to delta stores (i.e. DMS and delta files) for
iterating with an IS_DELETED virtual column. A diff scan will include
IS_DELETED in the projection so that clients can tell if a row was deleted
in the requested time range.

Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
---
M src/kudu/common/schema.h
M src/kudu/tablet/delta_store.cc
M src/kudu/tablet/delta_store.h
M src/kudu/tablet/deltafile-test.cc
M src/kudu/tablet/memrowset.cc
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/tablet-test-util.h
7 files changed, 99 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/59/11859/2
-- 
To view, visit http://gerrit.cloudera.org:8080/11859
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ibafcae9f9f82a6c2d9a8afedb8fbf07d6e3069ec
Gerrit-Change-Number: 11859
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>