You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/09/16 18:15:27 UTC

[kudu-CR] WIP: KUDU-1623. Properly handle UPSERTS that only include PK column

Hello David Ribeiro Alves,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/4441

to review the following change.

Change subject: WIP: KUDU-1623. Properly handle UPSERTS that only include PK column
......................................................................

WIP: KUDU-1623. Properly handle UPSERTS that only include PK column

WIP because I need a commit message and some test code cleanup

Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
---
M src/kudu/integration-tests/fuzz-itest.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_random_access-test.cc
4 files changed, 186 insertions(+), 62 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/4441/1
-- 
To view, visit http://gerrit.cloudera.org:8080/4441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>

[kudu-CR] KUDU-1623. Properly handle UPSERTS that only include PK column

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4441

to look at the new patch set (#2).

Change subject: KUDU-1623. Properly handle UPSERTS that only include PK column
......................................................................

KUDU-1623. Properly handle UPSERTS that only include PK column

This fixes a bug reported by Chris George in which the tablet server
would crash when handling an UPSERT which was converted into an empty
UPDATE. For example, in pseudo-SQL:

  CREATE TABLE t (
    x INT NOT NULL PRIMARY KEY
  );

  UPSERT INTO t VALUES (1);
  UPSERT INTO t VALUES (1);

Here the second upsert would detect the primary key already existed and
convert into an update containing any non-PK columns. Since there are no
non-PK columns, the update was "empty", and ended up as an empty
RowChangeList. In DEBUG builds, this fired a DCHECK, but in RELEASE
builds, we ended up with an empty RowChangeList in the DMS. At flush
time, this would cause other assertions to fire or crash.

The fix is relatively simple: if we see that an upsert converted into an
empty update, we just drop the delta rather than continuing to insert
into the delta store. This required loosing a check at bootstrap time,
where we previously assumed that every mutation must have been recorded
in at least one store.

The test changes are much larger than the fix itself due to some
refactoring. The main gist of the change is to introduce operations in
tablet_random_access-test and fuzz-itest where we send an UPSERT which
contains only the primary key column. These tests crashed prior to the
fix and pass now.

Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
---
M src/kudu/integration-tests/fuzz-itest.cc
A src/kudu/tablet/key_value_test_schema.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_random_access-test.cc
5 files changed, 227 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/4441/2
-- 
To view, visit http://gerrit.cloudera.org:8080/4441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] KUDU-1623. Properly handle UPSERTS that only include PK column

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1623. Properly handle UPSERTS that only include PK column
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4441/2/src/kudu/tablet/key_value_test_schema.h
File src/kudu/tablet/key_value_test_schema.h:

Line 33: struct KeyValueTestRow {
> nit: I find this choice of name confusing. How about: ExpectedTestKeyValueR
changed to ExpectedKeyValueRow


http://gerrit.cloudera.org:8080/#/c/4441/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

PS2, Line 462: bet
> typo
Done


Line 465:   if (buf.size() == 0) {
> it would be slightly clearer to do enc->is_empty() instead of testing the b
Done


http://gerrit.cloudera.org:8080/#/c/4441/2/src/kudu/tablet/tablet_random_access-test.cc
File src/kudu/tablet/tablet_random_access-test.cc:

Line 119:       VLOG(1) << RowOperationsPB::Type_Name(op.type) << " " << op.row->ToString();
> add log output before this to explain what this output is?
Done


Line 169:                                int val,
> hum, re wrapping in this case don't we usually align the variables with the
yea, just got messed up during some renames


-- 
To view, visit http://gerrit.cloudera.org:8080/4441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1623. Properly handle UPSERTS that only include PK column

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1623. Properly handle UPSERTS that only include PK column
......................................................................


Patch Set 3: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/4441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1623. Properly handle UPSERTS that only include PK column

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change.

Change subject: KUDU-1623. Properly handle UPSERTS that only include PK column
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/4441/2/src/kudu/tablet/key_value_test_schema.h
File src/kudu/tablet/key_value_test_schema.h:

Line 33: struct KeyValueTestRow {
nit: I find this choice of name confusing. How about: ExpectedTestKeyValueRow or something?


http://gerrit.cloudera.org:8080/#/c/4441/2/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

PS2, Line 462: bet
typo


Line 465:   if (buf.size() == 0) {
it would be slightly clearer to do enc->is_empty() instead of testing the buffer size


http://gerrit.cloudera.org:8080/#/c/4441/2/src/kudu/tablet/tablet_random_access-test.cc
File src/kudu/tablet/tablet_random_access-test.cc:

Line 119:       VLOG(1) << RowOperationsPB::Type_Name(op.type) << " " << op.row->ToString();
add log output before this to explain what this output is?


Line 169:                                int val,
hum, re wrapping in this case don't we usually align the variables with the first one, or am I misremembering?


-- 
To view, visit http://gerrit.cloudera.org:8080/4441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1623. Properly handle UPSERTS that only include PK column

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1623. Properly handle UPSERTS that only include PK column
......................................................................


KUDU-1623. Properly handle UPSERTS that only include PK column

This fixes a bug reported by Chris George in which the tablet server
would crash when handling an UPSERT which was converted into an empty
UPDATE. For example, in pseudo-SQL:

  CREATE TABLE t (
    x INT NOT NULL PRIMARY KEY
  );

  UPSERT INTO t VALUES (1);
  UPSERT INTO t VALUES (1);

Here the second upsert would detect the primary key already existed and
convert into an update containing any non-PK columns. Since there are no
non-PK columns, the update was "empty", and ended up as an empty
RowChangeList. In DEBUG builds, this fired a DCHECK, but in RELEASE
builds, we ended up with an empty RowChangeList in the DMS. At flush
time, this would cause other assertions to fire or crash.

The fix is relatively simple: if we see that an upsert converted into an
empty update, we just drop the delta rather than continuing to insert
into the delta store. This required loosing a check at bootstrap time,
where we previously assumed that every mutation must have been recorded
in at least one store.

The test changes are much larger than the fix itself due to some
refactoring. The main gist of the change is to introduce operations in
tablet_random_access-test and fuzz-itest where we send an UPSERT which
contains only the primary key column. These tests crashed prior to the
fix and pass now.

Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Reviewed-on: http://gerrit.cloudera.org:8080/4441
Reviewed-by: David Ribeiro Alves <dr...@apache.org>
Tested-by: Kudu Jenkins
---
M src/kudu/integration-tests/fuzz-itest.cc
A src/kudu/tablet/key_value_test_schema.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_random_access-test.cc
5 files changed, 229 insertions(+), 73 deletions(-)

Approvals:
  David Ribeiro Alves: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/4441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1623. Properly handle UPSERTS that only include PK column

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1623. Properly handle UPSERTS that only include PK column
......................................................................


Patch Set 2:

btw looped fuzz-itest 500 times: http://dist-test.cloudera.org//job?job_id=todd.1474058979.7901

-- 
To view, visit http://gerrit.cloudera.org:8080/4441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1623. Properly handle UPSERTS that only include PK column

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/4441

to look at the new patch set (#3).

Change subject: KUDU-1623. Properly handle UPSERTS that only include PK column
......................................................................

KUDU-1623. Properly handle UPSERTS that only include PK column

This fixes a bug reported by Chris George in which the tablet server
would crash when handling an UPSERT which was converted into an empty
UPDATE. For example, in pseudo-SQL:

  CREATE TABLE t (
    x INT NOT NULL PRIMARY KEY
  );

  UPSERT INTO t VALUES (1);
  UPSERT INTO t VALUES (1);

Here the second upsert would detect the primary key already existed and
convert into an update containing any non-PK columns. Since there are no
non-PK columns, the update was "empty", and ended up as an empty
RowChangeList. In DEBUG builds, this fired a DCHECK, but in RELEASE
builds, we ended up with an empty RowChangeList in the DMS. At flush
time, this would cause other assertions to fire or crash.

The fix is relatively simple: if we see that an upsert converted into an
empty update, we just drop the delta rather than continuing to insert
into the delta store. This required loosing a check at bootstrap time,
where we previously assumed that every mutation must have been recorded
in at least one store.

The test changes are much larger than the fix itself due to some
refactoring. The main gist of the change is to introduce operations in
tablet_random_access-test and fuzz-itest where we send an UPSERT which
contains only the primary key column. These tests crashed prior to the
fix and pass now.

Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
---
M src/kudu/integration-tests/fuzz-itest.cc
A src/kudu/tablet/key_value_test_schema.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tablet/tablet_random_access-test.cc
5 files changed, 229 insertions(+), 73 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/41/4441/3
-- 
To view, visit http://gerrit.cloudera.org:8080/4441
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic878f6e51ead5bf91335ddb47536e7c29de11ac4
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <dr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>