You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/12/12 14:02:37 UTC

[kudu-CR] WIP: KUDU-1775 (part 3): enforce max cell size and max PK size

Hello Adar Dembo,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/5475

to review the following change.

Change subject: WIP: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................

WIP: KUDU-1775 (part 3): enforce max cell size and max PK size

This adds limits on the size of any individual cell as well as a limit
on the maximum size of an encoded key.

* Large cells are known to cause problems such as KUDU-1524 (crash
  during flush).
* Large PKs have been seen to cause problems where the key column's
  cfile footer increases beyond the hard-coded maximum size of 64KB.

If a user attempts to insert a non-conforming row, or update a cell to
be larger than the maximum, they will receive an InvalidArgument error
for that row.

This patch takes the approach of doing the validation at the tablet
layer rather than somewhere higher up the stack. The reasoning is that
we don't want to reject an entire batch of operations due to one bad
row, and we are not well equipped to set per-row errors anywhere except
at the tablet layer.

Because this is an incompatible change (it's possible that users have
been successfully using larger cells) the limits are added as
configurable flags, but tagged as unsafe.

WIP: I want to test what happens when restarting a tablet server if a
WAL contains a non-conforming update/insert. I also imagine we have some
tests which insert non-conforming data and we'll need to bump the
appropriate flag or modify the test.

Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
---
M docs/schema_design.adoc
M src/kudu/client/client-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_server-test.cc
5 files changed, 172 insertions(+), 24 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/5475/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................


Patch Set 3:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/5475/3/docs/schema_design.adoc
File docs/schema_design.adoc:

PS3, Line 455:   
Nit: extra space here


PS3, Line 459: 300 columns
I find '300 columns' to be a little strange when the Number of Columns recommendation above mentions '200 columns'. Is this discrepancy intentional? If so, may want to clarify a bit.


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

Line 3909: // Test trying to insert a row with an encoded key that is too large.
Perhaps add an equivalent for the Java client?

Not really sure why this is worth testing in the clients, to be honest; it's entirely a server-side change.


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

PS3, Line 232: configuratio
configuration


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 414:   // TODO(todd): test what happens if we restart with an invalid op in our log!
Doesn't your new test in ts_recovery-itest do this, by reducing the max cell size and restarting?


PS3, Line 452:     // Don't do any validation on delete. This is important to allow users to
             :     // delete a row if a row with a too-large key was inserted on a previous version
             :     // that had no limits.
Is there really any validation to be done on delete? I guess we can look at the key size, but that's about it, right?


Line 480:   Status s = ValidateInsertOrUpsertUnlocked(*op);
Would be nice to track invalid ops in a metric, though not required for this patch.


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/tablet/tablet_bootstrap.cc
File src/kudu/tablet/tablet_bootstrap.cc:

PS3, Line 1267: we
"after we" ?


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

Line 328:   // This should generate one error into per_row_errors.
But it generates two errors; maybe this needs to be updated?


-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Adar Dembo, Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5475

to look at the new patch set (#5).

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................

KUDU-1775 (part 3): enforce max cell size and max PK size

This adds limits on the size of any individual cell as well as a limit
on the maximum size of an encoded key.

* Large cells are known to cause problems such as KUDU-1524 (crash
  during flush).
* Large PKs have been seen to cause problems where the key column's
  cfile footer increases beyond the hard-coded maximum size of 64KB.

If a user attempts to insert a non-conforming row, or update a cell to
be larger than the maximum, they will receive an InvalidArgument error
for that row.

This patch takes the approach of doing the validation at the tablet
layer rather than somewhere higher up the stack. The reasoning is that
we don't want to reject an entire batch of operations due to one bad
row, and we are not well equipped to set per-row errors anywhere except
at the tablet layer.

Because this is an incompatible change (it's possible that users have
been successfully using larger cells) the limits are added as
configurable flags, but tagged as unsafe. A new test verifies that, if
the user has insertions in the WAL that were previously allowed but now
are disallowed, the bootstrap fails without any crashes. It also tests
that by raising the flag to a higher value, bootstraps will succeed.
Making this test pass involved a small fix in bootstrap to properly
handle the case where an Apply failed on restart the previously
succeeded.

Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
---
M docs/schema_design.adoc
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
9 files changed, 276 insertions(+), 41 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/5475/5
-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5475/6/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

PS6, Line 435: FLAGS_max_cell_size_bytes
> Should this be FLAGS_max_encoded_key_size_bytes?
woops, good catch, will fix momentarily.


-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Silvius Rus <sr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................


Patch Set 5: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 5
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5475

to look at the new patch set (#3).

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................

KUDU-1775 (part 3): enforce max cell size and max PK size

This adds limits on the size of any individual cell as well as a limit
on the maximum size of an encoded key.

* Large cells are known to cause problems such as KUDU-1524 (crash
  during flush).
* Large PKs have been seen to cause problems where the key column's
  cfile footer increases beyond the hard-coded maximum size of 64KB.

If a user attempts to insert a non-conforming row, or update a cell to
be larger than the maximum, they will receive an InvalidArgument error
for that row.

This patch takes the approach of doing the validation at the tablet
layer rather than somewhere higher up the stack. The reasoning is that
we don't want to reject an entire batch of operations due to one bad
row, and we are not well equipped to set per-row errors anywhere except
at the tablet layer.

Because this is an incompatible change (it's possible that users have
been successfully using larger cells) the limits are added as
configurable flags, but tagged as unsafe. A new test verifies that, if
the user has insertions in the WAL that were previously allowed but now
are disallowed, the bootstrap fails without any crashes. It also tests
that by raising the flag to a higher value, bootstraps will succeed.
Making this test pass involved a small fix in bootstrap to properly
handle the case where an Apply failed on restart the previously
succeeded.

Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
---
M docs/schema_design.adoc
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
9 files changed, 273 insertions(+), 45 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/5475/3
-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] WIP: KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5475

to look at the new patch set (#2).

Change subject: WIP: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................

WIP: KUDU-1775 (part 3): enforce max cell size and max PK size

This adds limits on the size of any individual cell as well as a limit
on the maximum size of an encoded key.

* Large cells are known to cause problems such as KUDU-1524 (crash
  during flush).
* Large PKs have been seen to cause problems where the key column's
  cfile footer increases beyond the hard-coded maximum size of 64KB.

If a user attempts to insert a non-conforming row, or update a cell to
be larger than the maximum, they will receive an InvalidArgument error
for that row.

This patch takes the approach of doing the validation at the tablet
layer rather than somewhere higher up the stack. The reasoning is that
we don't want to reject an entire batch of operations due to one bad
row, and we are not well equipped to set per-row errors anywhere except
at the tablet layer.

Because this is an incompatible change (it's possible that users have
been successfully using larger cells) the limits are added as
configurable flags, but tagged as unsafe.

WIP: I want to test what happens when restarting a tablet server if a
WAL contains a non-conforming update/insert. I also imagine we have some
tests which insert non-conforming data and we'll need to bump the
appropriate flag or modify the test.

Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
---
M docs/schema_design.adoc
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_server-test.cc
7 files changed, 200 insertions(+), 33 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/5475/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/5475/3/docs/schema_design.adoc
File docs/schema_design.adoc:

PS3, Line 455:   
> Nit: extra space here
Done


PS3, Line 459: 300 columns
> I find '300 columns' to be a little strange when the Number of Columns reco
Done


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

Line 3909: // Test trying to insert a row with an encoded key that is too large.
> Perhaps add an equivalent for the Java client?
It was more convenient to write coverage for the "too-large-PK' case here in client-test rather than writing it in tablet_server-test, since I needed a schema with a composite key with multiple strings. All the utility code in tablet_server-test uses integer keys.

Agreed that it's a little goofy that this is really focusing on a tablet-server bit of code, so doesn't belong here from a naming perspective, but same is true of a bunch of other tests like TestMutateDeletedRow, etc.


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

PS3, Line 232: configuratio
> configuration
Done


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

Line 414:   // TODO(todd): test what happens if we restart with an invalid op in our log!
> Doesn't your new test in ts_recovery-itest do this, by reducing the max cel
Done


PS3, Line 452:     // Don't do any validation on delete. This is important to allow users to
             :     // delete a row if a row with a too-large key was inserted on a previous version
             :     // that had no limits.
> Is there really any validation to be done on delete? I guess we can look at
yea, that's what I meant here -- we could validate that the key is "legal", but it's probably best not to. Will rephrase.


Line 480:   Status s = ValidateInsertOrUpsertUnlocked(*op);
> Would be nice to track invalid ops in a metric, though not required for thi
added a TODO


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/tablet/tablet_bootstrap.cc
File src/kudu/tablet/tablet_bootstrap.cc:

Line 1263:       // TODO: get rid of redundant params below - they can be gotten from the Request
> warning: missing username/bug in TODO [google-readability-todo]
Done


PS3, Line 1267: we
> "after we" ?
Done


http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

Line 328:   // This should generate one error into per_row_errors.
> But it generates two errors; maybe this needs to be updated?
Done


-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................


KUDU-1775 (part 3): enforce max cell size and max PK size

This adds limits on the size of any individual cell as well as a limit
on the maximum size of an encoded key.

* Large cells are known to cause problems such as KUDU-1524 (crash
  during flush).
* Large PKs have been seen to cause problems where the key column's
  cfile footer increases beyond the hard-coded maximum size of 64KB.

If a user attempts to insert a non-conforming row, or update a cell to
be larger than the maximum, they will receive an InvalidArgument error
for that row.

This patch takes the approach of doing the validation at the tablet
layer rather than somewhere higher up the stack. The reasoning is that
we don't want to reject an entire batch of operations due to one bad
row, and we are not well equipped to set per-row errors anywhere except
at the tablet layer.

Because this is an incompatible change (it's possible that users have
been successfully using larger cells) the limits are added as
configurable flags, but tagged as unsafe. A new test verifies that, if
the user has insertions in the WAL that were previously allowed but now
are disallowed, the bootstrap fails without any crashes. It also tests
that by raising the flag to a higher value, bootstraps will succeed.
Making this test pass involved a small fix in bootstrap to properly
handle the case where an Apply failed on restart the previously
succeeded.

Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Reviewed-on: http://gerrit.cloudera.org:8080/5475
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
M docs/schema_design.adoc
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
9 files changed, 276 insertions(+), 41 deletions(-)

Approvals:
  Adar Dembo: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change.

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................


Patch Set 4: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Silvius Rus (Code Review)" <ge...@cloudera.org>.
Silvius Rus has posted comments on this change.

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................


Patch Set 6:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/5475/6/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

PS6, Line 3946: 65536
Should this be 16384?


http://gerrit.cloudera.org:8080/#/c/5475/6/src/kudu/tablet/tablet.cc
File src/kudu/tablet/tablet.cc:

PS6, Line 435: FLAGS_max_cell_size_bytes
Should this be FLAGS_max_encoded_key_size_bytes?


-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 6
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Silvius Rus <sr...@cloudera.com>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5475/3/src/kudu/integration-tests/ts_recovery-itest.cc
File src/kudu/integration-tests/ts_recovery-itest.cc:

Line 442:   ASSERT_OK(ts->Restart());
oops, this deletion later in the file was junk


-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1775 (part 3): enforce max cell size and max PK size

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5475

to look at the new patch set (#4).

Change subject: KUDU-1775 (part 3): enforce max cell size and max PK size
......................................................................

KUDU-1775 (part 3): enforce max cell size and max PK size

This adds limits on the size of any individual cell as well as a limit
on the maximum size of an encoded key.

* Large cells are known to cause problems such as KUDU-1524 (crash
  during flush).
* Large PKs have been seen to cause problems where the key column's
  cfile footer increases beyond the hard-coded maximum size of 64KB.

If a user attempts to insert a non-conforming row, or update a cell to
be larger than the maximum, they will receive an InvalidArgument error
for that row.

This patch takes the approach of doing the validation at the tablet
layer rather than somewhere higher up the stack. The reasoning is that
we don't want to reject an entire batch of operations due to one bad
row, and we are not well equipped to set per-row errors anywhere except
at the tablet layer.

Because this is an incompatible change (it's possible that users have
been successfully using larger cells) the limits are added as
configurable flags, but tagged as unsafe. A new test verifies that, if
the user has insertions in the WAL that were previously allowed but now
are disallowed, the bootstrap fails without any crashes. It also tests
that by raising the flag to a higher value, bootstraps will succeed.
Making this test pass involved a small fix in bootstrap to properly
handle the case where an Apply failed on restart the previously
succeeded.

Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
---
M docs/schema_design.adoc
M src/kudu/client/client-test.cc
M src/kudu/integration-tests/raft_consensus-itest.cc
M src/kudu/integration-tests/test_workload.cc
M src/kudu/integration-tests/ts_recovery-itest.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tablet/tablet_bootstrap.cc
M src/kudu/tserver/tablet_server-test.cc
9 files changed, 276 insertions(+), 48 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/75/5475/4
-- 
To view, visit http://gerrit.cloudera.org:8080/5475
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ib6b94cffd9c3efbe80a4b31e9272b376a4e41d15
Gerrit-PatchSet: 4
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Mike Percy <mp...@apache.org>
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>