You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Todd Lipcon (Code Review)" <ge...@cloudera.org> on 2016/11/30 20:33:26 UTC

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Hello Dan Burkert, Jean-Daniel Cryans,

I'd like you to do a code review.  Please visit

    http://gerrit.cloudera.org:8080/5282

to review the following change.

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................

KUDU-1524. Add a workaround for unflushable large cells

Previously, we had a hard-coded limit of 16MB for an individual cfile
block. This would cause a CHECK failure if someone inserted a cell
larger than this size.

We should probably limit large cells in the write path in a separate
patch, but it was also a bad idea to have this limit be a constant
instead of an 'unsafe' flag. This switches to using a flag for the value
so that, if we do end up in a situation like this, we can work around it
by bumping the flag instead of recompiling.

Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
---
M src/kudu/cfile/block_compression.cc
M src/kudu/cfile/block_compression.h
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_writer.cc
4 files changed, 28 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/5282/1
-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newchange
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5282/1/src/kudu/cfile/block_compression.cc
File src/kudu/cfile/block_compression.cc:

Line 61:   if (ub_compressed_size > FLAGS_max_cfile_block_size) {
why are you comparing the compressed size here in the write path, and teh uncompressed size in the read path?


-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged.

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................


KUDU-1524. Add a workaround for unflushable large cells

Previously, we had a hard-coded limit of 16MB for an individual cfile
block. This would cause a CHECK failure if someone inserted a cell
larger than this size.

We should probably limit large cells in the write path in a separate
patch, but it was also a bad idea to have this limit be a constant
instead of an 'unsafe' flag. This switches to using a flag for the value
so that, if we do end up in a situation like this, we can work around it
by bumping the flag instead of recompiling.

This also fixes the size limiting to be symmetric: we now always check
the size of the *uncompressed* block, which ensures that if we're able
to write a block, we will later be able to read it.

Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Reviewed-on: http://gerrit.cloudera.org:8080/5282
Tested-by: Kudu Jenkins
Reviewed-by: Dan Burkert <da...@apache.org>
---
M src/kudu/cfile/block_compression.cc
M src/kudu/cfile/block_compression.h
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_writer.cc
4 files changed, 34 insertions(+), 27 deletions(-)

Approvals:
  Dan Burkert: Looks good to me, approved
  Kudu Jenkins: Verified



-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: merged
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 3
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5282/1/src/kudu/cfile/block_compression.cc
File src/kudu/cfile/block_compression.cc:

Line 61:   if (ub_compressed_size > FLAGS_max_cfile_block_size) {
> why are you comparing the compressed size here in the write path, and teh u
yea, it's a little goofy, but the idea is so that bad data doesn't cause us to do some crazy memory allocation or somesuch. A lot of this should probably be rethought as we work on improving stability of large cells, but figured I'd just keep the current behavior but make the limit configurable as a quick fix in case someone hits this issue again.


-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................


Patch Set 2: Code-Review+2

-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Hello Kudu Jenkins,

I'd like you to reexamine a change.  Please visit

    http://gerrit.cloudera.org:8080/5282

to look at the new patch set (#2).

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................

KUDU-1524. Add a workaround for unflushable large cells

Previously, we had a hard-coded limit of 16MB for an individual cfile
block. This would cause a CHECK failure if someone inserted a cell
larger than this size.

We should probably limit large cells in the write path in a separate
patch, but it was also a bad idea to have this limit be a constant
instead of an 'unsafe' flag. This switches to using a flag for the value
so that, if we do end up in a situation like this, we can work around it
by bumping the flag instead of recompiling.

This also fixes the size limiting to be symmetric: we now always check
the size of the *uncompressed* block, which ensures that if we're able
to write a block, we will later be able to read it.

Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
---
M src/kudu/cfile/block_compression.cc
M src/kudu/cfile/block_compression.h
M src/kudu/cfile/cfile_reader.cc
M src/kudu/cfile/cfile_writer.cc
4 files changed, 34 insertions(+), 27 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/82/5282/2
-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 2
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change.

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5282/1/src/kudu/cfile/block_compression.cc
File src/kudu/cfile/block_compression.cc:

Line 61:   if (ub_compressed_size > FLAGS_max_cfile_block_size) {
> But why not limit on the uncompressed size here?  As it is a value can be w
that's a fair point. will do


-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/5282/1/src/kudu/cfile/block_compression.cc
File src/kudu/cfile/block_compression.cc:

Line 61:   if (ub_compressed_size > FLAGS_max_cfile_block_size) {
> yea, it's a little goofy, but the idea is so that bad data doesn't cause us
But why not limit on the uncompressed size here?  As it is a value can be written that then can't be read back.


-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: Yes

[kudu-CR] KUDU-1524. Add a workaround for unflushable large cells

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change.

Change subject: KUDU-1524. Add a workaround for unflushable large cells
......................................................................


Patch Set 1:

ping

-- 
To view, visit http://gerrit.cloudera.org:8080/5282
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I245b52f2bc8b9d95716cacd340dca93f64846c73
Gerrit-PatchSet: 1
Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-Owner: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Jean-Daniel Cryans <jd...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-HasComments: No