You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Andrew Wong (Code Review)" <ge...@cloudera.org> on 2018/03/23 08:36:11 UTC

[kudu-CR] KUDU-16: enable server-side limits on scanners

Andrew Wong has uploaded this change for review. ( http://gerrit.cloudera.org:8080/9783


Change subject: KUDU-16: enable server-side limits on scanners
......................................................................

KUDU-16: enable server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.

This patch also adds a public API to allow the specification of
per-client-side-scanner limits on the number of rows returned. Each
scanner will maintain a count of the number of rows already read, and
adjust the server-side limit upon sending the next scan request.

The existing uint64 protobuf field has been deprecated and replaced with
a new int64 field.

There are a few unit tests included to verify that the limits act as
expected. I also verified that lowering the limit reduces the number of
bytes read on disk.

This patch does not implement the behavior for scan tokens. That will
come in a later patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/client/client-test.cc
M src/kudu/client/client.cc
M src/kudu/client/client.h
M src/kudu/client/scan_configuration.cc
M src/kudu/client/scan_configuration.h
M src/kudu/client/scanner-internal.cc
M src/kudu/client/scanner-internal.h
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
16 files changed, 354 insertions(+), 76 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/1
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#7).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
10 files changed, 219 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/7
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 7
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9783/10/src/kudu/common/wire_protocol.cc
File src/kudu/common/wire_protocol.cc:

http://gerrit.cloudera.org:8080/#/c/9783/10/src/kudu/common/wire_protocol.cc@805
PS10, Line 805:                        int64_t max_num_rows_to_return,
> was thinking about this from a performance angle a bit more this morning. I
Hmmm interesting, so the idea would be to resize the selection vector post-scan, eh? Makes sense; I'll try that out.



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 30 Mar 2018 19:53:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#9).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
10 files changed, 238 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/9
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#15).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
9 files changed, 184 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/15
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 15
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#4).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.

The existing uint64 protobuf field has been deprecated and replaced with
a new int64 field.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
10 files changed, 171 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/4
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 4
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#11).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
11 files changed, 182 insertions(+), 35 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/11
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 11
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-16: enable server-side limits on scanners

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16: enable server-side limits on scanners
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9783/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9783/1//COMMIT_MSG@7
PS1, Line 7: KUDU-16: enable server-side limits on scanners
Could you split the client-side portion out of this patch? You're already testing the new server-side functionality with raw scan RPCs, so there's no reason why you can't treat it as a fully encapsulated patch.


http://gerrit.cloudera.org:8080/#/c/9783/1//COMMIT_MSG@19
PS1, Line 19: The existing uint64 protobuf field has been deprecated and replaced with
            : a new int64 field.
Why? What's wrong with reusing the existing field?



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 23 Mar 2018 18:32:08 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 15: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 15
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Wed, 04 Apr 2018 00:21:51 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol-test.cc
File src/kudu/common/wire_protocol-test.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol-test.cc@323
PS9, Line 323:                     boost::none /* max_num_rows_to_return */, true /* pad timestamps */);
if you write this as: /* max_num_rows_to_return= */boost::none then clang-tidy will check that the param name matches


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol.cc
File src/kudu/common/wire_protocol.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol.cc@762
PS9, Line 762: max_num_rows_to_return
we can get rid of some branching here by doing this once up front:

int max_rows = max_num_rows_to_return.value_or(block.nrows());

(this is a very hot function on the scan path)


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol.cc@771
PS9, Line 771: (*max_num_rows_to_return - rows_returned)
can also avoid this by updating a 'max_rows' above incrementally


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol.cc@838
PS9, Line 838: num_rows
what's the purpose of allowing this to be negative in this function? can't it just be checked at a higher level and be a DCHECK here to simplify some stuff a bit?


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_server-test.cc@2072
PS9, Line 2072: lmits
typo


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_service.cc@485
PS9, Line 485: optional
maybe this can be non-optional and simplify code paths by just setting it to row_block.nrows() here and get rid of the optional processing deeper down?


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_service.cc@502
PS9, Line 502: num_rows_returned_
do we still need this member if it's redundant with the one inside the scannner?



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 30 Mar 2018 01:53:48 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#12).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
9 files changed, 181 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/12
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 12
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#5).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.

The existing uint64 protobuf field has been deprecated and replaced with
a new int64 field.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
10 files changed, 171 insertions(+), 39 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/5
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 9:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol-test.cc
File src/kudu/common/wire_protocol-test.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol-test.cc@323
PS9, Line 323:                     boost::none /* max_num_rows_to_return */, true /* pad timestamps */);
> if you write this as: /* max_num_rows_to_return= */boost::none then clang-t
Done


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol.cc
File src/kudu/common/wire_protocol.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol.cc@762
PS9, Line 762: max_num_rows_to_return
> we can get rid of some branching here by doing this once up front:
Done


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol.cc@771
PS9, Line 771: (*max_num_rows_to_return - rows_returned)
> can also avoid this by updating a 'max_rows' above incrementally
Done


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/common/wire_protocol.cc@838
PS9, Line 838: num_rows
> what's the purpose of allowing this to be negative in this function? can't 
Done


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_server-test.cc@2072
PS9, Line 2072: lmits
> typo
Done


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_service.cc@485
PS9, Line 485: optional
> maybe this can be non-optional and simplify code paths by just setting it t
Nice, I like it.


http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_service.cc@502
PS9, Line 502: num_rows_returned_
> do we still need this member if it's redundant with the one inside the scan
It's necessary for the scanned rows metrics. AFAICT, scanners (and their internal count) can batches multiple scans, whereas the metrics get updated after each HandleRowBlock() (per each batch's result collector). I'm not sure there's elegant way to refactor this out.



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 30 Mar 2018 07:52:49 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#14).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
9 files changed, 182 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/14
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 14
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#13).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
9 files changed, 181 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/13
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 13
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#3).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.

The existing uint64 protobuf field has been deprecated and replaced with
a new int64 field.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
9 files changed, 173 insertions(+), 37 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/3
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 3:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.h
File src/kudu/common/scan_spec.h:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.h@42
PS2, Line 42:       lower_bound_partition_key_(),
> warning: initializer for member 'lower_bound_partition_key_' is redundant [
Done


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.h@43
PS2, Line 43:       exclusive_upper_bound_partition_key_(),
> warning: initializer for member 'exclusive_upper_bound_partition_key_' is r
Done


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.h@45
PS2, Line 45:       has_limit_(false),
> do you really need this. can't you get away with just having a has_limit() 
I think having limit = 0 meaning "no limit" is not intuitive, although yes some invalid limit (e.g. -1) would work. This is actually an artifact of the fact that I implemented this with uint64_t first and didn't want limit = 0 to mean "no limit".

Even with int64_t, having this as a separate field also simplifies cases where a user might specify a negative limit. Sure, we could disallow that and return an error or something, but that should be done at the client level. On the server side, I think it makes sense to just short-circuit (which isn't how it is now, but let me change that), and that's a harder call to make if we base short circuiting off of the value alone e.g. if we have -1 be the default, should we scan everything, or scan nothing? I suppose if we make the default LLONG_MAX, we could have it scan nothing. WDYT?


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.cc
File src/kudu/common/scan_spec.cc:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.cc@72
PS2, Line 72: has_limit_ && limit_ <= 0
> right, same point. this seems redudant. could just call the has_limit() fun
Addressed in the other comment


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/wire_protocol.h
File src/kudu/common/wire_protocol.h:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/wire_protocol.h@157
PS2, Line 157: const boost::optional<int64_t>& max_num_rows_to_return = boost::none
> likely best move this before the pad_ argument. it's possible we remove it 
Done


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/scanners.h
File src/kudu/tserver/scanners.h:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/scanners.h@368
PS2, Line 368: 
             :   // The number of rows that 
> docs
Done


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tablet_server-test.cc@2102
PS2, Line 2102:   const int64_t kHighLimit = kNumRows * 2;
> warning: either cast from 'int' to 'const int64_t' (aka 'const long') is in
Done


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tablet_server-test.cc@2107
PS2, Line 2107:   int num_rows = AllowSlowTests() ? 10000 : 1000;
> warning: missing username/bug in TODO [google-readability-todo]
This comment is five years old and I think is out of datae.


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tserver.proto@229
PS2, Line 229:   repeated ColumnRangePredicatePB DEPRECATED_range_predicates = 3;
> seems like an unrelated change that could go in a separate change set?
Done


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tserver.proto@282
PS2, Line 282:   // The maximum number of rows to scan with the new scanner.
             :   //
             :   // The scanner will automatically stop yielding results and close itself
             :   // after reaching this num
> how does this relate to 'batch_size_bytes' in the scan request below. we sh
It doesn't really relate `batch_size_bytes`. It imposes a limit to the total limit returned by a scanner, whereas `batch_size_bytes` imposes how many can be returned at a time. I updated the comment to maybe make it more clear that this is a property of the scanner (whereas batch_size_bytes is a property of each scan).

I'm not sure I follow the second point. AFAIK we don't, we'll scan for as long as there are more rowsets to scan.



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 3
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 23 Mar 2018 21:39:56 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/9783/9/src/kudu/tserver/tablet_service.cc@502
PS9, Line 502:  Returns number of
> It's necessary for the scanned rows metrics. AFAICT, scanners (and their in
I meant, scanners can *span* multiple *batches*, and the row collectors don't, and so we currently update metrics once per batch.



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 30 Mar 2018 17:37:59 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9783/14/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/9783/14/src/kudu/tserver/tablet_service.cc@484
PS14, Line 484: row_block.selection_vector()->CountSelected()
nit: can you store this in a temporary local int instead of calling it twice? I don't think the compiler will be smart enough to know it's constant.



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 14
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 03 Apr 2018 22:17:54 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 15:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9783/14/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/9783/14/src/kudu/tserver/tablet_service.cc@484
PS14, Line 484:  row_block.selection_vector()->CountSelected(
> nit: can you store this in a temporary local int instead of calling it twic
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 15
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Tue, 03 Apr 2018 22:21:28 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#8).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
10 files changed, 238 insertions(+), 49 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/8
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 8
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9783/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9783/1//COMMIT_MSG@19
PS1, Line 19: 
            : Change-Id: I3ca5dd
> I'd like to avoid using unsigned types for non-bit patterns, unless you mea
I don't really care about unsigned vs. signed, but I thought it'd be nice to reuse the existing proto field. I don't think doing so will have any effect on backwards compatibility.



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 23 Mar 2018 20:11:13 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Reviewed-on: http://gerrit.cloudera.org:8080/9783
Tested-by: Kudu Jenkins
Reviewed-by: Todd Lipcon <to...@apache.org>
---
M src/kudu/common/rowblock.cc
M src/kudu/common/rowblock.h
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
9 files changed, 184 insertions(+), 32 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Todd Lipcon: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 16
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 1:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/9783/1//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/9783/1//COMMIT_MSG@7
PS1, Line 7: KUDU-16: enable server-side limits on scanners
> Could you split the client-side portion out of this patch? You're already t
Yeah completely fair.


http://gerrit.cloudera.org:8080/#/c/9783/1//COMMIT_MSG@19
PS1, Line 19: The existing uint64 protobuf field has been deprecated and replaced with
            : a new int64 field.
> Why? What's wrong with reusing the existing field?
I'd like to avoid using unsigned types for non-bit patterns, unless you mean keep the proto field uint64 and use int64_t internally.



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 1
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 23 Mar 2018 19:24:48 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Kudu Jenkins, Adar Dembo, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#2).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.

The existing uint64 protobuf field has been deprecated and replaced with
a new int64 field.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
9 files changed, 170 insertions(+), 36 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/2
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#10).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.
Setting a non-positive limit is grounds for short circuiting the scan.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
10 files changed, 233 insertions(+), 53 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/10
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.
Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 10:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/9783/10/src/kudu/common/wire_protocol.cc
File src/kudu/common/wire_protocol.cc:

http://gerrit.cloudera.org:8080/#/c/9783/10/src/kudu/common/wire_protocol.cc@805
PS10, Line 805:                        int64_t max_num_rows_to_return,
was thinking about this from a performance angle a bit more this morning. I think it would be faster to actually do the limiting on the selection vector of the RowBlock rather than adding new conditionals into the CopyColumn code. That way the branches happen once up front instead of on every column, and the stuff like 'num_rows' calculation would just be adjusted automatically. Thoughts?



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 10
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Fri, 30 Mar 2018 18:56:03 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 5:

Ah, failed because I swapped the order of some args and it killed a test. Should be fine now.


-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 23 Mar 2018 22:59:33 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "David Ribeiro Alves (Code Review)" <ge...@cloudera.org>.
David Ribeiro Alves has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.h
File src/kudu/common/scan_spec.h:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.h@45
PS2, Line 45:       has_limit_(false),
do you really need this. can't you get away with just having a has_limit() function that tests if the limit is != 0


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.cc
File src/kudu/common/scan_spec.cc:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/scan_spec.cc@72
PS2, Line 72: has_limit_ && limit_ == 0
right, same point. this seems redudant. could just call the has_limit() function


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/wire_protocol.h
File src/kudu/common/wire_protocol.h:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/common/wire_protocol.h@157
PS2, Line 157: const boost::optional<int64_t>& max_num_rows_to_return = boost::none
likely best move this before the pad_ argument. it's possible we remove it in the future but still keep this one.


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/scanners.h
File src/kudu/tserver/scanners.h:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/scanners.h@368
PS2, Line 368: 
             :   int64_t num_rows_returned_;
docs


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tserver.proto@229
PS2, Line 229:   repeated ColumnRangePredicatePB DEPRECATED_range_predicates = 3 [deprecated = true];
seems like an unrelated change that could go in a separate change set?


http://gerrit.cloudera.org:8080/#/c/9783/2/src/kudu/tserver/tserver.proto@282
PS2, Line 282:   // The maximum number of rows to scan.
             :   // The scanner will automatically stop yielding results and close
             :   // itself after reaching this number of result rows.
             :   optional int64 limit = 15;
how does this relate to 'batch_size_bytes' in the scan request below. we should at least doc it.
also, did we have a limit before that was server side? if yes might be worth pondering moving it here (and making it default), since it's easier to find.



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 2
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Fri, 23 Mar 2018 20:01:43 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 9:

The pre-commit failures appear unrelated.


-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 9
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Thu, 29 Mar 2018 23:24:57 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 5:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.h
File src/kudu/common/scan_spec.h:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.h@42
PS1, Line 42:       cache_blocks_(true),
> warning: initializer for member 'lower_bound_partition_key_' is redundant [
Done


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.h@43
PS1, Line 43:       has_limit_(false),
> warning: initializer for member 'exclusive_upper_bound_partition_key_' is r
Done


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.h@172
PS1, Line 172: 
> It would be cleaner to use an optional<int64_t> if you don't want to specia
Done


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.cc
File src/kudu/common/scan_spec.cc:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.cc@132
PS1, Line 132:   const string limit = has_limit_ ? Substitute(" LIMIT $0", limit_) : "";
> nice.
Ack


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/wire_protocol.cc
File src/kudu/common/wire_protocol.cc:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/wire_protocol.cc@771
PS1, Line 771:     for (int i = 0; i < run_size && !has_fulfilled_limit(); i++) {
> You can avoid adding another runtime check in the inner loop by hoisting th
Done. Slightly different impl, but along the same lines.


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tablet_server-test.cc
File src/kudu/tserver/tablet_server-test.cc:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tablet_server-test.cc@2102
PS1, Line 2102:   const int64_t kHighLimit = kNumRows * 2;
> warning: either cast from 'int' to 'const int64_t' (aka 'const long') is in
Done


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tablet_server-test.cc@2107
PS1, Line 2107:   int num_rows = AllowSlowTests() ? 10000 : 1000;
> warning: missing username/bug in TODO [google-readability-todo]
Done


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tablet_service.cc@494
PS1, Line 494: limit do
> s/limitted/limited
Done


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tserver.proto@229
PS1, Line 229:   // DEPRECATED: use column_predicates field.
> Thanks for adding this.  Going forward I'd be in favor of using the depreca
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Tue, 27 Mar 2018 23:07:18 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.
Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/9783 )

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................


Patch Set 5:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.h
File src/kudu/common/scan_spec.h:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.h@172
PS1, Line 172: 
It would be cleaner to use an optional<int64_t> if you don't want to special case 0 (I wouldn't be against the special case).


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.cc
File src/kudu/common/scan_spec.cc:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/scan_spec.cc@132
PS1, Line 132:   const string limit = has_limit_ ? Substitute(" LIMIT $0", limit_) : "";
nice.


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/wire_protocol.cc
File src/kudu/common/wire_protocol.cc:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/common/wire_protocol.cc@771
PS1, Line 771:     for (int i = 0; i < run_size && !has_fulfilled_limit(); i++) {
You can avoid adding another runtime check in the inner loop by hoisting the check:

    auto limit = max_num_rows_to_return && max_num_rows_to_return - rows_returned > limit ? max_num_rows_to_return - rows_returned : limit;
    for (int64_t i = 0; i < limit; i++) {
    ...


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tablet_service.cc@494
PS1, Line 494: limit do
s/limitted/limited


http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

http://gerrit.cloudera.org:8080/#/c/9783/1/src/kudu/tserver/tserver.proto@229
PS1, Line 229:   // DEPRECATED: use column_predicates field.
Thanks for adding this.  Going forward I'd be in favor of using the deprecated=true option and _not_ renaming the field with the DEPRECATED_ prefix.  I think the reason we did that was that we didn't know about the deprecated option.

That being said, in this case I think you shouldn't introduce a new field, you should just reuse the existing one.  You can change the type, however I don't see a downside to keeping it unsigned?



-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 5
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot
Gerrit-Comment-Date: Sat, 24 Mar 2018 11:31:13 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-16 pt 1: add server-side limits on scanners

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Hello Tidy Bot, Dan Burkert, David Ribeiro Alves, Kudu Jenkins, Adar Dembo, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/9783

to look at the new patch set (#6).

Change subject: KUDU-16 pt 1: add server-side limits on scanners
......................................................................

KUDU-16 pt 1: add server-side limits on scanners

This patch adds the functionality to specify limits to the number of
rows returned per tablet server scanner. Internally, each server-side
scanner now tracks the number of rows it has already returned and
adjusts the number of rows to serialize per batch based on this limit.

Note: Use of this functionality by the client will come in a later
patch.

Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
---
M src/kudu/common/scan_spec.cc
M src/kudu/common/scan_spec.h
M src/kudu/common/wire_protocol-test.cc
M src/kudu/common/wire_protocol.cc
M src/kudu/common/wire_protocol.h
M src/kudu/tserver/scanners.cc
M src/kudu/tserver/scanners.h
M src/kudu/tserver/tablet_server-test.cc
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tserver.proto
10 files changed, 172 insertions(+), 46 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/83/9783/6
-- 
To view, visit http://gerrit.cloudera.org:8080/9783
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I3ca5ddff09f4910d12f80259bd11e4027f99aa7c
Gerrit-Change-Number: 9783
Gerrit-PatchSet: 6
Gerrit-Owner: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: David Ribeiro Alves <da...@gmail.com>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Tidy Bot