You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@kudu.apache.org by "Xu Yao (Code Review)" <ge...@cloudera.org> on 2018/05/15 08:40:57 UTC

[kudu-CR] KUDU-2437 Generate ScanToken from small chunks in tablet

Xu Yao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10406


Change subject: KUDU-2437 Generate ScanToken from small chunks in tablet
......................................................................

KUDU-2437 Generate ScanToken from small chunks in tablet

When reading data in a kudu table using spark, if there is a
large amount of data in the tablet, reading the data takes a long time.

The reason is that KuduRDD uses a tablet to generate
the scanToken, so a spark task needs to process all the data in a tablet. So:

1. Split the Tablet into many small chunks by some primary keys
2. Report the primary keys to Master
3. Client get the primary keys from Master, and generate the scanToken

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/LocatedTablet.java
M java/kudu-client/src/main/java/org/apache/kudu/client/RemoteTablet.java
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/tablet/rowset_tree-test.cc
M src/kudu/tablet/rowset_tree.cc
M src/kudu/tablet/rowset_tree.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
16 files changed, 305 insertions(+), 8 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 1
Gerrit-Owner: Xu Yao <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#17).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 732 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/17
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 17
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#15).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 730 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/15
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 15
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#21).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 728 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/21
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 21
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Yao Xu has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 23:

(2 comments)

Hi, Adar Dembo
Thanks for review. I have updated the patch. :)

http://gerrit.cloudera.org:8080/#/c/10406/22/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/10406/22/src/kudu/tablet/rowset_info.h@59
PS22, Line 59:                             Slice start_key,
             :                             Slice stop_key,
> Since you're passing these by copy, no need for 'const'; the caller and cal
Done


http://gerrit.cloudera.org:8080/#/c/10406/22/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/10406/22/src/kudu/tablet/tablet.h@441
PS22, Line 441: column_
> Nit: 'column_ids'
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 23
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Tue, 18 Sep 2018 05:27:26 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#20).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 721 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/20
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 20
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#19).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 732 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/19
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 19
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.

Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 14:

(21 comments)

Nice!  Really like this optimization.

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h
File src/kudu/common/key_range.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h@17
PS14, Line 17: #ifndef KUDU_COMMON_KEY_RANGE_H
Prefer using '#pragma once'  for new header files.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h@30
PS14, Line 30: const std::string& start_key,
             :            const std::string& stop_key,
Take the keys by value and move them into the field.  This allows the caller to potentially save a copy if they move the key into the new range.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/cfile_set.h@100
PS14, Line 100:   // Excludes the ad hoc index and bloomfiles.
I think this note about the index and bloomfiles being excluded is obvious in this case, so consider removing it.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset.h@140
PS14, Line 140:   virtual uint64_t OnDiskBaseDataColumnSize(const ColumnId& col_id) const = 0;
Given that ColumnId is only 4-bytes and trivially copyable its probably better for these APIs to take the ColumnId by value (here and elsewhere).


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset.cc
File src/kudu/tablet/rowset.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset.cc@230
PS14, Line 230:   // The actual value of this doesn't matter, since it won't be selected
This comment probably doesn't apply.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.h@53
PS14, Line 53:   static void SplitKeyRange(const RowSetTree& tree,
Add doc


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc
File src/kudu/tablet/rowset_info.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@151
PS14, Line 151: double WidthByDataSize(const Slice& prev, const Slice& next,
Add a comment, such as:

    Computes the the "width" of an interval as above, for the provided columns in the rowsets.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@152
PS14, Line 152:                        const vector<ColumnId>& col_ids,
Consider adding this parameter last, so that the parameter list of the original method is just a prefix of this ones.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@158
PS14, Line 158: const auto& col_id
copy by value here as well


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@276
PS14, Line 276:  &
Keep the & with the type


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@279
PS14, Line 279:   // if start_key greater than stop_key then return without split
Does this happen in practice?  Maybe we should make this an invariant that's validated at a higher level and put a CHECK here.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@283
PS14, Line 283: 
It would be great to add a doc here about the algorithm being used, similar to lines 198 - 212.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@311
PS14, Line 311:         if (interval_size / 2 > target_chunk_size - chunk_size &&
I expect that in most cases the target chunk size will be much greater than the DRS size (e.g. I'd expect hundreds of MiB or more chunk sizes, and DRSs are capped at 32MiB).

Given that, maybe it makes sense to keep this simple and reset before the cap is exceeded (with the exception that empty chunks).  That way I think you can make this conditional check just 'last_bound != prev'.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc
File src/kudu/tablet/tablet-test.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc@1159
PS14, Line 1159:   RowSetVector new_rowset = {
It'd be good to add tests for a few more tablet layout scenarios, namely:

- Tablet with 0 rowsets
- Tablet with 1 rowset
- Tablet with non-overlapping rowsets (with and without gaps between)
- Tablet with rowset whose start is the minimum value


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc@1176
PS14, Line 1176:     ASSERT_EQ(result.size(), range.size());
               :     for (size_t idx = 0; idx < result.size(); ++idx) {
               :       ASSERT_STREQ(result[idx].start_primary_key().c_str(),
               :                    range[idx].start_primary_key().c_str());
               :       ASSERT_STREQ(result[idx].stop_primary_key().c_str(),
               :                    range[idx].stop_primary_key().c_str());
               :       ASSERT_EQ(result[idx].size_bytes(), range[idx].size_bytes());
               :     }
Consider extracting these checks into a local lambda to reduce the boilerplate:

auto assert_chunks = [] (vector<KeyRange> expected, vector<KeyRange> actual) {
    ...
}


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc@1187
PS14, Line 1187:     std::vector<KeyRange> result = {
Perhaps the algorithm should be modified to combine these 0 sized chunks with the adjacent chunk?  I think in a typical application it will be a waste of resources to schedule/execute workers that end up scanning 0 rows.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc@139
PS14, Line 139: DEFINE_int32(split_default_chunk_size_bytes, 256 * 1024 * 1024,
Not sure its a good idea to have a default for this defined by the tablet server.  I imagine the application calling the api will always have a better idea of what chunk size is appropriate.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc@1439
PS14, Line 1439:   }
Validate the start key is less than the stop key, if they are both set (as discussed in previous comment).


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc@1441
PS14, Line 1441:   vector<ColumnId> column_ids;
validate that the column IDs are valid according to the tablet schema.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc@1448
PS14, Line 1448:   uint64_t target_chunk_size_bytes = req->has_target_chunk_size_bytes() ?
I'd just return an error here.  I'd prefer to be a little more restrictive at first with this API, since we can't later remove this fallback.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tserver.proto@388
PS14, Line 388: // A split key range request. Split tablet to key ranges, the request
s/to key/into primary key



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 14
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Fri, 22 Jun 2018 18:21:03 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#16).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 731 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/16
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 16
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Yao Xu has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 22:

> (8 comments)
 > 
 > I've got a high-level question about the kind of parallelism you're
 > seeking.
 > 
 > If I'm reading the patch correctly, the tablet chunk size input is
 > the encoded and compressed base data size of a particular query's
 > projected columns. This is effectively chunking by the amount of IO
 > the query is expected to need, ignoring deltas because it's hard to
 > calculate their impact without reading them all from disk.
 > 
 > Why did you choose this particular input? How is it better for a
 > client like Spark than, say, a chunking approach that produced
 > equally sized result sets (which would need to consider
 > decoded/decompressed data, plus perhaps input from 
 > MemRowSet/DeltaMemStores, and maybe delta files too)?

Sorry for late reply. I think it's better that use the decoded/decompressed data size. But the chunking approach costs a lot. So, I use the encoded/compressed data size to chunking.


-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 22
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Sun, 16 Sep 2018 11:16:12 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#10).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 607 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/10
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 10
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.

Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 23: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 23
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Tue, 18 Sep 2018 17:02:58 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#14).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 608 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/14
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 14
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Generate ScanToken from small chunks in tablet

Posted by "Xu Yao (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#3).

Change subject: KUDU-2437 Generate ScanToken from small chunks in tablet
......................................................................

KUDU-2437 Generate ScanToken from small chunks in tablet

When reading data in a kudu table using spark, if there is a
large amount of data in the tablet, reading the data takes a long time.

The reason is that KuduRDD uses a tablet to generate
the scanToken, so a spark task needs to process all the data in a tablet. So:

1. Split the Tablet into many small chunks by some primary keys
2. Report the primary keys to Master
3. Client get the primary keys from Master, and generate the scanToken

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/LocatedTablet.java
M java/kudu-client/src/main/java/org/apache/kudu/client/RemoteTablet.java
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/tablet/rowset_tree-test.cc
M src/kudu/tablet/rowset_tree.cc
M src/kudu/tablet/rowset_tree.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
16 files changed, 309 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/3
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 3
Gerrit-Owner: Xu Yao <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#5).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 598 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/5
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 5
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.

Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Reviewed-on: http://gerrit.cloudera.org:8080/10406
Tested-by: Kudu Jenkins
Reviewed-by: Adar Dembo <ad...@cloudera.com>
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 734 insertions(+), 3 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Adar Dembo: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 24
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Generate ScanToken from small chunks in tablet

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Yao Xu has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Generate ScanToken from small chunks in tablet
......................................................................


Patch Set 4:

Thank Todd for the question pointed out, we will modify the patch according to your design.


-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 4
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Fri, 18 May 2018 06:39:17 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Yao Xu has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 19:

(21 comments)

Thanks for review, I already updated the patch.

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h
File src/kudu/common/key_range.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h@17
PS14, Line 17: #pragma once
> Prefer using '#pragma once'  for new header files.
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h@30
PS14, Line 30: const std::string& start_key,
             :            const std::string& stop_key,
> Take the keys by value and move them into the field.  This allows the calle
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/cfile_set.h
File src/kudu/tablet/cfile_set.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/cfile_set.h@100
PS14, Line 100:   uint64_t OnDiskColumnDataSize(const ColumnId& col_id) const;
> I think this note about the index and bloomfiles being excluded is obvious 
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset.h
File src/kudu/tablet/rowset.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset.h@140
PS14, Line 140:   virtual uint64_t OnDiskBaseDataColumnSize(const ColumnId& col_id) const = 0;
> Given that ColumnId is only 4-bytes and trivially copyable its
 > probably better for these APIs to take the ColumnId by value (here
 > and elsewhere).

Is the ColumnId parameter to be passed as a value?


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset.cc
File src/kudu/tablet/rowset.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset.cc@230
PS14, Line 230:   uint64_t size = 0;
> This comment probably doesn't apply.
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.h@53
PS14, Line 53:   // Split [start_key, stop_key) into primary key ranges by chunk size
> Add doc
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc
File src/kudu/tablet/rowset_info.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@151
PS14, Line 151: // Computes the "width" of an interval as above, for the provided columns in the rowsets.
> Add a comment, such as:
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@152
PS14, Line 152: double WidthByDataSize(const Slice& prev, const Slice& next,
> Consider adding this parameter last, so that the parameter list of the orig
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@158
PS14, Line 158: e fraction = Strin
> copy by value here as well
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@276
PS14, Line 276: 
> Keep the & with the type
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@279
PS14, Line 279:                                vector<KeyRange>* ranges) {
> Does this happen in practice?  Maybe we should make this an invariant that'
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@283
PS14, Line 283:   }
> It would be great to add a doc here about the algorithm being used, similar
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/rowset_info.cc@311
PS14, Line 311:     if (prev.compare(next) < 0) {
> I expect that in most cases the target chunk size will be much
 > greater than the DRS size (e.g. I'd expect hundreds of MiB or more
 > chunk sizes, and DRSs are capped at 32MiB).
 > 
 > Given that, maybe it makes sense to keep this simple and reset
 > before the cap is exceeded (with the exception that empty chunks). 
 > That way I think you can make this conditional check just
 > 'last_bound != prev'.

I want KeyRange size to be as close as possible to the target size. Let me think about it.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc
File src/kudu/tablet/tablet-test.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc@1159
PS14, Line 1159:   }
> It'd be good to add tests for a few more tablet layout scenarios, namely:
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc@1176
PS14, Line 1176:   {
               :     std::vector<KeyRange> result = {
               :         KeyRange("", "2", 2000),
               :         KeyRange("2", "5", 6000),
               :         KeyRange("5", "6", 2000),
               :         KeyRange("6", "", 3000)
               :     };
               :     s
> Consider extracting these checks into a local lambda to reduce the boilerpl
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc@1187
PS14, Line 1187:   }
> Perhaps the algorithm should be modified to combine these 0 sized
 > chunks with the adjacent chunk?  I think in a typical application
 > it will be a waste of resources to schedule/execute workers that
 > end up scanning 0 rows.

I want to ensure that the size of KeyRange is close to the target chunk size, so keep the empty KeyRange, because there may be data in this interval in MemRowSet.


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc@139
PS14, Line 139: using google::protobuf::RepeatedPtrField;
> Not sure its a good idea to have a default for this defined by the tablet s
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc@1439
PS14, Line 1439:     if (start->encoded_key().compare(stop->encoded_key()) > 0) {
> Validate the start key is less than the stop key, if they are both set (as 
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc@1441
PS14, Line 1441:                            Status::IllegalState("Invalid primary key range"),
> validate that the column IDs are valid according to the tablet schema.
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tablet_service.cc@1448
PS14, Line 1448:   // Get ColumnId from request
> I'd just return an error here.  I'd prefer to be a little more restrictive 
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tserver.proto
File src/kudu/tserver/tserver.proto:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tserver/tserver.proto@388
PS14, Line 388: // A split key range request. Split tablet to key ranges, the request
> s/to key/into primary key
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 19
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Mon, 09 Jul 2018 15:14:20 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.

Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 21:

(8 comments)

I've got a high-level question about the kind of parallelism you're seeking.

If I'm reading the patch correctly, the tablet chunk size input is the encoded and compressed base data size of a particular query's projected columns. This is effectively chunking by the amount of IO the query is expected to need, ignoring deltas because it's hard to calculate their impact without reading them all from disk.

Why did you choose this particular input? How is it better for a client like Spark than, say, a chunking approach that produced equally sized result sets (which would need to consider decoded/decompressed data, plus perhaps input from  MemRowSet/DeltaMemStores, and maybe delta files too)?

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/common/key_range.h
File src/kudu/common/key_range.h:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/common/key_range.h@29
PS21, Line 29: public:
Nit: indent this by one character. Same for "private".


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/common/key_range.h@54
PS21, Line 54:     std::string start_key_;
             :     std::string stop_key_;
             :     uint64_t size_bytes_;
Nit: indentation should match that of methods (i.e. two characters).


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.h@53
PS21, Line 53:   // Split [start_key, stop_key) into primary key ranges by chunk size.
Doc what effect col_ids has on the operation.


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.h@55
PS21, Line 55:                             const Slice& start_key,
             :                             const Slice& stop_key,
Nit: because they're small, we generally pass Slice by copy rather than by reference.


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.h@57
PS21, Line 57:                             const std::vector<ColumnId> &col_ids,
Nit: "const Foo&" (rather than "const Foo &").


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.cc
File src/kudu/tablet/rowset_info.cc:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.cc@305
PS21, Line 305: RowSetTree::RSEndpoint
Nit: can use auto here.


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.cc@369
PS21, Line 369: uint64_t RowSetInfo::size_bytes(const ColumnId& col_id) const {
              :   return extra_->rowset->OnDiskBaseDataColumnSize(col_id);
              : }
Could this be inlined in the header like size_bytes()?


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/tablet.h@435
PS21, Line 435:   void SplitKeyRange(const EncodedKey* start_key,
Please document the effect of this function.



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 21
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 18:47:29 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#13).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 608 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/13
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 13
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#6).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 598 insertions(+), 2 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/6
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 6
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Generate ScanToken from small chunks in tablet

Posted by "Xu Yao (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#4).

Change subject: KUDU-2437 Generate ScanToken from small chunks in tablet
......................................................................

KUDU-2437 Generate ScanToken from small chunks in tablet

When reading data in a kudu table using spark, if there is a
large amount of data in the tablet, reading the data takes a long time.

The reason is that KuduRDD uses a tablet to generate
the scanToken, so a spark task needs to process all the data in a tablet. So:

1. TS report the DRS bounds info to Master
2. Client get the bounds info from Master
3. Client generate the scanToken by bounds info of tablet（set LowerBoundPrimaryKey and UpperBoundPrimaryKey）

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/LocatedTablet.java
M java/kudu-client/src/main/java/org/apache/kudu/client/RemoteTablet.java
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/tablet/rowset_tree-test.cc
M src/kudu/tablet/rowset_tree.cc
M src/kudu/tablet/rowset_tree.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
16 files changed, 310 insertions(+), 10 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/4
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 4
Gerrit-Owner: Xu Yao <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#9).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 607 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/9
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 9
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#7).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 602 insertions(+), 5 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/7
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 7
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Generate ScanToken from small chunks in tablet

Posted by "Xu Yao (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#2).

Change subject: KUDU-2437 Generate ScanToken from small chunks in tablet
......................................................................

KUDU-2437 Generate ScanToken from small chunks in tablet

When reading data in a kudu table using spark, if there is a
large amount of data in the tablet, reading the data takes a long time.

The reason is that KuduRDD uses a tablet to generate
the scanToken, so a spark task needs to process all the data in a tablet. So:

1. Split the Tablet into many small chunks by some primary keys
2. Report the primary keys to Master
3. Client get the primary keys from Master, and generate the scanToken

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M java/kudu-client/src/main/java/org/apache/kudu/client/KuduScanToken.java
M java/kudu-client/src/main/java/org/apache/kudu/client/LocatedTablet.java
M java/kudu-client/src/main/java/org/apache/kudu/client/RemoteTablet.java
M java/kudu-spark/src/main/scala/org/apache/kudu/spark/kudu/KuduRDD.scala
M src/kudu/master/catalog_manager.cc
M src/kudu/master/catalog_manager.h
M src/kudu/master/master.proto
M src/kudu/master/master_service.cc
M src/kudu/tablet/rowset_tree-test.cc
M src/kudu/tablet/rowset_tree.cc
M src/kudu/tablet/rowset_tree.h
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/heartbeater.cc
M src/kudu/tserver/ts_tablet_manager.cc
M src/kudu/tserver/ts_tablet_manager.h
16 files changed, 306 insertions(+), 8 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/2
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 2
Gerrit-Owner: Xu Yao <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins

Re: [kudu-CR] KUDU-2437 Generate ScanToken from small chunks in tablet

Posted by 徐瑶 <oc...@gmail.com>.

Hi Todd,

Thank you for the question pointed out, we will modify the patch according
to your design.

Yao

2018-05-18 1:13 GMT+08:00 Todd Lipcon (Code Review) <ge...@cloudera.org>:

> Todd Lipcon *posted comments* on this change.
>
> View Change <http://gerrit.cloudera.org:8080/10406>
>
> Patch set 4:
>
> Hi Xu. Thanks for working on this! It will be really great to decouple the
> available parallelism from the number of partitions.
>
> I have a couple concerns about the design in this patch, though.
>
> 1) periodic reporting of the bounds to the master: for our eventual
> scalability goals of thousands of nodes, each with thousands of tablets, I
> think this may be too much information to try to keep in memory on the
> master and to report on a periodic basis. Especially considering that many
> tablets may be cold (bounds not changing), it's a lot of redundant
> processing. Additionally, with it kept only in memory (not persisted) this
> means that shortly following a restart or failover, the master will not
> have the chunk bounds populated and the query plans submitted by users may
> change dramatically.
>
> 2) including the bounds in every GetTabletLocations call is also somewhat
> expensive from a size/performance perspective. In the case of writes or
> scans, they aren't needed. It would be better to avoid the expense for
> these common cases and only worry about the bounds for the specific use
> case you are targeting (generating splits for tasks)
>
>
> 3) from a security perspective, I'm not sure it's safe to include the
> bounds information in the 'getTabletLocations' RPC response. When we start
> adding more fine-grained authorization it's likely that we will want to
> provide location information to administrators/tools (eg to facilitate
> balancing or other ops workflows) but those administrators may not have
> access to read the underlying data. The bounds themselves can likely expose
> user data (eg imagine that the primary key is an email address or
> government ID number that should not be exposed)
>
>
> I'd like to consider an alternate design in which we add a new RPC to the
> tablet server itself, something like:
>
> message SplitKeyRangeRequest {
>   bytes tablet_id
>   bytes start_pk
>   bytes end_pk
>
>   // Number of bytes to try to return in each chunk. This is a hint.
>   // The tablet server may return chunks larger or smaller than this value.
>   int64 target_chunk_size_bytes
>
>   // The columns to consider when chunking.
>   // If specified, then the size estimate used for 'target_chunk_size_bytes'
>   // should only include these columns. This can be used if a query will
>   // only scan a certain subset of the columns.
>   repeated int32 col_ids
> }
>
> (perhaps the 'col_ids' feature could be left out initially)
>
> message SplitKeyRangeResponse {
>   repeated bytes split_pks
>   repeated int64 size_estimates
> }
>
> The advantages I see for this design are:
>
>    - no need to centralize the split metadata (so we can be more
>    fine-grained without worrying about master load or memory)
>    - for the purposes of locality scheduling, we only need the partition
>    locations, so the "planner" (spark driver or impala front-end) doesn't care
>    about the sub-partition splits. The individual tasks can then contact their
>    local tserver to further sub-divide the range
>    - on-demand computation of the splits means we will be able to do a
>    better job of lazily opening cold tablets later.
>
> What do you think?
>
>
> To view, visit change 10406 <http://gerrit.cloudera.org:8080/10406>. To
> unsubscribe, visit settings <http://gerrit.cloudera.org:8080/settings>.
> Gerrit-Project: kudu
> Gerrit-Branch: master
> Gerrit-MessageType: comment
> Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
> Gerrit-Change-Number: 10406
> Gerrit-PatchSet: 4
> Gerrit-Owner: Xu Yao <oc...@gmail.com>
> Gerrit-Reviewer: Kudu Jenkins
> Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
> Gerrit-Comment-Date: Thu, 17 May 2018 17:13:07 +0000
> Gerrit-HasComments: No
>

[kudu-CR] KUDU-2437 Generate ScanToken from small chunks in tablet

Posted by "Todd Lipcon (Code Review)" <ge...@cloudera.org>.

Todd Lipcon has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Generate ScanToken from small chunks in tablet
......................................................................

Patch Set 4:

Hi Xu. Thanks for working on this! It will be really great to decouple the available parallelism from the number of partitions.

I have a couple concerns about the design in this patch, though.

1) periodic reporting of the bounds to the master: for our eventual scalability goals of thousands of nodes, each with thousands of tablets, I think this may be too much information to try to keep in memory on the master and to report on a periodic basis. Especially considering that many tablets may be cold (bounds not changing), it's a lot of redundant processing. Additionally, with it kept only in memory (not persisted) this means that shortly following a restart or failover, the master will not have the chunk bounds populated and the query plans submitted by users may change dramatically.

2) including the bounds in every GetTabletLocations call is also somewhat expensive from a size/performance perspective. In the case of writes or scans, they aren't needed. It would be better to avoid the expense for these common cases and only worry about the bounds for the specific use case you are targeting (generating splits for tasks)

3) from a security perspective, I'm not sure it's safe to include the bounds information in the 'getTabletLocations' RPC response. When we start adding more fine-grained authorization it's likely that we will want to provide location information to administrators/tools (eg to facilitate balancing or other ops workflows) but those administrators may not have access to read the underlying data. The bounds themselves can likely expose user data (eg imagine that the primary key is an email address or government ID number that should not be exposed)

I'd like to consider an alternate design in which we add a new RPC to the tablet server itself, something like:

message SplitKeyRangeRequest {
bytes tablet_id
bytes start_pk
bytes end_pk

// Number of bytes to try to return in each chunk. This is a hint.
// The tablet server may return chunks larger or smaller than this value.
int64 target_chunk_size_bytes

// The columns to consider when chunking.
// If specified, then the size estimate used for 'target_chunk_size_bytes'
// should only include these columns. This can be used if a query will
// only scan a certain subset of the columns.
repeated int32 col_ids
}

(perhaps the 'col_ids' feature could be left out initially)

message SplitKeyRangeResponse {
repeated bytes split_pks
repeated int64 size_estimates
}

The advantages I see for this design are:
- no need to centralize the split metadata (so we can be more fine-grained without worrying about master load or memory)
- for the purposes of locality scheduling, we only need the partition locations, so the "planner" (spark driver or impala front-end) doesn't care about the sub-partition splits. The individual tasks can then contact their local tserver to further sub-divide the range
- on-demand computation of the splits means we will be able to do a better job of lazily opening cold tablets later.

What do you think?

--
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 4
Gerrit-Owner: Xu Yao <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Comment-Date: Thu, 17 May 2018 17:13:07 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Yao Xu has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 22:

(10 comments)

> (2 comments)
 > 
 > This is looking really good!  Just a few nits left.

Sorry for late reply. Thanks for review.

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/common/key_range.h
File src/kudu/common/key_range.h:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/common/key_range.h@29
PS21, Line 29:  public:
> Nit: indent this by one character. Same for "private".
Done


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/common/key_range.h@54
PS21, Line 54:   std::string start_key_;
             :   std::string stop_key_;
             :   uint64_t size_bytes_;
> Nit: indentation should match that of methods (i.e. two characters).
Done


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.h@53
PS21, Line 53:   // Split [start_key, stop_key) into primary key ranges by chunk size.
> Doc what effect col_ids has on the operation.
Done


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.h@55
PS21, Line 55:   // If col_ids specified, then the size estimate used for 'target_chunk_size'
             :   // should only include these columns. This can b
> Nit: because they're small, we generally pass Slice by copy rather than by 
Done


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.h@57
PS21, Line 57:   // only scan a certain subset of the columns.
> Nit: "const Foo&" (rather than "const Foo &").
Done


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.cc
File src/kudu/tablet/rowset_info.cc:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.cc@305
PS21, Line 305: auto& rse : tree.key_e
> Nit: can use auto here.
Done


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/rowset_info.cc@369
PS21, Line 369: uint64_t RowSetInfo::size_bytes(const ColumnId& col_id) const {
              :   return extra_->rowset->OnDiskBaseDataColumnSize(col_id);
              : }
> Could this be inlined in the header like size_bytes()?
OnDiskBaseDataColumnSize is a virtual function, so I think size_bytes(const ColumnId&) can't be inlined.


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tablet/tablet.h@435
PS21, Line 435: 
> Please document the effect of this function.
Done


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tserver/tablet_service.cc@1422
PS21, Line 1422:                            context);
> INVALID_CONFIG is used for a specific Raft algorithm issue, I think UNKNOWN
Done


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tserver/tablet_service.cc@1469
PS21, Line 1469: uta
> use the Schema::kColumnNotFound constant here
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 22
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Sun, 16 Sep 2018 10:54:43 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#22).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 734 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/22
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 22
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Adar Dembo, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#23).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 734 insertions(+), 3 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/23
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 23
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#8).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 606 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/8
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 8
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Dan Burkert, Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#18).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 730 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/18
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 18
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#12).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 608 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/12
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 12
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.

Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 19:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h
File src/kudu/common/key_range.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h@30
PS14, Line 30: const std::string& start_key,
             :            const std::string& stop_key,
> Done
start/stop_key should be 'std::string' not 'const std::string&' here, otherwise it's always copying (std::move on a cref is a copy).


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/rowset_info.h@53
PS19, Line 53: size
add period


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/rowset_info.cc
File src/kudu/tablet/rowset_info.cc:

http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/rowset_info.cc@282
PS19, Line 282:     CHECK(start_key.compare(stop_key) <= 0);
Probably cleaner to put both conditions in the CHECK, e.g.

stop_key.empty() || start_key.compare...


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc
File src/kudu/tablet/tablet-test.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc@1187
PS14, Line 1187:   }
> > Perhaps the algorithm should be modified to combine these 0 sized
That's true of every interval, though?  Is there a reason to expect that the empty intervals have significantly more MRS data?


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/tablet-test.cc
File src/kudu/tablet/tablet-test.cc:

http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/tablet-test.cc@1150
PS19, Line 1150: AsserChunks
'Assert'


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tserver/tablet_service.cc@1450
PS19, Line 1450:   for (const ColumnSchemaPB& column : req->columns()) {
Sorry I missed this before, but user requests should never include column IDs, instead this API should be based on column names.  You can model this based on how new scan requests are handled here: https://github.com/apache/kudu/blob/master/src/kudu/tserver/tablet_service.cc?utf8=%E2%9C%93#L1692-L1702


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tserver/tablet_service.cc@1466
PS19, Line 1466: req->has_target_chunk_size_bytes() || req->target_chunk_size_bytes() == 0
The first clause here can be dropped, since protobuf will return 0 if it's unset.


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tserver/tablet_service.cc@1468
PS19, Line 1468: targe
target



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 19
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Fri, 13 Jul 2018 00:11:11 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Dan Burkert (Code Review)" <ge...@cloudera.org>.

Dan Burkert has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 21:

(2 comments)

This is looking really good!  Just a few nits left.

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tserver/tablet_service.cc@1422
PS21, Line 1422:                            TabletServerErrorPB::INVALID_CONFIG,
INVALID_CONFIG is used for a specific Raft algorithm issue, I think UNKNOWN_ERROR is probably the best bet for these invalid arg cases.


http://gerrit.cloudera.org:8080/#/c/10406/21/src/kudu/tserver/tablet_service.cc@1469
PS21, Line 1469:  -1
use the Schema::kColumnNotFound constant here



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 21
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Tue, 17 Jul 2018 17:52:38 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Hello Kudu Jenkins, Todd Lipcon, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10406

to look at the new patch set (#11).

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................

KUDU-2437 Split a tablet into primary key ranges by size

When reading data in a kudu table using spark,
if there is a large amount of data in the tablet,
reading the data takes a long time. The reason
is that KuduRDD uses a tablet to generate the
scanToken, so a spark task needs to process all
the data in a tablet.

TabletServer should provide an RPC interface,
which can be split tablet into multiple primary
key ranges by size. The kudu-client can choose
whether to perform parallel scan according to
the case.

Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
---
M src/kudu/common/CMakeLists.txt
M src/kudu/common/common.proto
A src/kudu/common/key_range.cc
A src/kudu/common/key_range.h
M src/kudu/tablet/cfile_set.cc
M src/kudu/tablet/cfile_set.h
M src/kudu/tablet/diskrowset.cc
M src/kudu/tablet/diskrowset.h
M src/kudu/tablet/memrowset.h
M src/kudu/tablet/mock-rowsets.h
M src/kudu/tablet/rowset.cc
M src/kudu/tablet/rowset.h
M src/kudu/tablet/rowset_info.cc
M src/kudu/tablet/rowset_info.h
M src/kudu/tablet/tablet-harness.h
M src/kudu/tablet/tablet-test-util.h
M src/kudu/tablet/tablet-test.cc
M src/kudu/tablet/tablet.cc
M src/kudu/tablet/tablet.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/tserver/tablet_service.h
M src/kudu/tserver/tserver.proto
M src/kudu/tserver/tserver_service.proto
23 files changed, 608 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/06/10406/11
-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 11
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Yao Xu (Code Review)" <ge...@cloudera.org>.

Yao Xu has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 21:

(8 comments)

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h
File src/kudu/common/key_range.h:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/common/key_range.h@30
PS14, Line 30: std::string start_key,
             :            std::string stop_key,
> start/stop_key should be 'std::string' not 'const std::string&' here, other
Done


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/rowset_info.h@53
PS19, Line 53: size
> add period
Done


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/rowset_info.cc
File src/kudu/tablet/rowset_info.cc:

http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/rowset_info.cc@282
PS19, Line 282: 
> Probably cleaner to put both conditions in the CHECK, e.g.
Done


http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc
File src/kudu/tablet/tablet-test.cc:

http://gerrit.cloudera.org:8080/#/c/10406/14/src/kudu/tablet/tablet-test.cc@1187
PS14, Line 1187:   }
> That's true of every interval, though?  Is there a reason to expect that th
I modified the algorithm to combine these 0 sized chunks with adjacent chunk.


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/tablet-test.cc
File src/kudu/tablet/tablet-test.cc:

http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tablet/tablet-test.cc@1150
PS19, Line 1150: AssertChunk
> 'Assert'
Done


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tserver/tablet_service.cc@1450
PS19, Line 1450:   s = ColumnPBsToSchema(req->columns(), &schema);
> Sorry I missed this before, but user requests should never include column I
Done


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tserver/tablet_service.cc@1466
PS19, Line 1466: r<ColumnId> column_ids;
> The first clause here can be dropped, since protobuf will return 0 if it's 
Done


http://gerrit.cloudera.org:8080/#/c/10406/19/src/kudu/tserver/tablet_service.cc@1468
PS19, Line 1468: 
> target
Done



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 21
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Sat, 14 Jul 2018 03:19:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2437 Split a tablet into primary key ranges by size

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.

Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/10406 )

Change subject: KUDU-2437 Split a tablet into primary key ranges by size
......................................................................


Patch Set 22:

(2 comments)

Looks good, just two more nits.

http://gerrit.cloudera.org:8080/#/c/10406/22/src/kudu/tablet/rowset_info.h
File src/kudu/tablet/rowset_info.h:

http://gerrit.cloudera.org:8080/#/c/10406/22/src/kudu/tablet/rowset_info.h@59
PS22, Line 59:                             const Slice start_key,
             :                             const Slice stop_key,
Since you're passing these by copy, no need for 'const'; the caller and callee by definition have their own copies and so 'const' doesn't add any protection for the caller.


http://gerrit.cloudera.org:8080/#/c/10406/22/src/kudu/tablet/tablet.h
File src/kudu/tablet/tablet.h:

http://gerrit.cloudera.org:8080/#/c/10406/22/src/kudu/tablet/tablet.h@441
PS22, Line 441: col_ids
Nit: 'column_ids'



-- 
To view, visit http://gerrit.cloudera.org:8080/10406
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I9ec4395919f4b54102e458ef5154334c08412e8a
Gerrit-Change-Number: 10406
Gerrit-PatchSet: 22
Gerrit-Owner: Yao Xu <oc...@gmail.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Dan Burkert <da...@apache.org>
Gerrit-Reviewer: Kudu Jenkins
Gerrit-Reviewer: Todd Lipcon <to...@apache.org>
Gerrit-Reviewer: Yao Xu <oc...@gmail.com>
Gerrit-Comment-Date: Mon, 17 Sep 2018 22:38:06 +0000
Gerrit-HasComments: Yes