You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@kudu.apache.org by "Adar Dembo (Code Review)" <ge...@cloudera.org> on 2019/10/18 00:15:01 UTC

[kudu-CR] WIP: fix ORDERED scan regression when scanned columns are out-of-order

Hello Attila Bukor, Andrew Wong, Grant Henke,

I'd like you to do a code review. Please visit

    http://gerrit.cloudera.org:8080/14503

to review the following change.


Change subject: WIP: fix ORDERED scan regression when scanned columns are out-of-order
......................................................................

WIP: fix ORDERED scan regression when scanned columns are out-of-order

The new test in client-test is just a POC; proper tests need to be written.

Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
---
M src/kudu/client/client-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/schema.cc
M src/kudu/common/schema.h
M src/kudu/tserver/tablet_service.cc
5 files changed, 189 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/03/14503/1
-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14503

to look at the new patch set (#6).

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................

KUDU-2980: fix ORDERED scans when key columns are misordered in projection

This patch fixes KUDU-2980 by taking a stricter approach to how key columns
appear in ORDERED scan projections. Any key columns already in the
client-provided projection are completely ignored. Instead, when we rebuild
the projection server-side, the first section (i.e. the part with the key
columns) is built using the tablet's schema. After that, we incorporate any
remaining columns from the client-provided projection, dropping any key
columns. Finally, we add any columns necessary to satisfy predicates.

The new fuzz test reproduces the bug very easily, and is expansive enough to
provide additional coverage for funky projection and predicate ordering. I
looped it 1000 times in dist-test without failure.

Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
---
M src/kudu/client/client-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/partial_row.h
M src/kudu/integration-tests/data_gen_util.cc
M src/kudu/integration-tests/data_gen_util.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/util/random_util.h
7 files changed, 195 insertions(+), 32 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/03/14503/6
-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] WIP: fix ORDERED scan regression when scanned columns are out-of-order

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14503

to look at the new patch set (#2).

Change subject: WIP: fix ORDERED scan regression when scanned columns are out-of-order
......................................................................

WIP: fix ORDERED scan regression when scanned columns are out-of-order

The new test in client-test is just a POC; proper tests need to be written.

Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
---
M src/kudu/client/client-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/tserver/tablet_service.cc
3 files changed, 178 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/03/14503/2
-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has removed Kudu Jenkins from this change.  ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Removed reviewer Kudu Jenkins with the following votes:

* Verified-1 by Kudu Jenkins (120)
-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: deleteReviewer
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6440
PS4, Line 6440: SelectRandomSubset
You may be looking for ReservoirSample here; SelectRandomSubset returns _at least_ num_cols_to_project.



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 17:26:48 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 4: Code-Review+1

(4 comments)

LGTM, just a nit and some questions.

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6391
PS4, Line 6391:   // Generate a somewhat random schema with half of the columns in the primary key.
Maybe I'm missing something, but isn't this not random at all, in that this generates the same schema every time? It's cycling through the exact same types with the same starting type, and so it's wholly defined by that type ordering and a couple of constants, no?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6445
PS4, Line 6445:   // Insert some rows with randomized keys, flushing the tablet after each one.
Do you think it's worth flushing less frequently than once per row so we get rowsets that span more than one key?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6485
PS4, Line 6485: all_col_names
Just verifying my understanding -- this is where the magic of this test is, right? We may end up predicating on a key column that isn't being projected, and so previously, a key being a part of missing_cols could be reordered to the end?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/tserver/tablet_service.cc@2332
PS4, Line 2332: It doesn't know which of its columns are key columns. 
nit: maybe rephrase this since it's not true for ORDERED scans?



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Comment-Date: Wed, 23 Oct 2019 05:45:04 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6440
PS4, Line 6440: , string, Random>(
> Good catch, reading comprehension fail on my part.
Yeah, that should do it.



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 18:22:12 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14503

to look at the new patch set (#4).

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................

KUDU-2980: fix ORDERED scans when key columns are misordered in projection

This patch fixes KUDU-2980 by taking a stricter approach to how key columns
appear in ORDERED scan projections. Any key columns already in the
client-provided projection are completely ignored. Instead, when we rebuild
the projection server-side, the first section (i.e. the part with the key
columns) is built using the tablet's schema. After that, we incorporate any
remaining columns from the client-provided projection, dropping any key
columns. Finally, we add any columns necessary to satisfy predicates.

The new fuzz test reproduces the bug very easily, and is expansive enough to
provide additional coverage for funky projection and predicate ordering. I
looped it 1000 times in dist-test without failure.

Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
---
M src/kudu/client/client-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/partial_row.h
M src/kudu/integration-tests/data_gen_util.cc
M src/kudu/integration-tests/data_gen_util.h
M src/kudu/tserver/tablet_service.cc
6 files changed, 185 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/03/14503/4
-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 4:

(2 comments)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6439
PS4, Line 6439: num_cols_to_project
> I don't think so.
I meant 'break' (it was a typo), and I assume the answer is that it doesn't break if running the test with empty projections :)


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6491
PS4, Line 6491:   // Use a fault tolerant scan half the time.
              :   if (rng.OneIn(2)) {
              :     ASSERT_OK(scanner->SetFaultTolerant());
              :   }
> Not sure I understand. Are you suggesting parameterizing the test itself on
The difference is that with explicitly parameterized test, it would be easier to understand whether it happened on a FT or non-FT scan, if the test fails.  That might help with debugging and troubleshooting.



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 18:20:00 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 4: Verified+1

Overriding Jenkins, test failure was a known (and unrelated) flake.


-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 04:50:57 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 4:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6391
PS4, Line 6391:   // Generate a somewhat random schema with half of the columns in the primary key.
> Maybe I'm missing something, but isn't this not random at all, in that this
Oops yeah I can't remember whether I wrote the comment before the code, or whether I just plain goofed.

In any case, given that the projection itself is random, I don't see a need for the schema to be. I just wrote the code in this way so that the number of columns could be adjusted without having to change any other schema setup code.


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6391
PS4, Line 6391: random
> It looks like a pretty deterministic schema based on the circular next_type
Ack


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6401
PS4, Line 6401: ->NotNull();
> Is it important to test the case when nullable columns are present as well?
Not for this test: the projection setup code doesn't care about nullability.


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6409
PS4, Line 6409:       case KuduColumnSchema::INT8: data_type = KuduColumnSchema::INT16; break;
              :       case KuduColumnSchema::INT16: data_type = KuduColumnSchema::INT32; break;
              :       case KuduColumnSchema::INT32: data_type = KuduColumnSchema::INT64; break;
              :       case KuduColumnSchema::INT64: data_type = KuduColumnSchema::STRING; break;
              :       case KuduColumnSchema::STRING: data_type = KuduColumnSchema::INT8; break;
              :       default: LOG(FATAL) << "Unexpected data type " << data_type;
> Does it make sense to add some randomness here?  For example, choose next t
See what I wrote Andrew earlier: I don't think it's necessary because the projection itself is random, and that's where the type transitions (between columns) are a little more meaningful.


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6439
PS4, Line 6439: num_cols_to_project
> Does it brake if Uniform() returns 0?
I don't think so.


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6445
PS4, Line 6445:   // Insert some rows with randomized keys, flushing the tablet after each one.
> Do you think it's worth flushing less frequently than once per row so we ge
I'll add some randomness to the flush to get the best of both worlds.


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6473
PS4, Line 6473: // Leave one row in the tablet's MRS.
> Could you add an explanation why it's important?
Done


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6485
PS4, Line 6485: all_col_names
> Just verifying my understanding -- this is where the magic of this test is,
Yes, this is one part of it. But even if we had no predicates, the randomized projection could have also exercised the bug if it included a subset of the key columns out of order.


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6491
PS4, Line 6491:   // Use a fault tolerant scan half the time.
              :   if (rng.OneIn(2)) {
              :     ASSERT_OK(scanner->SetFaultTolerant());
              :   }
> This is a fuzz test, but maybe it's better to have this as a parameter anyw
Not sure I understand. Are you suggesting parameterizing the test itself on FT vs. non-FT? Is that fundamentally different than the implicit "parameterization" that's already done here?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6499
PS4, Line 6499: unordered_set
> Could you comment on why it's important (or actually not so important) to h
Sure


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/common/generic_iterators.cc
File src/kudu/common/generic_iterators.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/common/generic_iterators.cc@1174
PS4, Line 1174: !disallow_pushdown_for_tests_
> nit: could be wrapped into PREDICT_FALSE() and, probably, the predicates mi
Done


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/common/generic_iterators.cc@1176
PS4, Line 1176: num_columns - spec->predicates().size()
> Is it guaranteed that num_columns <= spec->predicates().size()?  If not, ma
I don't know, but I'll add a DCHECK.


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/common/generic_iterators.cc@1198
PS4, Line 1198:     non_predicate_column_indexes_.reserve(num_columns);
              :     for (int32_t col_idx = 0; col_idx < num_columns; col_idx++) {
              :       non_predicate_column_indexes_.emplace_back(col_idx);
              :     }
> syntactic sugar nit: consider replacing with
Done


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/tserver/tablet_service.cc@2332
PS4, Line 2332: It doesn't know which of its columns are key columns. 
> nit: maybe rephrase this since it's not true for ORDERED scans?
It's true for both though: when we build the projection from the client (i.e. L2272, and even up until we reach this section of code), we don't know which columns belong to the primary key and which don't. But, for ORDERED scans we need to know what the primary key is.

The old code looked up the projection's column names in the tablet schema to figure that out as it built the server-side projection (L2343).

The new code just adds the tablet schema's key columns verbatim to the server-side projection (L2337-2341), but notice how that's only done for ORDERED scans, not UNORDERED ones.



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Comment-Date: Wed, 23 Oct 2019 17:12:05 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 6: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 21:03:16 +0000
Gerrit-HasComments: No

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Kudu Jenkins, Andrew Wong, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14503

to look at the new patch set (#3).

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................

KUDU-2980: fix ORDERED scans when key columns are misordered in projection

This patch fixes KUDU-2980 by taking a stricter approach to how key columns
appear in ORDERED scan projections. Any key columns already in the
client-provided projection are completely ignored. Instead, when we rebuild
the projection server-side, the first section (i.e. the part with the key
columns) is built using the tablet's schema. After that, we incorporate any
remaining columns from the client-provided projection, dropping any key
columns. Finally, we add any columns necessary to satisfy predicates.

The new fuzz test reproduces the bug very easily, and is expansive enough to
provide additional coverage for funky projection and predicate ordering. I
looped it 1000 times in dist-test without failure.

Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
---
M src/kudu/client/client-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/partial_row.h
M src/kudu/integration-tests/data_gen_util.cc
M src/kudu/integration-tests/data_gen_util.h
M src/kudu/tserver/tablet_service.cc
6 files changed, 184 insertions(+), 26 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/03/14503/3
-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 3
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] WIP: fix ORDERED scan regression when scanned columns are out-of-order

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: WIP: fix ORDERED scan regression when scanned columns are out-of-order
......................................................................


Patch Set 2:

(5 comments)

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/client/client-test.cc@6379
PS1, Line 6379: CDH82238
> nit: maybe, turn this into something upstream-like?
I'm going to junk this whole test and replace it in an upcoming revision.


http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/client/client-test.cc@6380
PS1, Line 6380:   KuduSchemaBuilder b;
> nit: maybe replace the column names with something not field-specific?
Ack


http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/common/schema.cc
File src/kudu/common/schema.cc:

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/common/schema.cc@706
PS1, Line 706: } // namespace kudu
> Is it the entire order of the schema that we care about? Or just the keys? 
It is just the keys. I thought the sorting approach would be net simpler to understand, but the sort comparator (and associated comment) are confusing.

Here's a different approach, which I also find a bit confusing, but in a different way.


http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/common/schema.cc@721
PS1, Line 721: 
             : 
> Are we guaranteed that non-existent columns are not a part of the key range
Ack


http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/tserver/tablet_service.cc@2338
PS1, Line 2338:       // CHECK_OK is safe because the tablet schema has no duplicate columns.
              :       CHECK_OK(projection_builder.AddColumn(col, /* is_key= */ true));
              :     }
              :     for (int i = 0; i < projection.num_columns(); i++) {
              :       const auto& col = projection.column(i);
              :       // Any key columns in the projection will be ignored.
> nit: can you move this down by the ORDERED condition, since this only reall
Ack



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 2
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Sat, 19 Oct 2019 01:31:50 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 5:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6440
PS4, Line 6440: SelectRandomSubset
> You may be looking for ReservoirSample here; SelectRandomSubset returns _at
Good catch, reading comprehension fail on my part.

I think SelectRandomSubset is still what I want though; I just need to pass 0 in for min_to_return. Is that right?



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 17:42:42 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 4:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6391
PS4, Line 6391: random
It looks like a pretty deterministic schema based on the circular next_type transition, no?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6401
PS4, Line 6401: ->NotNull();
Is it important to test the case when nullable columns are present as well?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6409
PS4, Line 6409:       case KuduColumnSchema::INT8: data_type = KuduColumnSchema::INT16; break;
              :       case KuduColumnSchema::INT16: data_type = KuduColumnSchema::INT32; break;
              :       case KuduColumnSchema::INT32: data_type = KuduColumnSchema::INT64; break;
              :       case KuduColumnSchema::INT64: data_type = KuduColumnSchema::STRING; break;
              :       case KuduColumnSchema::STRING: data_type = KuduColumnSchema::INT8; break;
              :       default: LOG(FATAL) << "Unexpected data type " << data_type;
Does it make sense to add some randomness here?  For example, choose next type based on rand() % num_types?  In that case there might be cases when columns of the same type are adjacent in the sequence.


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6439
PS4, Line 6439: num_cols_to_project
Does it brake if Uniform() returns 0?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6491
PS4, Line 6491:   // Use a fault tolerant scan half the time.
              :   if (rng.OneIn(2)) {
              :     ASSERT_OK(scanner->SetFaultTolerant());
              :   }
This is a fuzz test, but maybe it's better to have this as a parameter anyways since it's not the part of the targeted fuzzing?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6499
PS4, Line 6499: unordered_set
Could you comment on why it's important (or actually not so important) to have unrodered_set, and not just set, while performing the comparison?



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Comment-Date: Wed, 23 Oct 2019 05:59:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6491
PS4, Line 6491:   // Use a fault tolerant scan half the time.
              :   if (rng.OneIn(2)) {
              :     ASSERT_OK(scanner->SetFaultTolerant());
              :   }
> Yeah but that's not enough to repro the (hypothetical) test failure: you'd 
SGTM



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 21:32:07 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP: fix ORDERED scan regression when scanned columns are out-of-order

Posted by "Andrew Wong (Code Review)" <ge...@cloudera.org>.
Andrew Wong has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: WIP: fix ORDERED scan regression when scanned columns are out-of-order
......................................................................


Patch Set 1:

(3 comments)

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/common/schema.cc
File src/kudu/common/schema.cc:

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/common/schema.cc@706
PS1, Line 706: void SchemaBuilder::SortByReferenceSchema(const Schema& reference) {
Is it the entire order of the schema that we care about? Or just the keys? If it's just the keys, could we just pull the keys from the reference schema, remove them from `cols_` by name, and then add them to the front?

Correct me if i'm wrong, I'm assuming the broken invariant shows up in Schema::Compare(), where it's comparing the first num_key_columns_ columns


http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/common/schema.cc@721
PS1, Line 721:   // Such columns are ignored for the purposes of the sort, though their
             :   // relative order is preserved thanks to stable_sort.
Are we guaranteed that non-existent columns are not a part of the key range?


http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/tserver/tablet_service.cc
File src/kudu/tserver/tablet_service.cc:

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/tserver/tablet_service.cc@2338
PS1, Line 2338:   // There's no guarantee that the order of columns in the projection matches
              :   // the order of columns in the tablet schema. This is a problem for ORDERED
              :   // scans, which use the projection's key columns to decode rowset bounds, and
              :   // those bounds are always in tablet schema column order. In such cases we
              :   // must sort the projection using the tablet schema, otherwise the decoding
              :   // will fail with either a "key too short" or "missing separator" error.
nit: can you move this down by the ORDERED condition, since this only really seems relevant there?

Also it might be clearer to give an obvious example of why the order might be off, e.g.

"There's no guarantee that the projection column order matches the tablet schema's column order (e.g. if the primary keys were not a part of the project but we've manually added the primary keys because this is an ORDERED scan). This is a problem for ORDERED scans..."



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 18 Oct 2019 01:26:06 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6491
PS4, Line 6491:   if (rng.OneIn(2)) {
              :     ASSERT_OK(scanner->SetFaultTolerant());
              :   }
              : 
> The difference is that with explicitly parameterized test, it would be easi
Sort of agree, but for tests like these we've also been happy with the logged RNG seed (from SeedRandom). Given that the test is single-threaded, reusing the RNG seed should yield a similar scan every time.

Barring that, I'd rather log the NewScanRequestPB in the test output as the basis for determinism, but unfortunately I haven't found a good way to extract that, either from the RPC call or from the KuduScanner itself.



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 18:36:19 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Hello Alexey Serbin, Attila Bukor, Andrew Wong, Grant Henke, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/14503

to look at the new patch set (#5).

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................

KUDU-2980: fix ORDERED scans when key columns are misordered in projection

This patch fixes KUDU-2980 by taking a stricter approach to how key columns
appear in ORDERED scan projections. Any key columns already in the
client-provided projection are completely ignored. Instead, when we rebuild
the projection server-side, the first section (i.e. the part with the key
columns) is built using the tablet's schema. After that, we incorporate any
remaining columns from the client-provided projection, dropping any key
columns. Finally, we add any columns necessary to satisfy predicates.

The new fuzz test reproduces the bug very easily, and is expansive enough to
provide additional coverage for funky projection and predicate ordering. I
looped it 1000 times in dist-test without failure.

Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
---
M src/kudu/client/client-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/partial_row.h
M src/kudu/integration-tests/data_gen_util.cc
M src/kudu/integration-tests/data_gen_util.h
M src/kudu/tserver/tablet_service.cc
6 files changed, 195 insertions(+), 31 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/kudu refs/changes/03/14503/5
-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 5
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 6:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6491
PS4, Line 6491:   if (rng.OneIn(2)) {
              :     ASSERT_OK(scanner->SetFaultTolerant());
              :   }
              : 
> Maybe, just add a log line right here then:
Yeah but that's not enough to repro the (hypothetical) test failure: you'd need to know the entirety of the NewScanRequestPB (i.e. projection and predicates), and possibly the PK values of the rows inserted. Only the PRNG seed will give you enough information.

I guess what I'm saying is that I don't find it terribly interesting to focus on the FT vs. non-FT aspect of the test; I think it's more useful to consider all of its knobs holistically.



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 6
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 20:09:10 +0000
Gerrit-HasComments: Yes

[kudu-CR] WIP: fix ORDERED scan regression when scanned columns are out-of-order

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: WIP: fix ORDERED scan regression when scanned columns are out-of-order
......................................................................


Patch Set 1:

(2 comments)

I took a quick look: my feedback is not about the essence of the change, but some nits.

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/client/client-test.cc@6379
PS1, Line 6379: CDH82238
nit: maybe, turn this into something upstream-like?


http://gerrit.cloudera.org:8080/#/c/14503/1/src/kudu/client/client-test.cc@6380
PS1, Line 6380:   KuduSchemaBuilder b;
nit: maybe replace the column names with something not field-specific?



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 1
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Fri, 18 Oct 2019 02:57:23 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Adar Dembo (Code Review)" <ge...@cloudera.org>.
Adar Dembo has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................

KUDU-2980: fix ORDERED scans when key columns are misordered in projection

This patch fixes KUDU-2980 by taking a stricter approach to how key columns
appear in ORDERED scan projections. Any key columns already in the
client-provided projection are completely ignored. Instead, when we rebuild
the projection server-side, the first section (i.e. the part with the key
columns) is built using the tablet's schema. After that, we incorporate any
remaining columns from the client-provided projection, dropping any key
columns. Finally, we add any columns necessary to satisfy predicates.

The new fuzz test reproduces the bug very easily, and is expansive enough to
provide additional coverage for funky projection and predicate ordering. I
looped it 1000 times in dist-test without failure.

Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Reviewed-on: http://gerrit.cloudera.org:8080/14503
Tested-by: Kudu Jenkins
Reviewed-by: Andrew Wong <aw...@cloudera.com>
---
M src/kudu/client/client-test.cc
M src/kudu/common/generic_iterators.cc
M src/kudu/common/partial_row.h
M src/kudu/integration-tests/data_gen_util.cc
M src/kudu/integration-tests/data_gen_util.h
M src/kudu/tserver/tablet_service.cc
M src/kudu/util/random_util.h
7 files changed, 195 insertions(+), 32 deletions(-)

Approvals:
  Kudu Jenkins: Verified
  Andrew Wong: Looks good to me, approved

-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 7
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 4:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6491
PS4, Line 6491:   // Use a fault tolerant scan half the time.
              :   if (rng.OneIn(2)) {
              :     ASSERT_OK(scanner->SetFaultTolerant());
              :   }
> Sort of agree, but for tests like these we've also been happy with the logg
Maybe, just add a log line right here then:

if (rng.OneIn(2)) {
  ASSERT_OK(scanner->SetFaultTolerant());
  LOG_INFO("running fault-tolerant scan");
} else {
  LOG_INFO("running non-fault-tolerant scan");
}

?



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Comment-Date: Wed, 23 Oct 2019 20:01:59 +0000
Gerrit-HasComments: Yes

[kudu-CR] KUDU-2980: fix ORDERED scans when key columns are misordered in projection

Posted by "Alexey Serbin (Code Review)" <ge...@cloudera.org>.
Alexey Serbin has posted comments on this change. ( http://gerrit.cloudera.org:8080/14503 )

Change subject: KUDU-2980: fix ORDERED scans when key columns are misordered in projection
......................................................................


Patch Set 4:

(4 comments)

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc
File src/kudu/client/client-test.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/client/client-test.cc@6473
PS4, Line 6473: // Leave one row in the tablet's MRS.
Could you add an explanation why it's important?


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/common/generic_iterators.cc
File src/kudu/common/generic_iterators.cc:

http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/common/generic_iterators.cc@1174
PS4, Line 1174: !disallow_pushdown_for_tests_
nit: could be wrapped into PREDICT_FALSE() and, probably, the predicates might be swapped in order


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/common/generic_iterators.cc@1176
PS4, Line 1176: num_columns - spec->predicates().size()
Is it guaranteed that num_columns <= spec->predicates().size()?  If not, maybe at least add DCHECK()


http://gerrit.cloudera.org:8080/#/c/14503/4/src/kudu/common/generic_iterators.cc@1198
PS4, Line 1198:     non_predicate_column_indexes_.reserve(num_columns);
              :     for (int32_t col_idx = 0; col_idx < num_columns; col_idx++) {
              :       non_predicate_column_indexes_.emplace_back(col_idx);
              :     }
syntactic sugar nit: consider replacing with

  non_predicate_column_indexes_.resize(num_columns);
  std::iota(non_predicate_column_indexes_.begin(), non_predicate_column_indexes_.end(), 0);



-- 
To view, visit http://gerrit.cloudera.org:8080/14503
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: kudu
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Idb213f466c7d01b6953a863d8aa53a34b5ac8893
Gerrit-Change-Number: 14503
Gerrit-PatchSet: 4
Gerrit-Owner: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Adar Dembo <ad...@cloudera.com>
Gerrit-Reviewer: Alexey Serbin <as...@cloudera.com>
Gerrit-Reviewer: Andrew Wong <aw...@cloudera.com>
Gerrit-Reviewer: Attila Bukor <ab...@apache.org>
Gerrit-Reviewer: Grant Henke <gr...@apache.org>
Gerrit-Comment-Date: Wed, 23 Oct 2019 06:22:32 +0000
Gerrit-HasComments: Yes