You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@impala.apache.org by "Marcel Kornacker (Code Review)" <ge...@cloudera.org> on 2016/03/08 02:16:26 UTC

[Impala-CR](cdh5-2.5.0_5.7.0) Runtime filters tests

Marcel Kornacker has posted comments on this change.

Change subject: Runtime filters tests
......................................................................

Patch Set 3:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test
File testdata/workloads/functional-query/queries/QueryTest/runtime_filters.test:

Line 2: ---- COMMENT
what's the reason for adding a new section for this? we can already put comments anywhere, and you added blank lines between test cases, which to me adds enough visual separation. the -- comment line feels superfluous.

Line 12: select count(*) from alltypes p join [BROADCAST] alltypestiny b
if you're not testing the planner, always use straight_join to avoid unwanted plan changes

Line 17: row_regex: .*RowsRead: 2.43K .*
how does this work? is it looking for any line that fits the regex?

anyway, that's a bit indirect. it would really be better to have an 'assert' on a particular scalar value from a runtime profile, like so

<path>.RowsRead <comparison-op> 2.43K

of course not in this patch, but please file a jira, sounds like a useful ramp-up task.

Line 28: row_regex: .*RowsRead: 206 .*
is the fact that a filter/a number of filters was applied recorded more directly? that would be useful here (and probably also for diagnostics of a live system).

Line 30:
does this really need to have 2 blank lines? you could also expand the ==== for better visual separation.

i'm a bit reluctant to add a new stylistic element/convention that's only maintained in a few test files.

Line 97: row_regex: .*Files rejected: 8 .*
does this add anything on top of rowsread? it seems like this is essentially testing the number of files that make up the tables. (ie, it shouldn't be here)

Line 196: row_regex: .*Files rejected: 7 .*
'files rejected' seems a bit brittle (we could change the number of files in a partition) and indirect, since you really care about the number of partitions. should we record the latter instead/in addition?

Line 219: # Test case 9: filters do pass through ROJ.
foj case missing

Line 277: row_regex: .*RowsRead: 333 .*
so this passes if any scan returned 333 rows?

Line 285: # join as its root, which produces filters for the scan of t1.
what's the plan for this? it can't have an hj at the root (there's an agg).

http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test
File testdata/workloads/functional-query/queries/QueryTest/runtime_filters_wait.test:

Line 4: # Regression test for IMPALA-3078: Don't wait for global filters
why not roll this into runtime-filters.test?

Line 6: # but fail the test if the query took more than a minute.
the latter part of that sentence isn't obvious from the test case

http://gerrit.cloudera.org:8080/#/c/2297/3/testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test
File testdata/workloads/functional-query/queries/QueryTest/runtime_row_filters.test:

Line 31:
does this really need to have 2 blank lines?

Line 117: row_regex: .*Rows rejected: 2.43K .*
why 'rows rejected' here? (as opposed to 'rows read' elsewhere)

--
To view, visit http://gerrit.cloudera.org:8080/2297
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I94d617c6d23ffa394a6eb7ead56f1cfb701e0d90
Gerrit-PatchSet: 3
Gerrit-Project: Impala
Gerrit-Branch: cdh5-2.5.0_5.7.0
Gerrit-Owner: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Henry Robinson <he...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <ma...@cloudera.com>
Gerrit-Reviewer: Mostafa Mokhtar <mm...@cloudera.com>
Gerrit-HasComments: Yes