You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Fang-Yu Rao (Code Review)" <ge...@cloudera.org> on 2019/06/25 18:44:29 UTC

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Fang-Yu Rao has uploaded this change for review. ( http://gerrit.cloudera.org:8080/13727


Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................

IMPALA-8698: Disable row count estimate to avoid a flaky test

Disabled the row count estimate for an hdfs table for the EE test
test_bloom_filters to avoid a flaky test due to a previous patchset
(IMPALA-7608).

Testing:
Have run the revised EE test on a local dev box.

Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
---
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
1 file changed, 1 insertion(+), 0 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/13727/2
-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................


Patch Set 2:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13727/2/testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
File testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test:

http://gerrit.cloudera.org:8080/#/c/13727/2/testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test@131
PS2, Line 131: SET DISABLE_HDFS_NUM_ROWS_ESTIMATE=1;
> Hi Bikram, thank you very much for your suggestion! According to your sugge
Yeah I like the idea of making the test produce the same value for the same reason for all file formats (rather than exercising a different code path).

Anyway, I think it's fine that we merged the original fix, since at least that will fix the failing tests and should be fairly stable.

We could implement one of the above ideas in a follow-on patch if it works. It's better in principle I think, but at least the test is stable now.



-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 26 Jun 2019 06:21:55 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................


Patch Set 3: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 3
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Wed, 26 Jun 2019 01:58:09 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Fang-Yu Rao (Code Review)" <ge...@cloudera.org>.
Fang-Yu Rao has posted comments on this change. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13727/2/testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
File testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test:

http://gerrit.cloudera.org:8080/#/c/13727/2/testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test@131
PS2, Line 131: SET DISABLE_HDFS_NUM_ROWS_ESTIMATE=1;
> instead of disabling this feature, I would recommend changing the query to 
Hi Bikram, thank you very much for your suggestion! According to your suggestion, I tried to run the following two SQL statements on my local dev box. Notice that the table "functional_seq_gzip.alltypes" is an hdfs table w/o stats.

1. The original statement in bloom_filters.test.

$IMPALA_HOME/bin/impala-shell.sh -q \
"SET RUNTIME_FILTER_MODE=GLOBAL;
SET RUNTIME_FILTER_WAIT_TIME_MS=30000;
SET RUNTIME_FILTER_MIN_SIZE=4KB;
SET RUNTIME_BLOOM_FILTER_SIZE=4KB;
select STRAIGHT_JOIN count(*)
from functional_seq_gzip.alltypes a
join [SHUFFLE]
functional_seq_gzip.alltypes b
on a.id = b.id; PROFILE;" -p | less

2. The revised statement according to your suggestion (notice that I changed that 7,300 to 3 to force the returned size of the sub-query to be at most 3 instead of 7,300).

$IMPALA_HOME/bin/impala-shell.sh -q \
"SET RUNTIME_FILTER_MODE=GLOBAL;
SET RUNTIME_FILTER_WAIT_TIME_MS=30000;
SET RUNTIME_FILTER_MIN_SIZE=4KB;
SET RUNTIME_BLOOM_FILTER_SIZE=4KB;
select STRAIGHT_JOIN count(*)
from (select * from functional_seq_gzip.alltypes limit 3) a
join [SHUFFLE]
(select * from functional_seq_gzip.alltypes limit 3) b
on a.id = b.id; PROFILE;" -p | less

However, on my local dev box, the sizes of "Filter 0" in both cases are 8.00 KB and thus it seems that restricting the sizes of the sub-queries may not be useful in this case. But I might be wrong. Any more suggestions or ideas would be greatly welcomed. :-)



-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 3
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Jun 2019 23:21:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................


Patch Set 3:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/4542/ DRY_RUN=false


-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 3
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Jun 2019 20:25:18 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................

IMPALA-8698: Disable row count estimate to avoid a flaky test

Disabled the row count estimate for an hdfs table for the EE test
test_bloom_filters to avoid a flaky test due to a previous patchset
(IMPALA-7608).

Testing:
Have run the revised EE test on a local dev box.

Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Reviewed-on: http://gerrit.cloudera.org:8080/13727
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
1 file changed, 1 insertion(+), 0 deletions(-)

Approvals:
  Impala Public Jenkins: Looks good to me, approved; Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 4
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Jun 2019 20:25:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Bikramjeet Vig (Code Review)" <ge...@cloudera.org>.
Bikramjeet Vig has posted comments on this change. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................


Patch Set 3:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/13727/2/testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test
File testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test:

http://gerrit.cloudera.org:8080/#/c/13727/2/testdata/workloads/functional-query/queries/QueryTest/bloom_filters.test@131
PS2, Line 131: SET DISABLE_HDFS_NUM_ROWS_ESTIMATE=1;
instead of disabling this feature, I would recommend changing the query to have a deterministic row count like this:
select STRAIGHT_JOIN count(*) from (select * from alltypes limit 7300) a join [SHUFFLE] (select * from alltypes limit 7300) b on a.id = b.id;



-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 3
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Jun 2019 20:47:31 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................


Patch Set 2:

Build Successful 

https://jenkins.impala.io/job/gerrit-code-review-checks/3742/ : Initial code review checks passed. Use gerrit-verify-dryrun-external or gerrit-verify-dryrun to run full precommit tests.


-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 2
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Jun 2019 19:57:27 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-8698: Disable row count estimate to avoid a flaky test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/13727 )

Change subject: IMPALA-8698: Disable row count estimate to avoid a flaky test
......................................................................


Patch Set 3: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/13727
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I8342bc20a6b7935823d2a8bac2b42afaa1a8aae0
Gerrit-Change-Number: 13727
Gerrit-PatchSet: 3
Gerrit-Owner: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Bikramjeet Vig <bi...@cloudera.com>
Gerrit-Reviewer: Fang-Yu Rao <fa...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 25 Jun 2019 20:25:17 +0000
Gerrit-HasComments: No