You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@impala.apache.org by "Vuk Ercegovac (Code Review)" <ge...@cloudera.org> on 2018/06/05 02:18:27 UTC

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Vuk Ercegovac has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10602


Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................

IMPALA-6956: deflake and logging for query_expiration test

There was a recent flake where the number of in-flight queries
that were executing differed from the expected number.
The reason for the false negative is that one of the queries
expired before the check for in-flight queries: it took too long
to issue the queries.

This change modifies the timeout for the expired query to not expire
so quickly. Additional logging is added to the check for in-flight
queries so that we can distinguish the case of too few queries vs.
queries that have the wrong state.

Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
---
M tests/custom_cluster/test_query_expiration.py
1 file changed, 7 insertions(+), 4 deletions(-)



  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/10602/1
-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newchange
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10602 )

Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................


Patch Set 1:

(1 comment)

Thanks for looking at this, sorry for creating the problem in the first place.

http://gerrit.cloudera.org:8080/#/c/10602/1/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/10602/1/tests/custom_cluster/test_query_expiration.py@58
PS1, Line 58:     client.execute("SET EXEC_TIME_LIMIT_S=3")
I understand this fixes one of the problem - this query expires too soon, but I don't see how this change fixes the query that was slow to start. Should we be waiting for all four queries to getting into the RUNNING state?



-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Jun 2018 03:24:59 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Posted by "Vuk Ercegovac (Code Review)" <ge...@cloudera.org>.
Hello Tim Armstrong, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/10602

to look at the new patch set (#2).

Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................

IMPALA-6956: deflake and logging for query_expiration test

There was a recent flake where the number of in-flight queries
that were executing differed from the expected number.
The reason for the false negative is that one of the queries
expired before the check for in-flight queries: it took too long
to issue the queries.

This change modifies the timeout for the expired query to not expire
so quickly. Additional logging is added to the check for in-flight
queries so that we can distinguish the case of too few queries vs.
queries that have the wrong state.

Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
---
M tests/custom_cluster/test_query_expiration.py
1 file changed, 9 insertions(+), 6 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/02/10602/2
-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10602 )

Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................


Patch Set 2: Verified+1


-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Wed, 06 Jun 2018 03:19:46 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Posted by "Tim Armstrong (Code Review)" <ge...@cloudera.org>.
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10602 )

Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................


Patch Set 2: Code-Review+2


-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Jun 2018 23:48:06 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has submitted this change and it was merged. ( http://gerrit.cloudera.org:8080/10602 )

Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................

IMPALA-6956: deflake and logging for query_expiration test

There was a recent flake where the number of in-flight queries
that were executing differed from the expected number.
The reason for the false negative is that one of the queries
expired before the check for in-flight queries: it took too long
to issue the queries.

This change modifies the timeout for the expired query to not expire
so quickly. Additional logging is added to the check for in-flight
queries so that we can distinguish the case of too few queries vs.
queries that have the wrong state.

Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Reviewed-on: http://gerrit.cloudera.org:8080/10602
Reviewed-by: Tim Armstrong <ta...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
M tests/custom_cluster/test_query_expiration.py
1 file changed, 9 insertions(+), 6 deletions(-)

Approvals:
  Tim Armstrong: Looks good to me, approved
  Impala Public Jenkins: Verified

-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 3
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Posted by "Vuk Ercegovac (Code Review)" <ge...@cloudera.org>.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10602 )

Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10602/1/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/10602/1/tests/custom_cluster/test_query_expiration.py@58
PS1, Line 58:     client.execute("SET EXEC_TIME_LIMIT_S=3")
> I understand this fixes one of the problem - this query expires too soon, b
The problem I saw was that it just took longer than assumed to get these four queries (and set calls) issued. There is no particular query that was slower than usual; their combined time was slower (perhaps an overloaded instance?). Easiest, but not fail proof option is to loosen the time for the query on L59 to expire later.

Waiting for a state would require that the "waiter" start immediately and record events as they occur. However, many of the ways we observe the state also include retries and timeouts, which can be error prone when making timing assumptions. Perhaps I can just extract start/stop/state for all queries and determine correctness via interval ordering?



-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Jun 2018 06:23:44 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Posted by "Impala Public Jenkins (Code Review)" <ge...@cloudera.org>.
Impala Public Jenkins has posted comments on this change. ( http://gerrit.cloudera.org:8080/10602 )

Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................


Patch Set 2:

Build started: https://jenkins.impala.io/job/gerrit-verify-dryrun/2603/


-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 2
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <im...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Jun 2018 23:52:00 +0000
Gerrit-HasComments: No

[Impala-ASF-CR] IMPALA-6956: deflake and logging for query expiration test

Posted by "Vuk Ercegovac (Code Review)" <ge...@cloudera.org>.
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10602 )

Change subject: IMPALA-6956: deflake and logging for query_expiration test
......................................................................


Patch Set 1:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/10602/1/tests/custom_cluster/test_query_expiration.py
File tests/custom_cluster/test_query_expiration.py:

http://gerrit.cloudera.org:8080/#/c/10602/1/tests/custom_cluster/test_query_expiration.py@58
PS1, Line 58:     client.execute("SET EXEC_TIME_LIMIT_S=3")
> The problem I saw was that it just took longer than assumed to get these fo
ah, you're right about the second query. there were two tests, one failed with one query off and other with both these queries off.
updated the other query as well.



-- 
To view, visit http://gerrit.cloudera.org:8080/10602
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I01a8762d28ad920b9ec8a0b1b82469618c66768f
Gerrit-Change-Number: 10602
Gerrit-PatchSet: 1
Gerrit-Owner: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Reviewer: Tim Armstrong <ta...@cloudera.com>
Gerrit-Reviewer: Vuk Ercegovac <ve...@cloudera.com>
Gerrit-Comment-Date: Tue, 05 Jun 2018 23:42:44 +0000
Gerrit-HasComments: Yes