You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "ASF subversion and git services (Jira)" <ji...@apache.org> on 2021/08/18 14:34:00 UTC
[jira] [Commented] (IMPALA-10783) run_and_verify_query_cancellation_test flakiness and improper error handling in TestImpalaShell

    [ https://issues.apache.org/jira/browse/IMPALA-10783?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17401101#comment-17401101 ] 

ASF subversion and git services commented on IMPALA-10783:
----------------------------------------------------------

Commit a9c8166694b285188df945707469fce2c83d24ee in impala's branch refs/heads/master from Bikramjeet Vig
[ https://gitbox.apache.org/repos/asf?p=impala.git;h=a9c8166 ]

IMPALA-10783: Fixed flakiness in run_and_verify_query_cancellation_test

The issue was that after the impala-shell is started in a seperate
process and an error is encountered then the process lingers on
and a long running query can hold on to resources and potentially
affect other tests running on the impala cluster.
This patch just makes sure that the impala-shell process is killed
regardless of any errors encountered.

Change-Id: I9f6d22d639921051cde5675fae1845bedb61c8cc
Reviewed-on: http://gerrit.cloudera.org:8080/17768
Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
Tested-by: Impala Public Jenkins <im...@cloudera.com>


> run_and_verify_query_cancellation_test flakiness and improper error handling in TestImpalaShell
> -----------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10783
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10783
>             Project: IMPALA
>          Issue Type: Bug
>    Affects Versions: Impala 4.0.0
>            Reporter: Bikramjeet Vig
>            Assignee: Bikramjeet Vig
>            Priority: Major
>              Labels: flaky-test
>
> Some tests in TestImpalaShell run impala-shell in a seperate process but don't handle the case where the test can fail and the impala-shell process can linger on.
> One such test run_and_verify_query_cancellation_test, failed due to flakiness and since it ran a query that returned a large result, the impala-shell process lingered on while fetching results. This caused the query to hold on to resources and starve the cluster of memory which caused other tests to fail due to not enough memory being available.
> The flakiness in run_and_verify_query_cancellation_test was:
> {noformat}
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/test_shell_commandline.py:414: in test_query_cancellation_during_wait_to_finish
>     self.run_and_verify_query_cancellation_test(vector, stmt, "RUNNING")
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/test_shell_commandline.py:422: in run_and_verify_query_cancellation_test
>     wait_for_query_state(vector, stmt, cancel_at_state)
> /data/jenkins/workspace/impala-asf-master-core-asan/repos/Impala/tests/shell/util.py:330: in wait_for_query_state
>     raise Exception(exc_text)
> E   Exception: The found in flight query is not the one under test: set all
> {noformat}
> the test checked for running queries too fast while the impala-shell was starting up. the impala-shell runs "set all" when it starts which the test picked up and raised an error thinking it did find its query.
> The result of this lingering query caused other tests to fail and throw errors like:
> {noformat}
> query_test/test_tpcds_queries.py:107: in test_tpcds_q18a
>     self.run_test_case(self.get_workload() + '-q18a', vector)
> common/impala_test_suite.py:678: in run_test_case
>     result = exec_fn(query, user=test_section.get('USER', '').strip() or None)
> common/impala_test_suite.py:616: in __exec_in_impala
>     result = self.__execute_query(target_impalad_client, query, user=user)
> common/impala_test_suite.py:936: in __execute_query
>     return impalad_client.execute(query, user=user)
> common/impala_connection.py:205: in execute
>     return self.__beeswax_client.execute(sql_stmt, user=user)
> beeswax/impala_beeswax.py:189: in execute
>     handle = self.__execute_query(query_string.strip(), user=user)
> beeswax/impala_beeswax.py:367: in __execute_query
>     self.wait_for_finished(handle)
> beeswax/impala_beeswax.py:388: in wait_for_finished
>     raise ImpalaBeeswaxException("Query aborted:" + error_log, None)
> E   ImpalaBeeswaxException: ImpalaBeeswaxException:
> E    Query aborted:Failed to get minimum memory reservation of 452.19 MB on daemon impala-ec2-centos74-m5-4xlarge-ondemand-191d.vpc.cloudera.com:27002 for query 394b7f96d554f99c:6882496c00000000 due to following error: Failed to increase reservation by 452.19 MB because it would exceed the applicable reservation limit for the "Process" ReservationTracker: reservation_limit=10.20 GB reservation=9.91 GB used_reservation=0 child_reservations=9.91 GB
> E   The top 5 queries that allocated memory under this tracker are:
> E   Query(fa4ece9474a3f865:1b284e6700000000): Reservation=9.60 GB ReservationLimit=9.60 GB OtherMemory=118.01 MB Total=9.71 GB Peak=9.71 GB
> E   Query(534d07950247ae68:6f5a410d00000000): Reservation=123.50 MB ReservationLimit=9.60 GB OtherMemory=2.68 MB Total=126.18 MB Peak=317.02 MB
> E   Query(2e4f087aa8263e23:e697d8e800000000): Reservation=50.81 MB ReservationLimit=9.60 GB OtherMemory=42.62 MB Total=93.43 MB Peak=173.74 MB
> E   Query(6e459d892dfa5050:5959219b00000000): Reservation=28.88 MB ReservationLimit=9.60 GB OtherMemory=18.77 MB Total=47.64 MB Peak=53.11 MB
> E   Query(ad455bea2e0adc64:2b0bbf3500000000): Reservation=17.94 MB ReservationLimit=9.60 GB OtherMemory=15.22 MB Total=33.16 MB Peak=163.99 MB
> E   
> E   
> E   
> E   
> E   
> E   Memory is likely oversubscribed. Reducing query concurrency or configuring admission control may help avoid this error.
> {noformat}
> Logs confirmed that fa4ece9474a3f865:1b284e6700000000 is the query id of the query that run_and_verify_query_cancellation_test ran.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org