You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2021/05/17 00:55:00 UTC

[jira] [Updated] (IMPALA-10704) test_retry_query_result_cacheing_failed and test_retry_query_set_query_in_flight_failed are flaky

     [ https://issues.apache.org/jira/browse/IMPALA-10704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Quanlong Huang updated IMPALA-10704:
------------------------------------
    Description: 
These two tests are added in IMPALA-10413. Saw the falures in
 * https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/13844/
 * https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/13878/

The failures are
{code}
tests/custom_cluster/test_query_retries.py:761: in test_retry_query_result_cacheing_failed
    lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
tests/common/impala_test_suite.py:1128: in assert_eventually
    count, timeout_s, error_msg_str))
E   Timeout: Check failed to return True after 0 tries and 60 seconds

tests/custom_cluster/test_query_retries.py:775: in test_retry_query_set_query_in_flight_failed
    lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
tests/common/impala_test_suite.py:1128: in assert_eventually
    count, timeout_s, error_msg_str))
E   Timeout: Check failed to return True after 0 tries and 60 seconds
{code}

Another problem is, when manually ran them locally, found that the original query is hanging in RETRYING state. See attached screenshot.
The test codes are problematic, it only expects one query running but not expecting its finish:
{code:python}
  @pytest.mark.execute_serially
  @CustomClusterTestSuite.with_args(
      impalad_args="--debug_actions=QUERY_RETRY_SET_RESULT_CACHE:FAIL",
      statestored_args="--statestore_heartbeat_frequency_ms=60000")
  def test_retry_query_result_cacheing_failed(self):
    """Test setting up results cacheing failed."""

    self.cluster.impalads[1].kill()
    query = "select count(*) from tpch_parquet.lineitem"
    self.hs2_client.set_configuration({'retry_failed_queries': 'true'})
    self.hs2_client.set_configuration_option('impala.resultset.cache.size', '1024')
    self.hs2_client.execute_async(query)
    self.assert_eventually(60, 0.1, 
        lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
{code}

  was:
These two tests are added in IMPALA-10413. Saw the falures in
 * https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/13844/
 * https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/13878/

Manually ran them locally, found that the original query is hanging in RETRYING state. See attached screenshot.


> test_retry_query_result_cacheing_failed and test_retry_query_set_query_in_flight_failed are flaky
> -------------------------------------------------------------------------------------------------
>
>                 Key: IMPALA-10704
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10704
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Backend
>            Reporter: Quanlong Huang
>            Assignee: Quanlong Huang
>            Priority: Major
>         Attachments: test_retry_query_result_cacheing_failed.png
>
>
> These two tests are added in IMPALA-10413. Saw the falures in
>  * https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/13844/
>  * https://jenkins.impala.io/job/ubuntu-16.04-from-scratch/13878/
> The failures are
> {code}
> tests/custom_cluster/test_query_retries.py:761: in test_retry_query_result_cacheing_failed
>     lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
> tests/common/impala_test_suite.py:1128: in assert_eventually
>     count, timeout_s, error_msg_str))
> E   Timeout: Check failed to return True after 0 tries and 60 seconds
> tests/custom_cluster/test_query_retries.py:775: in test_retry_query_set_query_in_flight_failed
>     lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
> tests/common/impala_test_suite.py:1128: in assert_eventually
>     count, timeout_s, error_msg_str))
> E   Timeout: Check failed to return True after 0 tries and 60 seconds
> {code}
> Another problem is, when manually ran them locally, found that the original query is hanging in RETRYING state. See attached screenshot.
> The test codes are problematic, it only expects one query running but not expecting its finish:
> {code:python}
>   @pytest.mark.execute_serially
>   @CustomClusterTestSuite.with_args(
>       impalad_args="--debug_actions=QUERY_RETRY_SET_RESULT_CACHE:FAIL",
>       statestored_args="--statestore_heartbeat_frequency_ms=60000")
>   def test_retry_query_result_cacheing_failed(self):
>     """Test setting up results cacheing failed."""
>     self.cluster.impalads[1].kill()
>     query = "select count(*) from tpch_parquet.lineitem"
>     self.hs2_client.set_configuration({'retry_failed_queries': 'true'})
>     self.hs2_client.set_configuration_option('impala.resultset.cache.size', '1024')
>     self.hs2_client.execute_async(query)
>     self.assert_eventually(60, 0.1, 
>         lambda: self.cluster.get_first_impalad().service.get_num_in_flight_queries() == 1)
> {code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org