You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Quanlong Huang (Jira)" <ji...@apache.org> on 2020/08/10 06:52:00 UTC

[jira] [Created] (IMPALA-10065) Hit DCHECK when retrying a query in FINISHED state

Quanlong Huang created IMPALA-10065:
---------------------------------------

             Summary: Hit DCHECK when retrying a query in FINISHED state
                 Key: IMPALA-10065
                 URL: https://issues.apache.org/jira/browse/IMPALA-10065
             Project: IMPALA
          Issue Type: Sub-task
            Reporter: Quanlong Huang
            Assignee: Quanlong Huang


Queries will go into FINISHED state when rows are available, no matter whether the client has fetched any results. If the client hasn't called fetch on the query, the query should still be retryable. However, retrying such a query hit a DCHECK at https://github.com/apache/impala/blob/a0057788c5c2300f58b6615a27116b8331171e06/be/src/runtime/query-driver.cc#L131-L135

This can be reproduce by modifying test_retries_from_cancellation_pool in tests/customer_test/test_query_retry.py:
{code}
diff --git a/tests/custom_cluster/test_query_retries.py b/tests/custom_cluster/test_query_retries.py
index 54f2334..ae57068 100644
--- a/tests/custom_cluster/test_query_retries.py
+++ b/tests/custom_cluster/test_query_retries.py
@@ -69,21 +69,23 @@ class TestQueryRetries(CustomClusterTestSuite):
     # The following query executes slowly, and does minimal TransmitData RPCs, so it is
     # likely that the statestore detects that the impalad has been killed before a
     # TransmitData RPC has occurred.
-    query = "select count(*) from functional.alltypes where bool_col = sleep(50)"
+    query = "select count(*) from functional.alltypestiny union all select count(*) from functional.alltypes where bool_col = sleep(50)"
 
     # Launch the query, wait for it to start running, and then kill an impalad.
     handle = self.execute_query_async(query,
         query_options={'retry_failed_queries': 'true'})
-    self.wait_for_state(handle, self.client.QUERY_STATES['RUNNING'], 60)
+    self.wait_for_state(handle, self.client.QUERY_STATES['FINISHED'], 60)
 
     # Kill a random impalad (but not the one executing the actual query).
     self.__kill_random_impalad()
+    time.sleep(10)
 
     # Validate the query results.
     results = self.client.fetch(query, handle)
     assert results.success
-    assert len(results.data) == 1
-    assert int(results.data[0]) == 3650
+    assert len(results.data) == 2
+    assert int(results.data[0]) == 8
+    assert int(results.data[1]) == 3650
 
     # Validate the live exec summary.
     retried_query_id = self.__get_retried_query_id_from_summary(handle)
{code}
The change choose another query that has two UNION operands. The query will be in FINISHED state after the first operand finishes. When we kill an impalad, the coordinator hit the DCHECK.

We should support retrying a FINISHED (but actually running) query that hasn't returned any results. This is required by IMPALA-9225.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org