You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by st...@apache.org on 2020/07/22 23:30:10 UTC

[impala] branch master updated (ee3f053 -> 4502097)

This is an automated email from the ASF dual-hosted git repository.

stakiar pushed a change to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git.


    from ee3f053  IMPALA-9964: Fill file descriptors properly in setFileMetadataFromFS
     new ea95691  IMPALA-9953: Shell should continue fetching even when 0 rows are returned
     new 4502097  IMPALA-9799: Add retries to TestFetchFirst get_num_in_flight_queries calls

The 2 revisions listed above as "new" are entirely new to this
repository and will be described in separate emails.  The revisions
listed as "add" were already present in the repository and have only
been added to this reference.


Summary of changes:
 shell/impala_shell.py                 |  2 +-
 tests/hs2/test_fetch_first.py         |  8 ++++++--
 tests/shell/test_shell_commandline.py | 16 ++++++++++++++++
 3 files changed, 23 insertions(+), 3 deletions(-)


[impala] 01/02: IMPALA-9953: Shell should continue fetching even when 0 rows are returned

Posted by st...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

stakiar pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit ea95691b775ef0edd032a2590a119e6841cc2129
Author: Sahil Takiar <ta...@gmail.com>
AuthorDate: Mon Jul 20 12:35:26 2020 -0700

    IMPALA-9953: Shell should continue fetching even when 0 rows are returned
    
    The Impala shell stops fetching rows if it receives a batch that
    contains 0 rows. This is incorrect because a batch with 0 rows can be
    returned if the fetch request hits a timeout. Instead, the shell should
    rely on the value of has_rows / hasMoreRows to determine when to stop
    issuing fetch requests.
    
    Tests:
    * Added a regression test to test_shell_commandline.py
    * Ran all shell tests
    
    Change-Id: I5f8527aea9e433f8cf426435c0ba41355bbf9d88
    Reviewed-on: http://gerrit.cloudera.org:8080/16222
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 shell/impala_shell.py                 |  2 +-
 tests/shell/test_shell_commandline.py | 16 ++++++++++++++++
 2 files changed, 17 insertions(+), 1 deletion(-)

diff --git a/shell/impala_shell.py b/shell/impala_shell.py
index e0d8026..8c18408 100755
--- a/shell/impala_shell.py
+++ b/shell/impala_shell.py
@@ -1183,7 +1183,7 @@ class ImpalaShell(cmd.Cmd, object):
         for rows in rows_fetched:
           # IMPALA-4418: Break out of the loop to prevent printing an unnecessary empty line.
           if len(rows) == 0:
-            break
+            continue
           self.output_stream.write(rows)
           num_rows += len(rows)
 
diff --git a/tests/shell/test_shell_commandline.py b/tests/shell/test_shell_commandline.py
index 1782990..673c7c2 100644
--- a/tests/shell/test_shell_commandline.py
+++ b/tests/shell/test_shell_commandline.py
@@ -1025,3 +1025,19 @@ class TestImpalaShell(ImpalaTestSuite):
       result = run_impala_shell_cmd(vector, ['-q', query, '-B', '--fetch_size', '512'])
       result_rows = result.stdout.strip().split('\n')
       assert len(result_rows) == 1024
+
+  def test_result_spooling_timeout(self, vector):
+    """Regression test for IMPALA-9953. Validates that if a fetch timeout occurs in the
+    middle of reading rows from Impala that all rows are still printed by the Impala
+    shell."""
+    # This query was stolen from __test_fetch_timeout in test_fetch_timeout.py. The query
+    # has a large delay between RowBatch production. So a fetch timeout will occur while
+    # fetching rows.
+    query_options = "set num_nodes=1; \
+                     set fetch_rows_timeout_ms=1; \
+                     set batch_size=1; \
+                     set spool_query_results=true;"
+    query = "select bool_col, avg(id) from functional.alltypes group by bool_col"
+    result = run_impala_shell_cmd(vector, ['-q', query_options + query, '-B'])
+    result_rows = result.stdout.strip().split('\n')
+    assert len(result_rows) == 2


[impala] 02/02: IMPALA-9799: Add retries to TestFetchFirst get_num_in_flight_queries calls

Posted by st...@apache.org.
This is an automated email from the ASF dual-hosted git repository.

stakiar pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 4502097a2d6bc9d41da185762d231f7fe413224b
Author: Sahil Takiar <ta...@gmail.com>
AuthorDate: Mon Jul 20 12:59:12 2020 -0700

    IMPALA-9799: Add retries to TestFetchFirst get_num_in_flight_queries calls
    
    The calls to get_num_in_flight_queries in TestFetchFirst are flaky
    because they expect the number of in flight queries to drop to 0
    immediately. This might not always be true, especially in ASAN builds
    where Impala is generally slower.
    
    This patch wraps to call to get_num_in_flight_queries in
    ImpalaTestSuite.assert_eventually, which adds retries to the calls to
    get_num_in_flight_queries.
    
    Testing:
    * Ran tests/hs2/test_fetch_first.py locally
    
    Change-Id: I349f861e8219e62311e8d4e0bfbd8f3618f0fa46
    Reviewed-on: http://gerrit.cloudera.org:8080/16218
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 tests/hs2/test_fetch_first.py | 8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/tests/hs2/test_fetch_first.py b/tests/hs2/test_fetch_first.py
index 9358092..b925c66 100644
--- a/tests/hs2/test_fetch_first.py
+++ b/tests/hs2/test_fetch_first.py
@@ -50,7 +50,9 @@ class TestFetchFirst(HS2TestSuite):
         TCLIService.TStatusCode.ERROR_STATUS,
         "Invalid value 'bad_number' for 'impala.resultset.cache.size' option")
     self.__verify_num_cached_rows(0)
-    assert 0 == impalad.get_num_in_flight_queries()
+    self.assert_eventually(30, 1,
+        lambda: 0 == impalad.get_num_in_flight_queries(),
+        "Num in flight queries did not reach 0")
 
     # Test that a result-cache size exceeding the per-Impalad maximum returns an error.
     # The default maximum result-cache size is 100000.
@@ -60,7 +62,9 @@ class TestFetchFirst(HS2TestSuite):
         TCLIService.TStatusCode.ERROR_STATUS,
         "Requested result-cache size of 100001 exceeds Impala's maximum of 100000")
     self.__verify_num_cached_rows(0)
-    assert 0 == impalad.get_num_in_flight_queries()
+    self.assert_eventually(30, 1,
+        lambda: 0 == impalad.get_num_in_flight_queries(),
+        "Num in flight queries did not reach 0")
 
   def __verify_num_cached_rows(self, num_cached_rows):
     """Asserts that Impala has the given number of rows in its result set cache. Also