You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by st...@apache.org on 2024/01/16 01:01:24 UTC

(impala) 01/03: IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 02d004a12166c5549d591b7352ec1463c5ee8ba3
Author: Yida Wu <yi...@cloudera.com>
AuthorDate: Sun Jan 14 17:51:02 2024 -0800

    IMPALA-12698: Restrict check_deleted_file_fd() for fixing flaky tests
    
    The introduction of check_deleted_file_fd() in IMPALA-12681 aimed
    to detect a bug related to remote spilling where local temporary file
    handles were not being released after deletion. However, the tests
    associated with this function seem flaky in exhaustive builds with
    occasionally some files of hdfs may not be promptly released after
    deletion, though locally, I observed that these files are eventually
    removed from /proc/xx/fd in a few minutes, the reason is unclear
    yet.
    
    To fix the flaky build failure, this patch confines the scope of
    check_deleted_file_fd() to detect files containing the keyword
    "scratch" only. Given that hdfs files eventually get removed, and
    it seems not an urgent issue, a separate Jira will be filed to track
    and investigate this behavior further.
    
    Testing:
    Reran the tests a couple times and passed.
    
    Change-Id: I55f5aa1cdbc0c74f6c7ebd25575e71d2b238bf98
    Reviewed-on: http://gerrit.cloudera.org:8080/20898
    Reviewed-by: Csaba Ringhofer <cs...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 tests/custom_cluster/test_scratch_disk.py | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/tests/custom_cluster/test_scratch_disk.py b/tests/custom_cluster/test_scratch_disk.py
index a00fff04d..d1ab74fc8 100644
--- a/tests/custom_cluster/test_scratch_disk.py
+++ b/tests/custom_cluster/test_scratch_disk.py
@@ -283,7 +283,12 @@ class TestScratchDir(CustomClusterTestSuite):
 
   def find_deleted_files_in_fd(self, pid):
     fd_path = "/proc/{}/fd".format(pid)
-    command = "find {} -ls | grep '(deleted)'".format(fd_path)
+    # Look for the files with keywords 'scratch' and '(deleted)'.
+    # Limited to keyword 'scratch' because in IMPALA-12698 the process may
+    # create some reference deleted hdfs files, but the files are eventually
+    # removed in a few minutes. This limitation helps to mitigate false-positive
+    # checks.
+    command = "find {} -ls | grep -E 'scratch.*(deleted)'".format(fd_path)
     try:
       result = subprocess.check_output(command, shell=True)
       return result.strip()