You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@impala.apache.org by st...@apache.org on 2023/08/09 23:43:07 UTC
[impala] 01/02: IMPALA-12311: Remove extra newlines in the updated golden file

This is an automated email from the ASF dual-hosted git repository.

stigahuang pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/impala.git

commit 826da820347387bd0587324cd29d44c9a29dd166
Author: Fang-Yu Rao <fa...@cloudera.com>
AuthorDate: Mon Jul 31 16:52:17 2023 -0700

    IMPALA-12311: Remove extra newlines in the updated golden file
    
    This patch removes extra newlines added to subsections when we parse
    test queries in an end-to-end test file. Specifically, in
    parse_test_file_text(), we append an extra newline in every subsection
    of a test query, resulting in one extra newline in the updated golden
    file if we add '--update_results' when running this test file to produce
    the updated golden file. This could be seen by looking at the updated
    golding file under $IMPALA_HOME/logs/ee_tests after executing the
    following.
    
    $IMPALA_HOME/bin/impala-py.test \
    --update_results \
    $IMPALA_HOME/tests/query_test/test_tpcds_queries.py::TestTpcdsDecimalV2Query::test_tpcds_q1
    
    The extra newline is needed for the verification of the subsections of
    RESULTS, DML_RESULTS, ERRORS to disambiguate the case of no lines from
    a single line with no text and will not be needed after the
    verification.
    
    To remove such extra newlines, we choose to do it in the place when
    write_test_file() is called to output the updated golden file since this
    requires fewer changes. An alternative could be to only add an extra
    newline for those 3 subsections mentioned above and also remove the last
    newline added in join_section_lines(), which would be called when the
    actual contents do not match the expected contents specified in the
    original golden file in the subsections of ERRORS, TYPES, LABELS,
    RESULTS and DML_RESULTS. Additional changes to TestTestFileParser are
    also required if we adopted the alternative.
    
    Testing:
     - Verified that the extra newlines are removed after this patch.
    
    Change-Id: Ic7668a437267bd76afecba8f87ead32d82580414
    Reviewed-on: http://gerrit.cloudera.org:8080/20272
    Reviewed-by: Impala Public Jenkins <im...@cloudera.com>
    Tested-by: Impala Public Jenkins <im...@cloudera.com>
---
 tests/util/test_file_parser.py | 13 +++++++++++--
 1 file changed, 11 insertions(+), 2 deletions(-)

diff --git a/tests/util/test_file_parser.py b/tests/util/test_file_parser.py
index 256bc5277..3fe347f7d 100644
--- a/tests/util/test_file_parser.py
+++ b/tests/util/test_file_parser.py
@@ -208,6 +208,9 @@ def parse_test_file_text(text, valid_section_names, skip_unknown_sections=True):
       if len(lines_content) != 0:
         # Add trailing newline to last line if present. This disambiguates between the
         # case of no lines versus a single line with no text.
+        # This extra newline is needed only for the verification of the subsections of
+        # RESULTS, DML_RESULTS and ERRORS and will be removed in write_test_file()
+        # when '--update_results' is added to output the updated golden file.
         subsection_str += "\n"
 
       if subsection_name not in valid_section_names:
@@ -294,6 +297,9 @@ def split_section_lines(section_str):
 def join_section_lines(lines):
   """
   The inverse of split_section_lines().
+  The extra newline at the end will be removed in write_test_file() so that when the
+  actual contents of a subsection do not match the expected contents, we won't see
+  extra newlines in those subsections (ERRORS, TYPES, LABELS, RESULTS and DML_RESULTS).
   """
   return '\n'.join(lines) + '\n'
 
@@ -323,8 +329,11 @@ def write_test_file(test_file_name, test_file_sections, encoding=None):
         if section_name == 'RESULTS' and test_case.get('VERIFIER'):
           full_section_name = '%s: %s' % (section_name, test_case['VERIFIER'])
         test_file_text.append("%s %s" % (SUBSECTION_DELIMITER, full_section_name))
-        section_value = ''.join(test_case[section_name])
-        if section_value.strip():
+        # We remove the extra newlines added in parse_test_file_text() so that in the
+        # updated golden file we will not see an extra newline at the end of each
+        # subsection.
+        section_value = ''.join(test_case[section_name]).strip()
+        if section_value:
           test_file_text.append(section_value)
     test_file_text.append(SECTION_DELIMITER)
     test_file.write(('\n').join(test_file_text))