You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Zoltán Borók-Nagy (Jira)" <ji...@apache.org> on 2021/06/11 17:29:00 UTC

[jira] [Created] (IMPALA-10747) test_runtime_filters.py::test_row_filters failed in dockerised test

Zoltán Borók-Nagy created IMPALA-10747:
------------------------------------------

             Summary: test_runtime_filters.py::test_row_filters failed in dockerised test
                 Key: IMPALA-10747
                 URL: https://issues.apache.org/jira/browse/IMPALA-10747
             Project: IMPALA
          Issue Type: Bug
            Reporter: Zoltán Borók-Nagy


test_runtime_filters.py::test_row_filters failed with the following stack trace:

{noformat}
query_test/test_runtime_filters.py:341: in test_row_filters
    test_file_vars={'$RUNTIME_FILTER_WAIT_TIME_MS': str(WAIT_TIME_MS)})
common/impala_test_suite.py:734: in run_test_case
    update_section=pytest.config.option.update_results)
common/test_result_verifier.py:636: in verify_runtime_profile
    actual))
E   AssertionError: Did not find matches for lines in runtime profile:
E   EXPECTED LINES:
E   row_regex: .*Rows processed: 16.38K.*
{noformat}

The job was:
https://jenkins.impala.io/job/ubuntu-16.04-dockerised-tests/4320/testReport/

It's similar to IMPALA-6004 and IMPALA-6712. Those were fixed by increasing the runtime filter wait time. It's currently 60 seconds in regular builds and 200 seconds in slow builds:
https://github.com/apache/impala/blob/c65d7861d9ae28f6fc592727ff699a8155dcda2c/tests/query_test/test_runtime_filters.py#L37

The profile contains:

{noformat}
Runtime filters: Not all filters arrived (arrived: [], missing [0]), waited for 59s361ms. Arrival delay: 1m.
{noformat}

This was the only test that failed in that build, and the whole build took 4 hr 17 min which is normal. So other tests didn't experience slowness.

There was only a single runtime filter that was generated by 02:HASH JOIN.

{noformat}
E   Operator                 #Hosts  #Inst  Avg Time  Max Time   #Rows  Est. #Rows   Peak Mem  Est. Peak Mem  Detail                  
E   ----------------------------------------------------------------------------------------------------------------------------------
E   F03:ROOT                      1      1   0.000ns   0.000ns                        4.01 MB        4.00 MB                          
E   07:AGGREGATE                  1      1   3.999ms   3.999ms       1           1   16.00 KB       16.00 KB  FINALIZE                
E   06:EXCHANGE                   1      1   0.000ns   0.000ns       3           1   32.00 KB       16.00 KB  UNPARTITIONED           
E   F02:EXCHANGE SENDER           3      3   0.000ns   0.000ns                       16.00 KB              0                          
E   03:AGGREGATE                  3      3   0.000ns   0.000ns       3           1   24.00 KB       16.00 KB                          
E   02:HASH JOIN                  3      3  14s217ms  18s739ms  51.50M       7.74M   68.06 MB      169.06 MB  INNER JOIN, PARTITIONED 
E   |--05:EXCHANGE                3      3   8s303ms  13s715ms   6.00M       6.00M   13.90 MB       10.12 MB  HASH(b.l_comment)       
E   |  F01:EXCHANGE SENDER        3      3  36s758ms  44s115ms                      209.53 KB              0                          
E   |  01:SCAN HDFS               3      3  13s637ms  15s775ms   6.00M       6.00M   29.96 MB       80.00 MB  tpch_parquet.lineitem b 
E   04:EXCHANGE                   3      3   4s874ms   7s223ms   6.00M       6.00M   12.27 MB       10.12 MB  HASH(a.l_comment)       
E   F00:EXCHANGE SENDER           3      3  23s495ms  31s755ms                      209.53 KB              0                          
E   00:SCAN HDFS                  3      3      1m4s      1m8s   6.00M       6.00M   29.96 MB       80.00 MB  tpch_parquet.lineitem a
{noformat}

The Max Time of F01:EXCHANGE SENDER was 44s115ms (non-child time).

The HASH JOIN BUILDER above the EXCHANGE SENDER had a non-child total time 19s851ms.

The profiles of all HASH_JOIN_NODE operators (all 3 of the 3 fragment instances) has:

{noformat}
Runtime filters: 1 of 1 Runtime Filter Published
{noformat}

So it seems like the filters were published, but 60 sec still wasn't enough.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)