You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Michael Brown (JIRA)" <ji...@apache.org> on 2018/03/19 16:40:00 UTC

[jira] [Created] (IMPALA-6701) stress test compute stats binary search can't find a start point

Michael Brown created IMPALA-6701:
-------------------------------------

             Summary: stress test compute stats binary search can't find a start point
                 Key: IMPALA-6701
                 URL: https://issues.apache.org/jira/browse/IMPALA-6701
             Project: IMPALA
          Issue Type: Bug
    Affects Versions: Impala 2.11.0, Impala 2.10.0, Impala 2.9.0, Impala 2.8.0, Impala 3.0, Impala 2.12.0
            Reporter: Michael Brown


The stress test compute stats statements recently took 9 hours to do a binary search.

The stress test cannot find a start point for mem_limit for compute stats statements, because explain is not supported.
{noformat}
[localhost:21000] > explain compute stats tpch.lineitem;
Query: explain compute stats tpch.lineitem
ERROR: AnalysisException: Syntax error in line 1:
explain compute stats tpch.lineitem
        ^
Encountered: COMPUTE
Expected: CREATE, DELETE, INSERT, SELECT, UPDATE, UPSERT, VALUES, WITH

CAUSED BY: Exception: Syntax error

[localhost:21000] >
{noformat}
The stress test has done this ever since it supported such:
{noformat}
1370 def estimate_query_mem_mb_usage(query, query_runner):
1371   """Runs an explain plan then extracts and returns the estimated memory needed to run
1372   the query.
1373   """
1374   with query_runner.impalad_conn.cursor() as cursor:
1375     LOG.debug("Using %s database", query.db_name)
1376     if query.db_name:
1377       cursor.execute('USE ' + query.db_name)
1378     if query.query_type == QueryType.COMPUTE_STATS:
1379       # Running "explain" on compute stats is not supported by Impala.
1380       return
{noformat}
This means the stress test is starting with the full limit of impalad.
{noformat}
2018-03-17 08:00:38,684 12313 MainThread INFO:concurrent_select[1164]:Collecting runtime info for query compute_stats_call_center_mt_dop_1: 
COMPUTE STATS call_center
2018-03-17 08:00:38,925 12313 MainThread DEBUG:concurrent_select[1375]:Using tpcds_300_decimal_parquet database
2018-03-17 08:00:38,925 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE tpcds_300_decimal_parquet
2018-03-17 08:00:39,007 12313 MainThread INFO:hiveserver2[265]:Closing active operation
2018-03-17 08:00:39,123 12313 MainThread INFO:concurrent_select[1247]:Finding a starting point for binary search
2018-03-17 08:00:39,148 12313 MainThread DEBUG:concurrent_select[866]:Using tpcds_300_decimal_parquet database
2018-03-17 08:00:39,148 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE tpcds_300_decimal_parquet
2018-03-17 08:00:39,206 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET MT_DOP=1
2018-03-17 08:00:39,333 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET ABORT_ON_ERROR=1
2018-03-17 08:00:39,416 12313 MainThread DEBUG:concurrent_select[878]:Setting mem limit to 77308 MB
2018-03-17 08:00:39,416 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET MEM_LIMIT=77308M
2018-03-17 08:00:39,503 12313 MainThread DEBUG:concurrent_select[882]:Running query with 77308 MB mem limit at vc0718.halxg.cloudera.com with timeout secs 9223372036854775807:
COMPUTE STATS call_center
2018-03-17 08:00:39,741 12313 MainThread DEBUG:concurrent_select[890]:Query id is 3b4213033bf2359c:d44b29c500000000
2018-03-17 08:00:41,084 12313 MainThread INFO:hiveserver2[265]:Closing active operation
2018-03-17 08:00:41,202 12313 MainThread DEBUG:concurrent_select[1209]:Spilled: False
2018-03-17 08:00:41,202 12313 MainThread INFO:concurrent_select[1267]:Finding minimum memory required to avoid spilling
2018-03-17 08:00:41,227 12313 MainThread DEBUG:concurrent_select[866]:Using tpcds_300_decimal_parquet database
2018-03-17 08:00:41,227 12313 MainThread DEBUG:db_connection[203]:IMPALA: USE tpcds_300_decimal_parquet
2018-03-17 08:00:41,286 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET MT_DOP=1
2018-03-17 08:00:41,367 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET ABORT_ON_ERROR=1
2018-03-17 08:00:41,449 12313 MainThread DEBUG:concurrent_select[878]:Setting mem limit to 38654 MB
2018-03-17 08:00:41,449 12313 MainThread DEBUG:db_connection[203]:IMPALA: SET MEM_LIMIT=38654M
2018-03-17 08:00:41,530 12313 MainThread DEBUG:concurrent_select[882]:Running query with 38654 MB mem limit at vc0718.halxg.cloudera.com with timeout secs 9223372036854775807:
COMPUTE STATS call_center
2018-03-17 08:00:41,589 12313 MainThread DEBUG:concurrent_select[890]:Query id is 74db40c3f221cf3:d67997c00000000
2018-03-17 08:00:42,184 12313 MainThread INFO:hiveserver2[265]:Closing active operation
{noformat}
This has always been the case, but no one really looked into it until now.

It's important to get this fixed soon as we increase where our stress tests run. Before, it was a very infrequent cost, but at least in my downstream environment, that is rapidly changing.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)