You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "carolinchen (Jira)" <ji...@apache.org> on 2021/10/12 09:18:00 UTC

[jira] [Created] (IMPALA-10964) Add query option that limits skew query in runtime

carolinchen created IMPALA-10964:
------------------------------------

             Summary: Add query option that limits skew query in runtime
                 Key: IMPALA-10964
                 URL: https://issues.apache.org/jira/browse/IMPALA-10964
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
    Affects Versions: Impala 4.0.0
            Reporter: carolinchen
             Fix For: Impala 4.0.1


Reject queries that  skew value is too big when executing the query. 
Query skew refers to the situation in which some nodes are significantly behind other nodes in the process of concurrent execution of SQL.

There are two style skews:
1. Row skew, which may be caused by unreasonable sql or uneven task distributions.
2. Time skew, which may be caused by different capability by execnode.

Query skew will cause two effects:
1. For the skew node may execute slowly,  which will slow down the query progress .
2. For the skew node may exhaust lots system resources( I/O, memory, rpc), which will
affect other queries in the same host/ query pool.

When the skew value reach unreasonale range,  will affect the cluster status and other running queries. This is a mechanism to protect the cluster from potentially harmful queries(eg: mem_limit).

In our environment, the SKEW_LIMIT query option is added to limit skewed query.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org