You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (JIRA)" <ji...@apache.org> on 2017/11/15 22:14:00 UTC

[jira] [Created] (IMPALA-6186) Consider adding aggregate global and per-pool limits to scratch space utilisation

Tim Armstrong created IMPALA-6186:
-------------------------------------

             Summary: Consider adding aggregate global and per-pool limits to scratch space utilisation
                 Key: IMPALA-6186
                 URL: https://issues.apache.org/jira/browse/IMPALA-6186
             Project: IMPALA
          Issue Type: Improvement
          Components: Backend
            Reporter: Tim Armstrong


Currently scratch_limit provides a per-query limit on scratch space usage. A global or per-pool default scratch_limit can be set to provide some degree of control over how much scratch space different queries can use. 

However there are limitations here:
* We cannot directly control the aggregate consumption of all queries or all queries in a pool. E.g. limit "ad-hoc" queries to 50GB total scratch space or limit all Impala queries to 200GB total scratch space.
* We cannot control how much space Impala will use on each disk. There may be some reason to have different limits on different disks, E.g. 10GB on /data1 and 20GB on /data2. Or we may simply want to avoid skew if one disk goes offline. E.g. if you have a 20GB aggregate limit, and one disk fails, then the effective limit for the others disks increases.

The design will require some significant thought to get the knobs right. E.g. one option is to allow global limits per-disk and per-pool limits for aggregate use across all disks.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)