You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Peter Ebert (Jira)" <ji...@apache.org> on 2021/08/04 19:35:00 UTC

[jira] [Created] (IMPALA-10841) Pin Specific Directories in the Remote Data Cache

Peter Ebert created IMPALA-10841:
------------------------------------

             Summary: Pin Specific Directories in the Remote Data Cache
                 Key: IMPALA-10841
                 URL: https://issues.apache.org/jira/browse/IMPALA-10841
             Project: IMPALA
          Issue Type: New Feature
    Affects Versions: Impala 3.4.0
            Reporter: Peter Ebert


For remote reads, a warm cache is important for best performance.  A feature to force a directory into the cache that pins it, regardless of the LRU or LIRS algorithm, to ensure SLAs can be met would be helpful. 

The reason I say a directory is that this would give us the flexibility to select a specific db, table, and/or partition(s) to keep in the cache.

Ideally this would fire at startup, there may be some overlap here with other jiras to warm the cache, so another option could be a flag to specify whether it could later be flushed out of the cache (by LRU/LIRS) or not, as pinning has the advantage for high SLA workloads that may not be read frequently enough to keep them in the cache.

Would have to think about scenarios where the directory is bigger than the cache and how to handle/warn about that.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org