You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues-all@impala.apache.org by "Joe McDonnell (Jira)" <ji...@apache.org> on 2020/04/04 00:13:00 UTC

[jira] [Resolved] (IMPALA-8690) Better eviction algorithm for data cache

     [ https://issues.apache.org/jira/browse/IMPALA-8690?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Joe McDonnell resolved IMPALA-8690.
-----------------------------------
    Fix Version/s: Impala 4.0
       Resolution: Fixed

We added LIRS as an eviction algorithm. It is currently not the default. We can track further improvements in separate JIRAs.

> Better eviction algorithm for data cache
> ----------------------------------------
>
>                 Key: IMPALA-8690
>                 URL: https://issues.apache.org/jira/browse/IMPALA-8690
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 3.3.0
>            Reporter: Michael Ho
>            Assignee: Joe McDonnell
>            Priority: Major
>             Fix For: Impala 4.0
>
>
> With the current implementation of data cache, all data access will be cached regardless of the access pattern. The current LRU eviction algorithm is not resistant to scan traffic so in case some users scan a big fact table, a lot of the heavily accessed items will be evicted inevitably. We should adopt better eviction algorithm (e.g. LRFU or some other well known ones in the literature). Would be nice to evaluate it against some users' traces now that IMPALA-8542 is fixed.
> In the short run, we probably need some workaround (e.g. query hints to disable caching for certain tables). Will file a separate jira for it.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-all-unsubscribe@impala.apache.org
For additional commands, e-mail: issues-all-help@impala.apache.org