You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Yida Wu (Jira)" <ji...@apache.org> on 2022/12/16 18:00:00 UTC

[jira] [Created] (IMPALA-11805) Expected codegen cache size is less than the actual allocation

Yida Wu created IMPALA-11805:
--------------------------------

             Summary: Expected codegen cache size is less than the actual allocation
                 Key: IMPALA-11805
                 URL: https://issues.apache.org/jira/browse/IMPALA-11805
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 4.3.0
            Reporter: Yida Wu
            Assignee: Yida Wu


In IMPALA-11470, we implement the cache for codegen functions, however, the expected size of a cache entry is much less than the actual allocation, according to the data in tcmalloc memory tracker. This could lead to a result of unexpected query failure when the memory tracker hits the capacity.

The current way to estimate the memory consumption of a codegen cache entry, mainly the memory consumption of a llvm::ExecutionEngine that stored in each entry, is to use the customized ImpalaMCJITMemoryManager https://github.com/apache/impala/blob/f705496e34ac474e8e1c999619e3b928c5e39e0f/be/src/codegen/mcjit-mem-mgr.h#L60, to accumulated bytes when the execution engine allocates code or data section. However in fact, the actual bytes allocated by the execution engine is much larger.

Tested in tpch and tpcds queries, in normal mode, the final consumption could be 3~4 times of the expectation, and it would be worse in the optimal mode, because the main difference is between the memory_manager_->bytes_allocated() and the actual execution engine allocation, and in normal mode it contains the size of the key, which is accurate.

When the execution engine is only existing a short period in runtime, the issue isn't that bad. However, when it becomes a part of the long-living cache entry, it could cause more problems by consuming much more memory than expectation.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)