You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Ye Zihao (Jira)" <ji...@apache.org> on 2023/02/01 02:33:00 UTC
[jira] [Created] (IMPALA-11886) Data cache should support asynchronous writes
Ye Zihao created IMPALA-11886:
---------------------------------
Summary: Data cache should support asynchronous writes
Key: IMPALA-11886
URL: https://issues.apache.org/jira/browse/IMPALA-11886
Project: IMPALA
Issue Type: Improvement
Affects Versions: Impala 4.3.0
Reporter: Ye Zihao
Assignee: Ye Zihao
Currently, writes to the data cache are synchronized with hdfs file reads, and both are handled by remote hdfs IO threads. In other words, if a cache miss occurs, the IO thread needs to take additional responsibility for cache writes, which will lead to query performance deterioration in some cases.
Therefore, the data cache should be able to defer the writes to another thread(or thread pool) which writes asynchronously, allowing the IO thread to copy the data into the temporary buffer and immediately return it to the Scanner. Also need to bound the extra memory consumption for holding the temporary buffer though.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)