You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Juan Yu (JIRA)" <ji...@apache.org> on 2018/01/02 22:18:00 UTC
[jira] [Created] (IMPALA-6361) File handle cache should be shared
across multiple IO threads
Juan Yu created IMPALA-6361:
-------------------------------
Summary: File handle cache should be shared across multiple IO threads
Key: IMPALA-6361
URL: https://issues.apache.org/jira/browse/IMPALA-6361
Project: IMPALA
Issue Type: Improvement
Components: Backend
Affects Versions: Impala 2.10.0
Reporter: Juan Yu
A file handle can only be used by one thread at a time, cannot be shared across multiple IO threads due to statistics tracking issue. This leads to multiple file handle cache been created and added to cache. This still adds NN load and reduce the number of files can be cached.
We should investigate a way to share a file handle across threads while maintaining appropriate statistics
Another thing to improve is to improve the efficiency of the file handle cache. For example, reducing the size of the HDFS file handle itself would reduce the memory footprint and allow the cache to hold more entries in the same memory.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)