You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Tim Armstrong (Jira)" <ji...@apache.org> on 2020/12/14 19:19:00 UTC

[jira] [Resolved] (IMPALA-6361) File handle cache should be shared across multiple IO threads

     [ https://issues.apache.org/jira/browse/IMPALA-6361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Tim Armstrong resolved IMPALA-6361.
-----------------------------------
    Resolution: Duplicate

> File handle cache should be shared across multiple IO threads
> -------------------------------------------------------------
>
>                 Key: IMPALA-6361
>                 URL: https://issues.apache.org/jira/browse/IMPALA-6361
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Backend
>    Affects Versions: Impala 2.10.0
>            Reporter: Juan Yu
>            Priority: Major
>
> A file handle can only be used by one thread at a time, cannot be shared across multiple IO threads due to statistics tracking issue. This leads to multiple file handle cache been created and added to cache. This still adds NN load and reduce the number of files can be cached.
> We should investigate a way to share a file handle across threads while maintaining appropriate statistics
> Another thing to improve is to improve the efficiency of the file handle cache. For example, reducing the size of the HDFS file handle itself would reduce the memory footprint and allow the cache to hold more entries in the same memory. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)