You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@impala.apache.org by "Joe McDonnell (JIRA)" <ji...@apache.org> on 2017/05/22 23:54:04 UTC

[jira] [Created] (IMPALA-5352) File handle cache needs timeout based eviction

Joe McDonnell created IMPALA-5352:
-------------------------------------

             Summary: File handle cache needs timeout based eviction
                 Key: IMPALA-5352
                 URL: https://issues.apache.org/jira/browse/IMPALA-5352
             Project: IMPALA
          Issue Type: Bug
          Components: Backend
    Affects Versions: Impala 2.9.0
            Reporter: Joe McDonnell
            Assignee: Joe McDonnell


The file handle cache currently will keep file handles open indefinitely if the cache is not at its maximum capacity. This means that file handles might stay around for extended periods of time (weeks, months). Since local files are accessed directly, an open file handle can prevent the disk blocks from being freed, even if the file is deleted through HDFS. The file handle cache should implement a timeout for file handles so that a file handle that is not used recently will be evicted. This limit should be configurable and it may be desirable for the default to take into account HDFS's fs.trash.interval.

Additionally, when files are replaced or appended, the file's mtime will increase. File handles with the old mtime will no longer be accessed, but they may not be aged out of the cache. These should be aged out more aggressively.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)