You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hive.apache.org by "Ádám Szita (Jira)" <ji...@apache.org> on 2022/07/27 12:55:00 UTC

[jira] [Assigned] (HIVE-26432) Improve LlapCacheAwareFs by caching file status information

     [ https://issues.apache.org/jira/browse/HIVE-26432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Ádám Szita reassigned HIVE-26432:
---------------------------------


> Improve LlapCacheAwareFs by caching file status information
> -----------------------------------------------------------
>
>                 Key: HIVE-26432
>                 URL: https://issues.apache.org/jira/browse/HIVE-26432
>             Project: Hive
>          Issue Type: Improvement
>            Reporter: Ádám Szita
>            Assignee: Ádám Szita
>            Priority: Major
>
> The current implementation of LlapCacheAwareFs is used to wrap InputStreams of non-ORC file formatted file reads, if set up to utilize LLAP caching.
> File content is cached by the calculated file ID and the required offsets within the file. This is later served from cache, however LlapCacheAwareFs acting as a FileSystem sometimes receives listStatus / getFileStatus calls too, which is only proxied to the original FS. If such operation on the original FS is slow, e.g. listing on S3, performance will be impacted. (This is not the case with how ORC is integrated into LLAP cache as it's not acting as a FS)
> I propose we cache the file status information too besides the content.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)