You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Arun Ravi M V (Jira)" <ji...@apache.org> on 2019/08/30 13:12:00 UTC

[jira] [Comment Edited] (HADOOP-16540) Pluggable Filesystem Caching Support in FileSystem Class

    [ https://issues.apache.org/jira/browse/HADOOP-16540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16919525#comment-16919525 ] 

Arun Ravi M V edited comment on HADOOP-16540 at 8/30/19 1:11 PM:
-----------------------------------------------------------------

Yes, you are right, the main reason for this ticket is user credentials. I have a situation where a large number of datasets (a few thousand) is located at a single s3 bucket. I am trying to introduce Role-based access control here, AWS policies have size limitations and cannot be used as the only solution. In this case, I would like to define caching per dataset (bucket + root s3 prefix) instead of doing it at the s3 bucket level.


was (Author: arunravimv):
Yes, you are right, the main reason for this ticket is user credentials. I have a situation where a large number of datasets (a few thousand) is located at a single s3 bucket. I am trying to introduce Role-based access control here, AWS policies have size limitations and cannot be used as the only solution. In this case, I would like to define caching per dataset (bucket + root s3 prefix) instead of doing it at the s3 bucket level.

> Pluggable Filesystem Caching Support in FileSystem Class
> --------------------------------------------------------
>
>                 Key: HADOOP-16540
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16540
>             Project: Hadoop Common
>          Issue Type: New Feature
>          Components: fs
>    Affects Versions: 3.3.0
>            Reporter: Arun Ravi M V
>            Priority: Major
>
> Provide an option to use Custom Cache Class in FileSystem Class. Currently, the caching is enabled by default and uses the URI schema and authority value to determine whether to create a new FS instance for the given URI or to fetch an already existing one from the cache.
> In case of AWS S3 FS Impl, for an S3 path, the authority name will be bucket name, ie Filesystem object will be cached at the bucket level, but providing a custom caching logic can empower the user to cache it at some prefix level and provide more flexibility. 



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org