You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Yiqun Lin (Jira)" <ji...@apache.org> on 2020/09/03 16:05:00 UTC

[jira] [Comment Edited] (HDDS-2939) Ozone FS namespace

    [ https://issues.apache.org/jira/browse/HDDS-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17190253#comment-17190253 ] 

Yiqun Lin edited comment on HDDS-2939 at 9/3/20, 4:04 PM:
----------------------------------------------------------

Some discussion about latest status of HDDS-2939 that I asked in mailing list.
 From [~rakeshr]:
{quote}Presently, I am working on the directory cache design and upgrade design.
 These two tasks are very important as the first one would help to *reduce
 the performance penalties on the path traversal*. Later one is to provide
 an efficient way to make a smooth upgrade experience to the users.
{quote}
Here the directory cache is used for avoid the additional look up overheads. Latest design of directory cache hasn't been attached but just some thoughts from me:

Two type mapping cache will be useful I think:
 * <KeyName, KeyInfo>, like </vol1/buck1/a/b/c/d/file1, KeyInfo>, so that we can skip the traverse search from dir table to key table.
 * <DirName, List<KeyInfo>>, this is used for the listStatus scenario, list files call can be a very expensive call under Ozone fs namespace.

Cache introduced here can speed up the metadata access but also there are two aspects we need to consider.
 * Cache entry eviction policy for this, we cannot cache all the dir/file entries.
 * Consistency between dir cache and underlying store. Cache entry will become stale when db store updated but not synced in corresponding cache entry. The cache refresh interval time can be introduced here. Only when the cache entry not updated more than given refresh interval, then we trigger update cache entry from querying the db table. Users can set different refresh interval time to ensure the cache freshness based on their scenarios. Also they can disable this cache by set interval to 0 that means each query will directly access to db.


was (Author: linyiqun):
Some discussion about latest status of HDDS-2939 that I asked in mailing list.
 From [~rakeshr]:
{quote}Presently, I am working on the directory cache design and upgrade design.
 These two tasks are very important as the first one would help to *reduce
 the performance penalties on the path traversal*. Later one is to provide
 an efficient way to make a smooth upgrade experience to the users.
{quote}
Here the directory cache is used for avoid the additional look up overheads. Latest design of directory cache hasn't been attached but just some thoughts from me:

Two type mapping cache will be useful I think:
 * <KeyName, KeyInfo>, like </vol1/buck1/a/b/c/d/file1, KeyInfo>, so that we can skip the traverse search from dir table to key table.
 * <DirName, List<KeyInfo>>, this is used for the listStatus scenario, list files call can be a very expensive call under Ozone fs namespace.

Cache introduced here can speed up the metadata access but also there are two aspects we need to consider.
 * Cache entry eviction policy for this, we cannot cache all the dir/file entries.
 * Consistency between dir cache and underlying store. The cache refresh interval time can be introduced here. Only when the cache entry not updated more than given refresh interval, then we trigger update cache entry from querying the db table. Users can set different refresh interval time to ensure the cache freshness based on their scenarios. Also they can disable this cache by set interval to 0 that means each query will directly access to db.

> Ozone FS namespace
> ------------------
>
>                 Key: HDDS-2939
>                 URL: https://issues.apache.org/jira/browse/HDDS-2939
>             Project: Hadoop Distributed Data Store
>          Issue Type: New Feature
>          Components: Ozone Manager
>            Reporter: Supratim Deka
>            Assignee: Rakesh Radhakrishnan
>            Priority: Major
>              Labels: Triaged
>         Attachments: Ozone FS Namespace Proposal v1.0.docx
>
>
> Create the structures and metadata layout required to support efficient FS namespace operations in Ozone - operations involving folders/directories required to support the Hadoop compatible Filesystem interface.
> The details are described in the attached document. The work is divided up into sub-tasks as per the task list in the document.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org