You are viewing a plain text version of this content. The canonical link for it is here.

Posted to hdfs-dev@hadoop.apache.org by "Bharat Viswanadham (Jira)" <ji...@apache.org> on 2019/11/20 19:22:00 UTC

[jira] [Resolved] (HDDS-2241) Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key

     [ https://issues.apache.org/jira/browse/HDDS-2241?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bharat Viswanadham resolved HDDS-2241.
--------------------------------------
    Fix Version/s: 0.5.0
       Resolution: Fixed

> Optimize the refresh pipeline logic used by KeyManagerImpl to obtain the pipelines for a key
> --------------------------------------------------------------------------------------------
>
>                 Key: HDDS-2241
>                 URL: https://issues.apache.org/jira/browse/HDDS-2241
>             Project: Hadoop Distributed Data Store
>          Issue Type: Bug
>          Components: Ozone Manager
>            Reporter: Aravindan Vijayan
>            Assignee: Aravindan Vijayan
>            Priority: Major
>              Labels: pull-request-available
>             Fix For: 0.5.0
>
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> Currently, while looking up a key, the Ozone Manager gets the pipeline information from SCM through an RPC for every block in the key. For large files > 1GB, we may end up making a lot of RPC calls for this. This can be optimized in a couple of ways
> * We can implement a batch getContainerWithPipeline API in SCM using which we can get the pipeline info locations for all the blocks for a file. To keep the number of containers passed in to SCM in a single call, we can have a fixed container batch size on the OM side. _Here, Number of calls = 1 (or k depending on batch size)_
> * Instead, a simpler change would be to have a map (method local) of ContainerID -> Pipeline that we get from SCM so that we don't need to make repeated calls to SCM for the same containerID for a key. _Here, Number of calls = Number of unique containerIDs_



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: hdfs-dev-help@hadoop.apache.org