You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Wei-Chiu Chuang (Jira)" <ji...@apache.org> on 2021/06/24 15:15:00 UTC

[jira] [Updated] (HDDS-5384) OM refreshPipeline should not invoke the expensive OmKeyLocationInfoGroup.getLocationList()

     [ https://issues.apache.org/jira/browse/HDDS-5384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Wei-Chiu Chuang updated HDDS-5384:
----------------------------------
    Attachment: om_liststatus_alloc_before.svg
                om_liststatus_cpu_before.svg
                om_listatus_alloc_after.svg
                om_liststatus_cpu_after.svg

> OM refreshPipeline should not invoke the expensive OmKeyLocationInfoGroup.getLocationList()
> -------------------------------------------------------------------------------------------
>
>                 Key: HDDS-5384
>                 URL: https://issues.apache.org/jira/browse/HDDS-5384
>             Project: Apache Ozone
>          Issue Type: Improvement
>          Components: OM
>            Reporter: Wei-Chiu Chuang
>            Assignee: Wei-Chiu Chuang
>            Priority: Major
>         Attachments: om_listatus_alloc_after.svg, om_liststatus_alloc_before.svg, om_liststatus_cpu_after.svg, om_liststatus_cpu_before.svg
>
>
> The OM's refreshPipeline (used by liststatus) implementation iterates over OmKeyLocationInfoGroup.getLocationList(), which is a very expensive call. It iterate over a collection of list of objects, allocates a new list, perform operations on each of them. In short, it's an O(n) method in terms of space and time complexity.
> There are many places in the Ozone code that uses this method. Most usages iterates over the generated list, without modifying the list. We should instead return the collection of lists, which is O(1).
> I have a client that issues many listStatus calls to examine the effect. Before the change, refreshPipeline costs 8.65% of heap usage. After: 1.95%.
> CPU cost: before: 8.18% after: 4.22%
> We should refrain from invoking getLocationList() as much as possible. But given the wide usage in the code, I elect not to remove the usage completely to avoid destabilizing it. Instead, I changed the usage in refreshPipeline to demonstrate its impact.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscribe@ozone.apache.org
For additional commands, e-mail: issues-help@ozone.apache.org