You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Istvan Fajth (Jira)" <ji...@apache.org> on 2020/06/12 11:49:00 UTC
[jira] [Comment Edited] (HDDS-3721) Implement getContentSummary to provide replicated size properly to dfs -du command

    [ https://issues.apache.org/jira/browse/HDDS-3721?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17134151#comment-17134151 ] 

Istvan Fajth edited comment on HDDS-3721 at 6/12/20, 11:48 AM:
---------------------------------------------------------------

At first I thought this one is simply a client side problem, but after going into the details a bit, I realised that there might be a reason why HDFS has this on the server side, and started to check into it, but then I had to put this one aside a bit.

The benefits of approaching this from the client side, is that it stays on the client side, and avoids a heavy implementation on the OM side, on the other hand on the OM side, on the other hand, it is painfully slow, and the runtime scales up with the number of elements in a directory, it was running for ~25 seconds on a folder with 82k files in 3.5k subfolders.
The problem of approaching this from the client side, is that it leads to 4 calls per subdirectory (14k calls in this case)... 1 READ_BUCKET, then 1 GET_FILE_STATUS (to see if this is a file or a dir), then if it is a directory 1 READ_BUCKET again, and finally a LIST_STATUS, which then can not be controlled or throttled by the server side much as these are coming from the client side and from possibly multiple clients at some times.


The benefit of having something similar in the OM API, is to have just one call, and we can do throttling and any kind of optimisation on the OM side as needed, and we might ultimately cache the values even if that becomes necessary.
The problem of this approach is that it possibly requires a lock, and is an operation that is blocking OM for too long... I am unsure though whether we need even the read lock.


[~arp], can you give some insight why you would like to avoid implementing this on OM side, perhaps why at the end was it implemented on the server side for HDFS?


was (Author: pifta):
At first I thought this one is simply a client side problem, but after going into the details a bit, I realised that there might be a reason why HDFS has this on the server side, and started to check into it, but then I had to put this one aside a bit.

The benefits of approaching this from the client side, is that it stays on the client side, and avoids a heavy implementation on the OM side, on the other hand on the OM side, on the other hand, it is painfully slow, and the runtime scales up with the number of elements in a directory, it was running for ~25 seconds on a folder with 82k files in 3.5k subfolders.
The problem of approaching this from the client side, is that it leads to 4 calls per subdirectory... 1 READ_BUCKET, then 1 GET_FILE_STATUS (to see if this is a file or a dir), then if it is a directory 1 READ_BUCKET again, and finally a LIST_STATUS, which then can not be controlled or throttled by the server side much as these are coming from the client side and from possibly multiple clients at some times.


The benefit of having something similar in the OM API, is to have just one call, and we can do throttling and any kind of optimisation on the OM side as needed, and we might ultimately cache the values even if that becomes necessary.
The problem of this approach is that it possibly requires a lock, and is an operation that is blocking OM for too long... I am unsure though whether we need even the read lock.


[~arp], can you give some insight why you would like to avoid implementing this on OM side, perhaps why at the end was it implemented on the server side for HDFS?

> Implement getContentSummary to provide replicated size properly to dfs -du command
> ----------------------------------------------------------------------------------
>
>                 Key: HDDS-3721
>                 URL: https://issues.apache.org/jira/browse/HDDS-3721
>             Project: Hadoop Distributed Data Store
>          Issue Type: Improvement
>            Reporter: Istvan Fajth
>            Assignee: Istvan Fajth
>            Priority: Major
>              Labels: Triaged
>
> Currently when you run hdfs dfs -du command against a path on Ozone, it uses the default implementation from FileSystem class in the Hadoop project, and that does not care to calculate with replication factor by default. In DistributedFileSystem and in a couple of FileSystem implementation there is an override to calculate the full replicated size properly.
> Currently the output is something like this for a folder which has file with replication factor of 3:
> {code}
> hdfs dfs -du -s -h o3fs://perfbucket.volume.ozone1/terasort/datagen
> 931.3 G  931.3 G  o3fs://perfbucket.volume.ozone1/terasort/datagen
> {code}
> The command in Ozone's case as well should report the replicated size az the second number so something around 2.7TB in this case.
> In order to do so, we should implement getContentSummary and calculate the replicated size in the response properly in order to get there.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org