You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ozone.apache.org by "Stephen O'Donnell (Jira)" <ji...@apache.org> on 2020/07/08 17:44:00 UTC

[jira] [Commented] (HDDS-1930) Test Topology Aware Job scheduling with Ozone Topology

    [ https://issues.apache.org/jira/browse/HDDS-1930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17153798#comment-17153798 ] 

Stephen O'Donnell commented on HDDS-1930:
-----------------------------------------

On a 8 datanode cluster (not ideal for Ozone as it should be a multiple of 3 DNs), I ran teragen and then terasort:

{code}
ozone sh volume create o3://ozone1/teragen
ozone sh bucket create o3://ozone1/teragen/bucket

yarn jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar teragen -Dmapreduce.job.maps=8 1000000000 o3fs://bucket.teragen.ozone1/test3

yarn jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-mapreduce-examples.jar terasort -Dmapreduce.job.maps=30 o3fs://bucket.teragen.ozone1/test3 o3fs://bucket.teragen.ozone1/sort-test3
{code}

The job counts at the end said:

{code}
		Killed map tasks=244
		Launched map tasks=620
		Launched reduce tasks=1
		Other local map tasks=244
		Data-local map tasks=321
		Rack-local map tasks=55
{code}

So it executed 620 -244 = 376 containers successfully, where 321 were "Data-Local" and the remaining 55 were "Rack-Local". I would like to do a few more tests to confirm, but from that, it does seem like the locality is making it through to YARN and it is picking local nodes for the data or at least the same rack.

> Test Topology Aware Job scheduling with Ozone Topology
> ------------------------------------------------------
>
>                 Key: HDDS-1930
>                 URL: https://issues.apache.org/jira/browse/HDDS-1930
>             Project: Hadoop Distributed Data Store
>          Issue Type: Sub-task
>            Reporter: Xiaoyu Yao
>            Assignee: Stephen O'Donnell
>            Priority: Major
>             Fix For: 0.6.0
>
>
> My initial results with Terasort does not seem to report the counter properly. Most of the requests are handled by rack local but no node local. This ticket is opened to add more system testing to validate the feature. 
> Total Allocated Containers: 3778
> Each table cell represents the number of NodeLocal/RackLocal/OffSwitch containers satisfied by NodeLocal/RackLocal/OffSwitch resource requests.
> Node Local Request	Rack Local Request	Off Switch Request
> Num Node Local Containers (satisfied by)	0		
> Num Rack Local Containers (satisfied by)	0	3648	
> Num Off Switch Containers (satisfied by)	0	96	34



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: ozone-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: ozone-issues-help@hadoop.apache.org