You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-dev@hadoop.apache.org by "Huangkaixuan (JIRA)" <ji...@apache.org> on 2017/03/06 07:50:32 UTC

[jira] [Created] (YARN-6289) yarn got little data locality

Huangkaixuan created YARN-6289:
----------------------------------

             Summary: yarn got little data locality
                 Key: YARN-6289
                 URL: https://issues.apache.org/jira/browse/YARN-6289
             Project: Hadoop YARN
          Issue Type: Improvement
          Components: capacity scheduler
         Environment: Hardware configuration
CPU: 2 x Intel(R) Xeon(R) E5-2620 v2 @ 2.10GHz /15M Cache 6-Core 12-Thread 
Memory: 128GB Memory (16x8GB) 1600MHz
Disk: 600GBx2 3.5-inch with RAID-1
Network bandwidth: 968Mb/s
Software configuration
Spark-1.6.2	Hadoop-2.7.1 

            Reporter: Huangkaixuan
            Priority: Minor


When I ran this experiment with both Spark and MapReduce wordcount on the file, I noticed that the job did not get data locality every time. It was seemingly random in the placement of the tasks, even though there is no other job running on the cluster. I expected the task placement to always be on the single machine which is holding the data block, but that did not happen.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-dev-help@hadoop.apache.org