You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Doug Cutting (JIRA)" <ji...@apache.org> on 2009/06/19 19:42:07 UTC

[jira] Commented: (HADOOP-6087) distcp can support bandwidth limiting

    [ https://issues.apache.org/jira/browse/HADOOP-6087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721899#action_12721899 ] 

Doug Cutting commented on HADOOP-6087:
--------------------------------------

Some comments:
 - 'sleeptime' should be 'getSleeptime()' to be thread safe, no?  or maybe use int as a sleep time, since updates to an int are atomic.
 - getNumRunningMaps() is expensive to call from each node at each interval, since reports for all tasks must be retrieved from the JT.  better would be to just fetch the job's counters each time, since they're constant-sized, not proportional to the number of tasks.  You'd need to add a maps_completed counter, then use the difference between that and TOTAL_LAUNCHED_MAPS to calculate the number running.
 - the interval to contact the JT might be randomized a bit, so that not all tasks hit it at the same time, e.g., by adding a random value that's 10% of the specified value.
 - when InterruptedException is caught a thread should generally exit, not simply log a warning.  if things will no longer work correctly without the thread, then it should somehow cause other threads dependent threads to fail too.
 - getNumRunningMaps() should either return a correct value or throw an exception.  if it cannot contact the JT or if the task does not know its Id it should fail, no?


> distcp can support bandwidth limiting
> -------------------------------------
>
>                 Key: HADOOP-6087
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6087
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: tools/distcp
>    Affects Versions: 0.21.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: d_bw.patch
>
>
> distcp should support an option for user to specify the bandwidth limit for the distcp job.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.