You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by manoj <ma...@gmail.com> on 2015/08/19 19:33:16 UTC

App Master takes ~30min to re-schedule task attempts.

Hello all,

I'm running Apache2.6.0.
I'm trying to remove a node from a Hadoop Cluster and the add it back.
The taskattempts on the node which was removed are rescheduled only after
30min.

During this 30min period looks like the App Master is trying to connect(
check the log below ) the same node which was removed and after about 30min
it reschedules those taskAttempts from the lost node and eventually the job
succeeds.

how can I reduce the 30min wait time?

.....
......
2015-08-14 11:25:21,662 INFO [ContainerLauncher #7]
org.apache.hadoop.ipc.Client: Retrying connect to server:
host172/XX.XX.XX.XX:36158. Already tried 0 time(s); retry policy is
RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000
MILLISECONDS)
......
......

-Thanks
--Manoj Kumar M