You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@hadoop.apache.org by Nan Zhu <zh...@gmail.com> on 2013/10/24 05:35:18 UTC

dynamically resizing the Hadoop cluster?

Hi, all

I’m running a Hadoop cluster on AWS EC2,  

I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this?  

E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)

I cannot use EMR, since I’m running a customized version of Hadoop  

Best,

--  
Nan Zhu
School of Computer Science,
McGill University

Re: dynamically resizing the Hadoop cluster?

Posted by Bryan Beaudreault <bb...@hubspot.com>.

It seems like you may want to look into Amazon's EMR (elastic mapreduce),
which does much of what you are trying to do.  It's worth taking a look at
since you're already storing your data in S3 and using EC2 for your
cluster(s).


On Thu, Oct 24, 2013 at 5:07 PM, Nan Zhu <zh...@gmail.com> wrote:

> Good explanation,
>
> Thank you, Ravi
>
> Best,
>
>
> On Thu, Oct 24, 2013 at 4:51 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
>> Hi Nan!
>>
>> If the task trackers stop heartbeating back to the JobTracker, the
>> JobTracker will mark them as dead and reschedule the tasks which were
>> running on that TaskTracker. Admittedly there is some delay between when
>> the TaskTrackers stop heartbeating back and when the JobTracker marks them
>> dead. This is controlled by the mapred.tasktracker.expiry.intervalparameter (I'm assuming you are using Hadoop 1.x)
>>
>> HTH
>> Ravi
>>
>>
>>
>>
>>
>>
>>   On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com>
>> wrote:
>>  Hi, Ravi,
>>
>> Thank you for the reply
>>
>> Actually I'm not running HDFS on EC2, instead I use S3 to store data
>>
>> I'm curious about that, if some nodes are decommissioned, the JobTracker
>> will deal those tasks which originally ran on them as "too slow" (since no
>> progress for a long time) so to run speculative execution OR it directly
>> treats them as "belonging to a running Job and ran on a dead TaskTracker"?
>>
>> Best,
>>
>> Nan
>>
>>
>>
>>
>>
>>
>> On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:
>>
>> Hi Nan!
>>
>> Usually nodes are decommissioned slowly over some period of time so as
>> not to disrupt the running jobs. When a node is decommissioned, the
>> NameNode must re-replicate all under-replicated blocks. Rather than
>> suddenly remove half the nodes, you might want to take a few nodes offline
>> at a time. Hadoop should be able to handle rescheduling tasks on nodes no
>> longer available (even without speculative execution. Speculative execution
>> is for something else).
>>
>> HTH
>> Ravi
>>
>>
>>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
>> wrote:
>>   Hi, all
>>
>> I’m running a Hadoop cluster on AWS EC2,
>>
>> I would like to dynamically resizing the cluster so as to reduce the
>> cost, is there any solution to achieve this?
>>
>> E.g. I would like to cut the cluster size with a half, is it safe to just
>> shutdown the instances (if some tasks are just running on them, can I rely
>> on the speculative execution to re-run them on the other nodes?)
>>
>> I cannot use EMR, since I’m running a customized version of Hadoop
>>
>> Best,
>>
>> --
>> Nan Zhu
>> School of Computer Science,
>> McGill University
>>
>>
>>
>>
>>
>>
>>
>> --
>> Nan Zhu
>> School of Computer Science,
>> McGill University
>> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>>
>>
>>
>
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>

Re: dynamically resizing the Hadoop cluster?

Posted by Bryan Beaudreault <bb...@hubspot.com>.

It seems like you may want to look into Amazon's EMR (elastic mapreduce),
which does much of what you are trying to do.  It's worth taking a look at
since you're already storing your data in S3 and using EC2 for your
cluster(s).


On Thu, Oct 24, 2013 at 5:07 PM, Nan Zhu <zh...@gmail.com> wrote:

> Good explanation,
>
> Thank you, Ravi
>
> Best,
>
>
> On Thu, Oct 24, 2013 at 4:51 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
>> Hi Nan!
>>
>> If the task trackers stop heartbeating back to the JobTracker, the
>> JobTracker will mark them as dead and reschedule the tasks which were
>> running on that TaskTracker. Admittedly there is some delay between when
>> the TaskTrackers stop heartbeating back and when the JobTracker marks them
>> dead. This is controlled by the mapred.tasktracker.expiry.intervalparameter (I'm assuming you are using Hadoop 1.x)
>>
>> HTH
>> Ravi
>>
>>
>>
>>
>>
>>
>>   On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com>
>> wrote:
>>  Hi, Ravi,
>>
>> Thank you for the reply
>>
>> Actually I'm not running HDFS on EC2, instead I use S3 to store data
>>
>> I'm curious about that, if some nodes are decommissioned, the JobTracker
>> will deal those tasks which originally ran on them as "too slow" (since no
>> progress for a long time) so to run speculative execution OR it directly
>> treats them as "belonging to a running Job and ran on a dead TaskTracker"?
>>
>> Best,
>>
>> Nan
>>
>>
>>
>>
>>
>>
>> On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:
>>
>> Hi Nan!
>>
>> Usually nodes are decommissioned slowly over some period of time so as
>> not to disrupt the running jobs. When a node is decommissioned, the
>> NameNode must re-replicate all under-replicated blocks. Rather than
>> suddenly remove half the nodes, you might want to take a few nodes offline
>> at a time. Hadoop should be able to handle rescheduling tasks on nodes no
>> longer available (even without speculative execution. Speculative execution
>> is for something else).
>>
>> HTH
>> Ravi
>>
>>
>>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
>> wrote:
>>   Hi, all
>>
>> I’m running a Hadoop cluster on AWS EC2,
>>
>> I would like to dynamically resizing the cluster so as to reduce the
>> cost, is there any solution to achieve this?
>>
>> E.g. I would like to cut the cluster size with a half, is it safe to just
>> shutdown the instances (if some tasks are just running on them, can I rely
>> on the speculative execution to re-run them on the other nodes?)
>>
>> I cannot use EMR, since I’m running a customized version of Hadoop
>>
>> Best,
>>
>> --
>> Nan Zhu
>> School of Computer Science,
>> McGill University
>>
>>
>>
>>
>>
>>
>>
>> --
>> Nan Zhu
>> School of Computer Science,
>> McGill University
>> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>>
>>
>>
>
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>

Re: dynamically resizing the Hadoop cluster?

Posted by Bryan Beaudreault <bb...@hubspot.com>.

It seems like you may want to look into Amazon's EMR (elastic mapreduce),
which does much of what you are trying to do.  It's worth taking a look at
since you're already storing your data in S3 and using EC2 for your
cluster(s).


On Thu, Oct 24, 2013 at 5:07 PM, Nan Zhu <zh...@gmail.com> wrote:

> Good explanation,
>
> Thank you, Ravi
>
> Best,
>
>
> On Thu, Oct 24, 2013 at 4:51 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
>> Hi Nan!
>>
>> If the task trackers stop heartbeating back to the JobTracker, the
>> JobTracker will mark them as dead and reschedule the tasks which were
>> running on that TaskTracker. Admittedly there is some delay between when
>> the TaskTrackers stop heartbeating back and when the JobTracker marks them
>> dead. This is controlled by the mapred.tasktracker.expiry.intervalparameter (I'm assuming you are using Hadoop 1.x)
>>
>> HTH
>> Ravi
>>
>>
>>
>>
>>
>>
>>   On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com>
>> wrote:
>>  Hi, Ravi,
>>
>> Thank you for the reply
>>
>> Actually I'm not running HDFS on EC2, instead I use S3 to store data
>>
>> I'm curious about that, if some nodes are decommissioned, the JobTracker
>> will deal those tasks which originally ran on them as "too slow" (since no
>> progress for a long time) so to run speculative execution OR it directly
>> treats them as "belonging to a running Job and ran on a dead TaskTracker"?
>>
>> Best,
>>
>> Nan
>>
>>
>>
>>
>>
>>
>> On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:
>>
>> Hi Nan!
>>
>> Usually nodes are decommissioned slowly over some period of time so as
>> not to disrupt the running jobs. When a node is decommissioned, the
>> NameNode must re-replicate all under-replicated blocks. Rather than
>> suddenly remove half the nodes, you might want to take a few nodes offline
>> at a time. Hadoop should be able to handle rescheduling tasks on nodes no
>> longer available (even without speculative execution. Speculative execution
>> is for something else).
>>
>> HTH
>> Ravi
>>
>>
>>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
>> wrote:
>>   Hi, all
>>
>> I’m running a Hadoop cluster on AWS EC2,
>>
>> I would like to dynamically resizing the cluster so as to reduce the
>> cost, is there any solution to achieve this?
>>
>> E.g. I would like to cut the cluster size with a half, is it safe to just
>> shutdown the instances (if some tasks are just running on them, can I rely
>> on the speculative execution to re-run them on the other nodes?)
>>
>> I cannot use EMR, since I’m running a customized version of Hadoop
>>
>> Best,
>>
>> --
>> Nan Zhu
>> School of Computer Science,
>> McGill University
>>
>>
>>
>>
>>
>>
>>
>> --
>> Nan Zhu
>> School of Computer Science,
>> McGill University
>> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>>
>>
>>
>
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>

Re: dynamically resizing the Hadoop cluster?

Posted by Bryan Beaudreault <bb...@hubspot.com>.

It seems like you may want to look into Amazon's EMR (elastic mapreduce),
which does much of what you are trying to do.  It's worth taking a look at
since you're already storing your data in S3 and using EC2 for your
cluster(s).


On Thu, Oct 24, 2013 at 5:07 PM, Nan Zhu <zh...@gmail.com> wrote:

> Good explanation,
>
> Thank you, Ravi
>
> Best,
>
>
> On Thu, Oct 24, 2013 at 4:51 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
>> Hi Nan!
>>
>> If the task trackers stop heartbeating back to the JobTracker, the
>> JobTracker will mark them as dead and reschedule the tasks which were
>> running on that TaskTracker. Admittedly there is some delay between when
>> the TaskTrackers stop heartbeating back and when the JobTracker marks them
>> dead. This is controlled by the mapred.tasktracker.expiry.intervalparameter (I'm assuming you are using Hadoop 1.x)
>>
>> HTH
>> Ravi
>>
>>
>>
>>
>>
>>
>>   On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com>
>> wrote:
>>  Hi, Ravi,
>>
>> Thank you for the reply
>>
>> Actually I'm not running HDFS on EC2, instead I use S3 to store data
>>
>> I'm curious about that, if some nodes are decommissioned, the JobTracker
>> will deal those tasks which originally ran on them as "too slow" (since no
>> progress for a long time) so to run speculative execution OR it directly
>> treats them as "belonging to a running Job and ran on a dead TaskTracker"?
>>
>> Best,
>>
>> Nan
>>
>>
>>
>>
>>
>>
>> On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:
>>
>> Hi Nan!
>>
>> Usually nodes are decommissioned slowly over some period of time so as
>> not to disrupt the running jobs. When a node is decommissioned, the
>> NameNode must re-replicate all under-replicated blocks. Rather than
>> suddenly remove half the nodes, you might want to take a few nodes offline
>> at a time. Hadoop should be able to handle rescheduling tasks on nodes no
>> longer available (even without speculative execution. Speculative execution
>> is for something else).
>>
>> HTH
>> Ravi
>>
>>
>>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
>> wrote:
>>   Hi, all
>>
>> I’m running a Hadoop cluster on AWS EC2,
>>
>> I would like to dynamically resizing the cluster so as to reduce the
>> cost, is there any solution to achieve this?
>>
>> E.g. I would like to cut the cluster size with a half, is it safe to just
>> shutdown the instances (if some tasks are just running on them, can I rely
>> on the speculative execution to re-run them on the other nodes?)
>>
>> I cannot use EMR, since I’m running a customized version of Hadoop
>>
>> Best,
>>
>> --
>> Nan Zhu
>> School of Computer Science,
>> McGill University
>>
>>
>>
>>
>>
>>
>>
>> --
>> Nan Zhu
>> School of Computer Science,
>> McGill University
>> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>>
>>
>>
>
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>

Re: dynamically resizing the Hadoop cluster?

Posted by Nan Zhu <zh...@gmail.com>.

Good explanation,

Thank you, Ravi

Best,


On Thu, Oct 24, 2013 at 4:51 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Nan!
>
> If the task trackers stop heartbeating back to the JobTracker, the
> JobTracker will mark them as dead and reschedule the tasks which were
> running on that TaskTracker. Admittedly there is some delay between when
> the TaskTrackers stop heartbeating back and when the JobTracker marks them
> dead. This is controlled by the mapred.tasktracker.expiry.intervalparameter (I'm assuming you are using Hadoop 1.x)
>
> HTH
> Ravi
>
>
>
>
>
>
>   On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>  Hi, Ravi,
>
> Thank you for the reply
>
> Actually I'm not running HDFS on EC2, instead I use S3 to store data
>
> I'm curious about that, if some nodes are decommissioned, the JobTracker
> will deal those tasks which originally ran on them as "too slow" (since no
> progress for a long time) so to run speculative execution OR it directly
> treats them as "belonging to a running Job and ran on a dead TaskTracker"?
>
> Best,
>
> Nan
>
>
>
>
>
>
> On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
> Hi Nan!
>
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
>
> HTH
> Ravi
>
>
>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>   Hi, all
>
> I’m running a Hadoop cluster on AWS EC2,
>
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
>
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
>
> I cannot use EMR, since I’m running a customized version of Hadoop
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
>
>
>
>
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>
>
>


-- 
Nan Zhu
School of Computer Science,
McGill University
E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>

Re: dynamically resizing the Hadoop cluster?

Posted by Nan Zhu <zh...@gmail.com>.

Good explanation,

Thank you, Ravi

Best,


On Thu, Oct 24, 2013 at 4:51 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Nan!
>
> If the task trackers stop heartbeating back to the JobTracker, the
> JobTracker will mark them as dead and reschedule the tasks which were
> running on that TaskTracker. Admittedly there is some delay between when
> the TaskTrackers stop heartbeating back and when the JobTracker marks them
> dead. This is controlled by the mapred.tasktracker.expiry.intervalparameter (I'm assuming you are using Hadoop 1.x)
>
> HTH
> Ravi
>
>
>
>
>
>
>   On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>  Hi, Ravi,
>
> Thank you for the reply
>
> Actually I'm not running HDFS on EC2, instead I use S3 to store data
>
> I'm curious about that, if some nodes are decommissioned, the JobTracker
> will deal those tasks which originally ran on them as "too slow" (since no
> progress for a long time) so to run speculative execution OR it directly
> treats them as "belonging to a running Job and ran on a dead TaskTracker"?
>
> Best,
>
> Nan
>
>
>
>
>
>
> On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
> Hi Nan!
>
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
>
> HTH
> Ravi
>
>
>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>   Hi, all
>
> I’m running a Hadoop cluster on AWS EC2,
>
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
>
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
>
> I cannot use EMR, since I’m running a customized version of Hadoop
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
>
>
>
>
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>
>
>


-- 
Nan Zhu
School of Computer Science,
McGill University
E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>

Re: dynamically resizing the Hadoop cluster?

Posted by Nan Zhu <zh...@gmail.com>.

Good explanation,

Thank you, Ravi

Best,


On Thu, Oct 24, 2013 at 4:51 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Nan!
>
> If the task trackers stop heartbeating back to the JobTracker, the
> JobTracker will mark them as dead and reschedule the tasks which were
> running on that TaskTracker. Admittedly there is some delay between when
> the TaskTrackers stop heartbeating back and when the JobTracker marks them
> dead. This is controlled by the mapred.tasktracker.expiry.intervalparameter (I'm assuming you are using Hadoop 1.x)
>
> HTH
> Ravi
>
>
>
>
>
>
>   On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>  Hi, Ravi,
>
> Thank you for the reply
>
> Actually I'm not running HDFS on EC2, instead I use S3 to store data
>
> I'm curious about that, if some nodes are decommissioned, the JobTracker
> will deal those tasks which originally ran on them as "too slow" (since no
> progress for a long time) so to run speculative execution OR it directly
> treats them as "belonging to a running Job and ran on a dead TaskTracker"?
>
> Best,
>
> Nan
>
>
>
>
>
>
> On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
> Hi Nan!
>
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
>
> HTH
> Ravi
>
>
>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>   Hi, all
>
> I’m running a Hadoop cluster on AWS EC2,
>
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
>
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
>
> I cannot use EMR, since I’m running a customized version of Hadoop
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
>
>
>
>
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>
>
>


-- 
Nan Zhu
School of Computer Science,
McGill University
E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>

Re: dynamically resizing the Hadoop cluster?

Posted by Nan Zhu <zh...@gmail.com>.

Good explanation,

Thank you, Ravi

Best,


On Thu, Oct 24, 2013 at 4:51 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Nan!
>
> If the task trackers stop heartbeating back to the JobTracker, the
> JobTracker will mark them as dead and reschedule the tasks which were
> running on that TaskTracker. Admittedly there is some delay between when
> the TaskTrackers stop heartbeating back and when the JobTracker marks them
> dead. This is controlled by the mapred.tasktracker.expiry.intervalparameter (I'm assuming you are using Hadoop 1.x)
>
> HTH
> Ravi
>
>
>
>
>
>
>   On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>  Hi, Ravi,
>
> Thank you for the reply
>
> Actually I'm not running HDFS on EC2, instead I use S3 to store data
>
> I'm curious about that, if some nodes are decommissioned, the JobTracker
> will deal those tasks which originally ran on them as "too slow" (since no
> progress for a long time) so to run speculative execution OR it directly
> treats them as "belonging to a running Job and ran on a dead TaskTracker"?
>
> Best,
>
> Nan
>
>
>
>
>
>
> On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:
>
> Hi Nan!
>
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
>
> HTH
> Ravi
>
>
>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>   Hi, all
>
> I’m running a Hadoop cluster on AWS EC2,
>
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
>
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
>
> I cannot use EMR, since I’m running a customized version of Hadoop
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
>
>
>
>
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
> E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>
>
>
>


-- 
Nan Zhu
School of Computer Science,
McGill University
E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>

Re: dynamically resizing the Hadoop cluster?

Posted by Ravi Prakash <ra...@ymail.com>.

Hi Nan!

If the task trackers stop heartbeating back to the JobTracker, the JobTracker will mark them as dead and reschedule the tasks which were running on that TaskTracker. Admittedly there is some delay between when the TaskTrackers stop heartbeating back and when the JobTracker marks them dead. This is controlled by the mapred.tasktracker.expiry.interval parameter (I'm assuming you are using Hadoop 1.x)

HTH
Ravi

On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com> wrote:

Hi, Ravi,

Thank you for the reply

Actually I'm not running HDFS on EC2, instead I use S3 to store data

I'm curious about that, if some nodes are decommissioned, the JobTracker will deal those tasks which originally ran on them as "too slow" (since no progress for a long time) so to run speculative execution OR it directly treats them as "belonging to a running Job and ran on a dead TaskTracker"?

Best,

Nan

On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:

Hi Nan!
>
>
>Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else). 
>
>
>
>HTH
>Ravi
>
>
>
>
>On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com> wrote:
> 
>Hi, all
>
>
>I’m running a Hadoop cluster on AWS EC2, 
>
>
>I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this? 
>
>
>E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)
>
>
>I cannot use EMR, since I’m running a customized version of Hadoop 
>
>
>Best,
>
>
>-- 
>Nan Zhu
>School of Computer Science,
>McGill University
>
>
>
>
>
>

-- 

Nan Zhu
School of Computer Science, 
McGill University
E-Mail: zhunanmcgill@gmail.com

Re: dynamically resizing the Hadoop cluster?

Posted by Ravi Prakash <ra...@ymail.com>.

Hi Nan!

If the task trackers stop heartbeating back to the JobTracker, the JobTracker will mark them as dead and reschedule the tasks which were running on that TaskTracker. Admittedly there is some delay between when the TaskTrackers stop heartbeating back and when the JobTracker marks them dead. This is controlled by the mapred.tasktracker.expiry.interval parameter (I'm assuming you are using Hadoop 1.x)

HTH
Ravi

On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com> wrote:

Hi, Ravi,

Thank you for the reply

Actually I'm not running HDFS on EC2, instead I use S3 to store data

I'm curious about that, if some nodes are decommissioned, the JobTracker will deal those tasks which originally ran on them as "too slow" (since no progress for a long time) so to run speculative execution OR it directly treats them as "belonging to a running Job and ran on a dead TaskTracker"?

Best,

Nan

On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:

Hi Nan!
>
>
>Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else). 
>
>
>
>HTH
>Ravi
>
>
>
>
>On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com> wrote:
> 
>Hi, all
>
>
>I’m running a Hadoop cluster on AWS EC2, 
>
>
>I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this? 
>
>
>E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)
>
>
>I cannot use EMR, since I’m running a customized version of Hadoop 
>
>
>Best,
>
>
>-- 
>Nan Zhu
>School of Computer Science,
>McGill University
>
>
>
>
>
>

-- 

Nan Zhu
School of Computer Science, 
McGill University
E-Mail: zhunanmcgill@gmail.com

Re: dynamically resizing the Hadoop cluster?

Posted by Ravi Prakash <ra...@ymail.com>.

Hi Nan!

If the task trackers stop heartbeating back to the JobTracker, the JobTracker will mark them as dead and reschedule the tasks which were running on that TaskTracker. Admittedly there is some delay between when the TaskTrackers stop heartbeating back and when the JobTracker marks them dead. This is controlled by the mapred.tasktracker.expiry.interval parameter (I'm assuming you are using Hadoop 1.x)

HTH
Ravi

On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com> wrote:

Hi, Ravi,

Thank you for the reply

Actually I'm not running HDFS on EC2, instead I use S3 to store data

I'm curious about that, if some nodes are decommissioned, the JobTracker will deal those tasks which originally ran on them as "too slow" (since no progress for a long time) so to run speculative execution OR it directly treats them as "belonging to a running Job and ran on a dead TaskTracker"?

Best,

Nan

On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:

Hi Nan!
>
>
>Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else). 
>
>
>
>HTH
>Ravi
>
>
>
>
>On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com> wrote:
> 
>Hi, all
>
>
>I’m running a Hadoop cluster on AWS EC2, 
>
>
>I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this? 
>
>
>E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)
>
>
>I cannot use EMR, since I’m running a customized version of Hadoop 
>
>
>Best,
>
>
>-- 
>Nan Zhu
>School of Computer Science,
>McGill University
>
>
>
>
>
>

-- 

Nan Zhu
School of Computer Science, 
McGill University
E-Mail: zhunanmcgill@gmail.com

Re: dynamically resizing the Hadoop cluster?

Posted by Ravi Prakash <ra...@ymail.com>.

Hi Nan!

If the task trackers stop heartbeating back to the JobTracker, the JobTracker will mark them as dead and reschedule the tasks which were running on that TaskTracker. Admittedly there is some delay between when the TaskTrackers stop heartbeating back and when the JobTracker marks them dead. This is controlled by the mapred.tasktracker.expiry.interval parameter (I'm assuming you are using Hadoop 1.x)

HTH
Ravi

On Thursday, October 24, 2013 1:21 PM, Nan Zhu <zh...@gmail.com> wrote:

Hi, Ravi,

Thank you for the reply

Actually I'm not running HDFS on EC2, instead I use S3 to store data

I'm curious about that, if some nodes are decommissioned, the JobTracker will deal those tasks which originally ran on them as "too slow" (since no progress for a long time) so to run speculative execution OR it directly treats them as "belonging to a running Job and ran on a dead TaskTracker"?

Best,

Nan

On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:

Hi Nan!
>
>
>Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else). 
>
>
>
>HTH
>Ravi
>
>
>
>
>On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com> wrote:
> 
>Hi, all
>
>
>I’m running a Hadoop cluster on AWS EC2, 
>
>
>I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this? 
>
>
>E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)
>
>
>I cannot use EMR, since I’m running a customized version of Hadoop 
>
>
>Best,
>
>
>-- 
>Nan Zhu
>School of Computer Science,
>McGill University
>
>
>
>
>
>

-- 

Nan Zhu
School of Computer Science, 
McGill University
E-Mail: zhunanmcgill@gmail.com

Re: dynamically resizing the Hadoop cluster?

Posted by Nan Zhu <zh...@gmail.com>.

Hi, Ravi,

Thank you for the reply

Actually I'm not running HDFS on EC2, instead I use S3 to store data

I'm curious about that, if some nodes are decommissioned, the JobTracker
will deal those tasks which originally ran on them as "too slow" (since no
progress for a long time) so to run speculative execution OR it directly
treats them as "belonging to a running Job and ran on a dead TaskTracker"?

Best,

Nan






On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Nan!
>
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
>
> HTH
> Ravi
>
>
>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>   Hi, all
>
> I’m running a Hadoop cluster on AWS EC2,
>
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
>
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
>
> I cannot use EMR, since I’m running a customized version of Hadoop
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
>
>
>


-- 
Nan Zhu
School of Computer Science,
McGill University
E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>

Re: dynamically resizing the Hadoop cluster?

Posted by Nan Zhu <zh...@gmail.com>.

Hi, Ravi,

Thank you for the reply

Actually I'm not running HDFS on EC2, instead I use S3 to store data

I'm curious about that, if some nodes are decommissioned, the JobTracker
will deal those tasks which originally ran on them as "too slow" (since no
progress for a long time) so to run speculative execution OR it directly
treats them as "belonging to a running Job and ran on a dead TaskTracker"?

Best,

Nan






On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Nan!
>
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
>
> HTH
> Ravi
>
>
>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>   Hi, all
>
> I’m running a Hadoop cluster on AWS EC2,
>
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
>
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
>
> I cannot use EMR, since I’m running a customized version of Hadoop
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
>
>
>


-- 
Nan Zhu
School of Computer Science,
McGill University
E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>

Re: dynamically resizing the Hadoop cluster?

Posted by Nan Zhu <zh...@gmail.com>.

Hi, Ravi,

Thank you for the reply

Actually I'm not running HDFS on EC2, instead I use S3 to store data

I'm curious about that, if some nodes are decommissioned, the JobTracker
will deal those tasks which originally ran on them as "too slow" (since no
progress for a long time) so to run speculative execution OR it directly
treats them as "belonging to a running Job and ran on a dead TaskTracker"?

Best,

Nan






On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Nan!
>
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
>
> HTH
> Ravi
>
>
>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>   Hi, all
>
> I’m running a Hadoop cluster on AWS EC2,
>
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
>
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
>
> I cannot use EMR, since I’m running a customized version of Hadoop
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
>
>
>


-- 
Nan Zhu
School of Computer Science,
McGill University
E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>

Re: dynamically resizing the Hadoop cluster?

Posted by Nan Zhu <zh...@gmail.com>.

Hi, Ravi,

Thank you for the reply

Actually I'm not running HDFS on EC2, instead I use S3 to store data

I'm curious about that, if some nodes are decommissioned, the JobTracker
will deal those tasks which originally ran on them as "too slow" (since no
progress for a long time) so to run speculative execution OR it directly
treats them as "belonging to a running Job and ran on a dead TaskTracker"?

Best,

Nan






On Thu, Oct 24, 2013 at 2:04 PM, Ravi Prakash <ra...@ymail.com> wrote:

> Hi Nan!
>
> Usually nodes are decommissioned slowly over some period of time so as not
> to disrupt the running jobs. When a node is decommissioned, the NameNode
> must re-replicate all under-replicated blocks. Rather than suddenly remove
> half the nodes, you might want to take a few nodes offline at a time.
> Hadoop should be able to handle rescheduling tasks on nodes no longer
> available (even without speculative execution. Speculative execution is for
> something else).
>
> HTH
> Ravi
>
>
>   On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com>
> wrote:
>   Hi, all
>
> I’m running a Hadoop cluster on AWS EC2,
>
> I would like to dynamically resizing the cluster so as to reduce the cost,
> is there any solution to achieve this?
>
> E.g. I would like to cut the cluster size with a half, is it safe to just
> shutdown the instances (if some tasks are just running on them, can I rely
> on the speculative execution to re-run them on the other nodes?)
>
> I cannot use EMR, since I’m running a customized version of Hadoop
>
> Best,
>
> --
> Nan Zhu
> School of Computer Science,
> McGill University
>
>
>
>
>


-- 
Nan Zhu
School of Computer Science,
McGill University
E-Mail: zhunanmcgill@gmail.com <zh...@gmail.com>

Re: dynamically resizing the Hadoop cluster?

Posted by Ravi Prakash <ra...@ymail.com>.

Hi Nan!

Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else). 


HTH
Ravi




On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com> wrote:
 
Hi, all

I’m running a Hadoop cluster on AWS EC2, 

I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this? 

E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)

I cannot use EMR, since I’m running a customized version of Hadoop 

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University

Re: dynamically resizing the Hadoop cluster?

Posted by Ravi Prakash <ra...@ymail.com>.

Hi Nan!

Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else). 


HTH
Ravi




On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com> wrote:
 
Hi, all

I’m running a Hadoop cluster on AWS EC2, 

I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this? 

E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)

I cannot use EMR, since I’m running a customized version of Hadoop 

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University

Re: dynamically resizing the Hadoop cluster?

Posted by Ravi Prakash <ra...@ymail.com>.

Hi Nan!

Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else). 


HTH
Ravi




On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com> wrote:
 
Hi, all

I’m running a Hadoop cluster on AWS EC2, 

I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this? 

E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)

I cannot use EMR, since I’m running a customized version of Hadoop 

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University

Re: dynamically resizing the Hadoop cluster?

Posted by Ravi Prakash <ra...@ymail.com>.

Hi Nan!

Usually nodes are decommissioned slowly over some period of time so as not to disrupt the running jobs. When a node is decommissioned, the NameNode must re-replicate all under-replicated blocks. Rather than suddenly remove half the nodes, you might want to take a few nodes offline at a time. Hadoop should be able to handle rescheduling tasks on nodes no longer available (even without speculative execution. Speculative execution is for something else). 


HTH
Ravi




On Wednesday, October 23, 2013 10:26 PM, Nan Zhu <zh...@gmail.com> wrote:
 
Hi, all

I’m running a Hadoop cluster on AWS EC2, 

I would like to dynamically resizing the cluster so as to reduce the cost, is there any solution to achieve this? 

E.g. I would like to cut the cluster size with a half, is it safe to just shutdown the instances (if some tasks are just running on them, can I rely on the speculative execution to re-run them on the other nodes?)

I cannot use EMR, since I’m running a customized version of Hadoop 

Best,

-- 
Nan Zhu
School of Computer Science,
McGill University