You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mapreduce-user@hadoop.apache.org by Sundeep Kambhampati <ka...@cse.ohio-state.edu> on 2013/07/18 19:02:50 UTC
Fault tolerance and Speculative Execution
Hi all,
Is it true that Hadoop 'always' starts same map tasks multiple times in
order to be fault tolerant. i.e. same task is launched on several
machines so that even if a node fails then same task would be available
on other node. And in case no node fails redundant task that finishes
late is killed. If it is true how can I change that configuration for
Hadoop to do it or not do it.
Speculative execution on the other hand does what I explained above
(redundant map tasks) but only after all the map tasks are scheduled and
if some nodes are free it starts redundant map tasks for those which are
running slow. Is it always true? How do change this configuration
enable/disable.
I am using Hadoop-1.1.2 incase version matters.
I really appreciate if someone could help me with this. Thank you.
Regards
Sundeep
Re: Fault tolerance and Speculative Execution
Posted by Harsh J <ha...@cloudera.com>.
The finer grained controls are available in 2.x MR2 today. Pretty sure
they weren't deprecated/removed, but may have been replaced by better
names.
On Thu, Jul 18, 2013 at 11:07 PM, German Florez-Larrahondo
<ge...@samsung.com> wrote:
> Also, a simple explanation of how speculative execution works and what are
> the key settings can be found here:
> http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit
> ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD
> wQ6AEwAA
>
> In addition, there used to be other parameters (slownodethreshold,
> slowtaskthreshold & speculativecap)
> http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html
> but I believe they were deprecated...
>
> Regards
> German
> ./g
>
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Thursday, July 18, 2013 12:11 PM
> To: <us...@hadoop.apache.org>
> Subject: Re: Fault tolerance and Speculative Execution
>
> What you describe in the first paragraph is not true.
>
> Speculative execution API toggles are listed in the documentation:
> http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
> and in the mapred-default page in property form:
> http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
> execution is enabled by default.
>
> On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
> <ka...@cse.ohio-state.edu> wrote:
>> Hi all,
>> Is it true that Hadoop 'always' starts same map tasks multiple times
>> in order to be fault tolerant. i.e. same task is launched on several
>> machines so that even if a node fails then same task would be
>> available on other node. And in case no node fails redundant task that
> finishes late is killed.
>> If it is true how can I change that configuration for Hadoop to do it
>> or not do it.
>>
>> Speculative execution on the other hand does what I explained above
>> (redundant map tasks) but only after all the map tasks are scheduled
>> and if some nodes are free it starts redundant map tasks for those
>> which are running slow. Is it always true? How do change this
>> configuration enable/disable.
>>
>> I am using Hadoop-1.1.2 incase version matters.
>>
>> I really appreciate if someone could help me with this. Thank you.
>>
>> Regards
>> Sundeep
>>
>>
>
>
>
> --
> Harsh J
>
--
Harsh J
Re: Fault tolerance and Speculative Execution
Posted by Harsh J <ha...@cloudera.com>.
The finer grained controls are available in 2.x MR2 today. Pretty sure
they weren't deprecated/removed, but may have been replaced by better
names.
On Thu, Jul 18, 2013 at 11:07 PM, German Florez-Larrahondo
<ge...@samsung.com> wrote:
> Also, a simple explanation of how speculative execution works and what are
> the key settings can be found here:
> http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit
> ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD
> wQ6AEwAA
>
> In addition, there used to be other parameters (slownodethreshold,
> slowtaskthreshold & speculativecap)
> http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html
> but I believe they were deprecated...
>
> Regards
> German
> ./g
>
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Thursday, July 18, 2013 12:11 PM
> To: <us...@hadoop.apache.org>
> Subject: Re: Fault tolerance and Speculative Execution
>
> What you describe in the first paragraph is not true.
>
> Speculative execution API toggles are listed in the documentation:
> http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
> and in the mapred-default page in property form:
> http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
> execution is enabled by default.
>
> On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
> <ka...@cse.ohio-state.edu> wrote:
>> Hi all,
>> Is it true that Hadoop 'always' starts same map tasks multiple times
>> in order to be fault tolerant. i.e. same task is launched on several
>> machines so that even if a node fails then same task would be
>> available on other node. And in case no node fails redundant task that
> finishes late is killed.
>> If it is true how can I change that configuration for Hadoop to do it
>> or not do it.
>>
>> Speculative execution on the other hand does what I explained above
>> (redundant map tasks) but only after all the map tasks are scheduled
>> and if some nodes are free it starts redundant map tasks for those
>> which are running slow. Is it always true? How do change this
>> configuration enable/disable.
>>
>> I am using Hadoop-1.1.2 incase version matters.
>>
>> I really appreciate if someone could help me with this. Thank you.
>>
>> Regards
>> Sundeep
>>
>>
>
>
>
> --
> Harsh J
>
--
Harsh J
Re: Fault tolerance and Speculative Execution
Posted by Harsh J <ha...@cloudera.com>.
The finer grained controls are available in 2.x MR2 today. Pretty sure
they weren't deprecated/removed, but may have been replaced by better
names.
On Thu, Jul 18, 2013 at 11:07 PM, German Florez-Larrahondo
<ge...@samsung.com> wrote:
> Also, a simple explanation of how speculative execution works and what are
> the key settings can be found here:
> http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit
> ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD
> wQ6AEwAA
>
> In addition, there used to be other parameters (slownodethreshold,
> slowtaskthreshold & speculativecap)
> http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html
> but I believe they were deprecated...
>
> Regards
> German
> ./g
>
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Thursday, July 18, 2013 12:11 PM
> To: <us...@hadoop.apache.org>
> Subject: Re: Fault tolerance and Speculative Execution
>
> What you describe in the first paragraph is not true.
>
> Speculative execution API toggles are listed in the documentation:
> http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
> and in the mapred-default page in property form:
> http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
> execution is enabled by default.
>
> On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
> <ka...@cse.ohio-state.edu> wrote:
>> Hi all,
>> Is it true that Hadoop 'always' starts same map tasks multiple times
>> in order to be fault tolerant. i.e. same task is launched on several
>> machines so that even if a node fails then same task would be
>> available on other node. And in case no node fails redundant task that
> finishes late is killed.
>> If it is true how can I change that configuration for Hadoop to do it
>> or not do it.
>>
>> Speculative execution on the other hand does what I explained above
>> (redundant map tasks) but only after all the map tasks are scheduled
>> and if some nodes are free it starts redundant map tasks for those
>> which are running slow. Is it always true? How do change this
>> configuration enable/disable.
>>
>> I am using Hadoop-1.1.2 incase version matters.
>>
>> I really appreciate if someone could help me with this. Thank you.
>>
>> Regards
>> Sundeep
>>
>>
>
>
>
> --
> Harsh J
>
--
Harsh J
Re: Fault tolerance and Speculative Execution
Posted by Harsh J <ha...@cloudera.com>.
The finer grained controls are available in 2.x MR2 today. Pretty sure
they weren't deprecated/removed, but may have been replaced by better
names.
On Thu, Jul 18, 2013 at 11:07 PM, German Florez-Larrahondo
<ge...@samsung.com> wrote:
> Also, a simple explanation of how speculative execution works and what are
> the key settings can be found here:
> http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit
> ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD
> wQ6AEwAA
>
> In addition, there used to be other parameters (slownodethreshold,
> slowtaskthreshold & speculativecap)
> http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html
> but I believe they were deprecated...
>
> Regards
> German
> ./g
>
>
> -----Original Message-----
> From: Harsh J [mailto:harsh@cloudera.com]
> Sent: Thursday, July 18, 2013 12:11 PM
> To: <us...@hadoop.apache.org>
> Subject: Re: Fault tolerance and Speculative Execution
>
> What you describe in the first paragraph is not true.
>
> Speculative execution API toggles are listed in the documentation:
> http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
> and in the mapred-default page in property form:
> http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
> execution is enabled by default.
>
> On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
> <ka...@cse.ohio-state.edu> wrote:
>> Hi all,
>> Is it true that Hadoop 'always' starts same map tasks multiple times
>> in order to be fault tolerant. i.e. same task is launched on several
>> machines so that even if a node fails then same task would be
>> available on other node. And in case no node fails redundant task that
> finishes late is killed.
>> If it is true how can I change that configuration for Hadoop to do it
>> or not do it.
>>
>> Speculative execution on the other hand does what I explained above
>> (redundant map tasks) but only after all the map tasks are scheduled
>> and if some nodes are free it starts redundant map tasks for those
>> which are running slow. Is it always true? How do change this
>> configuration enable/disable.
>>
>> I am using Hadoop-1.1.2 incase version matters.
>>
>> I really appreciate if someone could help me with this. Thank you.
>>
>> Regards
>> Sundeep
>>
>>
>
>
>
> --
> Harsh J
>
--
Harsh J
RE: Fault tolerance and Speculative Execution
Posted by German Florez-Larrahondo <ge...@samsung.com>.
Also, a simple explanation of how speculative execution works and what are
the key settings can be found here:
http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit
ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD
wQ6AEwAA
In addition, there used to be other parameters (slownodethreshold,
slowtaskthreshold & speculativecap)
http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html
but I believe they were deprecated...
Regards
German
./g
-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Thursday, July 18, 2013 12:11 PM
To: <us...@hadoop.apache.org>
Subject: Re: Fault tolerance and Speculative Execution
What you describe in the first paragraph is not true.
Speculative execution API toggles are listed in the documentation:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
and in the mapred-default page in property form:
http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
execution is enabled by default.
On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
<ka...@cse.ohio-state.edu> wrote:
> Hi all,
> Is it true that Hadoop 'always' starts same map tasks multiple times
> in order to be fault tolerant. i.e. same task is launched on several
> machines so that even if a node fails then same task would be
> available on other node. And in case no node fails redundant task that
finishes late is killed.
> If it is true how can I change that configuration for Hadoop to do it
> or not do it.
>
> Speculative execution on the other hand does what I explained above
> (redundant map tasks) but only after all the map tasks are scheduled
> and if some nodes are free it starts redundant map tasks for those
> which are running slow. Is it always true? How do change this
> configuration enable/disable.
>
> I am using Hadoop-1.1.2 incase version matters.
>
> I really appreciate if someone could help me with this. Thank you.
>
> Regards
> Sundeep
>
>
--
Harsh J
RE: Fault tolerance and Speculative Execution
Posted by German Florez-Larrahondo <ge...@samsung.com>.
Also, a simple explanation of how speculative execution works and what are
the key settings can be found here:
http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit
ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD
wQ6AEwAA
In addition, there used to be other parameters (slownodethreshold,
slowtaskthreshold & speculativecap)
http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html
but I believe they were deprecated...
Regards
German
./g
-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Thursday, July 18, 2013 12:11 PM
To: <us...@hadoop.apache.org>
Subject: Re: Fault tolerance and Speculative Execution
What you describe in the first paragraph is not true.
Speculative execution API toggles are listed in the documentation:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
and in the mapred-default page in property form:
http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
execution is enabled by default.
On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
<ka...@cse.ohio-state.edu> wrote:
> Hi all,
> Is it true that Hadoop 'always' starts same map tasks multiple times
> in order to be fault tolerant. i.e. same task is launched on several
> machines so that even if a node fails then same task would be
> available on other node. And in case no node fails redundant task that
finishes late is killed.
> If it is true how can I change that configuration for Hadoop to do it
> or not do it.
>
> Speculative execution on the other hand does what I explained above
> (redundant map tasks) but only after all the map tasks are scheduled
> and if some nodes are free it starts redundant map tasks for those
> which are running slow. Is it always true? How do change this
> configuration enable/disable.
>
> I am using Hadoop-1.1.2 incase version matters.
>
> I really appreciate if someone could help me with this. Thank you.
>
> Regards
> Sundeep
>
>
--
Harsh J
RE: Fault tolerance and Speculative Execution
Posted by German Florez-Larrahondo <ge...@samsung.com>.
Also, a simple explanation of how speculative execution works and what are
the key settings can be found here:
http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit
ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD
wQ6AEwAA
In addition, there used to be other parameters (slownodethreshold,
slowtaskthreshold & speculativecap)
http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html
but I believe they were deprecated...
Regards
German
./g
-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Thursday, July 18, 2013 12:11 PM
To: <us...@hadoop.apache.org>
Subject: Re: Fault tolerance and Speculative Execution
What you describe in the first paragraph is not true.
Speculative execution API toggles are listed in the documentation:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
and in the mapred-default page in property form:
http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
execution is enabled by default.
On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
<ka...@cse.ohio-state.edu> wrote:
> Hi all,
> Is it true that Hadoop 'always' starts same map tasks multiple times
> in order to be fault tolerant. i.e. same task is launched on several
> machines so that even if a node fails then same task would be
> available on other node. And in case no node fails redundant task that
finishes late is killed.
> If it is true how can I change that configuration for Hadoop to do it
> or not do it.
>
> Speculative execution on the other hand does what I explained above
> (redundant map tasks) but only after all the map tasks are scheduled
> and if some nodes are free it starts redundant map tasks for those
> which are running slow. Is it always true? How do change this
> configuration enable/disable.
>
> I am using Hadoop-1.1.2 incase version matters.
>
> I really appreciate if someone could help me with this. Thank you.
>
> Regards
> Sundeep
>
>
--
Harsh J
RE: Fault tolerance and Speculative Execution
Posted by German Florez-Larrahondo <ge...@samsung.com>.
Also, a simple explanation of how speculative execution works and what are
the key settings can be found here:
http://books.google.com/books?id=Wu_xeGdU4G8C&pg=PA216&dq=hadoop+the+definit
ive+guide+speculative+execution&hl=en&sa=X&ei=lyLoUd7bDojk9gTC1YDIBA&ved=0CD
wQ6AEwAA
In addition, there used to be other parameters (slownodethreshold,
slowtaskthreshold & speculativecap)
http://coffee2idea.blogspot.com/2011/11/hadoop-speculative-execution.html
but I believe they were deprecated...
Regards
German
./g
-----Original Message-----
From: Harsh J [mailto:harsh@cloudera.com]
Sent: Thursday, July 18, 2013 12:11 PM
To: <us...@hadoop.apache.org>
Subject: Re: Fault tolerance and Speculative Execution
What you describe in the first paragraph is not true.
Speculative execution API toggles are listed in the documentation:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
and in the mapred-default page in property form:
http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
execution is enabled by default.
On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
<ka...@cse.ohio-state.edu> wrote:
> Hi all,
> Is it true that Hadoop 'always' starts same map tasks multiple times
> in order to be fault tolerant. i.e. same task is launched on several
> machines so that even if a node fails then same task would be
> available on other node. And in case no node fails redundant task that
finishes late is killed.
> If it is true how can I change that configuration for Hadoop to do it
> or not do it.
>
> Speculative execution on the other hand does what I explained above
> (redundant map tasks) but only after all the map tasks are scheduled
> and if some nodes are free it starts redundant map tasks for those
> which are running slow. Is it always true? How do change this
> configuration enable/disable.
>
> I am using Hadoop-1.1.2 incase version matters.
>
> I really appreciate if someone could help me with this. Thank you.
>
> Regards
> Sundeep
>
>
--
Harsh J
Re: Fault tolerance and Speculative Execution
Posted by Harsh J <ha...@cloudera.com>.
What you describe in the first paragraph is not true.
Speculative execution API toggles are listed in the documentation:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
and in the mapred-default page in property form:
http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
execution is enabled by default.
On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
<ka...@cse.ohio-state.edu> wrote:
> Hi all,
> Is it true that Hadoop 'always' starts same map tasks multiple times in
> order to be fault tolerant. i.e. same task is launched on several machines
> so that even if a node fails then same task would be available on other
> node. And in case no node fails redundant task that finishes late is killed.
> If it is true how can I change that configuration for Hadoop to do it or not
> do it.
>
> Speculative execution on the other hand does what I explained above
> (redundant map tasks) but only after all the map tasks are scheduled and if
> some nodes are free it starts redundant map tasks for those which are
> running slow. Is it always true? How do change this configuration
> enable/disable.
>
> I am using Hadoop-1.1.2 incase version matters.
>
> I really appreciate if someone could help me with this. Thank you.
>
> Regards
> Sundeep
>
>
--
Harsh J
Re: Fault tolerance and Speculative Execution
Posted by Harsh J <ha...@cloudera.com>.
What you describe in the first paragraph is not true.
Speculative execution API toggles are listed in the documentation:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
and in the mapred-default page in property form:
http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
execution is enabled by default.
On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
<ka...@cse.ohio-state.edu> wrote:
> Hi all,
> Is it true that Hadoop 'always' starts same map tasks multiple times in
> order to be fault tolerant. i.e. same task is launched on several machines
> so that even if a node fails then same task would be available on other
> node. And in case no node fails redundant task that finishes late is killed.
> If it is true how can I change that configuration for Hadoop to do it or not
> do it.
>
> Speculative execution on the other hand does what I explained above
> (redundant map tasks) but only after all the map tasks are scheduled and if
> some nodes are free it starts redundant map tasks for those which are
> running slow. Is it always true? How do change this configuration
> enable/disable.
>
> I am using Hadoop-1.1.2 incase version matters.
>
> I really appreciate if someone could help me with this. Thank you.
>
> Regards
> Sundeep
>
>
--
Harsh J
Re: Fault tolerance and Speculative Execution
Posted by Harsh J <ha...@cloudera.com>.
What you describe in the first paragraph is not true.
Speculative execution API toggles are listed in the documentation:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
and in the mapred-default page in property form:
http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
execution is enabled by default.
On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
<ka...@cse.ohio-state.edu> wrote:
> Hi all,
> Is it true that Hadoop 'always' starts same map tasks multiple times in
> order to be fault tolerant. i.e. same task is launched on several machines
> so that even if a node fails then same task would be available on other
> node. And in case no node fails redundant task that finishes late is killed.
> If it is true how can I change that configuration for Hadoop to do it or not
> do it.
>
> Speculative execution on the other hand does what I explained above
> (redundant map tasks) but only after all the map tasks are scheduled and if
> some nodes are free it starts redundant map tasks for those which are
> running slow. Is it always true? How do change this configuration
> enable/disable.
>
> I am using Hadoop-1.1.2 incase version matters.
>
> I really appreciate if someone could help me with this. Thank you.
>
> Regards
> Sundeep
>
>
--
Harsh J
Re: Fault tolerance and Speculative Execution
Posted by Harsh J <ha...@cloudera.com>.
What you describe in the first paragraph is not true.
Speculative execution API toggles are listed in the documentation:
http://hadoop.apache.org/docs/stable/mapred_tutorial.html#Job+Configuration
and in the mapred-default page in property form:
http://hadoop.apache.org/docs/stable/mapred-default.html. Speculative
execution is enabled by default.
On Thu, Jul 18, 2013 at 10:32 PM, Sundeep Kambhampati
<ka...@cse.ohio-state.edu> wrote:
> Hi all,
> Is it true that Hadoop 'always' starts same map tasks multiple times in
> order to be fault tolerant. i.e. same task is launched on several machines
> so that even if a node fails then same task would be available on other
> node. And in case no node fails redundant task that finishes late is killed.
> If it is true how can I change that configuration for Hadoop to do it or not
> do it.
>
> Speculative execution on the other hand does what I explained above
> (redundant map tasks) but only after all the map tasks are scheduled and if
> some nodes are free it starts redundant map tasks for those which are
> running slow. Is it always true? How do change this configuration
> enable/disable.
>
> I am using Hadoop-1.1.2 incase version matters.
>
> I really appreciate if someone could help me with this. Thank you.
>
> Regards
> Sundeep
>
>
--
Harsh J