You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Da Zheng <zh...@gmail.com> on 2010/11/28 08:40:13 UTC
delay the execution of reducers
Hello,
I found in Hadoop that reducers starts when a fraction of the number of mappers
is complete. However, in my case, I hope reducers to start only when all mappers
are complete. I searched for Hadoop configuration parameters, and found
mapred.reduce.slowstart.completed.maps, which seems to do what I want. But no
matter what value (0.99, 1.00, etc) I set to
mapred.reduce.slowstart.completed.maps, reducers always start to execute when
about 10% of mappers are complete.
Do I set the right parameter? Is there any other parameter I can use for this
purpose?
Thanks,
Da
Re: delay the execution of reducers
Posted by Hemanth Yamijala <yh...@gmail.com>.
Hi,
> Changing the parameter for a specific job works better for me.
>
> But I was asking in general in which configuration file(s) should I change
> the value of the parameters.
> For parameters in hdfs-site.xml, I should changes the configuration file in
> each machine. But for parameters in mapred-site.xml, it seems enough to
> change the configuration file in the machine where the job is launched
Ideally, if you knew which processes need to read the configuration
value, you can set it in the configuration files on nodes running
those processes. For instance, you knew a parameter is only required
on the NameNode, you can set it in the hdfs-site.xml on the NameNode
and so on. If in doubt though, it almost always helps to set the same
value in the configuration files on all nodes.
Thanks
Hemanth
> Thanks,
> Da
>
> On 11/29/2010 01:31 PM, Arun C Murthy wrote:
>>
>> Just set it for you job.
>>
>> In your launching program do something like:
>>
>> jobConf.setFloat("mapred.reduce.slowstart.completed.maps", 0.5);
>>
>> On Nov 29, 2010, at 9:46 AM, Da Zheng wrote:
>>
>>> On 11/29/2010 05:42 AM, Chandraprakash Bhagtani wrote:
>>>>
>>>> you can see whether your property is in effect by looking at the
>>>> following
>>>> URL
>>>> http://<jobtracker-host>:50030/jobconf.jsp?jobid=<job-id>
>>>>
>>>> replace<jobtracker-host> with your jobtracker ip and<job-id> with the
>>>> running job
>>>>
>>>> have you restarted mapreduce after changing mapred-site.xml?
>>>>
>>> It shows me the value is still 0.05. I am a little confused. Since
>>> hadoop in each machine has configuration files, which configuration
>>> files should I change? For mapred-site.xml, I only need to change the
>>> one in the master node? (I always start my MapReduce program from the
>>> master node). What about other configuration files such as core-site.xml
>>> and hdfs-site.xml? I guess I have to change them on all machines in the
>>> cluster.
>>>
>>> Thanks,
>>> Da
>>
>
>
Re: delay the execution of reducers
Posted by Da Zheng <zh...@gmail.com>.
Changing the parameter for a specific job works better for me.
But I was asking in general in which configuration file(s) should I
change the value of the parameters.
For parameters in hdfs-site.xml, I should changes the configuration file
in each machine. But for parameters in mapred-site.xml, it seems enough
to change the configuration file in the machine where the job is launched
Thanks,
Da
On 11/29/2010 01:31 PM, Arun C Murthy wrote:
> Just set it for you job.
>
> In your launching program do something like:
>
> jobConf.setFloat("mapred.reduce.slowstart.completed.maps", 0.5);
>
> On Nov 29, 2010, at 9:46 AM, Da Zheng wrote:
>
>> On 11/29/2010 05:42 AM, Chandraprakash Bhagtani wrote:
>>> you can see whether your property is in effect by looking at the
>>> following
>>> URL
>>> http://<jobtracker-host>:50030/jobconf.jsp?jobid=<job-id>
>>>
>>> replace<jobtracker-host> with your jobtracker ip and<job-id> with the
>>> running job
>>>
>>> have you restarted mapreduce after changing mapred-site.xml?
>>>
>> It shows me the value is still 0.05. I am a little confused. Since
>> hadoop in each machine has configuration files, which configuration
>> files should I change? For mapred-site.xml, I only need to change the
>> one in the master node? (I always start my MapReduce program from the
>> master node). What about other configuration files such as core-site.xml
>> and hdfs-site.xml? I guess I have to change them on all machines in the
>> cluster.
>>
>> Thanks,
>> Da
>
Re: delay the execution of reducers
Posted by Arun C Murthy <ac...@yahoo-inc.com>.
Just set it for you job.
In your launching program do something like:
jobConf.setFloat("mapred.reduce.slowstart.completed.maps", 0.5);
On Nov 29, 2010, at 9:46 AM, Da Zheng wrote:
> On 11/29/2010 05:42 AM, Chandraprakash Bhagtani wrote:
>> you can see whether your property is in effect by looking at the
>> following
>> URL
>> http://<jobtracker-host>:50030/jobconf.jsp?jobid=<job-id>
>>
>> replace<jobtracker-host> with your jobtracker ip and<job-id> with
>> the
>> running job
>>
>> have you restarted mapreduce after changing mapred-site.xml?
>>
> It shows me the value is still 0.05. I am a little confused. Since
> hadoop in each machine has configuration files, which configuration
> files should I change? For mapred-site.xml, I only need to change the
> one in the master node? (I always start my MapReduce program from the
> master node). What about other configuration files such as core-
> site.xml
> and hdfs-site.xml? I guess I have to change them on all machines in
> the
> cluster.
>
> Thanks,
> Da
Re: delay the execution of reducers
Posted by Da Zheng <zh...@gmail.com>.
On 11/29/2010 05:42 AM, Chandraprakash Bhagtani wrote:
> you can see whether your property is in effect by looking at the following
> URL
> http://<jobtracker-host>:50030/jobconf.jsp?jobid=<job-id>
>
> replace<jobtracker-host> with your jobtracker ip and<job-id> with the
> running job
>
> have you restarted mapreduce after changing mapred-site.xml?
>
It shows me the value is still 0.05. I am a little confused. Since
hadoop in each machine has configuration files, which configuration
files should I change? For mapred-site.xml, I only need to change the
one in the master node? (I always start my MapReduce program from the
master node). What about other configuration files such as core-site.xml
and hdfs-site.xml? I guess I have to change them on all machines in the
cluster.
Thanks,
Da
Re: delay the execution of reducers
Posted by Chandraprakash Bhagtani <cp...@gmail.com>.
you can see whether your property is in effect by looking at the following
URL
http://<jobtracker-host>:50030/jobconf.jsp?jobid=<job-id>
replace <jobtracker-host> with your jobtracker ip and <job-id> with the
running job
have you restarted mapreduce after changing mapred-site.xml?
On Mon, Nov 29, 2010 at 6:56 AM, li ping <li...@gmail.com> wrote:
> org.apache.hadoop.mapred.JobInProgress
>
> Maybe you find this class.
>
> On Mon, Nov 29, 2010 at 4:36 AM, Da Zheng <zh...@gmail.com> wrote:
>
> > I have a problem with subscribing mapreduce mailing list.
> >
> > I use hadoop-0.20.2. I have added this parameter to mapred-site.xml. Is
> > there any way for me to check whether the parameter has been read and
> > activated?
> >
> > BTW, what do you mean by opening a jira?
> >
> > Thanks,
> > Da
> >
> >
> > On 11/28/2010 05:03 AM, Arun C Murthy wrote:
> >
> >> Moving to mapreduce-user@, bcc common-user@. Please use project
> >> specific lists.
> >>
> >> mapreduce.reduce.slowstart.completed.maps is the right knob. Which
> version
> >> of hadoop are you running? If it isn't working, please open a jira.
> Thanks.
> >>
> >> Arun
> >>
> >> On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:
> >>
> >> Hello,
> >>>
> >>> I found in Hadoop that reducers starts when a fraction of the number of
> >>> mappers
> >>> is complete. However, in my case, I hope reducers to start only when
> all
> >>> mappers
> >>> are complete. I searched for Hadoop configuration parameters, and found
> >>> mapred.reduce.slowstart.completed.maps, which seems to do what I want.
> >>> But no
> >>> matter what value (0.99, 1.00, etc) I set to
> >>> mapred.reduce.slowstart.completed.maps, reducers always start to
> execute
> >>> when
> >>> about 10% of mappers are complete.
> >>>
> >>> Do I set the right parameter? Is there any other parameter I can use
> for
> >>> this
> >>> purpose?
> >>>
> >>> Thanks,
> >>> Da
> >>>
> >>
> >>
> >
>
>
> --
> -----李平
>
--
Thanks & Regards,
Chandra Prakash Bhagtani,
Nokia India Pvt. Ltd.
Re: delay the execution of reducers
Posted by li ping <li...@gmail.com>.
org.apache.hadoop.mapred.JobInProgress
Maybe you find this class.
On Mon, Nov 29, 2010 at 4:36 AM, Da Zheng <zh...@gmail.com> wrote:
> I have a problem with subscribing mapreduce mailing list.
>
> I use hadoop-0.20.2. I have added this parameter to mapred-site.xml. Is
> there any way for me to check whether the parameter has been read and
> activated?
>
> BTW, what do you mean by opening a jira?
>
> Thanks,
> Da
>
>
> On 11/28/2010 05:03 AM, Arun C Murthy wrote:
>
>> Moving to mapreduce-user@, bcc common-user@. Please use project
>> specific lists.
>>
>> mapreduce.reduce.slowstart.completed.maps is the right knob. Which version
>> of hadoop are you running? If it isn't working, please open a jira. Thanks.
>>
>> Arun
>>
>> On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:
>>
>> Hello,
>>>
>>> I found in Hadoop that reducers starts when a fraction of the number of
>>> mappers
>>> is complete. However, in my case, I hope reducers to start only when all
>>> mappers
>>> are complete. I searched for Hadoop configuration parameters, and found
>>> mapred.reduce.slowstart.completed.maps, which seems to do what I want.
>>> But no
>>> matter what value (0.99, 1.00, etc) I set to
>>> mapred.reduce.slowstart.completed.maps, reducers always start to execute
>>> when
>>> about 10% of mappers are complete.
>>>
>>> Do I set the right parameter? Is there any other parameter I can use for
>>> this
>>> purpose?
>>>
>>> Thanks,
>>> Da
>>>
>>
>>
>
--
-----李平
Re: delay the execution of reducers
Posted by li ping <li...@gmail.com>.
org.apache.hadoop.mapred.JobInProgress
Maybe you find this class.
On Mon, Nov 29, 2010 at 4:36 AM, Da Zheng <zh...@gmail.com> wrote:
> I have a problem with subscribing mapreduce mailing list.
>
> I use hadoop-0.20.2. I have added this parameter to mapred-site.xml. Is
> there any way for me to check whether the parameter has been read and
> activated?
>
> BTW, what do you mean by opening a jira?
>
> Thanks,
> Da
>
>
> On 11/28/2010 05:03 AM, Arun C Murthy wrote:
>
>> Moving to mapreduce-user@, bcc common-user@. Please use project
>> specific lists.
>>
>> mapreduce.reduce.slowstart.completed.maps is the right knob. Which version
>> of hadoop are you running? If it isn't working, please open a jira. Thanks.
>>
>> Arun
>>
>> On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:
>>
>> Hello,
>>>
>>> I found in Hadoop that reducers starts when a fraction of the number of
>>> mappers
>>> is complete. However, in my case, I hope reducers to start only when all
>>> mappers
>>> are complete. I searched for Hadoop configuration parameters, and found
>>> mapred.reduce.slowstart.completed.maps, which seems to do what I want.
>>> But no
>>> matter what value (0.99, 1.00, etc) I set to
>>> mapred.reduce.slowstart.completed.maps, reducers always start to execute
>>> when
>>> about 10% of mappers are complete.
>>>
>>> Do I set the right parameter? Is there any other parameter I can use for
>>> this
>>> purpose?
>>>
>>> Thanks,
>>> Da
>>>
>>
>>
>
--
-----李平
Re: delay the execution of reducers
Posted by Da Zheng <zh...@gmail.com>.
I have a problem with subscribing mapreduce mailing list.
I use hadoop-0.20.2. I have added this parameter to mapred-site.xml. Is
there any way for me to check whether the parameter has been read and
activated?
BTW, what do you mean by opening a jira?
Thanks,
Da
On 11/28/2010 05:03 AM, Arun C Murthy wrote:
> Moving to mapreduce-user@, bcc common-user@. Please use project
> specific lists.
>
> mapreduce.reduce.slowstart.completed.maps is the right knob. Which
> version of hadoop are you running? If it isn't working, please open a
> jira. Thanks.
>
> Arun
>
> On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:
>
>> Hello,
>>
>> I found in Hadoop that reducers starts when a fraction of the number
>> of mappers
>> is complete. However, in my case, I hope reducers to start only when
>> all mappers
>> are complete. I searched for Hadoop configuration parameters, and found
>> mapred.reduce.slowstart.completed.maps, which seems to do what I
>> want. But no
>> matter what value (0.99, 1.00, etc) I set to
>> mapred.reduce.slowstart.completed.maps, reducers always start to
>> execute when
>> about 10% of mappers are complete.
>>
>> Do I set the right parameter? Is there any other parameter I can use
>> for this
>> purpose?
>>
>> Thanks,
>> Da
>
Re: delay the execution of reducers
Posted by Arun C Murthy <ac...@yahoo-inc.com>.
Moving to mapreduce-user@, bcc common-user@. Please use project
specific lists.
mapreduce.reduce.slowstart.completed.maps is the right knob. Which
version of hadoop are you running? If it isn't working, please open a
jira. Thanks.
Arun
On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:
> Hello,
>
> I found in Hadoop that reducers starts when a fraction of the number
> of mappers
> is complete. However, in my case, I hope reducers to start only when
> all mappers
> are complete. I searched for Hadoop configuration parameters, and
> found
> mapred.reduce.slowstart.completed.maps, which seems to do what I
> want. But no
> matter what value (0.99, 1.00, etc) I set to
> mapred.reduce.slowstart.completed.maps, reducers always start to
> execute when
> about 10% of mappers are complete.
>
> Do I set the right parameter? Is there any other parameter I can use
> for this
> purpose?
>
> Thanks,
> Da
Re: delay the execution of reducers
Posted by Arun C Murthy <ac...@yahoo-inc.com>.
Moving to mapreduce-user@, bcc common-user@. Please use project
specific lists.
mapreduce.reduce.slowstart.completed.maps is the right knob. Which
version of hadoop are you running? If it isn't working, please open a
jira. Thanks.
Arun
On Nov 27, 2010, at 11:40 PM, Da Zheng wrote:
> Hello,
>
> I found in Hadoop that reducers starts when a fraction of the number
> of mappers
> is complete. However, in my case, I hope reducers to start only when
> all mappers
> are complete. I searched for Hadoop configuration parameters, and
> found
> mapred.reduce.slowstart.completed.maps, which seems to do what I
> want. But no
> matter what value (0.99, 1.00, etc) I set to
> mapred.reduce.slowstart.completed.maps, reducers always start to
> execute when
> about 10% of mappers are complete.
>
> Do I set the right parameter? Is there any other parameter I can use
> for this
> purpose?
>
> Thanks,
> Da