You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mrunit.apache.org by Dipesh Khakhkhar <di...@gmail.com> on 2012/11/28 23:59:13 UTC

How to set number of reduce tasks in MRUnit's mocked context object

I'm calling getNumReduceTasks() method in my reducer and in my MRUnit test,
framework has created a mock context object but when I call
getNumReduceTasks it is returning 0.

I have tried setting it to Configuration object too but it didn't help.

How can i set getNumReduceTasks() to return a desired number here?

Thanks.

Re: How to set number of reduce tasks in MRUnit's mocked context object

Posted by Dave Beech <da...@paraliatech.com>.
Ah sorry, it must just be on the trunk and not yet released. My mistake. Watch this space then for a new version soon. 

For now you could build a snapshot from the source, or if you want the reduce task method to be mocked internally, feel free to create a JIRA

cheers
Dave

On 29 Nov 2012, at 22:52, Dipesh Khakhkhar <di...@gmail.com> wrote:

> @Tom 
> 
> Thanks for replying. You have got my use case correctly - I am using this in my reducer.
> 
> I do set number of reducer in my Job and there I'm using abstraction i.e. setNumReduceTasks
> 
>  /**
>    * Set the number of reduce tasks for the job.
>    * @param tasks the number of reduce tasks
>    * @throws IllegalStateException if the job is submitted
>    */
>   public void setNumReduceTasks(int tasks) throws IllegalStateException {
>     ensureState(JobState.DEFINE);
>     conf.setNumReduceTasks(tasks);
>   }
> 
> I'm thinking better not to change my code there as it is working fine and does not depend on any changes in the configuration name (in future hadoop releases). But the value which I'm setting there, I'm able get it in my reducer so I'm changing reducer to use that directly instead of context.getNumReduceTasks(). This resulted in running my unit test correctly.
> 
> Thanks for the suggestion - I only tried to set configuration using both 
> 
> reduceDriver.getConfiguration().set("mapreduce.job.reduces", "10"); 
> reduceDriver.getConfiguration().setInt("mapreduce.job.reduces", 10); 
> 
> Tried the following as well - 
> 
> reduceDriver.getConfiguration().set("mapred.reduce.tasks", "10"); 
> reduceDriver.getConfiguration().setInt("mapred.reduce.tasks", 10); 
> 
> And kept my reducer code same to retrieve it i.e. used context.getNumReduceTasks() and still it gave me 0.
> 
> When I see the actual code path for the above code in Hadoop_1.0.3, it leads to JobConf class and there I see the following definition
> 
> /**
>    * Get configured the number of reduce tasks for this job. Defaults to 
>    * <code>1</code>.
>    * 
>    * @return the number of reduce tasks for this job.
>    */
>   public int getNumReduceTasks() { return getInt("mapred.reduce.tasks", 1); }
> 
> The reason it didn't actually return 1 - ideally it should though because it is being invoked using a mocked context object (Reducer.Context -- derived form Context). Correct?
> 
> So i believe that even mocked context object should return 1 as per the contract of the above call. 
> 
> Please inform me if you would like to file a bug for this.
> 
> @Dave
> 
> Thanks for replying. There is no getContext method for MapDriver class otherwise this would have been much simpler to mock.
> 
> Thanks.
> 
> 
> 
> 
> On Thu, Nov 29, 2012 at 4:16 AM, Dave Beech <da...@paraliatech.com> wrote:
> Hi Dipesh
> 
> The Context in a mrunit test is actually a mock object (created with Mockito). Only some of the methods are set-up internally to provide return values, and getNumReduceTasks isn't one of them. But, you can set this up yourself in test code. 
> 
> e.g.
> Mockito.when(mapDriver.getContext().getNumReduceTasks()).thenReturn(10);
> 
> Cheers,
> Dave
> 
> 
> On 29 November 2012 03:10, Tom Wheeler <tw...@cloudera.com> wrote:
> Hi Dipesh,
> 
> OK, I think I understand what you're saying. I am going to restate it
> just so you'll be sure I've got it.
> 
> Your mapper (or reducer) is trying to check the return value of the
> context.getNumReduceTasks() method, but it's returning 0 in all cases.
>  Although this wouldn't be an issue for most unit tests, your mapper
> is doing some computation on this value so you need MRUnit to return
> something other than 0 so you can test your code.  Does that sound
> right?
> 
> If so, I cannot say offhand whether what you're seeing is a bug or a
> feature that just hasn't been implemented yet.  I think I can offer a
> workaround for you try, though it may be kind of a hack.
> 
> Whenever you call methods like setNumReduceTasks, it's really just a
> convenient way of setting a property that Hadoop interprets.
> According to the Hadoop Streaming guide [1], the corresponding
> property here ought to be num.reduce.tasks.  Therefore, instead of
> checking for getNumReduceTasks() in your mapper code, try checking the
> return value of this:
> 
>     context.getConfiguration().get("mapreduce.job.reduces")
> 
> And then in the setup of your corresponding unit test, set that value
> to whatever you want it to be:
> 
>     mapDriver.getConfiguration().setInt("mapreduce.job.reduces", 1);
> 
> I've verified that the property set this way in MRUnit 0.9.0 is
> returned with the same value, though I didn't verify much beyond that.
> 
> [1] http://hadoop.apache.org/docs/mapreduce/current/streaming.html#Specifying+the+Number+of+Reducers
> 
> On Wed, Nov 28, 2012 at 8:06 PM, Dipesh Khakhkhar
> <di...@gmail.com> wrote:
> > Hi Tom,
> >
> > Thanks for replying. I completely agree with you - there will be only one
> > Reduce task in unit test and when we query the mock object to get number of
> > reduce task it should return 1 instead of zero.
> >
> > I'm using to calculate a custom counter and since mocked Context object
> > returns it 0 my test is failing.
> >
> > Can we set it externally this value using MRUnit 0.9*?
> >
> > Thanks.
> > -Dipesh
> 
> 

Re: How to set number of reduce tasks in MRUnit's mocked context object

Posted by Dipesh Khakhkhar <di...@gmail.com>.
@Tom

Thanks for replying. You have got my use case correctly - I am using this
in my reducer.

I do set number of reducer in my Job and there I'm using abstraction
i.e. setNumReduceTasks

 /**
   * Set the number of reduce tasks for the job.
   * @param tasks the number of reduce tasks
   * @throws IllegalStateException if the job is submitted
   */
  public void setNumReduceTasks(int tasks) throws IllegalStateException {
    ensureState(JobState.DEFINE);
    conf.setNumReduceTasks(tasks);
  }

I'm thinking better not to change my code there as it is working fine and
does not depend on any changes in the configuration name (in future hadoop
releases). But the value which I'm setting there, I'm able get it in my
reducer so I'm changing reducer to use that directly instead of
context.getNumReduceTasks().
This resulted in running my unit test correctly.

Thanks for the suggestion - I only tried to set configuration using both

reduceDriver.getConfiguration().set("mapreduce.job.reduces", "10");
reduceDriver.getConfiguration().setInt("mapreduce.job.reduces", 10);

Tried the following as well -

reduceDriver.getConfiguration().set("mapred.reduce.tasks", "10");
reduceDriver.getConfiguration().setInt("mapred.reduce.tasks", 10);

And kept my reducer code same to retrieve it i.e. used
context.getNumReduceTasks()
and still it gave me 0.

When I see the actual code path for the above code in Hadoop_1.0.3, it
leads to JobConf class and there I see the following definition

/**
   * Get configured the number of reduce tasks for this job. Defaults to
   * <code>1</code>.
   *
   * @return the number of reduce tasks for this job.
   */
  public int getNumReduceTasks() { return getInt("mapred.reduce.tasks", 1);
}

The reason it didn't actually return 1 - ideally it should though because
it is being invoked using a mocked context object (Reducer.Context --
derived form Context). Correct?

So i believe that even mocked context object should return 1 as per the
contract of the above call.

Please inform me if you would like to file a bug for this.

@Dave

Thanks for replying. There is no getContext method for MapDriver class
otherwise this would have been much simpler to mock.

Thanks.




On Thu, Nov 29, 2012 at 4:16 AM, Dave Beech <da...@paraliatech.com> wrote:

> Hi Dipesh
>
> The Context in a mrunit test is actually a mock object (created with
> Mockito). Only some of the methods are set-up internally to provide return
> values, and getNumReduceTasks isn't one of them. But, you can set this up
> yourself in test code.
>
> e.g.
> Mockito.when(mapDriver.getContext().getNumReduceTasks()).thenReturn(10);
>
> Cheers,
> Dave
>
>
> On 29 November 2012 03:10, Tom Wheeler <tw...@cloudera.com> wrote:
>
>> Hi Dipesh,
>>
>> OK, I think I understand what you're saying. I am going to restate it
>> just so you'll be sure I've got it.
>>
>> Your mapper (or reducer) is trying to check the return value of the
>> context.getNumReduceTasks() method, but it's returning 0 in all cases.
>>  Although this wouldn't be an issue for most unit tests, your mapper
>> is doing some computation on this value so you need MRUnit to return
>> something other than 0 so you can test your code.  Does that sound
>> right?
>>
>> If so, I cannot say offhand whether what you're seeing is a bug or a
>> feature that just hasn't been implemented yet.  I think I can offer a
>> workaround for you try, though it may be kind of a hack.
>>
>> Whenever you call methods like setNumReduceTasks, it's really just a
>> convenient way of setting a property that Hadoop interprets.
>> According to the Hadoop Streaming guide [1], the corresponding
>> property here ought to be num.reduce.tasks.  Therefore, instead of
>> checking for getNumReduceTasks() in your mapper code, try checking the
>> return value of this:
>>
>>     context.getConfiguration().get("mapreduce.job.reduces")
>>
>> And then in the setup of your corresponding unit test, set that value
>> to whatever you want it to be:
>>
>>     mapDriver.getConfiguration().setInt("mapreduce.job.reduces", 1);
>>
>> I've verified that the property set this way in MRUnit 0.9.0 is
>> returned with the same value, though I didn't verify much beyond that.
>>
>> [1]
>> http://hadoop.apache.org/docs/mapreduce/current/streaming.html#Specifying+the+Number+of+Reducers
>>
>> On Wed, Nov 28, 2012 at 8:06 PM, Dipesh Khakhkhar
>> <di...@gmail.com> wrote:
>> > Hi Tom,
>> >
>> > Thanks for replying. I completely agree with you - there will be only
>> one
>> > Reduce task in unit test and when we query the mock object to get
>> number of
>> > reduce task it should return 1 instead of zero.
>> >
>> > I'm using to calculate a custom counter and since mocked Context object
>> > returns it 0 my test is failing.
>> >
>> > Can we set it externally this value using MRUnit 0.9*?
>> >
>> > Thanks.
>> > -Dipesh
>>
>
>

Re: How to set number of reduce tasks in MRUnit's mocked context object

Posted by Dave Beech <da...@paraliatech.com>.
Hi Dipesh

The Context in a mrunit test is actually a mock object (created with
Mockito). Only some of the methods are set-up internally to provide return
values, and getNumReduceTasks isn't one of them. But, you can set this up
yourself in test code.

e.g.
Mockito.when(mapDriver.getContext().getNumReduceTasks()).thenReturn(10);

Cheers,
Dave


On 29 November 2012 03:10, Tom Wheeler <tw...@cloudera.com> wrote:

> Hi Dipesh,
>
> OK, I think I understand what you're saying. I am going to restate it
> just so you'll be sure I've got it.
>
> Your mapper (or reducer) is trying to check the return value of the
> context.getNumReduceTasks() method, but it's returning 0 in all cases.
>  Although this wouldn't be an issue for most unit tests, your mapper
> is doing some computation on this value so you need MRUnit to return
> something other than 0 so you can test your code.  Does that sound
> right?
>
> If so, I cannot say offhand whether what you're seeing is a bug or a
> feature that just hasn't been implemented yet.  I think I can offer a
> workaround for you try, though it may be kind of a hack.
>
> Whenever you call methods like setNumReduceTasks, it's really just a
> convenient way of setting a property that Hadoop interprets.
> According to the Hadoop Streaming guide [1], the corresponding
> property here ought to be num.reduce.tasks.  Therefore, instead of
> checking for getNumReduceTasks() in your mapper code, try checking the
> return value of this:
>
>     context.getConfiguration().get("mapreduce.job.reduces")
>
> And then in the setup of your corresponding unit test, set that value
> to whatever you want it to be:
>
>     mapDriver.getConfiguration().setInt("mapreduce.job.reduces", 1);
>
> I've verified that the property set this way in MRUnit 0.9.0 is
> returned with the same value, though I didn't verify much beyond that.
>
> [1]
> http://hadoop.apache.org/docs/mapreduce/current/streaming.html#Specifying+the+Number+of+Reducers
>
> On Wed, Nov 28, 2012 at 8:06 PM, Dipesh Khakhkhar
> <di...@gmail.com> wrote:
> > Hi Tom,
> >
> > Thanks for replying. I completely agree with you - there will be only one
> > Reduce task in unit test and when we query the mock object to get number
> of
> > reduce task it should return 1 instead of zero.
> >
> > I'm using to calculate a custom counter and since mocked Context object
> > returns it 0 my test is failing.
> >
> > Can we set it externally this value using MRUnit 0.9*?
> >
> > Thanks.
> > -Dipesh
>

Re: How to set number of reduce tasks in MRUnit's mocked context object

Posted by Tom Wheeler <tw...@cloudera.com>.
Hi Dipesh,

OK, I think I understand what you're saying. I am going to restate it
just so you'll be sure I've got it.

Your mapper (or reducer) is trying to check the return value of the
context.getNumReduceTasks() method, but it's returning 0 in all cases.
 Although this wouldn't be an issue for most unit tests, your mapper
is doing some computation on this value so you need MRUnit to return
something other than 0 so you can test your code.  Does that sound
right?

If so, I cannot say offhand whether what you're seeing is a bug or a
feature that just hasn't been implemented yet.  I think I can offer a
workaround for you try, though it may be kind of a hack.

Whenever you call methods like setNumReduceTasks, it's really just a
convenient way of setting a property that Hadoop interprets.
According to the Hadoop Streaming guide [1], the corresponding
property here ought to be num.reduce.tasks.  Therefore, instead of
checking for getNumReduceTasks() in your mapper code, try checking the
return value of this:

    context.getConfiguration().get("mapreduce.job.reduces")

And then in the setup of your corresponding unit test, set that value
to whatever you want it to be:

    mapDriver.getConfiguration().setInt("mapreduce.job.reduces", 1);

I've verified that the property set this way in MRUnit 0.9.0 is
returned with the same value, though I didn't verify much beyond that.

[1] http://hadoop.apache.org/docs/mapreduce/current/streaming.html#Specifying+the+Number+of+Reducers

On Wed, Nov 28, 2012 at 8:06 PM, Dipesh Khakhkhar
<di...@gmail.com> wrote:
> Hi Tom,
>
> Thanks for replying. I completely agree with you - there will be only one
> Reduce task in unit test and when we query the mock object to get number of
> reduce task it should return 1 instead of zero.
>
> I'm using to calculate a custom counter and since mocked Context object
> returns it 0 my test is failing.
>
> Can we set it externally this value using MRUnit 0.9*?
>
> Thanks.
> -Dipesh

Re: How to set number of reduce tasks in MRUnit's mocked context object

Posted by Dipesh Khakhkhar <di...@gmail.com>.
Hi Tom,

Thanks for replying. I completely agree with you - there will be only one
Reduce task in unit test and when we query the mock object to get number of
reduce task it should return 1 instead of zero.

I'm using to calculate a custom counter and since mocked Context object
returns it 0 my test is failing.

Can we set it externally this value using MRUnit 0.9*?

Thanks.
-Dipesh

On Wed, Nov 28, 2012 at 3:32 PM, Tom Wheeler <tw...@cloudera.com> wrote:

> Hi Dipesh,
>
> I haven't looked at that part of the code, but based on my
> understanding of MRUnit, it doesn't (generally) make sense to have
> multiple reduce tasks in a unit test.  If your goal is to test a
> custom partitioner, then I know that's not yet supported (MRUNIT-128).
>
> You could maybe make the case that in a unit test you should be able
> to set an arbitrary value in the configuration object and then be able
> to to retrieve that using the corresponding get method.  Maybe you
> could clarify what you want to accomplish by setting the number of
> reduce tasks.
>
> Tom Wheeler
>
> On Wed, Nov 28, 2012 at 5:11 PM, Dipesh Khakhkhar
> <di...@gmail.com> wrote:
> > When I am running a reducer test - then this method should return 1
> without
> > requiring any further mocking. Correct? Same is true for Mapper (I have
> not
> > tried it though).
> >
> > Will it require an enhancement in the framework or it can be done
> > externally.
> >
> > Thanks.
> >
> >
> > On Wed, Nov 28, 2012 at 2:59 PM, Dipesh Khakhkhar <
> dipeshsoftware@gmail.com>
> > wrote:
> >>
> >> I'm calling getNumReduceTasks() method in my reducer and in my MRUnit
> >> test, framework has created a mock context object but when I call
> >> getNumReduceTasks it is returning 0.
> >>
> >> I have tried setting it to Configuration object too but it didn't help.
> >>
> >> How can i set getNumReduceTasks() to return a desired number here?
> >>
> >> Thanks.
> >
> >
>

Re: How to set number of reduce tasks in MRUnit's mocked context object

Posted by Tom Wheeler <tw...@cloudera.com>.
Hi Dipesh,

I haven't looked at that part of the code, but based on my
understanding of MRUnit, it doesn't (generally) make sense to have
multiple reduce tasks in a unit test.  If your goal is to test a
custom partitioner, then I know that's not yet supported (MRUNIT-128).

You could maybe make the case that in a unit test you should be able
to set an arbitrary value in the configuration object and then be able
to to retrieve that using the corresponding get method.  Maybe you
could clarify what you want to accomplish by setting the number of
reduce tasks.

Tom Wheeler

On Wed, Nov 28, 2012 at 5:11 PM, Dipesh Khakhkhar
<di...@gmail.com> wrote:
> When I am running a reducer test - then this method should return 1 without
> requiring any further mocking. Correct? Same is true for Mapper (I have not
> tried it though).
>
> Will it require an enhancement in the framework or it can be done
> externally.
>
> Thanks.
>
>
> On Wed, Nov 28, 2012 at 2:59 PM, Dipesh Khakhkhar <di...@gmail.com>
> wrote:
>>
>> I'm calling getNumReduceTasks() method in my reducer and in my MRUnit
>> test, framework has created a mock context object but when I call
>> getNumReduceTasks it is returning 0.
>>
>> I have tried setting it to Configuration object too but it didn't help.
>>
>> How can i set getNumReduceTasks() to return a desired number here?
>>
>> Thanks.
>
>

Re: How to set number of reduce tasks in MRUnit's mocked context object

Posted by Dipesh Khakhkhar <di...@gmail.com>.
When I am running a reducer test - then this method should return 1 without
requiring any further mocking. Correct? Same is true for Mapper (I have not
tried it though).

Will it require an enhancement in the framework or it can be done
externally.

Thanks.

On Wed, Nov 28, 2012 at 2:59 PM, Dipesh Khakhkhar
<di...@gmail.com>wrote:

> I'm calling getNumReduceTasks() method in my reducer and in my MRUnit
> test, framework has created a mock context object but when I call
> getNumReduceTasks it is returning 0.
>
> I have tried setting it to Configuration object too but it didn't help.
>
> How can i set getNumReduceTasks() to return a desired number here?
>
> Thanks.
>