You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by David Rosenstrauch <da...@darose.net> on 2010/08/04 17:38:13 UTC
Fwd: Partitioner in Hadoop 0.20
Someone sent this email to the commons-user list a while back, but it
seems like it slipped through the cracks. We're starting to dig into
some hard-core Hadoop development and just came upon this same issue,
though.
Anyone know if there's any particular reason why the new Partitioner
class doesn't implement JobConfigurable? (And, if not, whether there's
any plans to fix this omission?) We're working on a somewhat complex
partitioner, and it would be extremely helpful to be able to pass it
some parms via the jobconf.
Thanks,
DR
-------- Original Message --------
Subject: Partitioner in Hadoop 0.20
Date: Wed, 30 Jun 2010 00:05:52 -0400
From: Saptarshi Guha <sa...@gmail.com>
Reply-To: common-user@hadoop.apache.org, saptarshi.guha@gmail.com
To: common-user@hadoop.apache.org
Hello,
in hadoop 0.20.2 (current), the Partitioner class does not extend
JobConfigurable (as it did in Hadoop pre 0.19).
Does this mean there isn't a way to set some configurable options for the
partitioner?
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapreduce/Partitioner.html
(old :
http://hadoop.apache.org/common/docs/current/api/org/apache/hadoop/mapred/Partitioner.html
)
Regards
Saptarshi
Re: Partitioner in Hadoop 0.20
Posted by Owen O'Malley <om...@apache.org>.
On Aug 4, 2010, at 10:58 AM, David Rosenstrauch wrote:
> So my partitioner needs to implement Configurable, then not
> JobConfigurable. Tnx much!
ReflectionUtils.newInstance will use either Configurable or
JobConfigurable (or both!). So implementing either one will work fine.
-- Owen
Re: Partitioner in Hadoop 0.20
Posted by Owen O'Malley <om...@apache.org>.
On Aug 4, 2010, at 10:58 AM, David Rosenstrauch wrote:
> So my partitioner needs to implement Configurable, then not
> JobConfigurable. Tnx much!
ReflectionUtils.newInstance will use either Configurable or
JobConfigurable (or both!). So implementing either one will work fine.
-- Owen
Re: Partitioner in Hadoop 0.20
Posted by David Rosenstrauch <da...@darose.net>.
On 08/04/2010 01:55 PM, Wilkes, Chris wrote:
> On Aug 4, 2010, at 10:50 AM, David Rosenstrauch wrote:
>
>> On 08/04/2010 12:30 PM, Owen O'Malley wrote:
>>>
>>> On Aug 4, 2010, at 8:38 AM, David Rosenstrauch wrote:
>>>
>>>> Anyone know if there's any particular reason why the new Partitioner
>>>> class doesn't implement JobConfigurable? (And, if not, whether there's
>>>> any plans to fix this omission?) We're working on a somewhat complex
>>>> partitioner, and it would be extremely helpful to be able to pass it
>>>> some parms via the jobconf.
>>>
>>> The short answer is that it doesn't need to. If you make your
>>> partitioner either Configured or JobConfigurable, it will be configured.
>>> The API class doesn't depend on it precisely because it is not required
>>> for all partitioners.
>>>
>>> -- Owen
>>
>> ? Not sure I understand correctly ... can you pls clarify?
>>
>> So if I make my custom partitioner implement JobConfigurable, then its
>> configure(JobConf) method will automagically get called and allow me
>> to configure it with info in the jobConf that's passed in? (Note that
>> making it extend from Configured is not an option, since it needs to
>> extend from org.apache.hadoop.mapreduce.Partitioner.)
>>
>
>
> The partitioner is instantiated by ReflectionUtils.newInstance(clazz,
> job) , that calls the setConfiguration() on the newly created object if
> it implements Configurable
>
> Chris
So my partitioner needs to implement Configurable, then not
JobConfigurable. Tnx much!
DR
Re: Partitioner in Hadoop 0.20
Posted by David Rosenstrauch <da...@darose.net>.
On 08/04/2010 01:55 PM, Wilkes, Chris wrote:
> On Aug 4, 2010, at 10:50 AM, David Rosenstrauch wrote:
>
>> On 08/04/2010 12:30 PM, Owen O'Malley wrote:
>>>
>>> On Aug 4, 2010, at 8:38 AM, David Rosenstrauch wrote:
>>>
>>>> Anyone know if there's any particular reason why the new Partitioner
>>>> class doesn't implement JobConfigurable? (And, if not, whether there's
>>>> any plans to fix this omission?) We're working on a somewhat complex
>>>> partitioner, and it would be extremely helpful to be able to pass it
>>>> some parms via the jobconf.
>>>
>>> The short answer is that it doesn't need to. If you make your
>>> partitioner either Configured or JobConfigurable, it will be configured.
>>> The API class doesn't depend on it precisely because it is not required
>>> for all partitioners.
>>>
>>> -- Owen
>>
>> ? Not sure I understand correctly ... can you pls clarify?
>>
>> So if I make my custom partitioner implement JobConfigurable, then its
>> configure(JobConf) method will automagically get called and allow me
>> to configure it with info in the jobConf that's passed in? (Note that
>> making it extend from Configured is not an option, since it needs to
>> extend from org.apache.hadoop.mapreduce.Partitioner.)
>>
>
>
> The partitioner is instantiated by ReflectionUtils.newInstance(clazz,
> job) , that calls the setConfiguration() on the newly created object if
> it implements Configurable
>
> Chris
So my partitioner needs to implement Configurable, then not
JobConfigurable. Tnx much!
DR
Re: Partitioner in Hadoop 0.20
Posted by "Wilkes, Chris" <cw...@gmail.com>.
On Aug 4, 2010, at 10:50 AM, David Rosenstrauch wrote:
> On 08/04/2010 12:30 PM, Owen O'Malley wrote:
>>
>> On Aug 4, 2010, at 8:38 AM, David Rosenstrauch wrote:
>>
>>> Anyone know if there's any particular reason why the new Partitioner
>>> class doesn't implement JobConfigurable? (And, if not, whether
>>> there's
>>> any plans to fix this omission?) We're working on a somewhat complex
>>> partitioner, and it would be extremely helpful to be able to pass it
>>> some parms via the jobconf.
>>
>> The short answer is that it doesn't need to. If you make your
>> partitioner either Configured or JobConfigurable, it will be
>> configured.
>> The API class doesn't depend on it precisely because it is not
>> required
>> for all partitioners.
>>
>> -- Owen
>
> ? Not sure I understand correctly ... can you pls clarify?
>
> So if I make my custom partitioner implement JobConfigurable, then
> its configure(JobConf) method will automagically get called and
> allow me to configure it with info in the jobConf that's passed in?
> (Note that making it extend from Configured is not an option, since
> it needs to extend from org.apache.hadoop.mapreduce.Partitioner.)
>
The partitioner is instantiated by ReflectionUtils.newInstance(clazz,
job) , that calls the setConfiguration() on the newly created object
if it implements Configurable
Chris
Re: Partitioner in Hadoop 0.20
Posted by David Rosenstrauch <da...@darose.net>.
On 08/04/2010 12:30 PM, Owen O'Malley wrote:
>
> On Aug 4, 2010, at 8:38 AM, David Rosenstrauch wrote:
>
>> Anyone know if there's any particular reason why the new Partitioner
>> class doesn't implement JobConfigurable? (And, if not, whether there's
>> any plans to fix this omission?) We're working on a somewhat complex
>> partitioner, and it would be extremely helpful to be able to pass it
>> some parms via the jobconf.
>
> The short answer is that it doesn't need to. If you make your
> partitioner either Configured or JobConfigurable, it will be configured.
> The API class doesn't depend on it precisely because it is not required
> for all partitioners.
>
> -- Owen
? Not sure I understand correctly ... can you pls clarify?
So if I make my custom partitioner implement JobConfigurable, then its
configure(JobConf) method will automagically get called and allow me to
configure it with info in the jobConf that's passed in? (Note that
making it extend from Configured is not an option, since it needs to
extend from org.apache.hadoop.mapreduce.Partitioner.)
Thanks,
DR
Re: Partitioner in Hadoop 0.20
Posted by David Rosenstrauch <da...@darose.net>.
On 08/04/2010 12:30 PM, Owen O'Malley wrote:
>
> On Aug 4, 2010, at 8:38 AM, David Rosenstrauch wrote:
>
>> Anyone know if there's any particular reason why the new Partitioner
>> class doesn't implement JobConfigurable? (And, if not, whether there's
>> any plans to fix this omission?) We're working on a somewhat complex
>> partitioner, and it would be extremely helpful to be able to pass it
>> some parms via the jobconf.
>
> The short answer is that it doesn't need to. If you make your
> partitioner either Configured or JobConfigurable, it will be configured.
> The API class doesn't depend on it precisely because it is not required
> for all partitioners.
>
> -- Owen
? Not sure I understand correctly ... can you pls clarify?
So if I make my custom partitioner implement JobConfigurable, then its
configure(JobConf) method will automagically get called and allow me to
configure it with info in the jobConf that's passed in? (Note that
making it extend from Configured is not an option, since it needs to
extend from org.apache.hadoop.mapreduce.Partitioner.)
Thanks,
DR
Re: Partitioner in Hadoop 0.20
Posted by Owen O'Malley <om...@apache.org>.
On Aug 4, 2010, at 8:38 AM, David Rosenstrauch wrote:
> Anyone know if there's any particular reason why the new Partitioner
> class doesn't implement JobConfigurable? (And, if not, whether
> there's any plans to fix this omission?) We're working on a
> somewhat complex partitioner, and it would be extremely helpful to
> be able to pass it some parms via the jobconf.
The short answer is that it doesn't need to. If you make your
partitioner either Configured or JobConfigurable, it will be
configured. The API class doesn't depend on it precisely because it is
not required for all partitioners.
-- Owen
Re: Partitioner in Hadoop 0.20
Posted by Owen O'Malley <om...@apache.org>.
On Aug 4, 2010, at 8:38 AM, David Rosenstrauch wrote:
> Anyone know if there's any particular reason why the new Partitioner
> class doesn't implement JobConfigurable? (And, if not, whether
> there's any plans to fix this omission?) We're working on a
> somewhat complex partitioner, and it would be extremely helpful to
> be able to pass it some parms via the jobconf.
The short answer is that it doesn't need to. If you make your
partitioner either Configured or JobConfigurable, it will be
configured. The API class doesn't depend on it precisely because it is
not required for all partitioners.
-- Owen