You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Lance Norskog <go...@gmail.com> on 2010/12/20 04:54:22 UTC

Code nit- not worth JIRA entry

The old Hadoop JobConf class is a subclass of the new Configuration
class. Unforch, some classes still require the JobConf class when they
don't use any of it's methods, and use it as a Configuration. This
means that when your code uses Configuration, you can't use one of the
old-school classes when really you could.

The org.apache.mahout.common.parameters.Parameter classes should all
take Configuration in the constructors, if they have constructors
(some don't). Some still take JobConif and then upcast them with the
super() call.

This make it hard to write a DistanceMeasure that takes a parameter:
TanimotoDistanceMeasure takes two: PathParameter is in the second
list, and it makes a Vector Parameter using the ClassParameter
extension which is also in the second list. I'm using the
ClassParameter dodge to make a Double Parameter that works with a
Configuration.

These take a JobConf in the constructor:
StringParameter
IntegerParameter
FileParameter
DoubleParameter
	
This takes a Configuration in some methods:
Parametered Interface

These take a Configuration in the constructor:
PathParameter
CompositeParameter
ClassParameter
AbstractParameter

I have not done a sweep of the source base to find other instances of
this. Not being a committer, it's not really pointful.

-- 
Lance Norskog
goksron@gmail.com

Re: Code nit- not worth JIRA entry

Posted by Sean Owen <sr...@gmail.com>.
I just committed a change that removes most usage of JobConf and
replaces with Configuration. This is, in theory, nearly no change as
they are just key-value containers, and tests pass and such.

I think I can remove more. But so far does it solve the problem?

It's another small step to getting rid of use of old, deprecated
Hadoop APIs. It's mostly the Bayes jobs that are still on the old
code. Robin, what are your thoughts on converting these? We had talked
about doing this for 0.4 over 6 months ago but that seems to have
stalled.

On Mon, Dec 20, 2010 at 9:50 AM, Sean Owen <sr...@gmail.com> wrote:
> In theory we should be entirely on new, undeprecated Hadoop APIs (the
> 0.20.x+ APIs). In practice that's not so and I don't know how or when the
> stragglers will be updated. But the goal is most certainly to not use
> JobConf at all.
> However as we've agreed to require 0.20.2, it's fair game to use
> Configuration where possible. I'll have a look to see just how much can be
> updated. My code inspections ought to have picked up a lot of those already.
>
> On Mon, Dec 20, 2010 at 3:54 AM, Lance Norskog <go...@gmail.com> wrote:
>>
>> The old Hadoop JobConf class is a subclass of the new Configuration
>> class. Unforch, some classes still require the JobConf class when they
>> don't use any of it's methods, and use it as a Configuration. This
>> means that when your code uses Configuration, you can't use one of the
>> old-school classes when really you could.
>>
>> The org.apache.mahout.common.parameters.Parameter classes should all
>> take Configuration in the constructors, if they have constructors
>> (some don't). Some still take JobConif and then upcast them with the
>> super() call.
>>
>> This make it hard to write a DistanceMeasure that takes a parameter:
>> TanimotoDistanceMeasure takes two: PathParameter is in the second
>> list, and it makes a Vector Parameter using the ClassParameter
>> extension which is also in the second list. I'm using the
>> ClassParameter dodge to make a Double Parameter that works with a
>> Configuration.
>>
>> These take a JobConf in the constructor:
>> StringParameter
>> IntegerParameter
>> FileParameter
>> DoubleParameter
>>
>> This takes a Configuration in some methods:
>> Parametered Interface
>>
>> These take a Configuration in the constructor:
>> PathParameter
>> CompositeParameter
>> ClassParameter
>> AbstractParameter
>>
>> I have not done a sweep of the source base to find other instances of
>> this. Not being a committer, it's not really pointful.
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>
>

Re: Code nit- not worth JIRA entry

Posted by Sean Owen <sr...@gmail.com>.
In theory we should be entirely on new, undeprecated Hadoop APIs (the
0.20.x+ APIs). In practice that's not so and I don't know how or when the
stragglers will be updated. But the goal is most certainly to not use
JobConf at all.

However as we've agreed to require 0.20.2, it's fair game to use
Configuration where possible. I'll have a look to see just how much can be
updated. My code inspections ought to have picked up a lot of those already.

On Mon, Dec 20, 2010 at 3:54 AM, Lance Norskog <go...@gmail.com> wrote:

> The old Hadoop JobConf class is a subclass of the new Configuration
> class. Unforch, some classes still require the JobConf class when they
> don't use any of it's methods, and use it as a Configuration. This
> means that when your code uses Configuration, you can't use one of the
> old-school classes when really you could.
>
> The org.apache.mahout.common.parameters.Parameter classes should all
> take Configuration in the constructors, if they have constructors
> (some don't). Some still take JobConif and then upcast them with the
> super() call.
>
> This make it hard to write a DistanceMeasure that takes a parameter:
> TanimotoDistanceMeasure takes two: PathParameter is in the second
> list, and it makes a Vector Parameter using the ClassParameter
> extension which is also in the second list. I'm using the
> ClassParameter dodge to make a Double Parameter that works with a
> Configuration.
>
> These take a JobConf in the constructor:
> StringParameter
> IntegerParameter
> FileParameter
> DoubleParameter
>
> This takes a Configuration in some methods:
> Parametered Interface
>
> These take a Configuration in the constructor:
> PathParameter
> CompositeParameter
> ClassParameter
> AbstractParameter
>
> I have not done a sweep of the source base to find other instances of
> this. Not being a committer, it's not really pointful.
>
> --
> Lance Norskog
> goksron@gmail.com
>