You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by "Sean Owen (JIRA)" <ji...@apache.org> on 2011/09/13 11:12:09 UTC

[jira] [Resolved] (MAHOUT-663) Rationalize hadoop job creation with respect to setJarByClass

     [ https://issues.apache.org/jira/browse/MAHOUT-663?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Sean Owen resolved MAHOUT-663.
------------------------------

    Resolution: Not A Problem
      Assignee: Sean Owen

I suggest we resolve this as "already working as desired" since you can get the Job and reconfigure it as you like if using AbstractJob. (Would a command-line option to pick the class containing the jar be useful?)

Not everything uses AbstractJob but I've given up that fight. I just don't think the authors of the remaining code that doesn't use it are going to get to it. So, you wouldn't be helped by this idea there.

> Rationalize hadoop job creation with respect to setJarByClass
> -------------------------------------------------------------
>
>                 Key: MAHOUT-663
>                 URL: https://issues.apache.org/jira/browse/MAHOUT-663
>             Project: Mahout
>          Issue Type: Bug
>          Components: build
>    Affects Versions: 0.4, 0.5
>            Reporter: Benson Margulies
>            Assignee: Sean Owen
>             Fix For: 0.6
>
>
> Mahout includes a series of driver classes that create hadoop jobs via static methods.
> Each one of these calls job.setJarByClass(itself.class).
> Unfortunately, this subverts the hadoop support for putting additional jars in the lib directory of a job jar, since the class passed in is not a class that lives in the ordinary section of the job jar.
> The effect of this is to force users of Mahout (and Mahout's own example job jar) to unpack the mahout-core jar into the main section, instead of just treating it as a 'lib' dependency.
> It seems to me that all the static job creators should be refactored into a public function that returns a job object (and does NOT call waitForCompletion), and then the existing wrapper. Users could call the new functions, and make their own call to setJarByClass.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira