You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@whirr.apache.org by "Andrei Savu (Updated) (JIRA)" <ji...@apache.org> on 2011/12/09 12:40:40 UTC

[jira] [Updated] (WHIRR-384) Add Mahout as a service

     [ https://issues.apache.org/jira/browse/WHIRR-384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrei Savu updated WHIRR-384:
------------------------------

    Attachment: WHIRR-384.patch

Thanks Frank! Here is a slightly updated patch (improved test logging, added as dep to CLI). 

+1 from me. Works like a charm both on aws-ec2 & cloudservers-uk. 
                
> Add Mahout as a service
> -----------------------
>
>                 Key: WHIRR-384
>                 URL: https://issues.apache.org/jira/browse/WHIRR-384
>             Project: Whirr
>          Issue Type: New Feature
>          Components: new service
>    Affects Versions: 0.7.0
>            Reporter: Frank Scholten
>             Fix For: 0.8.0
>
>         Attachments: WHIRR-384-mahout-client.patch, WHIRR-384-mahout-home.patch, WHIRR-384.patch
>
>
> Here is an initial patch to support Mahout as a Whirr service.
> I created the role 'mahout-home' which can be used to install the binary Mahout distribution on a Hadoop namenode.
> By combining this role with configuration for a Hadoop cluster you can SSH into the namenode, su to root and start running Mahout jobs via the mahout script immediately.
> The 'mahout-home' role has two properties
> Mahout version					whirr.mahout.version 
> URL of the Mahout binary distribution tarball	whirr.mahout.tarball.url
> Note that I used a snapshot version of Mahout for testing, revision 1169784, because there were some problems with the Mahout script in 0.5 that have been fixed on trunk, see MAHOUT-680. To test you can set the tarball property to this link http://dl.dropbox.com/u/13436484/mahout-distribution-0.6-SNAPSHOT.tar.gz
> I used configure actions and the onBeforeConfigure(). If there is a better way to express this with the Whirr API let me know.
> Currently I am investigating a 'mahout-jar' role, which installs the Mahout examples job jar under $HADOOP_HOME/lib on a tasktracer node. I already have some code for putting the jar in place but when running a job from my local machine I still get ClassNotFoundExceptions. I believe this is because Hadoop has already started before the jar is put in the lib dir, so the jar won't be picked up, but I have to investigate some more. From WHIRR-221 I understood that there is no support (yet?) for ordering of services but if you have an idea on how to fix this let me know.
> Comments and suggestions welcome!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira