You are viewing a plain text version of this content. The canonical link for it is here.

Posted to user@mahout.apache.org by ma...@googlemail.com on 2010/07/02 17:26:15 UTC

Mahout running on Hadoop

Hi there,

I am new to Mahout and currently evaluating the framework for our project.  
I like the idea of being prepared for a large amount of users and data that  
Mahout provides with the oportunity to run on Hadoop. However, I am  
currently looking for an easy way of how to migrate to Hadoop. So far, I  
have only found tutorials that describe how to deal with the files for  
input and output that need to be copied to hadoop. However, as a user I do  
not really want to cope with the problem of copying files, I rather want to  
stay with the API and the Recommedner class.

Is this possible at all? Is there any documentation on this issue?

Best regards,
Matthias

Re: Mahout running on Hadoop

Posted by Sean Owen <sr...@gmail.com>.

In general, the Hadoop-based implementations are completely different
creatures. Code from the regular online versions doesn't port and the
computation needs to be structured quite differently. They're almost
different libraries.

There's one hybrid, and that is the pseudo-distributed recommender
bits in org.apache.mahout.cf.taste.hadoop.pseudo. This is a way to run
many non-distributed normal Recommenders on Hadoop. The computation
isn't actually distributed; it's just that many instances are run. The
issue there is eventually your data outgrows what can be loaded on one
machine, but, could be useful.

Otherwise, no there's no way to just move a given implementation to
Hadoop in a fully distributed way. Some algorithms just won't be
distributable.

On Sun, Jul 4, 2010 at 9:00 PM, Ted Dunning <te...@gmail.com> wrote:
> The recommendation capabilities are the best integrated and most
> interchangeable parts of Mahout.
>
> You should be able to start with entirely on-line recommendations and switch
> to off-line methods fairly transparently as you scale.  In addition, you
> should be able to use off-line precomputation with hadoop and still use
> non-hadoop based methods for experiments.
>
> Sean should probably comment on the details, but I am pretty sure that the
> statement above is a good summary.
>
> 2010/7/4 Matthias Böhmer <ma...@m-boehmer.de>
>
>> Yes, right! I have an non-Hadoop implementation using the API and I am
>> wondering which steps I have to take to move to a Hadoop-based
>> implementation. It seems like I have to change my application code,
>> right? Or is there a way to keep my application code as it is, e.g.
>> for running tests without Hadoop.
>>
>> 2010/7/2 Ted Dunning <te...@gmail.com>:
>> > By this, do you mean migrate from using the Mahout recommendation
>> framework
>> > without hadoop to using the Mahout recommendation framework with Hadoop?
>> >
>> > On Fri, Jul 2, 2010 at 8:26 AM, <ma...@googlemail.com> wrote:
>> >
>> >> However, I am currently looking for an easy way of how to migrate to
>> >> Hadoop.
>> >
>>
>>
>>
>> --
>> --
>>
>

Re: Mahout running on Hadoop

Posted by Ted Dunning <te...@gmail.com>.

The recommendation capabilities are the best integrated and most
interchangeable parts of Mahout.

You should be able to start with entirely on-line recommendations and switch
to off-line methods fairly transparently as you scale.  In addition, you
should be able to use off-line precomputation with hadoop and still use
non-hadoop based methods for experiments.

Sean should probably comment on the details, but I am pretty sure that the
statement above is a good summary.

2010/7/4 Matthias Böhmer <ma...@m-boehmer.de>

> Yes, right! I have an non-Hadoop implementation using the API and I am
> wondering which steps I have to take to move to a Hadoop-based
> implementation. It seems like I have to change my application code,
> right? Or is there a way to keep my application code as it is, e.g.
> for running tests without Hadoop.
>
> 2010/7/2 Ted Dunning <te...@gmail.com>:
> > By this, do you mean migrate from using the Mahout recommendation
> framework
> > without hadoop to using the Mahout recommendation framework with Hadoop?
> >
> > On Fri, Jul 2, 2010 at 8:26 AM, <ma...@googlemail.com> wrote:
> >
> >> However, I am currently looking for an easy way of how to migrate to
> >> Hadoop.
> >
>
>
>
> --
> --
>

Re: Mahout running on Hadoop

Posted by Matthias Böhmer <ma...@m-boehmer.de>.

Yes, right! I have an non-Hadoop implementation using the API and I am
wondering which steps I have to take to move to a Hadoop-based
implementation. It seems like I have to change my application code,
right? Or is there a way to keep my application code as it is, e.g.
for running tests without Hadoop.

2010/7/2 Ted Dunning <te...@gmail.com>:
> By this, do you mean migrate from using the Mahout recommendation framework
> without hadoop to using the Mahout recommendation framework with Hadoop?
>
> On Fri, Jul 2, 2010 at 8:26 AM, <ma...@googlemail.com> wrote:
>
>> However, I am currently looking for an easy way of how to migrate to
>> Hadoop.
>

-- 
--

Re: Mahout running on Hadoop

Posted by Ted Dunning <te...@gmail.com>.

By this, do you mean migrate from using the Mahout recommendation framework
without hadoop to using the Mahout recommendation framework with Hadoop?

On Fri, Jul 2, 2010 at 8:26 AM, <ma...@googlemail.com> wrote:

> However, I am currently looking for an easy way of how to migrate to
> Hadoop.