You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Matt Mitchell <go...@gmail.com> on 2012/08/04 20:28:47 UTC

mahout on aws elastic map reduce

Hi,

I'm digging around trying to find info on running mahout on AWS's
Elastic Map Reduce. Anyone know of a step-by-step article/tutorial?
I'm interested in running "itemsimilarity", "recommenditembased" and
"recommendfactorized".

Thanks!

- Matt

Re: mahout on aws elastic map reduce

Posted by Matt Mitchell <go...@gmail.com>.
I was finally able to get things running. I had completely overlooked
the similarity algorithm parameter, and once I set that, it worked
great! It'd be nice if mahout complained immediately when a required
param is missing instead of when the param is needed, especially in
multi-phase/map-reduce jobs.

- Matt

On Sat, Aug 4, 2012 at 5:11 PM, Matt Mitchell <go...@gmail.com> wrote:
> I'm attempting to run the
> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob on
> AWS EMR.
>
> I can see that things are working by looking at the task logs.
> However, after it runs for about 10 minutes, it dies. The only log
> file is stdout, and it's empty.
>
> Does this look right -- using the ruby client:
>
> ./elastic-mapreduce -j JOB_ID --jar
> s3n://mm.lib/mahout-core-0.6-job.jar --main-class
> org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob
> --arg --input --arg s3n://mm.input-data/data.csv --arg --output --arg
> s3n://mm.output-data/ --arg --tempDir --arg tempDir4 --access-id
> ACCESS_KEY --private-key PRIVATE_KEY
>
> One question... should the S3 output directory already exist?
>
> - Matt
>
> On Sat, Aug 4, 2012 at 3:18 PM, Matt Mitchell <go...@gmail.com> wrote:
>> Thanks :) Of course, I found this as soon as I posted!
>>
>> https://cwiki.apache.org/MAHOUT/mahout-on-elastic-mapreduce.html
>>
>> - Matt
>>
>> On Sat, Aug 4, 2012 at 2:34 PM, Sebastian Schelter <ss...@apache.org> wrote:
>>> Its pretty simple, upload the mahout jar and your data to S3 and click
>>> together a custom mapreduce step pointing to the ItemSimilarityJob class
>>> Am 04.08.2012 20:29 schrieb "Matt Mitchell" <go...@gmail.com>:
>>>
>>>> Hi,
>>>>
>>>> I'm digging around trying to find info on running mahout on AWS's
>>>> Elastic Map Reduce. Anyone know of a step-by-step article/tutorial?
>>>> I'm interested in running "itemsimilarity", "recommenditembased" and
>>>> "recommendfactorized".
>>>>
>>>> Thanks!
>>>>
>>>> - Matt
>>>>

Re: mahout on aws elastic map reduce

Posted by Matt Mitchell <go...@gmail.com>.
I'm attempting to run the
org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob on
AWS EMR.

I can see that things are working by looking at the task logs.
However, after it runs for about 10 minutes, it dies. The only log
file is stdout, and it's empty.

Does this look right -- using the ruby client:

./elastic-mapreduce -j JOB_ID --jar
s3n://mm.lib/mahout-core-0.6-job.jar --main-class
org.apache.mahout.cf.taste.hadoop.similarity.item.ItemSimilarityJob
--arg --input --arg s3n://mm.input-data/data.csv --arg --output --arg
s3n://mm.output-data/ --arg --tempDir --arg tempDir4 --access-id
ACCESS_KEY --private-key PRIVATE_KEY

One question... should the S3 output directory already exist?

- Matt

On Sat, Aug 4, 2012 at 3:18 PM, Matt Mitchell <go...@gmail.com> wrote:
> Thanks :) Of course, I found this as soon as I posted!
>
> https://cwiki.apache.org/MAHOUT/mahout-on-elastic-mapreduce.html
>
> - Matt
>
> On Sat, Aug 4, 2012 at 2:34 PM, Sebastian Schelter <ss...@apache.org> wrote:
>> Its pretty simple, upload the mahout jar and your data to S3 and click
>> together a custom mapreduce step pointing to the ItemSimilarityJob class
>> Am 04.08.2012 20:29 schrieb "Matt Mitchell" <go...@gmail.com>:
>>
>>> Hi,
>>>
>>> I'm digging around trying to find info on running mahout on AWS's
>>> Elastic Map Reduce. Anyone know of a step-by-step article/tutorial?
>>> I'm interested in running "itemsimilarity", "recommenditembased" and
>>> "recommendfactorized".
>>>
>>> Thanks!
>>>
>>> - Matt
>>>

Re: mahout on aws elastic map reduce

Posted by Matt Mitchell <go...@gmail.com>.
Thanks :) Of course, I found this as soon as I posted!

https://cwiki.apache.org/MAHOUT/mahout-on-elastic-mapreduce.html

- Matt

On Sat, Aug 4, 2012 at 2:34 PM, Sebastian Schelter <ss...@apache.org> wrote:
> Its pretty simple, upload the mahout jar and your data to S3 and click
> together a custom mapreduce step pointing to the ItemSimilarityJob class
> Am 04.08.2012 20:29 schrieb "Matt Mitchell" <go...@gmail.com>:
>
>> Hi,
>>
>> I'm digging around trying to find info on running mahout on AWS's
>> Elastic Map Reduce. Anyone know of a step-by-step article/tutorial?
>> I'm interested in running "itemsimilarity", "recommenditembased" and
>> "recommendfactorized".
>>
>> Thanks!
>>
>> - Matt
>>

Re: mahout on aws elastic map reduce

Posted by Sebastian Schelter <ss...@apache.org>.
Its pretty simple, upload the mahout jar and your data to S3 and click
together a custom mapreduce step pointing to the ItemSimilarityJob class
Am 04.08.2012 20:29 schrieb "Matt Mitchell" <go...@gmail.com>:

> Hi,
>
> I'm digging around trying to find info on running mahout on AWS's
> Elastic Map Reduce. Anyone know of a step-by-step article/tutorial?
> I'm interested in running "itemsimilarity", "recommenditembased" and
> "recommendfactorized".
>
> Thanks!
>
> - Matt
>