You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Otis Gospodnetic <ot...@yahoo.com> on 2009/04/01 05:32:25 UTC

Re: mahout for news recommendation?

it's the former.  Taste is still not parallelized, but other parts of Mahout are, and they make use of Hadoop.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



----- Original Message ----
> From: Vinicius Carvalho <vi...@gmail.com>
> To: mahout-user@lucene.apache.org
> Sent: Tuesday, March 31, 2009 12:57:35 PM
> Subject: Re: mahout for news recommendation?
> 
> "Just to clarify a little bit, the CF part of Mahout is intended for real
> time, while the other parts (clustering, classification) are batch."
> 
> Sorry to just bump in the discussion. I've started with Taste a few months
> ago to use in my MD project. I've found mahout and I'm still studying hadoop
> first.
> 
> My question is: So the item recommender inside mahout runs on a single node?
> Or does it uses the map-reduce features from hadoop?
> 
> Sorry for the dumb question.
> 
> Regards
> 
> On Tue, Mar 31, 2009 at 1:32 PM, Tim Bass wrote:
> 
> > Most prior-work in news related classification has been done with
> > Bayesian classifiers / networks.
> >
> > I kindly suggest that if you are interested in processing RSS, you use
> > Bayesian classifiers as your core.
> >
> >
> >
> > On Tue, Mar 31, 2009 at 10:39 PM, Jason Rennie wrote:
> > > Sorry for my misunderstanding.  Thanks for the clarification!
> > >
> > > Jason
> > >
> > > On Tue, Mar 31, 2009 at 10:22 AM, Grant Ingersoll 
> > >wrote:
> > >
> > >>
> > >> On Mar 31, 2009, at 9:47 AM, Jason Rennie wrote:
> > >>
> > >>
> > >>> Note that if you want the system to exhibit real-time feedback, Mahout
> > may
> > >>> not be what you want since it is intended for batch-processing, IIUC.
> > >>>
> > >>>
> > >> Just to clarify a little bit, the CF part of Mahout is intended for real
> > >> time, while the other parts (clustering, classification) are batch.
> > >>
> > >
> > >
> > >
> > > --
> > > Jason Rennie
> > > Research Scientist, ITA Software
> > > http://www.itasoftware.com/
> > >
> >
> 
> 
> 
> -- 
> The intuitive mind is a sacred gift and the
> rational mind is a faithful servant. We have
> created a society that honors the servant and
> has forgotten the gift.


Re: mahout for news recommendation?

Posted by Joshua Bronson <ja...@gmail.com>.
On Thu, Apr 2, 2009 at 7:11 AM, Sean Owen <sr...@gmail.com> wrote:

> On Thu, Apr 2, 2009 at 3:58 AM, Joshua Bronson <ja...@gmail.com>
> wrote:
> > ...I assumed the instructions had left out the step of running "svn
> > checkout http://svn.apache.org/repos/asf/lucene/mahout/trunk/". Was this
> > assumption incorrect?
>
> Well you do need some copy of the Mahout distro, whether from SVN or
> a tarball. I imagin the latter is actually more common.


Where would you get a tarball, by the way? There is none linked to from
http://lucene.apache.org/mahout/releases.html. As for getting it from SVN,
the "Version Control" link under "Resources" in the sidebar of
http://lucene.apache.org/mahout/ points to a ViewVC instance (
http://svn.apache.org/viewvc/lucene/mahout/). I had to dig around a tiny bit
to find http://svn.apache.org/repos/asf/lucene/mahout/trunk/.


The instructions are indeed silent on this and assume you start with copy of
> the distro from some source.
>

I think it would be helpful to include this step in the demo explicitly.


> I did have to "mkdir
> > -p
> trunk/taste-web/src/main/resources/org/apache/mahout/cf/taste/example/grouplens"
> > before I could copy the .dat files there as
> > the trunk/taste-web/src/main/resources directory of the checkout doesn't
> > contain anything in it. Did I go off on the wrong track?
>
> Possible, the locations have been moving about and not sure those
> changes are in sync with my brain or the documentation. This sounds
> right, and I guess it works? then we should make the directory in SVN.
>

+1.

Re: mahout for news recommendation?

Posted by Sean Owen <sr...@gmail.com>.
On Thu, Apr 2, 2009 at 3:58 AM, Joshua Bronson <ja...@gmail.com> wrote:
> ...I assumed the instructions had left out the step of running "svn
> checkout http://svn.apache.org/repos/asf/lucene/mahout/trunk/". Was this
> assumption incorrect?

Well you do need some copy of the Mahout distro, whether from SVN or a
tarball. I imagin the latter is actually more common. The instructions
are indeed silent on this and assume you start with copy of the distro
from some source.

>
> I did have to "mkdir
> -p trunk/taste-web/src/main/resources/org/apache/mahout/cf/taste/example/grouplens"
> before I could copy the .dat files there as
> the trunk/taste-web/src/main/resources directory of the checkout doesn't
> contain anything in it. Did I go off on the wrong track?

Possible, the locations have been moving about and not sure those
changes are in sync with my brain or the documentation. This sounds
right, and I guess it works? then we should make the directory in SVN.

Re: mahout for news recommendation?

Posted by Sean Owen <sr...@gmail.com>.
> Great, thanks! I forgot to mention that the Requirements section lists J2SE
> 5.0, but actually 6.0 is required (BayesClassifier.java uses a
> java.util.Deque, for instance).

Yeah while the CF part doesn't use Java 6 classes it does use
@Override on interface methods now. It really should say Java 6 now.


> For sure. On that note, http://lucene.apache.org/mahout/releases.html should
> also be updated to link to some tarballs or something no?

Yes once there is a first release, indeed.


> - After obtaining a compatible version of mvn (2.0.10), I had to
> "export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home"
> to get mvn -version to report "Java version: 1.6.0_07" rather than 1.5.0_16.
> The mvn install step will fail if it's not using java 1.6.

I feel like this is controlled on Macs by a little Java utility under
Applications > Utilities. Same effect. Indeed this needs to be
mentioned along with the Maven note.

Re: mahout for news recommendation?

Posted by Joshua Bronson <ja...@gmail.com>.
On Fri, Apr 3, 2009 at 1:25 AM, Sean Owen <sr...@gmail.com> wrote:

> I can take care of it shortly when I'm back.


Great, thanks! I forgot to mention that the Requirements section lists J2SE
5.0, but actually 6.0 is required (BayesClassifier.java uses a
java.util.Deque, for instance).


> The only tweak I'd make is that 'step 0' is just to obtain the distro,
> which is not necessarily done via SVN and in fact won't be done that way for
> most people.


For sure. On that note, http://lucene.apache.org/mahout/releases.html should
also be updated to link to some tarballs or something no?


> I also see that my 'how to roll your own server' section is now out of date
> too and needs fixing.
>


A couple other notes on how I got things set up on my Mac:

- After obtaining a compatible version of mvn (2.0.10), I had to
"export JAVA_HOME=/System/Library/Frameworks/JavaVM.framework/Versions/1.6/Home"
to get mvn -version to report "Java version: 1.6.0_07" rather than 1.5.0_16.
The mvn install step will fail if it's not using java 1.6.

- To get mahout into Eclipse, I ran "mvn eclipse:eclipse" from the root of
the mahout distro, then in Eclipse I went to File > Import..., chose
"Existing Projects into Workspace", pointed it to the mahout distro, and
everything imported beautifully. I also had to change the JVM Eclipse was
using from 1.5 to 1.6 in Eclipse > Preferences > Java > Installed JREs.

Hope this helps someone.

Re: mahout for news recommendation?

Posted by Matthew Runo <mr...@zappos.com>.
That's what I needed. Just a more complete "how to get started".

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mruno@zappos.com - 702-943-7833

On Apr 6, 2009, at 11:22 PM, Sean Owen wrote:

> I have updated the documentation per this thread, which will soon
> improve this page:
> http://lucene.apache.org/mahout/taste.html
>
> Is this what you're after? it has some walkthroughs and how-tos.  
> What else?
>
> On Fri, Apr 3, 2009 at 5:31 PM, Matthew Runo <mr...@zappos.com> wrote:
>> I'd love to have a better "Taste for dummies" getting started  
>> section. I too
>> am on OS X, and with this thread can finally compile Taste, but the
>> instructions are still vauge (to me). I'd love to help, but I think  
>> it'd be
>> /great/ to have something really easy to follow along with, ala..
>>
>> 1. Create a file in the format, userId,itemID. Call it "input.txt".
>> 2. Create a class.. <code for class that reads in input.txt>
>> 3. How to build mahout
>> 4. How to build taste
>> 5. instructions for running the class
>> 6....
>>
>> I think this would help a lot of people.
>>
>> Thanks for your time!
>>
>> Matthew Runo
>> Software Engineer, Zappos.com
>> mruno@zappos.com - 702-943-7833
>>
>> On Apr 2, 2009, at 10:25 PM, Sean Owen wrote:
>>
>>> I can take care of it shortly when I'm back. The only tweak I'd make
>>> is that 'step 0' is just to obtain the distro, which is not
>>> necessarily done via SVN and in fact won't be done that way for most
>>> people.
>>>
>>> I also see that my 'how to roll your own server' section is now  
>>> out of
>>> date too and needs fixing.
>>>
>>
>>
>


Re: mahout for news recommendation?

Posted by Sean Owen <sr...@gmail.com>.
I have updated the documentation per this thread, which will soon
improve this page:
http://lucene.apache.org/mahout/taste.html

Is this what you're after? it has some walkthroughs and how-tos. What else?

On Fri, Apr 3, 2009 at 5:31 PM, Matthew Runo <mr...@zappos.com> wrote:
> I'd love to have a better "Taste for dummies" getting started section. I too
> am on OS X, and with this thread can finally compile Taste, but the
> instructions are still vauge (to me). I'd love to help, but I think it'd be
> /great/ to have something really easy to follow along with, ala..
>
> 1. Create a file in the format, userId,itemID. Call it "input.txt".
> 2. Create a class.. <code for class that reads in input.txt>
> 3. How to build mahout
> 4. How to build taste
> 5. instructions for running the class
> 6....
>
> I think this would help a lot of people.
>
> Thanks for your time!
>
> Matthew Runo
> Software Engineer, Zappos.com
> mruno@zappos.com - 702-943-7833
>
> On Apr 2, 2009, at 10:25 PM, Sean Owen wrote:
>
>> I can take care of it shortly when I'm back. The only tweak I'd make
>> is that 'step 0' is just to obtain the distro, which is not
>> necessarily done via SVN and in fact won't be done that way for most
>> people.
>>
>> I also see that my 'how to roll your own server' section is now out of
>> date too and needs fixing.
>>
>
>

Re: mahout for news recommendation?

Posted by Matthew Runo <mr...@zappos.com>.
I'd love to have a better "Taste for dummies" getting started section.  
I too am on OS X, and with this thread can finally compile Taste, but  
the instructions are still vauge (to me). I'd love to help, but I  
think it'd be /great/ to have something really easy to follow along  
with, ala..

1. Create a file in the format, userId,itemID. Call it "input.txt".
2. Create a class.. <code for class that reads in input.txt>
3. How to build mahout
4. How to build taste
5. instructions for running the class
6....

I think this would help a lot of people.

Thanks for your time!

Matthew Runo
Software Engineer, Zappos.com
mruno@zappos.com - 702-943-7833

On Apr 2, 2009, at 10:25 PM, Sean Owen wrote:

> I can take care of it shortly when I'm back. The only tweak I'd make
> is that 'step 0' is just to obtain the distro, which is not
> necessarily done via SVN and in fact won't be done that way for most
> people.
>
> I also see that my 'how to roll your own server' section is now out of
> date too and needs fixing.
>


Re: mahout for news recommendation?

Posted by Sean Owen <sr...@gmail.com>.
I can take care of it shortly when I'm back. The only tweak I'd make
is that 'step 0' is just to obtain the distro, which is not
necessarily done via SVN and in fact won't be done that way for most
people.

I also see that my 'how to roll your own server' section is now out of
date too and needs fixing.

Re: mahout for news recommendation?

Posted by Joshua Bronson <ja...@gmail.com>.
On Thu, Apr 2, 2009 at 5:55 PM, Grant Ingersoll <gs...@apache.org> wrote:

>
> On Apr 2, 2009, at 4:05 PM, Joshua Bronson wrote:
>
>  On Thu, Apr 2, 2009 at 3:30 PM, Grant Ingersoll <gs...@apache.org>
>> wrote:
>>
>>
>>> On Apr 2, 2009, at 2:08 PM, Joshua Bronson wrote:
>>>
>>> The machine I'm having problems with the demo on is a Macbook with
>>>
>>>> Apple-distributed java tools:
>>>>
>>>> $ uname -a
>>>>
>>>> Darwin voodoo.openplans.org 9.6.0 Darwin Kernel Version 9.6.0: Mon Nov
>>>> 24
>>>> 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 i386
>>>> MacBook2,1
>>>> Darwin
>>>>
>>>>
>>>>
>>>> $ for i in mvn java javac; do which $i; $i -version; echo; done
>>>> /usr/bin/mvn
>>>> Maven version: 2.0.6
>>>>
>>>>
>>> Please upgrade to 2.0.9 or later (2.0.10 is the latest).   By all
>>> accounts
>>> 2.0.6 is a real dog:
>>>
>>> http://www.lucidimagination.com/search/document/3e7dfef6281482dd/packaging_step_taking_forever_is_this_right
>>>
>>>
>> Actually there's a later version than 2.0.10: 2.1.0, which I tried and
>> which
>> also seems not to work with the "mvn package" step. I switched to 2.0.10
>> and
>> it worked. Have a look at http://paste.pocoo.org/show/110778/ if you get
>> a
>> chance.
>>
>
> Arg, so much for back compatibility.
>
> Let's put it this way, I use 2.0.9 and 2.0.10 and that's what I used when
> writing the Maven stuff.
>
>
>>
>> Now that I've got the demo running on my own machine, I'm looking forward
>> to
>> working with taste further. Will report back with my results.
>>
>
>
> Very cool.  Sorry for the intermittent pain!


No worries. http://lucene.apache.org/mahout/taste.html should be updated
though to save others from similar pain in the future (especially
considering that mvn 2.0.6 is the version that ships with Mac OS 10.5). To
summarize:

- "mvn 2.0.9 or 2.0.10" should be added to the Requirements list
- A "step 0" should be added for running "svn co
http://svn.apache.org/repos/asf/lucene/mahout/trunk/"
- someone with commit access should svn mkdir all the missing
directories under
trunk/taste-web/src/main/resources/org/apache/mahout/cf/taste/example/grouplens.

I'd be happy to help with this.

Josh

Re: mahout for news recommendation?

Posted by Grant Ingersoll <gs...@apache.org>.
On Apr 2, 2009, at 4:05 PM, Joshua Bronson wrote:

> On Thu, Apr 2, 2009 at 3:30 PM, Grant Ingersoll  
> <gs...@apache.org> wrote:
>
>>
>> On Apr 2, 2009, at 2:08 PM, Joshua Bronson wrote:
>>
>> The machine I'm having problems with the demo on is a Macbook with
>>> Apple-distributed java tools:
>>>
>>> $ uname -a
>>>
>>> Darwin voodoo.openplans.org 9.6.0 Darwin Kernel Version 9.6.0: Mon  
>>> Nov 24
>>> 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 i386  
>>> MacBook2,1
>>> Darwin
>>>
>>>
>>>
>>> $ for i in mvn java javac; do which $i; $i -version; echo; done
>>> /usr/bin/mvn
>>> Maven version: 2.0.6
>>>
>>
>> Please upgrade to 2.0.9 or later (2.0.10 is the latest).   By all  
>> accounts
>> 2.0.6 is a real dog:
>> http://www.lucidimagination.com/search/document/3e7dfef6281482dd/packaging_step_taking_forever_is_this_right
>>
>
> Actually there's a later version than 2.0.10: 2.1.0, which I tried  
> and which
> also seems not to work with the "mvn package" step. I switched to  
> 2.0.10 and
> it worked. Have a look at http://paste.pocoo.org/show/110778/ if you  
> get a
> chance.

Arg, so much for back compatibility.

Let's put it this way, I use 2.0.9 and 2.0.10 and that's what I used  
when writing the Maven stuff.

>
>
> Now that I've got the demo running on my own machine, I'm looking  
> forward to
> working with taste further. Will report back with my results.


Very cool.  Sorry for the intermittent pain!


--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: mahout for news recommendation?

Posted by Joshua Bronson <ja...@gmail.com>.
On Thu, Apr 2, 2009 at 3:30 PM, Grant Ingersoll <gs...@apache.org> wrote:

>
> On Apr 2, 2009, at 2:08 PM, Joshua Bronson wrote:
>
>  The machine I'm having problems with the demo on is a Macbook with
>> Apple-distributed java tools:
>>
>> $ uname -a
>>
>> Darwin voodoo.openplans.org 9.6.0 Darwin Kernel Version 9.6.0: Mon Nov 24
>> 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 i386 MacBook2,1
>> Darwin
>>
>>
>>
>> $ for i in mvn java javac; do which $i; $i -version; echo; done
>> /usr/bin/mvn
>> Maven version: 2.0.6
>>
>
> Please upgrade to 2.0.9 or later (2.0.10 is the latest).   By all accounts
> 2.0.6 is a real dog:
> http://www.lucidimagination.com/search/document/3e7dfef6281482dd/packaging_step_taking_forever_is_this_right
>

Actually there's a later version than 2.0.10: 2.1.0, which I tried and which
also seems not to work with the "mvn package" step. I switched to 2.0.10 and
it worked. Have a look at http://paste.pocoo.org/show/110778/ if you get a
chance.

Now that I've got the demo running on my own machine, I'm looking forward to
working with taste further. Will report back with my results.

Re: mahout for news recommendation?

Posted by Grant Ingersoll <gs...@apache.org>.
On Apr 2, 2009, at 2:08 PM, Joshua Bronson wrote:

> The machine I'm having problems with the demo on is a Macbook with
> Apple-distributed java tools:
>
> $ uname -a
>
> Darwin voodoo.openplans.org 9.6.0 Darwin Kernel Version 9.6.0: Mon  
> Nov 24
> 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 i386  
> MacBook2,1
> Darwin
>
>
>
> $ for i in mvn java javac; do which $i; $i -version; echo; done
> /usr/bin/mvn
> Maven version: 2.0.6

Please upgrade to 2.0.9 or later (2.0.10 is the latest).   By all  
accounts 2.0.6 is a real dog: http://www.lucidimagination.com/search/document/3e7dfef6281482dd/packaging_step_taking_forever_is_this_right

-Grant

Re: mahout for news recommendation?

Posted by Joshua Bronson <ja...@gmail.com>.
 The machine I'm having problems with the demo on is a Macbook with
Apple-distributed java tools:

$ uname -a

Darwin voodoo.openplans.org 9.6.0 Darwin Kernel Version 9.6.0: Mon Nov 24
17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386 i386 i386 MacBook2,1
Darwin



$ for i in mvn java javac; do which $i; $i -version; echo; done
/usr/bin/mvn
Maven version: 2.0.6

/usr/bin/java
java version "1.6.0_07"
Java(TM) SE Runtime Environment (build 1.6.0_07-b06-153)
Java HotSpot(TM) 64-Bit Server VM (build 1.6.0_07-b06-57, mixed mode)

/usr/bin/javac
javac 1.6.0_07




I retried the demo on a Gentoo Linux virtual machine, however, and all went
well. Here is the info on that machine:

$ uname -a
Linux dev.melkjug.org 2.6.21-xen #1 SMP Tue May 20 03:08:24 EDT 2008 x86_64
Intel(R) Xeon(R) CPU E5430 @ 2.66GHz GenuineIntel GNU/Linux


$ for i in mvn java javac; do which $i; $i -version; echo; done
/usr/bin/mvn
Maven version: 2.0.9
Java version: 1.6.0_11
OS name: "linux" version: "2.6.21-xen" arch: "amd64" Family: "unix"

/usr/bin/java
java version "1.6.0_11"
Java(TM) SE Runtime Environment (build 1.6.0_11-b03)
Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode)

/usr/bin/javac
javac 1.6.0_11



Note that the difference in the maven versions. Is the demo not compatible
with maven <= 2.0.6?


On Thu, Apr 2, 2009 at 8:51 AM, Grant Ingersoll <gs...@apache.org> wrote:

> Hmm, when I check out clean and do install it works fine.
>
> What platform are you on?
>
>
>
> On Apr 2, 2009, at 8:12 AM, Grant Ingersoll wrote:
>
>  What version of Maven do you have?
>>
>> On Apr 1, 2009, at 4:27 PM, Joshua Bronson wrote:
>>
>>  You mean you're supposed to do step 4 *before* step 8?!? ;p
>>> I did run mvn install, and though I got a bunch of warnings like the
>>> following:
>>>
>>> [WARNING] Entry:
>>>
>>>>
>>>> mahout-0.2-SNAPSHOT/usr/local/melk/mahout/core/src/main/java/org/apache/mahout/cf/taste/impl/common/
>>>> longer than 100 characters.
>>>>
>>>>
>>> after a couple hours it said it completed successfully:
>>>
>>> [INFO]
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>  [INFO] Reactor Summary:
>>>
>>>>
>>>>  [INFO]
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>  [INFO] Mahout core ........................................... SUCCESS
>>>
>>>> [8:46.665s]
>>>>
>>>>  [INFO] Mahout Taste Webapp ................................... SUCCESS
>>>
>>>> [55.496s]
>>>>
>>>>  [INFO] Mahout examples ....................................... SUCCESS
>>>
>>>> [55.317s]
>>>>
>>>>  [INFO] Apache Lucene Mahout .................................. SUCCESS
>>>
>>>> [2:02:03.392s]
>>>>
>>>>  [INFO]
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>  [INFO]
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>  [INFO] BUILD SUCCESSFUL
>>>
>>>>
>>>>  [INFO]
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>  [INFO] Total time: 132 minutes 41 seconds
>>>
>>>>
>>>>  [INFO] Finished at: Wed Apr 01 00:59:27 EDT 2009
>>>
>>>>
>>>>  [INFO] Final Memory: 61M/80M
>>>
>>>>
>>>>  [INFO]
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>
>>>
>>> So I proceeded through steps 5, 6, and 7, and then step 8's "mvn package"
>>> command failed with the output I linked to.
>>>
>>> Just for the heck of it I tried "mvn install" again (from the top-level
>>> directory) and after getting a bunch of the "longer-than-100-characters"
>>> warnings again, this time after 7 minutes it failed with:
>>>
>>> [ERROR] BUILD ERROR
>>>
>>>>
>>>>  [INFO]
>>>
>>>> ------------------------------------------------------------------------
>>>>
>>>>  [INFO] Failed to create assembly: Error creating assembly archive
>>> project: A
>>>
>>>> tar file cannot include itself.
>>>>
>>>>
>>>
>>> I posted the full transcript of my console session at
>>> http://melkjug.org/_static/grouplens-install-log.txt. Seems like
>>> something
>>> funky's going on with tar, but I'm not sure what.
>>>
>>>
>>> On Wed, Apr 1, 2009 at 12:11 PM, Grant Ingersoll <gsingers@apache.org
>>> >wrote:
>>>
>>>  Do a "mvn install" from the top level directory first:
>>>> http://lucene.apache.org/mahout/taste.html#demo
>>>>
>>>> HTH,
>>>> Grant
>>>>
>>>>
>>>> On Apr 1, 2009, at 11:35 AM, Joshua Bronson wrote:
>>>>
>>>> Thanks all for the good info. Taste definitely sounds like a promising
>>>>
>>>>> direction for us to go in for our recommendation service.
>>>>> I'm working through the installation of the GroupLens demo, but the mvn
>>>>> package step is failing with the output at
>>>>> http://paste.pocoo.org/show/110618/. Haven't looked into this yet,
>>>>> just
>>>>> thought I'd post to the list first with my progress. If anyone else
>>>>> uses
>>>>> IRC, I've created (and am currently the only one in) the #mahout
>>>>> channel
>>>>> on
>>>>> freenode. Hope to see some of you in there!
>>>>>
>>>>> Josh
>>>>>
>>>>> On Wed, Apr 1, 2009 at 5:48 AM, Sean Owen <sr...@gmail.com> wrote:
>>>>>
>>>>> Couple clarifications -
>>>>>
>>>>>>
>>>>>> The CF components are oriented to on-line, real-time use, though of
>>>>>> course
>>>>>> one can trivially build a batch job out of that. That is what I did
>>>>>> with
>>>>>> the
>>>>>> EC2 image that cranks out recommendations for all users.
>>>>>>
>>>>>> The CF component is also already parallelized as much as is practical.
>>>>>> There
>>>>>> are already Hadoop jobs for parallel, batch operation.
>>>>>>
>>>>>> Finally if you have some external notion of item similarity, like text
>>>>>> similarity between articles, you can and should include this info by
>>>>>> creating an ItemSimilarity with this knowledge. In that case you want
>>>>>> to
>>>>>> use
>>>>>> an item-based recommender, since it is only in such a case that
>>>>>> item-based
>>>>>> recommenders have a distinct advantage.
>>>>>>
>>>>>> On Apr 1, 2009 10:32 AM, "Otis Gospodnetic" <
>>>>>> otis_gospodnetic@yahoo.com>
>>>>>> wrote:
>>>>>>
>>>>>>
>>>>>> it's the former.  Taste is still not parallelized, but other parts of
>>>>>> Mahout
>>>>>> are, and they make use of Hadoop.
>>>>>>
>>>>>>
>>>>>>  --------------------------
>>>> Grant Ingersoll
>>>> http://www.lucidimagination.com/
>>>>
>>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>>>> Solr/Lucene:
>>>> http://www.lucidimagination.com/search
>>>>
>>>>
>>>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>>
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>> Solr/Lucene:
>> http://www.lucidimagination.com/search
>>
>>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>

Re: mahout for news recommendation?

Posted by Grant Ingersoll <gs...@apache.org>.
Hmm, when I check out clean and do install it works fine.

What platform are you on?


On Apr 2, 2009, at 8:12 AM, Grant Ingersoll wrote:

> What version of Maven do you have?
>
> On Apr 1, 2009, at 4:27 PM, Joshua Bronson wrote:
>
>> You mean you're supposed to do step 4 *before* step 8?!? ;p
>> I did run mvn install, and thought I got a bunch of warnings like the
>> following:
>>
>> [WARNING] Entry:
>>> mahout-0.2-SNAPSHOT/usr/local/melk/mahout/core/src/main/java/org/ 
>>> apache/mahout/cf/taste/impl/common/
>>> longer than 100 characters.
>>>
>>
>> after a couple hours it said it completed successfully:
>>
>> [INFO]
>>> ------------------------------------------------------------------------
>>>
>> [INFO] Reactor Summary:
>>>
>> [INFO]
>>> ------------------------------------------------------------------------
>>>
>> [INFO] Mahout core ...........................................  
>> SUCCESS
>>> [8:46.665s]
>>>
>> [INFO] Mahout Taste Webapp ...................................  
>> SUCCESS
>>> [55.496s]
>>>
>> [INFO] Mahout examples .......................................  
>> SUCCESS
>>> [55.317s]
>>>
>> [INFO] Apache Lucene Mahout ..................................  
>> SUCCESS
>>> [2:02:03.392s]
>>>
>> [INFO]
>>> ------------------------------------------------------------------------
>>>
>> [INFO]
>>> ------------------------------------------------------------------------
>>>
>> [INFO] BUILD SUCCESSFUL
>>>
>> [INFO]
>>> ------------------------------------------------------------------------
>>>
>> [INFO] Total time: 132 minutes 41 seconds
>>>
>> [INFO] Finished at: Wed Apr 01 00:59:27 EDT 2009
>>>
>> [INFO] Final Memory: 61M/80M
>>>
>> [INFO]
>>> ------------------------------------------------------------------------
>>>
>>
>>
>> So I proceeded through steps 5, 6, and 7, and then step 8's "mvn  
>> package"
>> command failed with the output I linked to.
>>
>> Just for the heck of it I tried "mvn install" again (from the top- 
>> level
>> directory) and after getting a bunch of the "longer-than-100- 
>> characters"
>> warnings again, this time after 7 minutes it failed with:
>>
>> [ERROR] BUILD ERROR
>>>
>> [INFO]
>>> ------------------------------------------------------------------------
>>>
>> [INFO] Failed to create assembly: Error creating assembly archive  
>> project: A
>>> tar file cannot include itself.
>>>
>>
>>
>> I posted the full transcript of my console session at
>> http://melkjug.org/_static/grouplens-install-log.txt. Seems like  
>> something
>> funky's going on with tar, but I'm not sure what.
>>
>>
>> On Wed, Apr 1, 2009 at 12:11 PM, Grant Ingersoll  
>> <gs...@apache.org>wrote:
>>
>>> Do a "mvn install" from the top level directory first:
>>> http://lucene.apache.org/mahout/taste.html#demo
>>>
>>> HTH,
>>> Grant
>>>
>>>
>>> On Apr 1, 2009, at 11:35 AM, Joshua Bronson wrote:
>>>
>>> Thanks all for the good info. Taste definitely sounds like a  
>>> promising
>>>> direction for us to go in for our recommendation service.
>>>> I'm working through the installation of the GroupLens demo, but  
>>>> the mvn
>>>> package step is failing with the output at
>>>> http://paste.pocoo.org/show/110618/. Haven't looked into this  
>>>> yet, just
>>>> thought I'd post to the list first with my progress. If anyone  
>>>> else uses
>>>> IRC, I've created (and am currently the only one in) the #mahout  
>>>> channel
>>>> on
>>>> freenode. Hope to see some of you in there!
>>>>
>>>> Josh
>>>>
>>>> On Wed, Apr 1, 2009 at 5:48 AM, Sean Owen <sr...@gmail.com> wrote:
>>>>
>>>> Couple clarifications -
>>>>>
>>>>> The CF components are oriented to on-line, real-time use, though  
>>>>> of
>>>>> course
>>>>> one can trivially build a batch job out of that. That is what I  
>>>>> did with
>>>>> the
>>>>> EC2 image that cranks out recommendations for all users.
>>>>>
>>>>> The CF component is also already parallelized as much as is  
>>>>> practical.
>>>>> There
>>>>> are already Hadoop jobs for parallel, batch operation.
>>>>>
>>>>> Finally if you have some external notion of item similarity,  
>>>>> like text
>>>>> similarity between articles, you can and should include this  
>>>>> info by
>>>>> creating an ItemSimilarity with this knowledge. In that case you  
>>>>> want to
>>>>> use
>>>>> an item-based recommender, since it is only in such a case that
>>>>> item-based
>>>>> recommenders have a distinct advantage.
>>>>>
>>>>> On Apr 1, 2009 10:32 AM, "Otis Gospodnetic" <otis_gospodnetic@yahoo.com 
>>>>> >
>>>>> wrote:
>>>>>
>>>>>
>>>>> it's the former.  Taste is still not parallelized, but other  
>>>>> parts of
>>>>> Mahout
>>>>> are, and they make use of Hadoop.
>>>>>
>>>>>
>>> --------------------------
>>> Grant Ingersoll
>>> http://www.lucidimagination.com/
>>>
>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
>>> using
>>> Solr/Lucene:
>>> http://www.lucidimagination.com/search
>>>
>>>
>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
> using Solr/Lucene:
> http://www.lucidimagination.com/search
>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: mahout for news recommendation?

Posted by Grant Ingersoll <gs...@apache.org>.
What version of Maven do you have?

On Apr 1, 2009, at 4:27 PM, Joshua Bronson wrote:

> You mean you're supposed to do step 4 *before* step 8?!? ;p
> I did run mvn install, and thought I got a bunch of warnings like the
> following:
>
> [WARNING] Entry:
>> mahout-0.2-SNAPSHOT/usr/local/melk/mahout/core/src/main/java/org/ 
>> apache/mahout/cf/taste/impl/common/
>> longer than 100 characters.
>>
>
> after a couple hours it said it completed successfully:
>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] Reactor Summary:
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] Mahout core ........................................... SUCCESS
>> [8:46.665s]
>>
> [INFO] Mahout Taste Webapp ................................... SUCCESS
>> [55.496s]
>>
> [INFO] Mahout examples ....................................... SUCCESS
>> [55.317s]
>>
> [INFO] Apache Lucene Mahout .................................. SUCCESS
>> [2:02:03.392s]
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] BUILD SUCCESSFUL
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] Total time: 132 minutes 41 seconds
>>
> [INFO] Finished at: Wed Apr 01 00:59:27 EDT 2009
>>
> [INFO] Final Memory: 61M/80M
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
>
>
> So I proceeded through steps 5, 6, and 7, and then step 8's "mvn  
> package"
> command failed with the output I linked to.
>
> Just for the heck of it I tried "mvn install" again (from the top- 
> level
> directory) and after getting a bunch of the "longer-than-100- 
> characters"
> warnings again, this time after 7 minutes it failed with:
>
> [ERROR] BUILD ERROR
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] Failed to create assembly: Error creating assembly archive  
> project: A
>> tar file cannot include itself.
>>
>
>
> I posted the full transcript of my console session at
> http://melkjug.org/_static/grouplens-install-log.txt. Seems like  
> something
> funky's going on with tar, but I'm not sure what.
>
>
> On Wed, Apr 1, 2009 at 12:11 PM, Grant Ingersoll  
> <gs...@apache.org>wrote:
>
>> Do a "mvn install" from the top level directory first:
>> http://lucene.apache.org/mahout/taste.html#demo
>>
>> HTH,
>> Grant
>>
>>
>> On Apr 1, 2009, at 11:35 AM, Joshua Bronson wrote:
>>
>> Thanks all for the good info. Taste definitely sounds like a  
>> promising
>>> direction for us to go in for our recommendation service.
>>> I'm working through the installation of the GroupLens demo, but  
>>> the mvn
>>> package step is failing with the output at
>>> http://paste.pocoo.org/show/110618/. Haven't looked into this yet,  
>>> just
>>> thought I'd post to the list first with my progress. If anyone  
>>> else uses
>>> IRC, I've created (and am currently the only one in) the #mahout  
>>> channel
>>> on
>>> freenode. Hope to see some of you in there!
>>>
>>> Josh
>>>
>>> On Wed, Apr 1, 2009 at 5:48 AM, Sean Owen <sr...@gmail.com> wrote:
>>>
>>> Couple clarifications -
>>>>
>>>> The CF components are oriented to on-line, real-time use, though of
>>>> course
>>>> one can trivially build a batch job out of that. That is what I  
>>>> did with
>>>> the
>>>> EC2 image that cranks out recommendations for all users.
>>>>
>>>> The CF component is also already parallelized as much as is  
>>>> practical.
>>>> There
>>>> are already Hadoop jobs for parallel, batch operation.
>>>>
>>>> Finally if you have some external notion of item similarity, like  
>>>> text
>>>> similarity between articles, you can and should include this info  
>>>> by
>>>> creating an ItemSimilarity with this knowledge. In that case you  
>>>> want to
>>>> use
>>>> an item-based recommender, since it is only in such a case that
>>>> item-based
>>>> recommenders have a distinct advantage.
>>>>
>>>> On Apr 1, 2009 10:32 AM, "Otis Gospodnetic" <otis_gospodnetic@yahoo.com 
>>>> >
>>>> wrote:
>>>>
>>>>
>>>> it's the former.  Taste is still not parallelized, but other  
>>>> parts of
>>>> Mahout
>>>> are, and they make use of Hadoop.
>>>>
>>>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>>
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
>> using
>> Solr/Lucene:
>> http://www.lucidimagination.com/search
>>
>>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: mahout for news recommendation?

Posted by Joshua Bronson <ja...@gmail.com>.
Oh, it just occurred to me I should have said that /usr/local/melk/mahout is
a checkout of http://svn.apache.org/repos/asf/lucene/mahout/trunk/. When I
got to step 2, which says...

> 2. Unpack the archive and copy movies.dat and ratings.dat to
> trunk/taste-web/src/main/resources/org/apache/mahout/cf/taste/example/grouplens under
> the Mahout distribution directory.


...I assumed the instructions had left out the step of running "svn
checkout http://svn.apache.org/repos/asf/lucene/mahout/trunk/". Was this
assumption incorrect?

I did have to "mkdir
-p trunk/taste-web/src/main/resources/org/apache/mahout/cf/taste/example/grouplens"
before I could copy the .dat files there as
the trunk/taste-web/src/main/resources directory of the checkout doesn't
contain anything in it. Did I go off on the wrong track?

On Wed, Apr 1, 2009 at 4:27 PM, Joshua Bronson <ja...@gmail.com> wrote:

> You mean you're supposed to do step 4 *before* step 8?!? ;p
> I did run mvn install, and though I got a bunch of warnings like the
> following:
>
> [WARNING] Entry:
>> mahout-0.2-SNAPSHOT/usr/local/melk/mahout/core/src/main/java/org/apache/mahout/cf/taste/impl/common/
>> longer than 100 characters.
>>
>
> after a couple hours it said it completed successfully:
>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] Reactor Summary:
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] Mahout core ........................................... SUCCESS
>> [8:46.665s]
>>
> [INFO] Mahout Taste Webapp ................................... SUCCESS
>> [55.496s]
>>
> [INFO] Mahout examples ....................................... SUCCESS
>> [55.317s]
>>
> [INFO] Apache Lucene Mahout .................................. SUCCESS
>> [2:02:03.392s]
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] BUILD SUCCESSFUL
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] Total time: 132 minutes 41 seconds
>>
> [INFO] Finished at: Wed Apr 01 00:59:27 EDT 2009
>>
> [INFO] Final Memory: 61M/80M
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
>
>
> So I proceeded through steps 5, 6, and 7, and then step 8's "mvn package"
> command failed with the output I linked to.
>
> Just for the heck of it I tried "mvn install" again (from the top-level
> directory) and after getting a bunch of the "longer-than-100-characters"
> warnings again, this time after 7 minutes it failed with:
>
> [ERROR] BUILD ERROR
>>
> [INFO]
>> ------------------------------------------------------------------------
>>
> [INFO] Failed to create assembly: Error creating assembly archive project:
>> A tar file cannot include itself.
>>
>
>
> I posted the full transcript of my console session at
> http://melkjug.org/_static/grouplens-install-log.txt. Seems like something
> funky's going on with tar, but I'm not sure what.
>
>
> On Wed, Apr 1, 2009 at 12:11 PM, Grant Ingersoll <gs...@apache.org>wrote:
>
>> Do a "mvn install" from the top level directory first:
>> http://lucene.apache.org/mahout/taste.html#demo
>>
>> HTH,
>> Grant
>>
>>
>> On Apr 1, 2009, at 11:35 AM, Joshua Bronson wrote:
>>
>>  Thanks all for the good info. Taste definitely sounds like a promising
>>> direction for us to go in for our recommendation service.
>>> I'm working through the installation of the GroupLens demo, but the mvn
>>> package step is failing with the output at
>>> http://paste.pocoo.org/show/110618/. Haven't looked into this yet, just
>>> thought I'd post to the list first with my progress. If anyone else uses
>>> IRC, I've created (and am currently the only one in) the #mahout channel
>>> on
>>> freenode. Hope to see some of you in there!
>>>
>>> Josh
>>>
>>> On Wed, Apr 1, 2009 at 5:48 AM, Sean Owen <sr...@gmail.com> wrote:
>>>
>>>  Couple clarifications -
>>>>
>>>> The CF components are oriented to on-line, real-time use, though of
>>>> course
>>>> one can trivially build a batch job out of that. That is what I did with
>>>> the
>>>> EC2 image that cranks out recommendations for all users.
>>>>
>>>> The CF component is also already parallelized as much as is practical.
>>>> There
>>>> are already Hadoop jobs for parallel, batch operation.
>>>>
>>>> Finally if you have some external notion of item similarity, like text
>>>> similarity between articles, you can and should include this info by
>>>> creating an ItemSimilarity with this knowledge. In that case you want to
>>>> use
>>>> an item-based recommender, since it is only in such a case that
>>>> item-based
>>>> recommenders have a distinct advantage.
>>>>
>>>> On Apr 1, 2009 10:32 AM, "Otis Gospodnetic" <otis_gospodnetic@yahoo.com
>>>> >
>>>> wrote:
>>>>
>>>>
>>>> it's the former.  Taste is still not parallelized, but other parts of
>>>> Mahout
>>>> are, and they make use of Hadoop.
>>>>
>>>>
>> --------------------------
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>>
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
>> Solr/Lucene:
>> http://www.lucidimagination.com/search
>>
>>
>

Re: mahout for news recommendation?

Posted by Joshua Bronson <ja...@gmail.com>.
You mean you're supposed to do step 4 *before* step 8?!? ;p
I did run mvn install, and thought I got a bunch of warnings like the
following:

[WARNING] Entry:
> mahout-0.2-SNAPSHOT/usr/local/melk/mahout/core/src/main/java/org/apache/mahout/cf/taste/impl/common/
> longer than 100 characters.
>

after a couple hours it said it completed successfully:

[INFO]
> ------------------------------------------------------------------------
>
[INFO] Reactor Summary:
>
[INFO]
> ------------------------------------------------------------------------
>
[INFO] Mahout core ........................................... SUCCESS
> [8:46.665s]
>
[INFO] Mahout Taste Webapp ................................... SUCCESS
> [55.496s]
>
[INFO] Mahout examples ....................................... SUCCESS
> [55.317s]
>
[INFO] Apache Lucene Mahout .................................. SUCCESS
> [2:02:03.392s]
>
[INFO]
> ------------------------------------------------------------------------
>
[INFO]
> ------------------------------------------------------------------------
>
[INFO] BUILD SUCCESSFUL
>
[INFO]
> ------------------------------------------------------------------------
>
[INFO] Total time: 132 minutes 41 seconds
>
[INFO] Finished at: Wed Apr 01 00:59:27 EDT 2009
>
[INFO] Final Memory: 61M/80M
>
[INFO]
> ------------------------------------------------------------------------
>


So I proceeded through steps 5, 6, and 7, and then step 8's "mvn package"
command failed with the output I linked to.

Just for the heck of it I tried "mvn install" again (from the top-level
directory) and after getting a bunch of the "longer-than-100-characters"
warnings again, this time after 7 minutes it failed with:

[ERROR] BUILD ERROR
>
[INFO]
> ------------------------------------------------------------------------
>
[INFO] Failed to create assembly: Error creating assembly archive project: A
> tar file cannot include itself.
>


I posted the full transcript of my console session at
http://melkjug.org/_static/grouplens-install-log.txt. Seems like something
funky's going on with tar, but I'm not sure what.


On Wed, Apr 1, 2009 at 12:11 PM, Grant Ingersoll <gs...@apache.org>wrote:

> Do a "mvn install" from the top level directory first:
> http://lucene.apache.org/mahout/taste.html#demo
>
> HTH,
> Grant
>
>
> On Apr 1, 2009, at 11:35 AM, Joshua Bronson wrote:
>
>  Thanks all for the good info. Taste definitely sounds like a promising
>> direction for us to go in for our recommendation service.
>> I'm working through the installation of the GroupLens demo, but the mvn
>> package step is failing with the output at
>> http://paste.pocoo.org/show/110618/. Haven't looked into this yet, just
>> thought I'd post to the list first with my progress. If anyone else uses
>> IRC, I've created (and am currently the only one in) the #mahout channel
>> on
>> freenode. Hope to see some of you in there!
>>
>> Josh
>>
>> On Wed, Apr 1, 2009 at 5:48 AM, Sean Owen <sr...@gmail.com> wrote:
>>
>>  Couple clarifications -
>>>
>>> The CF components are oriented to on-line, real-time use, though of
>>> course
>>> one can trivially build a batch job out of that. That is what I did with
>>> the
>>> EC2 image that cranks out recommendations for all users.
>>>
>>> The CF component is also already parallelized as much as is practical.
>>> There
>>> are already Hadoop jobs for parallel, batch operation.
>>>
>>> Finally if you have some external notion of item similarity, like text
>>> similarity between articles, you can and should include this info by
>>> creating an ItemSimilarity with this knowledge. In that case you want to
>>> use
>>> an item-based recommender, since it is only in such a case that
>>> item-based
>>> recommenders have a distinct advantage.
>>>
>>> On Apr 1, 2009 10:32 AM, "Otis Gospodnetic" <ot...@yahoo.com>
>>> wrote:
>>>
>>>
>>> it's the former.  Taste is still not parallelized, but other parts of
>>> Mahout
>>> are, and they make use of Hadoop.
>>>
>>>
> --------------------------
> Grant Ingersoll
> http://www.lucidimagination.com/
>
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using
> Solr/Lucene:
> http://www.lucidimagination.com/search
>
>

Re: mahout for news recommendation?

Posted by Grant Ingersoll <gs...@apache.org>.
Do a "mvn install" from the top level directory first: http://lucene.apache.org/mahout/taste.html#demo

HTH,
Grant

On Apr 1, 2009, at 11:35 AM, Joshua Bronson wrote:

> Thanks all for the good info. Taste definitely sounds like a promising
> direction for us to go in for our recommendation service.
> I'm working through the installation of the GroupLens demo, but the  
> mvn
> package step is failing with the output at
> http://paste.pocoo.org/show/110618/. Haven't looked into this yet,  
> just
> thought I'd post to the list first with my progress. If anyone else  
> uses
> IRC, I've created (and am currently the only one in) the #mahout  
> channel on
> freenode. Hope to see some of you in there!
>
> Josh
>
> On Wed, Apr 1, 2009 at 5:48 AM, Sean Owen <sr...@gmail.com> wrote:
>
>> Couple clarifications -
>>
>> The CF components are oriented to on-line, real-time use, though of  
>> course
>> one can trivially build a batch job out of that. That is what I did  
>> with
>> the
>> EC2 image that cranks out recommendations for all users.
>>
>> The CF component is also already parallelized as much as is  
>> practical.
>> There
>> are already Hadoop jobs for parallel, batch operation.
>>
>> Finally if you have some external notion of item similarity, like  
>> text
>> similarity between articles, you can and should include this info by
>> creating an ItemSimilarity with this knowledge. In that case you  
>> want to
>> use
>> an item-based recommender, since it is only in such a case that  
>> item-based
>> recommenders have a distinct advantage.
>>
>> On Apr 1, 2009 10:32 AM, "Otis Gospodnetic" <otis_gospodnetic@yahoo.com 
>> >
>> wrote:
>>
>>
>> it's the former.  Taste is still not parallelized, but other parts of
>> Mahout
>> are, and they make use of Hadoop.
>>

--------------------------
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:
http://www.lucidimagination.com/search


Re: mahout for news recommendation?

Posted by Joshua Bronson <ja...@gmail.com>.
Thanks all for the good info. Taste definitely sounds like a promising
direction for us to go in for our recommendation service.
I'm working through the installation of the GroupLens demo, but the mvn
package step is failing with the output at
http://paste.pocoo.org/show/110618/. Haven't looked into this yet, just
thought I'd post to the list first with my progress. If anyone else uses
IRC, I've created (and am currently the only one in) the #mahout channel on
freenode. Hope to see some of you in there!

Josh

On Wed, Apr 1, 2009 at 5:48 AM, Sean Owen <sr...@gmail.com> wrote:

> Couple clarifications -
>
> The CF components are oriented to on-line, real-time use, though of course
> one can trivially build a batch job out of that. That is what I did with
> the
> EC2 image that cranks out recommendations for all users.
>
> The CF component is also already parallelized as much as is practical.
> There
> are already Hadoop jobs for parallel, batch operation.
>
> Finally if you have some external notion of item similarity, like text
> similarity between articles, you can and should include this info by
> creating an ItemSimilarity with this knowledge. In that case you want to
> use
> an item-based recommender, since it is only in such a case that item-based
> recommenders have a distinct advantage.
>
> On Apr 1, 2009 10:32 AM, "Otis Gospodnetic" <ot...@yahoo.com>
> wrote:
>
>
> it's the former.  Taste is still not parallelized, but other parts of
> Mahout
> are, and they make use of Hadoop.
>

Re: mahout for news recommendation?

Posted by Sean Owen <sr...@gmail.com>.
Couple clarifications -

The CF components are oriented to on-line, real-time use, though of course
one can trivially build a batch job out of that. That is what I did with the
EC2 image that cranks out recommendations for all users.

The CF component is also already parallelized as much as is practical. There
are already Hadoop jobs for parallel, batch operation.

Finally if you have some external notion of item similarity, like text
similarity between articles, you can and should include this info by
creating an ItemSimilarity with this knowledge. In that case you want to use
an item-based recommender, since it is only in such a case that item-based
recommenders have a distinct advantage.

On Apr 1, 2009 10:32 AM, "Otis Gospodnetic" <ot...@yahoo.com>
wrote:


it's the former.  Taste is still not parallelized, but other parts of Mahout
are, and they make use of Hadoop.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

----- Original Message ---- > From: Vinicius Carvalho <
viniciusccarvalho@gmail.com> > To: mahout-...

> On Tue, Mar 31, 2009 at 1:32 PM, Tim Bass wrote: > > > Most prior-work in
news related classifica...

> > On Tue, Mar 31, 2009 at 10:39 PM, Jason Rennie wrote: > > > Sorry for my
misunderstanding. Than...

> > >wrote: > > > > > >> > > >> On Mar 31, 2009, at 9:47 AM, Jason Rennie
wrote: > > >> > > >> > > >...