You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@mahout.apache.org by Isabel Drost <ap...@isabel-drost.de> on 2008/04/01 08:02:21 UTC
Re: Fast Feather Track
On Monday 31 March 2008, Karl Wettin wrote:
> I think it is worth listing all the algorithms people have submitted as
> GSoC proposals. It is an amazingly large group of people when you
> consider at how long the project has been around.
+1 Thanks for the comment - added them. Looks really impressive now -
unfortunately I guess the list was outdated at the moment I wrote it down ;)
> I also think you should add an introduction slide to ML so people that
> does not yet know they can benefit from it will understand. Perhaps that
> is the same thing as the "Problem setting"? I'll rant on though.
+1 Thanks for ranting it. It should be the same as "Problem setting". Waking
up this morning I still think the essential part of learning models from data
is still missing - despite the many application examples. Will add that this
afternoon.
> Nutch has an ngram based language identifier. Lucene has a "more like
> this" feature. Carrot cluster search results. LingPipe does a whole lot
> of things with text I think many would like to see in Mahout.
Any other examples? I will add these to the next version. (Did not have that
mail when I made the corresponding slide.
> One important thing is that people might not be aware that they store
> structured minable data. There is a lot of facetted classifications,
> tags, ratings and what not that is not used to its full potential.
I tried to give a few examples on the Problem Setting slide. Maybe this slide
can move further back into some "We need you/what can you do with Mahout"
context and at the Problem setting I would put a slide on learning models
from data. Thanks for the examples you gave.
Isabel
--
If you wait long enough, it will go away... after having done its damage.If it
was bad, it will be back.
|\ _,,,---,,_ Web: <http://www.isabel-drost.de>
/,`.-'`' -. ;-;;,_
|,4- ) )-,_..;\ ( `'-'
'---''(_/--' `-'\_) (fL) IM: <xm...@spaceboyz.net>
Re: Fast Feather Track
Posted by Karl Wettin <ka...@gmail.com>.
Isabel Drost skrev:
> On Monday 31 March 2008, Karl Wettin wrote:
>> Nutch has an ngram based language identifier. Lucene has a "more like
>> this" feature. Carrot cluster search results. LingPipe does a whole lot
>> of things with text I think many would like to see in Mahout.
>
> Any other examples? I will add these to the next version. (Did not have that
> mail when I made the corresponding slide.
Some "did you mean" must count as machine learning. Nice example where
there is no need for other data than users correcting their own typos,
accepting/declining suggestions and inspecting results. (Reinforcement
learning)
karl