You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@mahout.apache.org by Apurv Verma <da...@gmail.com> on 2011/10/05 17:07:46 UTC

Request for Assistance.

Hii all,
 I am interested in becoming a contributor to Mahout. But unfortunately I
have not had any course in Machine Learning still. I am having a course in
Artificial Intelligence this semester.
I am also *not* conversant with hadoop and mapreduce though I have heard of
it and have long wanted to learn it. Can someone please guide me (mentor
informally) so that I may get a sense and direction and I am able to develop
the skills set required to contribute to this project within the next 6
months.

--
thanks and regards,

Apurv Verma
B. Tech.(CSE)
IIT- Ropar

Linked In: http://in.linkedin.com/in/apurv5

Re: Request for Assistance.

Posted by Joanne Sun <jo...@gmail.com>.
How I can contribute to it? It is already implemented but maybe not in Mahout.

On Wed, Oct 5, 2011 at 8:02 PM, Lance Norskog <go...@gmail.com> wrote:
> I stand (privately) corrected:
>
> ----------------------------------------
> Actually, on-line algorithms are ones that use constant memory and constant
> time per input record.  This may mean that they can run on demand, or not.
>
> What is most important is that run-time will be linear or better in input
> size and if it succeeds on small input, it will succeed on large input
> (eventually).
>
> http://en.wikipedia.org/wiki/Online_algorithm
>  ------------------------
>
> If you wish to contribute something really cool to Mahout, I recently found
> this online implementation of Singular Value Decomposition (SVD). It updates
> the SVD incrementally, and is used in a recommender. This algorithm has also
> been used to track shapes in real-time in computer vision.
>
> http://www.merl.com/publications/TR2003-014/
>
>
> On Wed, Oct 5, 2011 at 3:51 PM, Lance Norskog <go...@gmail.com> wrote:
>
>> There are many "online" algorithms which do not use Hadoop or map/reduce.
>> Online means that the code runs on demand; Hadoop jobs create large
>> datasets. Online algorithms may use the output of Hadoop, or process input
>> data directly.
>>
>> Try learning an area of the online algorithms like recommender systems or
>> classifiers- you will learn much more quickly. You will find that Hadoop is
>> an amazing time-waster.
>>
>> Lance
>>
>>
>> On Wed, Oct 5, 2011 at 11:35 AM, Isabel Drost <is...@apache.org> wrote:
>>
>>>
>>> First of all welcome also from my side.
>>>
>>> On 05.10.2011 Apurv Verma wrote:
>>> >  I am interested in becoming a contributor to Mahout.
>>>
>>> Actually we have a "How to contribute page" on our wiki that might help
>>> you:
>>>
>>> https://cwiki.apache.org/MAHOUT/how-to-contribute.html
>>>
>>> I guess the general take away is to start using Mahout for your own
>>> projects. As
>>> with any software you use sooner or later you will find stuff that bothers
>>> you:
>>> Missing documentation, extensions you need to make here and there, sutle
>>> bugs.
>>>
>>>
>>> >  But unfortunately I have not had any course in Machine Learning still.
>>> I am
>>> >  having a course in Artificial Intelligence this semester.
>>>
>>> While it is certainly a great help to have some machine learning
>>> background, you
>>> do not need a PhD to start contributing to Mahout. Any infrastructure
>>> improvements that do not change the inner algorithms but make it easier to
>>> integrate Mahout and re-use it are highly welcome.
>>>
>>>
>>> > I am also *not* conversant with hadoop and mapreduce though I have heard
>>> of
>>> > it and have long wanted to learn it. Can someone please guide me (mentor
>>> > informally) so that I may get a sense and direction and I am able to
>>> > develop the skills set required to contribute to this project within the
>>> > next 6 months.
>>>
>>> You have taken a very good first step by contacting the mailing list. Try
>>> to
>>> figure out an area that you would like to use Mahout for, start working in
>>> that
>>> direction, if you come across any questions that cannot be answered by a
>>> trivial
>>> search in the mailing list archives don't be shy to ask on list. When
>>> getting
>>> more proficient answer questions other new-comers may have, start
>>> reviewing
>>> patches and maybe even contribute your own improvements.
>>>
>>> Isabel
>>>
>>
>>
>>
>> --
>> Lance Norskog
>> goksron@gmail.com
>>
>>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>

Re: Request for Assistance.

Posted by Lance Norskog <go...@gmail.com>.
I stand (privately) corrected:

----------------------------------------
Actually, on-line algorithms are ones that use constant memory and constant
time per input record.  This may mean that they can run on demand, or not.

What is most important is that run-time will be linear or better in input
size and if it succeeds on small input, it will succeed on large input
(eventually).

http://en.wikipedia.org/wiki/Online_algorithm
 ------------------------

If you wish to contribute something really cool to Mahout, I recently found
this online implementation of Singular Value Decomposition (SVD). It updates
the SVD incrementally, and is used in a recommender. This algorithm has also
been used to track shapes in real-time in computer vision.

http://www.merl.com/publications/TR2003-014/


On Wed, Oct 5, 2011 at 3:51 PM, Lance Norskog <go...@gmail.com> wrote:

> There are many "online" algorithms which do not use Hadoop or map/reduce.
> Online means that the code runs on demand; Hadoop jobs create large
> datasets. Online algorithms may use the output of Hadoop, or process input
> data directly.
>
> Try learning an area of the online algorithms like recommender systems or
> classifiers- you will learn much more quickly. You will find that Hadoop is
> an amazing time-waster.
>
> Lance
>
>
> On Wed, Oct 5, 2011 at 11:35 AM, Isabel Drost <is...@apache.org> wrote:
>
>>
>> First of all welcome also from my side.
>>
>> On 05.10.2011 Apurv Verma wrote:
>> >  I am interested in becoming a contributor to Mahout.
>>
>> Actually we have a "How to contribute page" on our wiki that might help
>> you:
>>
>> https://cwiki.apache.org/MAHOUT/how-to-contribute.html
>>
>> I guess the general take away is to start using Mahout for your own
>> projects. As
>> with any software you use sooner or later you will find stuff that bothers
>> you:
>> Missing documentation, extensions you need to make here and there, sutle
>> bugs.
>>
>>
>> >  But unfortunately I have not had any course in Machine Learning still.
>> I am
>> >  having a course in Artificial Intelligence this semester.
>>
>> While it is certainly a great help to have some machine learning
>> background, you
>> do not need a PhD to start contributing to Mahout. Any infrastructure
>> improvements that do not change the inner algorithms but make it easier to
>> integrate Mahout and re-use it are highly welcome.
>>
>>
>> > I am also *not* conversant with hadoop and mapreduce though I have heard
>> of
>> > it and have long wanted to learn it. Can someone please guide me (mentor
>> > informally) so that I may get a sense and direction and I am able to
>> > develop the skills set required to contribute to this project within the
>> > next 6 months.
>>
>> You have taken a very good first step by contacting the mailing list. Try
>> to
>> figure out an area that you would like to use Mahout for, start working in
>> that
>> direction, if you come across any questions that cannot be answered by a
>> trivial
>> search in the mailing list archives don't be shy to ask on list. When
>> getting
>> more proficient answer questions other new-comers may have, start
>> reviewing
>> patches and maybe even contribute your own improvements.
>>
>> Isabel
>>
>
>
>
> --
> Lance Norskog
> goksron@gmail.com
>
>


-- 
Lance Norskog
goksron@gmail.com

Re: Request for Assistance.

Posted by Lance Norskog <go...@gmail.com>.
There are many "online" algorithms which do not use Hadoop or map/reduce.
Online means that the code runs on demand; Hadoop jobs create large
datasets. Online algorithms may use the output of Hadoop, or process input
data directly.

Try learning an area of the online algorithms like recommender systems or
classifiers- you will learn much more quickly. You will find that Hadoop is
an amazing time-waster.

Lance

On Wed, Oct 5, 2011 at 11:35 AM, Isabel Drost <is...@apache.org> wrote:

>
> First of all welcome also from my side.
>
> On 05.10.2011 Apurv Verma wrote:
> >  I am interested in becoming a contributor to Mahout.
>
> Actually we have a "How to contribute page" on our wiki that might help
> you:
>
> https://cwiki.apache.org/MAHOUT/how-to-contribute.html
>
> I guess the general take away is to start using Mahout for your own
> projects. As
> with any software you use sooner or later you will find stuff that bothers
> you:
> Missing documentation, extensions you need to make here and there, sutle
> bugs.
>
>
> >  But unfortunately I have not had any course in Machine Learning still. I
> am
> >  having a course in Artificial Intelligence this semester.
>
> While it is certainly a great help to have some machine learning
> background, you
> do not need a PhD to start contributing to Mahout. Any infrastructure
> improvements that do not change the inner algorithms but make it easier to
> integrate Mahout and re-use it are highly welcome.
>
>
> > I am also *not* conversant with hadoop and mapreduce though I have heard
> of
> > it and have long wanted to learn it. Can someone please guide me (mentor
> > informally) so that I may get a sense and direction and I am able to
> > develop the skills set required to contribute to this project within the
> > next 6 months.
>
> You have taken a very good first step by contacting the mailing list. Try
> to
> figure out an area that you would like to use Mahout for, start working in
> that
> direction, if you come across any questions that cannot be answered by a
> trivial
> search in the mailing list archives don't be shy to ask on list. When
> getting
> more proficient answer questions other new-comers may have, start reviewing
> patches and maybe even contribute your own improvements.
>
> Isabel
>



-- 
Lance Norskog
goksron@gmail.com

Re: Request for Assistance.

Posted by Isabel Drost <is...@apache.org>.
First of all welcome also from my side.

On 05.10.2011 Apurv Verma wrote:
>  I am interested in becoming a contributor to Mahout.

Actually we have a "How to contribute page" on our wiki that might help you:

https://cwiki.apache.org/MAHOUT/how-to-contribute.html

I guess the general take away is to start using Mahout for your own projects. As 
with any software you use sooner or later you will find stuff that bothers you: 
Missing documentation, extensions you need to make here and there, sutle bugs. 


>  But unfortunately I have not had any course in Machine Learning still. I am
>  having a course in Artificial Intelligence this semester.

While it is certainly a great help to have some machine learning background, you 
do not need a PhD to start contributing to Mahout. Any infrastructure 
improvements that do not change the inner algorithms but make it easier to 
integrate Mahout and re-use it are highly welcome.


> I am also *not* conversant with hadoop and mapreduce though I have heard of
> it and have long wanted to learn it. Can someone please guide me (mentor
> informally) so that I may get a sense and direction and I am able to
> develop the skills set required to contribute to this project within the
> next 6 months.

You have taken a very good first step by contacting the mailing list. Try to 
figure out an area that you would like to use Mahout for, start working in that 
direction, if you come across any questions that cannot be answered by a trivial 
search in the mailing list archives don't be shy to ask on list. When getting 
more proficient answer questions other new-comers may have, start reviewing 
patches and maybe even contribute your own improvements.

Isabel

Re: Request for Assistance.

Posted by Chris Schilling <ch...@thecleversense.com>.
Thanks for those links Matthew!

On Wed, Oct 5, 2011 at 8:13 AM, Matthew Runo <mr...@zappos.com> wrote:

> Hello -
>
> Welcome to the list :)
>
> You might want to check out http://www.ai-class.com/ and
> http://www.ml-class.org. They're online versions of Stanfrod classes on
> Machine Learning and Artificial Intelligence. They both start soon too!
>
> I'm sure that they'll give you a good starting point for work on Mahout.
> I'm planning on taking both myself!
>
> -Matthew Runo
>
> On Oct 5, 2011, at 8:07 AM, Apurv Verma wrote:
>
> Hii all,
> I am interested in becoming a contributor to Mahout. But unfortunately I
> have not had any course in Machine Learning still. I am having a course in
> Artificial Intelligence this semester.
> I am also *not* conversant with hadoop and mapreduce though I have heard of
> it and have long wanted to learn it. Can someone please guide me (mentor
> informally) so that I may get a sense and direction and I am able to
> develop
> the skills set required to contribute to this project within the next 6
> months.
>
> --
> thanks and regards,
>
> Apurv Verma
> B. Tech.(CSE)
> IIT- Ropar
>
> Linked In: http://in.linkedin.com/in/apurv5
>
>


-- 
Chris Schilling
Sr. Data Fiend
Clever Sense, Inc.
"Curating the World Around You!"
--------------------------------------------------------------
Winner of the 2011 Fortune Brainstorm Start-up
Idol<http://tech.fortune.cnn.com/2011/07/20/startup-idol-brainstorm-clever-sense/>

Wanna join the Clever Team? We're
hiring!<http://www.thecleversense.com/jobs.html>
--------------------------------------------------------------

Re: Request for Assistance.

Posted by Matthew Runo <mr...@zappos.com>.
Hello -

Welcome to the list :)

You might want to check out http://www.ai-class.com/ and http://www.ml-class.org. They're online versions of Stanfrod classes on Machine Learning and Artificial Intelligence. They both start soon too!

I'm sure that they'll give you a good starting point for work on Mahout. I'm planning on taking both myself!

-Matthew Runo

On Oct 5, 2011, at 8:07 AM, Apurv Verma wrote:

Hii all,
I am interested in becoming a contributor to Mahout. But unfortunately I
have not had any course in Machine Learning still. I am having a course in
Artificial Intelligence this semester.
I am also *not* conversant with hadoop and mapreduce though I have heard of
it and have long wanted to learn it. Can someone please guide me (mentor
informally) so that I may get a sense and direction and I am able to develop
the skills set required to contribute to this project within the next 6
months.

--
thanks and regards,

Apurv Verma
B. Tech.(CSE)
IIT- Ropar

Linked In: http://in.linkedin.com/in/apurv5