You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by ankit nadig <an...@gmail.com> on 2013/09/23 18:07:42 UTC

Newbie:Need guidance

Hi,
   im a newbie...i want to learn and contribute to hadoop.I've set up a
single node cluster on ubuntu 12.04 . and i know c,c++ and am currently
learning Java. I haven't read any documentation and am new to open source
as such.

sorry for wasting ur time and if this is the wrong place for this mail but
can u give me any guidance on how to proceed ?

thank you.

Re: Newbie:Need guidance

Posted by Jay Vyas <ja...@gmail.com>.
I just created the jira :)  Its brand new so it wont be useful just yet -
and also - similar to steve's comment - it is for learning the ecosystem,
not the underlying plumbing of distributed java apps.
https://issues.apache.org/jira/browse/BIGTOP-1089


On Tue, Sep 24, 2013 at 11:31 AM, ankit nadig <an...@gmail.com> wrote:

> thanks a lot!
>
>
> On Tue, Sep 24, 2013 at 6:49 PM, Jay Vyas <ja...@gmail.com> wrote:
>
> > And also, if you want to help out: we are developing blueprints in the
> > bigtop project specifically for people who want to learn how real world
> > bigdata workflows look.
> >
> >
> > > On Sep 24, 2013, at 4:52 AM, Steve Loughran <st...@hortonworks.com>
> > wrote:
> > >
> > > Hi.
> > >
> > > You need to know that we don't really consider Hadoop a good place to
> > learn
> > > about Java or distributed system programming: it is simply too complex.
> > > It's like learning C by writing linux kernel device drivers -so we
> > > explicitly warn against trying to do this
> > >
> > > http://wiki.apache.org/hadoop/HadoopIsNot
> > >
> > > That said: we do we welcome new developers, and there is even a touch
> of
> > C
> > > code lurking in there too.
> > >
> > > What I'd recommend is you start not by delving into the code of Hadoop,
> > but
> > > by learning how to use it: a tool to answer questions about data; a
> > > platform you can build bigger applications from.
> > >
> > > This leads to two possible projects
> > >
> > > 1. Think of something you are curious about and from which you can grab
> > > public datasets from. A lot of government open datasets are really
> > > interesting, especially when merged with other datasets. Then analyse
> it
> > > -if you can find something interesting and new then that's something
> you
> > > can talk about and get known for.
> > >
> > > 2. Try writing a web application using Hadoop and its nosql database(s)
> > as
> > > the back end -either web or mobile device front end, HBase/Accumulo at
> > the
> > > back, HDFS underneath. This will give you experience in how the stack
> > fits
> > > together.
> > >
> > > Doing either of these not only gradually introduces you into the world
> of
> > > Hadoop & friends, it introduces you to the concepts gradually, rather
> > than
> > > dropping you into source code which is not only big and complex, but
> > whose
> > > main test setup -a few tens of servers- is a big investment on its own
> > > -though renting cluster time from a cloud provider can provide an
> > emulation
> > > of that rack of machines.
> > >
> > > It will also make it clear where Hadoop is lacking today -perhaps in
> some
> > > of the APIs, perhaps in the web site, and its experience on tablets and
> > > phones. Coming at those problems with the experience of actual needs
> will
> > > help shape your thinking in what should be done.
> > >
> > > Finally, while getting started with Hadoop, yes, you do need to read
> that
> > > documentation, and sign up to the Hadoop user list [
> > > http://hadoop.apache.org/mailing_lists.html#User]  if you want to get
> > help
> > > getting things to work, code against Hadoop, etc. Questions like that
> to
> > > the dev list just get ignored (sorry!)
> > >
> > >
> > > -Steve
> > >
> > >
> > >
> > >> On 23 September 2013 17:07, ankit nadig <an...@gmail.com> wrote:
> > >>
> > >> Hi,
> > >>   im a newbie...i want to learn and contribute to hadoop.I've set up a
> > >> single node cluster on ubuntu 12.04 . and i know c,c++ and am
> currently
> > >> learning Java. I haven't read any documentation and am new to open
> > source
> > >> as such.
> > >>
> > >> sorry for wasting ur time and if this is the wrong place for this mail
> > but
> > >> can u give me any guidance on how to proceed ?
> > >>
> > >> thank you.
> > >
> > > --
> > > CONFIDENTIALITY NOTICE
> > > NOTICE: This message is intended for the use of the individual or
> entity
> > to
> > > which it is addressed and may contain information that is confidential,
> > > privileged and exempt from disclosure under applicable law. If the
> reader
> > > of this message is not the intended recipient, you are hereby notified
> > that
> > > any printing, copying, dissemination, distribution, disclosure or
> > > forwarding of this communication is strictly prohibited. If you have
> > > received this communication in error, please contact the sender
> > immediately
> > > and delete it from your system. Thank You.
> >
>



-- 
Jay Vyas
http://jayunit100.blogspot.com

Re: Newbie:Need guidance

Posted by ankit nadig <an...@gmail.com>.
thanks a lot!


On Tue, Sep 24, 2013 at 6:49 PM, Jay Vyas <ja...@gmail.com> wrote:

> And also, if you want to help out: we are developing blueprints in the
> bigtop project specifically for people who want to learn how real world
> bigdata workflows look.
>
>
> > On Sep 24, 2013, at 4:52 AM, Steve Loughran <st...@hortonworks.com>
> wrote:
> >
> > Hi.
> >
> > You need to know that we don't really consider Hadoop a good place to
> learn
> > about Java or distributed system programming: it is simply too complex.
> > It's like learning C by writing linux kernel device drivers -so we
> > explicitly warn against trying to do this
> >
> > http://wiki.apache.org/hadoop/HadoopIsNot
> >
> > That said: we do we welcome new developers, and there is even a touch of
> C
> > code lurking in there too.
> >
> > What I'd recommend is you start not by delving into the code of Hadoop,
> but
> > by learning how to use it: a tool to answer questions about data; a
> > platform you can build bigger applications from.
> >
> > This leads to two possible projects
> >
> > 1. Think of something you are curious about and from which you can grab
> > public datasets from. A lot of government open datasets are really
> > interesting, especially when merged with other datasets. Then analyse it
> > -if you can find something interesting and new then that's something you
> > can talk about and get known for.
> >
> > 2. Try writing a web application using Hadoop and its nosql database(s)
> as
> > the back end -either web or mobile device front end, HBase/Accumulo at
> the
> > back, HDFS underneath. This will give you experience in how the stack
> fits
> > together.
> >
> > Doing either of these not only gradually introduces you into the world of
> > Hadoop & friends, it introduces you to the concepts gradually, rather
> than
> > dropping you into source code which is not only big and complex, but
> whose
> > main test setup -a few tens of servers- is a big investment on its own
> > -though renting cluster time from a cloud provider can provide an
> emulation
> > of that rack of machines.
> >
> > It will also make it clear where Hadoop is lacking today -perhaps in some
> > of the APIs, perhaps in the web site, and its experience on tablets and
> > phones. Coming at those problems with the experience of actual needs will
> > help shape your thinking in what should be done.
> >
> > Finally, while getting started with Hadoop, yes, you do need to read that
> > documentation, and sign up to the Hadoop user list [
> > http://hadoop.apache.org/mailing_lists.html#User]  if you want to get
> help
> > getting things to work, code against Hadoop, etc. Questions like that to
> > the dev list just get ignored (sorry!)
> >
> >
> > -Steve
> >
> >
> >
> >> On 23 September 2013 17:07, ankit nadig <an...@gmail.com> wrote:
> >>
> >> Hi,
> >>   im a newbie...i want to learn and contribute to hadoop.I've set up a
> >> single node cluster on ubuntu 12.04 . and i know c,c++ and am currently
> >> learning Java. I haven't read any documentation and am new to open
> source
> >> as such.
> >>
> >> sorry for wasting ur time and if this is the wrong place for this mail
> but
> >> can u give me any guidance on how to proceed ?
> >>
> >> thank you.
> >
> > --
> > CONFIDENTIALITY NOTICE
> > NOTICE: This message is intended for the use of the individual or entity
> to
> > which it is addressed and may contain information that is confidential,
> > privileged and exempt from disclosure under applicable law. If the reader
> > of this message is not the intended recipient, you are hereby notified
> that
> > any printing, copying, dissemination, distribution, disclosure or
> > forwarding of this communication is strictly prohibited. If you have
> > received this communication in error, please contact the sender
> immediately
> > and delete it from your system. Thank You.
>

Re: Newbie:Need guidance

Posted by Jay Vyas <ja...@gmail.com>.
And also, if you want to help out: we are developing blueprints in the bigtop project specifically for people who want to learn how real world bigdata workflows look.


> On Sep 24, 2013, at 4:52 AM, Steve Loughran <st...@hortonworks.com> wrote:
> 
> Hi.
> 
> You need to know that we don't really consider Hadoop a good place to learn
> about Java or distributed system programming: it is simply too complex.
> It's like learning C by writing linux kernel device drivers -so we
> explicitly warn against trying to do this
> 
> http://wiki.apache.org/hadoop/HadoopIsNot
> 
> That said: we do we welcome new developers, and there is even a touch of C
> code lurking in there too.
> 
> What I'd recommend is you start not by delving into the code of Hadoop, but
> by learning how to use it: a tool to answer questions about data; a
> platform you can build bigger applications from.
> 
> This leads to two possible projects
> 
> 1. Think of something you are curious about and from which you can grab
> public datasets from. A lot of government open datasets are really
> interesting, especially when merged with other datasets. Then analyse it
> -if you can find something interesting and new then that's something you
> can talk about and get known for.
> 
> 2. Try writing a web application using Hadoop and its nosql database(s) as
> the back end -either web or mobile device front end, HBase/Accumulo at the
> back, HDFS underneath. This will give you experience in how the stack fits
> together.
> 
> Doing either of these not only gradually introduces you into the world of
> Hadoop & friends, it introduces you to the concepts gradually, rather than
> dropping you into source code which is not only big and complex, but whose
> main test setup -a few tens of servers- is a big investment on its own
> -though renting cluster time from a cloud provider can provide an emulation
> of that rack of machines.
> 
> It will also make it clear where Hadoop is lacking today -perhaps in some
> of the APIs, perhaps in the web site, and its experience on tablets and
> phones. Coming at those problems with the experience of actual needs will
> help shape your thinking in what should be done.
> 
> Finally, while getting started with Hadoop, yes, you do need to read that
> documentation, and sign up to the Hadoop user list [
> http://hadoop.apache.org/mailing_lists.html#User]  if you want to get help
> getting things to work, code against Hadoop, etc. Questions like that to
> the dev list just get ignored (sorry!)
> 
> 
> -Steve
> 
> 
> 
>> On 23 September 2013 17:07, ankit nadig <an...@gmail.com> wrote:
>> 
>> Hi,
>>   im a newbie...i want to learn and contribute to hadoop.I've set up a
>> single node cluster on ubuntu 12.04 . and i know c,c++ and am currently
>> learning Java. I haven't read any documentation and am new to open source
>> as such.
>> 
>> sorry for wasting ur time and if this is the wrong place for this mail but
>> can u give me any guidance on how to proceed ?
>> 
>> thank you.
> 
> -- 
> CONFIDENTIALITY NOTICE
> NOTICE: This message is intended for the use of the individual or entity to 
> which it is addressed and may contain information that is confidential, 
> privileged and exempt from disclosure under applicable law. If the reader 
> of this message is not the intended recipient, you are hereby notified that 
> any printing, copying, dissemination, distribution, disclosure or 
> forwarding of this communication is strictly prohibited. If you have 
> received this communication in error, please contact the sender immediately 
> and delete it from your system. Thank You.

Re: Newbie:Need guidance

Posted by Steve Loughran <st...@hortonworks.com>.
Hi.

You need to know that we don't really consider Hadoop a good place to learn
about Java or distributed system programming: it is simply too complex.
It's like learning C by writing linux kernel device drivers -so we
explicitly warn against trying to do this

http://wiki.apache.org/hadoop/HadoopIsNot

That said: we do we welcome new developers, and there is even a touch of C
code lurking in there too.

What I'd recommend is you start not by delving into the code of Hadoop, but
by learning how to use it: a tool to answer questions about data; a
platform you can build bigger applications from.

This leads to two possible projects

1. Think of something you are curious about and from which you can grab
public datasets from. A lot of government open datasets are really
interesting, especially when merged with other datasets. Then analyse it
-if you can find something interesting and new then that's something you
can talk about and get known for.

2. Try writing a web application using Hadoop and its nosql database(s) as
the back end -either web or mobile device front end, HBase/Accumulo at the
back, HDFS underneath. This will give you experience in how the stack fits
together.

Doing either of these not only gradually introduces you into the world of
Hadoop & friends, it introduces you to the concepts gradually, rather than
dropping you into source code which is not only big and complex, but whose
main test setup -a few tens of servers- is a big investment on its own
-though renting cluster time from a cloud provider can provide an emulation
of that rack of machines.

It will also make it clear where Hadoop is lacking today -perhaps in some
of the APIs, perhaps in the web site, and its experience on tablets and
phones. Coming at those problems with the experience of actual needs will
help shape your thinking in what should be done.

Finally, while getting started with Hadoop, yes, you do need to read that
documentation, and sign up to the Hadoop user list [
http://hadoop.apache.org/mailing_lists.html#User]  if you want to get help
getting things to work, code against Hadoop, etc. Questions like that to
the dev list just get ignored (sorry!)


-Steve



On 23 September 2013 17:07, ankit nadig <an...@gmail.com> wrote:

> Hi,
>    im a newbie...i want to learn and contribute to hadoop.I've set up a
> single node cluster on ubuntu 12.04 . and i know c,c++ and am currently
> learning Java. I haven't read any documentation and am new to open source
> as such.
>
> sorry for wasting ur time and if this is the wrong place for this mail but
> can u give me any guidance on how to proceed ?
>
> thank you.
>

-- 
CONFIDENTIALITY NOTICE
NOTICE: This message is intended for the use of the individual or entity to 
which it is addressed and may contain information that is confidential, 
privileged and exempt from disclosure under applicable law. If the reader 
of this message is not the intended recipient, you are hereby notified that 
any printing, copying, dissemination, distribution, disclosure or 
forwarding of this communication is strictly prohibited. If you have 
received this communication in error, please contact the sender immediately 
and delete it from your system. Thank You.