You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by Vidur Goyal <vi...@students.iiit.ac.in> on 2010/06/07 13:38:46 UTC

HDFS Source Code

Hi,

I am experimenting with HDFS API's . I was wondering if somebody could
help me understand the source code of hdfs by providing any relevant
documentation or could guide me how to start.

Thanks,
Vidur

-- 
This message has been scanned for viruses and
dangerous content by MailScanner, and is
believed to be clean.


Re: HDFS Source Code

Posted by Owen O'Malley <om...@apache.org>.
Please move this discussion over to hdfs-dev@hadoop.apache.org.

Thanks,
    Owen

Re: HDFS Source Code

Posted by Josh Patterson <jo...@cloudera.com>.
Ahmad,
A nice overview of Map Reduce is at:

http://hadoop.apache.org/common/docs/current/mapred_tutorial.html

<http://hadoop.apache.org/common/docs/current/mapred_tutorial.html>In terms
of replacing the entire communication framework under Map Reduce, thats a
considerably different and more complex task that simply talking to hdfs
from java. You have a large number of interlocking classes based on the
communication system of hadoop; replacing this would be like trying to
replace the frame of an automobile --- in other words, you might try, but
you run a high risk of not having a large percentage of the parts not work
correctly with the new automobile frame.

If you were to tackle such a large task, it would be less of "knowing a
certain set of classes to work with" and more of knowing how a large degree
of hadoop works.

Josh

On Mon, Jun 7, 2010 at 12:20 PM, Ahmad Shahzad <as...@gmail.com> wrote:

> Hi,
>    I am also interested in looking what is going on under the hood. Thanks
> for sharing this information.
>
> Could you please tell the same about Map Reduce. I mean, how to figure out
> that how job tracker and task trackers communicate with each other. Which
> classes in hadoop api are responsible for doing that. If i go in detail, i
> would like to know that how can i change the communication mechanism of
> hadoop map reduce to use my communication library instead of using regular
> sockets and http.
>
> Regards,
> Ahmad Shahzad
>

Re: HDFS Source Code

Posted by Ahmad Shahzad <as...@gmail.com>.
Hi,
    I am also interested in looking what is going on under the hood. Thanks
for sharing this information.

Could you please tell the same about Map Reduce. I mean, how to figure out
that how job tracker and task trackers communicate with each other. Which
classes in hadoop api are responsible for doing that. If i go in detail, i
would like to know that how can i change the communication mechanism of
hadoop map reduce to use my communication library instead of using regular
sockets and http.

Regards,
Ahmad Shahzad

Re: HDFS Source Code

Posted by Josh Patterson <jo...@cloudera.com>.
Vidur,
Probably a great place to get started would be:

http://hadoop.apache.org/common/docs/current/hdfs_design.html

<http://hadoop.apache.org/common/docs/current/hdfs_design.html>Then after
reading that, you could take a look at the api docs:

<http://hadoop.apache.org/common/docs/r0.20.0/>
http://hadoop.apache.org/common/docs/current/api/

and then a few ways to touch hdfs from other languages:

http://wiki.apache.org/hadoop/HDFS-APIs

Beyond that, I'd download the source from SVN and take a look at

org.apache.hadoop.hdfs.DFSClient.java

to get a feel for how most apps talk to HDFS. After that you could look at:
*
*
*org.apache.hadoop.fs.FsShell*

to get a feel for how the shell system as a java program talks to DFSClient.

Beyond that, just reading more source code and compiling your own
experiments reading and writing to HDFS is the best way to get a feel for
whats going on under the hood.

Josh Patterson

Solutions Architect
Cloudera




On Mon, Jun 7, 2010 at 7:38 AM, Vidur Goyal <vi...@students.iiit.ac.in>wrote:

> Hi,
>
> I am experimenting with HDFS API's . I was wondering if somebody could
> help me understand the source code of hdfs by providing any relevant
> documentation or could guide me how to start.
>
> Thanks,
> Vidur
>
> --
> This message has been scanned for viruses and
> dangerous content by MailScanner, and is
> believed to be clean.
>
>