You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2006/09/21 18:09:48 UTC
[Lucene-hadoop Wiki] Update of "FrontPage" by mbloore

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by mbloore:
http://wiki.apache.org/lucene-hadoop/FrontPage

The comment on the change is:
typo fixes

------------------------------------------------------------------------------
  and assigns each fragment to a ''map task''. The framework also distributes the many map tasks
  across the cluster of nodes on which it operates. Each map task consumes key/value pairs
  from its assigned fragment and produces a set of intermediate key/value pairs. For each
- input key/value pair ''(K,V)'', the map task invokes a user defined ''map function'' that transmutes
+ input key/value pair ''(K,V)'' the map task invokes a user defined ''map function'' that transmutes
  the input into a different key/value pair ''(K',V')''.
  
  Following the map phase the framework sorts the intermediate data set by key and produces
  a set of ''(K',V'*)'' tuples so that all the values associated with a particular key appear
- together. It also partitions the set of tuples into as a number of fragments equal to the
+ together. It also partitions the set of tuples into a number of fragments equal to the
  number of reduce tasks.
  
  In the reduce phase, each ''reduce task'' consumes the fragment of ''(K',V'*)'' tuples assigned to it.
- For each such tuple it invokes a user defined ''reduce function'' that transmutes the tuple into
+ For each such tuple it invokes a user-defined ''reduce function'' that transmutes the tuple into
  an output key/value pair ''(K,V)''. Once again, the framework distributes the many reduce
  tasks across the cluster of nodes and deals with shipping the appropriate fragment of
  intermediate data to each reduce task.
  
- Tasks in each phase are executed in a fault tolerant manner, if a node(s) fail in the middle
+ Tasks in each phase are executed in a fault-tolerant manner, if node(s) fail in the middle
  of a computation the tasks assigned to them are re-distributed among the remaining nodes.
  Having many map and reduce tasks enables good load balancing and allows failed tasks to be
  re-run with small runtime overhead.
@@ -48, +48 @@

  The Hadoop Map/Reduce framework has a master/slave architecture. It has a single master
  server or ''jobtracker'' and several slave servers or ''tasktrackers'', one per node in the cluster.
  The ''jobtracker'' is the point of interaction between users and the framework. Users submit
- map/reduce jobs to the ''jobtracker'' which puts them in a queue of pending jobs and executes
+ map/reduce jobs to the ''jobtracker'', which puts them in a queue of pending jobs and executes
- them on a first come first serve basis. The ''jobtracker'' manages the assignment of map and
+ them on a first-come/first-served basis. The ''jobtracker'' manages the assignment of map and
  reduce tasks to the ''tasktrackers''. The ''tasktrackers'' execute tasks upon instruction from the
  jobtracker and also handle data motion between the map and reduce phases.
  
@@ -73, +73 @@

  The ''Namenode'' makes filesystem namespace operations like opening, closing, renaming etc.
  of files and directories available via an RPC interface. It also determines the mapping of
  blocks to ''Datanodes''. The ''Datanodes'' are responsible for serving read and write
- requests from filesystem clients, they also perform block creation, deletion and replication
+ requests from filesystem clients, they also perform block creation, deletion, and replication
  upon instruction from the ''Namenode''.