You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2011/12/22 04:44:49 UTC
[Hadoop Wiki] Update of "FrontPage" by jiuzheyang

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The "FrontPage" page has been changed by jiuzheyang:
http://wiki.apache.org/hadoop/FrontPage?action=diff&rev1=290&rev2=291

+ [http://www.thomassabostockists.com/ thomas sabo] is one of the famous jewelry brand. [http://www.thomassabostockists.com/ thomas sabo charms] is one of the most popular series. [http://www.thomassabostockists.com/ thomas sabo uk] give you more choice.
- = Apache Hadoop =
- [[http://hadoop.apache.org/|Apache Hadoop]] is a framework for running applications on large cluster built of commodity hardware. The Hadoop framework transparently provides applications both reliability and data motion. Hadoop implements a computational paradigm named [[HadoopMapReduce|Map/Reduce]], where the application is divided into many small fragments of work, each of which may be executed or re-executed on any node in the cluster. In addition, it provides a distributed file system ([[DFS|HDFS]]) that stores data on the compute nodes, providing very high aggregate bandwidth across the cluster. Both MapReduce and the Hadoop Distributed File System are designed so that node failures are automatically handled by the framework.
  
- == General Information ==
-  * [[http://hadoop.apache.org/|Official Apache Hadoop Website]]: download, bug-tracking, mailing-lists, etc.
-  * [[ProjectDescription|Overview]] of Apache Hadoop
-  * [[FAQ]] Frequently Asked Questions.
-  * [[HadoopIsNot|What Hadoop is not]]
-  * [[Distributions and Commercial Support]] for Hadoop (RPMs, Debs, AMIs, etc)
-  * [[HadoopPresentations|Presentations]], [[Books|books]], [[HadoopArticles|articles]] and [[Papers|papers]] about Hadoop
-  * PoweredBy, a growing list of sites and applications powered by Apache Hadoop
-  * Support
-   * [[Help|Getting help from the hadoop community]].
-   * [[Support|People and companies for hire]].
-  * [[Conferences|Hadoop Community Events and Conferences]]
-   * HadoopUserGroups (HUGs)
-   * HadoopSummit
-   * HadoopWorld
-   * HadoopMeetupAtApacheCon
- 
- === Related-Projects ===
-  * [[HBase]], a Bigtable-like structured storage system for Hadoop HDFS
-  * [[http://wiki.apache.org/pig/|Apache Pig]] is a high-level data-flow language and execution framework for parallel computation. It is built on top of Hadoop Core.
-  * [[Hive]] a data warehouse infrastructure which allows sql-like adhoc querying of data (in any format) stored in Hadoop
-  * ZooKeeper is a high-performance coordination service for distributed applications.
-  * [[http://wiki.apache.org/hama|Hama]], a Google's Pregel-like distributed computing framework based on BSP (Bulk Synchronous Parallel) computing techniques for massive scientific computations.
-  * [[http://lucene.apache.org/mahout|Mahout]], scalable Machine Learning algorithms using Hadoop
- 
- == User Documentation ==
-  * [[HadoopJavaVersions|Available Java Runtime Environments for Hadoop]]
-  * ImportantConcepts
-  * GettingStartedWithHadoop (lots of details and explanation)
-  * QuickStart (for those who just want it to work ''now'')
-  * [[http://hadoop.apache.org/core/docs/current/commands_manual.html|Command Line Options]] for the Hadoop shell scripts.
-  * [[HadoopOverview|Hadoop Code Overview]]
-  * [[TroubleShooting|Troubleshooting]] What do when things go wrong
- 
- === Setting up a Hadoop Cluster ===
-  * [[Setup|Setting up a Hadoop Cluster]]
-  * [[Running_Hadoop_On_OS_X_10.5_64-bit_(Single-Node_Cluster)]]
-  * HowToConfigure Hadoop software
-  * [[WebApp_URLs|WebApps for monitoring your system]]
-  * [[NameNodeFailover|How to handle name node failure]]
-  * [[GangliaMetrics|How to get metrics into ganglia]]
-  * [[LargeClusterTips|Tips for managing a large cluster]]
-  * [[DiskSetup|Disk Setup: some suggestions]]
-  * [[PerformanceTuning|Performance:]] getting extra throughput
-  * [[topology_rack_awareness_scripts|Topology Scripts / Rack Awareness]]
- * [[http://www.buzzbacklinks.com|buy backlink]]
- * [[http://www.buzzbacklinks.com|backlink services]]
- 
-  * Virtual Clusters including Amazon AWS
-   * [[Virtual Hadoop]] -the theory
-   * How to set up a [[VirtualCluster|Virtual Cluster]]
-   * Running Hadoop on [[AmazonEC2]]
-   * Running Hadoop with AmazonS3
- 
- 
- === Tutorials ===
-  * [[Running_Hadoop_On_Ubuntu_Linux_(Single-Node_Cluster)]] A tutorial on installing, configuring and running Hadoop on a single Ubuntu Linux machine.
-  * [[http://www.cloudera.com/hadoop-training-basic|Cloudera basic training]]
-  * [[http://v-lad.org/Tutorials/Hadoop/00%20-%20Intro.html|Hadoop Windows/Eclipse Tutorial]]: How to develop Hadoop with Eclipse on Windows.
-  * [[http://developer.yahoo.com/hadoop/tutorial/|Yahoo! Hadoop Tutorial]]: Hadoop setup, HDFS, and [[HadoopMapReduce|MapReduce]]
- 
- === MapReduce ===
- The MapReduce algorithm is the foundational algorithm of Hadoop, and is critical to understand.
-  * HadoopMapReduce
-  * HadoopMapRedClasses
-  * HowManyMapsAndReduces
-  * TaskExecutionEnvironment
-  * HowToDebugMapReducePrograms
- 
-  * Examples
-   * WordCount
-   * [[PythonWordCount|Python Word Count]]
-   * [[C++WordCount|C/C++ Word Count]]
-   * [[Grep]]
-   * [[Sort]]
-   * RandomWriter
-   * [[HadoopDfsReadWriteExample|How to read from and write to HDFS]]
- 
-  * Benchmarks
-   * [[HardwareBenchmarks|Hardware benchmarks]]
-   * [[DataProcessingBenchmarks|Data processing benchmarks]]
- 
- 
- == Contributed parts of the Hadoop codebase ==
-  These are independent modules that are in the Hadoop codebase but not tightly integrated with the main project -yet.
-   * HadoopStreaming (Useful for using Hadoop with other programming languages)
-   * DistributedLucene, a Proposal for a distributed Lucene index in Hadoop
-   * [[MountableHDFS]], Fuse-DFS & other Tools to mount HDFS as a standard filesystem on Linux (and some other Unix OSs)
-   * [[HDFS-APIs]] in Perl, Python, PHP and other languages.
-   * [[Chukwa]] a data collection, storage, and analysis framework
-   * [[EclipsePlugIn|The Apache Hadoop Plugin for Eclipse]] (An Eclipse plug-in that simplifies the creation and deployment of MapReduce programs with an HDFS Administrative feature)
-   * [[HDFS-RAID]] Erasure Coding in HDFS
- 
- == Developer Documentation ==
-  * [[Roadmap]], listing release plans.
-  * HowToContribute
-  * HowToDevelopUnitTests
-  * HowToUseInjectionFramework
-  * HowToUseSystemTestFramework
-  * HowToSetupYourDevelopmentEnvironment
-  * HowToUseConcurrencyAnalysisTools
-  * [[HowToUseJCarder]]
-  * [[CodeReviewChecklist|HowToCodeReview]]
-  * [[Jira]] usage guidelines
-  * HowToCommit
-  * HowToRelease
-  * HudsonBuildServer
-  * HowToSetupUbuntuBuildMachine
-  * DevelopmentHints
-  * ProjectSuggestions
-  * [[HadoopUnderIDEA|Building/Testing under IntelliJ IDEA]]
-  * [[GitAndHadoop|Git And Hadoop]]
-  * ProjectSplit
- 
- == Related Resources ==
-  * [[http://wiki.apache.org/nutch/NutchHadoopTutorial|Nutch Hadoop Tutorial]] (Useful for understanding Hadoop in an application context)
-  * [[http://www.alphaworks.ibm.com/tech/mapreducetools|IBM MapReduce Tools for Eclipse]] - Out of date. Use the Eclipse Plugin in the MapReduce/Contrib instead
-  * Hadoop IRC channel is #hadoop at irc.freenode.net.
-  * [[http://www.tom-doehler.de/wordpress/index.php/2007/12/19/spring-and-hadoop/|Using Spring and Hadoop]] (Discussion of possibilities to use Hadoop and Dependency Injection with Spring)
-  * [[http://www.wheregridenginelives.com/content/big-data-big-compute-grid-engine-and-hadoop-0|Univa Grid Engine Integration]] A blog post about the integration of Hadoop with the Grid Engine successor Univa Grid Engine
-  * [[http://philippeadjiman.com/blog/the-hadoop-tutorial-series/|Hadoop Tutorial Series]] Learning progressively important core Hadoop concepts with hands-on experiments using the Cloudera Virtual Machine
-  * [[http://pydoop.sourceforge.net|Pydoop]] A Python MapReduce and HDFS API for Hadoop.
-  * [[https://github.com/klbostee/dumbo/wiki|Dumbo]] Dumbo is a project that allows you to easily write and run Hadoop programs in Python.
-  * [[http://www.asterdata.com/news/091001-Aster-Hadoop-connector.php|Hadoop distributed file system]] New Hadoop Connector Enables Ultra-Fast Transfer of Data between Hadoop and Aster Data's MPP Data Warehouse.
-  * [[CUDA On Hadoop|Hadoop + CUDA]]
-  * [[http://kazman.shidler.hawaii.edu/ArchDoc.html|HDFS Architecture Documentation]] An overview of the HDFS architecture, intended for contributors.
- 
- ----
- CategoryHomepage
-