You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by "Xiaoyi Lu@cse.osu" <lu...@cse.ohio-state.edu> on 2015/05/27 08:09:41 UTC
[HiBD] Announcing the release of RDMA for Apache Hadoop-2.x 0.9.7
The High-Performance Big Data (HiBD) team is pleased to announce the
release of Hadoop-2.x 0.9.7 package (for Hadoop 2.x series) with the
following features.
New features compared to Hadoop-2.x 0.9.6 are:
- Compliant with Apache Hadoop 2.6.0 and Hortonworks Data
Platform (HDP) 2.2.0.0 APIs and applications
- Plugin-based architecture supporting RDMA-based designs for
HDFS (HHH, HHH-M, HHH-L), MapReduce, MapReduce over Lustre
and RPC, etc.
- Plugin for Apache Hadoop distribution (tested with 2.6.0)
- Plugin for Hortonworks Data Platform (HDP) (tested with 2.2.0.0)
- Supports deploying Hadoop with Slurm and PBS in different
running modes (HHH, HHH-M, HHH-L, and MapReduce over Lustre)
The complete set of features for RDMA Apache Hadoop-2.x 0.9.7:
- Based on Apache Hadoop 2.6.0
- High performance design with native InfiniBand and RoCE support
at the verbs level for HDFS, MapReduce, and RPC components
- Compliant with Apache Hadoop 2.6.0 and Hortonworks Data
Platform (HDP) 2.2.0.0 APIs and applications
- Plugin-based architecture supporting RDMA-based designs for
HDFS (HHH, HHH-M, HHH-L), MapReduce, MapReduce over Lustre
and RPC, etc.
- Plugin for Apache Hadoop distribution (tested with 2.6.0)
- Plugin for Hortonworks Data Platform (HDP) (tested with 2.2.0.0)
- Supports deploying Hadoop with Slurm and PBS in different
running modes (HHH, HHH-M, HHH-L, and MapReduce over Lustre)
- Easily configurable for different running modes (HHH, HHH-M, HHH-L,
and MapReduce over Lustre) and different protocols (native InfiniBand,
RoCE, and IPoIB)
- On-demand connection setup
- HDFS over native InfiniBand and RoCE
- RDMA-based write
- RDMA-based replication
- Parallel replication support
- Overlapping in different stages of write and replication
- Enhanced hybrid HDFS design with in-memory and heterogeneous
storage (HHH)
- Supports three modes of operations
- HHH (default) with I/O operations over RAM disk, SSD, and HDD
- HHH-M (in-memory) with I/O operations in-memory
- HHH-L (Lustre-integrated) with I/O operations in local
storage and Lustre
- Policies to efficiently utilize heterogeneous storage
devices (RAM Disk, SSD, HDD, and Lustre)
- Greedy and Balanced policies support
- Automatic policy selection based on available storage types
- Hybrid replication (in-memory and persistent storage) for
HHH default mode
- Memory replication (in-memory only with lazy persistence) for
HHH-M mode
- Lustre-based fault-tolerance for HHH-L mode
- No HDFS replication
- Reduced local storage space usage
- MapReduce over native InfiniBand and RoCE
- RDMA-based shuffle
- Pre-fetching and caching of map output
- In-memory merge
- Advanced optimization in overlapping
- map, shuffle, and merge
- shuffle, merge, and reduce
- Optional disk-assisted shuffle
- High performance design of MapReduce over Lustre
- Supports two shuffle approaches
- Lustre read based shuffle
- RDMA based shuffle
- Hybrid shuffle based on both shuffle approaches
- Configurable distribution support
- In-memory merge and overlapping of different phases
- RPC over native InfiniBand and RoCE
- JVM-bypassed buffer management
- RDMA or send/recv based adaptive communication
- Intelligent buffer allocation and adjustment for serialization
- Tested with
- Mellanox InfiniBand adapters (DDR, QDR, and FDR)
- RoCE support with Mellanox adapters
- Various multi-core platforms
- RAM Disks, SSDs, HDDs, and Lustre
For downloading RDMA for Apache Hadoop-2.x 0.9.7 package and the
associated user guide, please visit the following URL:
http://hibd.cse.ohio-state.edu
Sample performance numbers for benchmarks using RDMA for Apache
Hadoop-2.x 0.9.7 version can be viewed by visiting the `Performance'
tab of the above website.
All questions, feedback and bug reports are welcome. Please post it
to the rdma-hadoop-discuss mailing list (rdma-hadoop-discuss at
cse.ohio-state.edu).
Thanks,
The High-Performance Big Data (HiBD) Team