You are viewing a plain text version of this content. The canonical link for it is here.
Posted to general@hadoop.apache.org by "Xiaoyi Lu@cse.osu" <lu...@cse.ohio-state.edu> on 2015/03/24 01:08:04 UTC
[HiBD] Announcing the release of RDMA for Apache Hadoop-2.x 0.9.6
The High-Performance Big Data (HiBD) team is pleased to announce the
release of Hadoop-2.x 0.9.6 package (for Hadoop 2.x series) with the
following features.
* RDMA for Apache Hadoop-2.x 0.9.6 Features
- Based on Apache Hadoop 2.6.0
- High performance design with native InfiniBand and RoCE support
at the verbs level for HDFS, MapReduce, and RPC components
- Compliant with Apache Hadoop 2.6.0 APIs and applications
- Easily configurable for different running modes (HHH, HHH-M, HHH-L,
and MapReduce over Lustre) and different protocols (native InfiniBand,
RoCE, and IPoIB)
- On-demand connection setup
- HDFS over native InfiniBand and RoCE
- RDMA-based write
- RDMA-based replication
- Parallel replication support
- Overlapping in different stages of write and replication
- Enhanced hybrid HDFS design with in-memory and heterogeneous
storage (HHH)
- Supports three modes of operations
- HHH (default) with I/O operations over RAM disk, SSD, and HDD
- HHH-M (in-memory) with I/O operations in-memory
- HHH-L (Lustre-integrated) with I/O operations in local
storage and Lustre
- Policies to efficiently utilize heterogeneous storage
devices (RAM Disk, SSD, HDD, and Lustre)
- Greedy and Balanced policies support
- Automatic policy selection based on available storage types
- Hybrid replication (in-memory and persistent storage) for
HHH default mode
- Memory replication (in-memory only with lazy persistence) for
HHH-M mode
- Lustre-based fault-tolerance for HHH-L mode
- No HDFS replication
- Reduced local storage space usage
- MapReduce over native InfiniBand and RoCE
- RDMA-based shuffle
- Pre-fetching and caching of map output
- In-memory merge
- Advanced optimization in overlapping
- map, shuffle, and merge
- shuffle, merge, and reduce
- Optional disk-assisted shuffle
- High performance design of MapReduce over Lustre
- Supports two shuffle approaches
- Lustre read based shuffle
- RDMA based shuffle
- Hybrid shuffle based on both shuffle approaches
- Configurable distribution support
- In-memory merge and overlapping of different phases
- RPC over native InfiniBand and RoCE
- JVM-bypassed buffer management
- RDMA or send/recv based adaptive communication
- Intelligent buffer allocation and adjustment for serialization
- Tested with
- Mellanox InfiniBand adapters (DDR, QDR, and FDR)
- RoCE support with Mellanox adapters
- Various multi-core platforms
- RAM Disks, SSDs, HDDs, and Lustre
Bug Fixes (since Apache Hadoop-2.x 0.9.5)
- Fix a hang issue in running with WordCount-like benchmarks
- Thanks to Amit Sangroya@TCS for reporting the issue
- Fix an issue for NameNode running with HA enabled mode
- Thanks to Qihu Yang@AsiaInfo for reporting the issue
For downloading RDMA for Apache Hadoop-2.x 0.9.6 package and the
associated user guide, please visit the following URL:
http://hibd.cse.ohio-state.edu
Sample performance numbers for benchmarks using RDMA for Apache
Hadoop-2.x 0.9.6 version can be viewed by visiting the `Performance'
tab of the above website.
All questions, feedbacks and bug reports are welcome. Please post it
to the rdma-hadoop-discuss mailing list (rdma-hadoop-discuss at
cse.ohio-state.edu).
Thanks,
The High-Performance Big Data (HiBD) Team