You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by "Dai, Jason" <ja...@intel.com> on 2012/10/25 14:38:28 UTC

Release of HiBench 2.2 (a Hadoop benchmark suite)

Hi,

I would like to announce the availability of HiBench 2.2 at https://github.com/intel-hadoop/hibench. Since the release of HiBench 2.1, we have received many good feedbacks, and HiBench 2.2 provides an update to v2.1 based on these feedbacks, including:

1)      Build automatic data generators for Nutch indexing and Bayesian classification workloads. In HiBench 2.1 they used fixed input data set, and cannot easily scale up or down.

2)      Change the PageRank workload to the implementation contained in the Pegasus project (http://www.cs.cmu.edu/~pegasus/). The previous PageRank workload in HiBench 2.1 comes from Mahout 0.6 and can run into out of memory problems with large input data; and Mahout has dropped the support for PageRank since (see MAHOUT-1049<https://issues.apache.org/jira/browse/MAHOUT-1049>).

3)      Upgrade the machine learning workloads (K-mean clustering and Bayesian classification) to Mahout 0.7, which fixes many issues/bugs in Mahout 0.6 (that is, the version we used in HiBench 2.1).

Thanks,
-Jason

_____________________________________________
From: Dai, Jason
Sent: Thursday, June 14, 2012 12:27 AM
To: common-user@hadoop.apache.org<ma...@hadoop.apache.org>
Subject: Open source of HiBench 2.1 (a Hadoop benchmark suite)


Hi,

HiBench, a Hadoop benchmark suite constructed by Intel, is used intensively for Hadoop benchmarking, tuning & optimizations both inside Intel and by our customers/partners. It consists of a set of representative Hadoop programs including both micro-benchmarks and more "real world" applications (e.g., search, machine learning and Hive queries).

We have made HiBench 2.1 available under Apache License 2.0 at https://github.com/hibench/HiBench-2.1, and would like to get your feedbacks on how it can be further improved. BTW, please stop by the Intel booth if you are at Hadoop summit, so that we can have more interactive discussions on both HiBench and HiTune (our Hadoop performance analyzer open sourced at https://github.com/hitune/hitune).

Thanks,
-Jason