You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@hadoop.apache.org by Arjun Bakshi <ba...@mail.uc.edu> on 2014/08/11 17:39:02 UTC
Examining effect of changing block placement policy.
Hi,
I've made some changed to the default block placement policy and want to
see how if affects a cluster. Any suggestions on how I can test the
before and after of a cluster after making these changes?
I read up a bit on Rumen and GridMix in my search for tools that would
help me benchmark things on a cluster. As far as I know, I need some job
traces to get the ball rolling. I've googled for sample job traces but
didn't find anything. I found this page:
http://ftp.pdl.cmu.edu/pub/datasets/hla/dataset.html but I'm not sure
how to use the data there.
I don't have a ton of data, or a bunch of queries I could run on it. My
best idea till now is to run a bunch of sorts on different input sizes,
and word counts on different combination of files, all while following
an exponential inter-job arrival time. I'm planning to do this on AWS's
EC2's free tire.
Any suggestions on how to observe the effects of changing the policy
would be appreciated.
Thank you,
Arjun