You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Mingjie Lai <mi...@trendmicro.com> on 2011/01/21 21:40:09 UTC

YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)

Guys.
There is a discussion regarding testing HBASE with YCSB on Whirr or EC2. 
Send to @dev so more people can be involved.

Lars.
I have an automatic YCSB test for HBase running on EC2. It was derived 
from Andy and Eugene's HBase EC2 script. What I added include:
- YCSB test support
- build and upload new HBase jar triggered by SCM(git) changes
- email YCSB test results to configured recipients
- automatically running as a daily cron job

You can take a look at: https://github.com/mlai/hbase-ec2/tree/ycsb for 
more detail.

We do want to move the script to support Whirr, but right now we're lack 
of resources to do the job. Also It seems there is a Whirr HBase bug 
reported although I haven't exactly checked the detail. So there is no 
further progress toward Whirr support right now.

 >> Reporting back the results will be a bit more challenging as usually
 >> you spin down the cluster at end.
I was also bothered a lot for what could be best way to present the 
result from an automatic test. I picked the simplest way -- sending 
result by emails, so that I can avoid the problem to save the data to 
somewhere.

But it could be extended to support Hudson. Right now it downloads the 
result files locally after YCSB tests finished, and parse the result 
locally where I grab the detail of results as email contents. I think 
hudson can use the same files to present results.

 >> And we do
 >> not want to keep the cluster running unnecessarily for a build in web
 >> interface to browse the results etc.
Totally agree, we want to terminate the cluster as soon as the test 
finished.

Here is an example of a test result:
http://pastebin.com/f08bRCkY

What do you think, Lars?

Thanks,
Mingjie


-------- Original Message --------
Subject: 	Re: Report to Apache board: first cut
Date: 	Fri, 21 Jan 2011 09:46:46 -0800
From: 	Stack <st...@duboce.net>

	

	


+1 to Todd suggestion (and change subject -- smile)
St.Ack

On Fri, Jan 21, 2011 at 8:19 AM, Todd Lipcon<to...@cloudera.com>  wrote:
>  Should we move this discussion to the dev list at large?
>
>  Our QA team is also starting to look at at least smoke testing HBase on a
>  cluster. We should coordinate efforts!
>
>  On Fri, Jan 21, 2011 at 12:56 AM, Lars George<la...@gmail.com>  wrote:
>
>>  Hi Andy,
>>
>>  I assumed as much from our previous conversations. I send Eugene the
>>  details on Whirr and using HBase with it. Unfortunately currently
>>  JClouds can not yet ship the scripts from the local directory, but
>>  that is coming soon. In the meantime we need to use a "public" S3
>>  based repo that has a copy. He had that set up last time we got HBase
>>  running together using Whirr. I think he is pretty much set, we simply
>>  need to add a specific "test" role that allows us to start the cluster
>>  and when "test" is part of the template we can not only start the
>>  cluster but invoke whatever test we need. In effect we could have
>>  "test-ycsb-basic", "test-ycsb-workload-5050", "test-mvn-test" (for the
>>  build in tests) and so on to start this. That has the advantage of
>>  being able to use various templates to test different cluster setups
>>  against equally different test scenarios.
>>
>>  Reporting back the results will be a bit more challenging as usually
>>  you spin down the cluster at end. We could grab whatever the test
>>  results are and upload them back to an S3 repo or so? I am not sure if
>>  there is a common interface for that which would make sense given
>>  YCSB! and the Surefire reports are different end results. And we do
>>  not want to keep the cluster running unnecessarily for a build in web
>>  interface to browse the results etc. Nice would be some Hudson
>>  integration which would spin up clusters and then retain the test
>>  results? Sorry for not having a clear idea here, though I assume you
>>  already have a much better plan, so just throwing it out there.
>>
>>  If this makes sense I could also add those tests into the Whirr HBase
>>  service itself so that it gets shipped with Whirr for everyone to
>>  execute. That way the test scripts would evolve with the project.
>>
>>  Eugene and Mingjie, what is your take on this? Looking forward hearing from
>>  you.
>>
>>  Regards,
>>  Lars
>>
>>  On Fri, Jan 21, 2011 at 1:35 AM, Andrew Purtell<ap...@apache.org>
>>  wrote:
>>  >  I've talked with our guys about doing exactly this Lars.
>>  >
>>  >  Best regards,
>>  >
>>  >      - Andy
>>  >
>>  >  Problems worthy of attack prove their worth by hitting back.
>>  >    - Piet Hein (via Tom White)
>>  >
>>  >
>>  >  --- On Tue, 1/18/11, Lars George<la...@gmail.com>  wrote:
>>  >
>>  >>  From: Lars George<la...@gmail.com>
>>  >>  Subject: Re: Report to Apache board: first cut
>>  >>  To: private@hbase.apache.org
>>  >>  Date: Tuesday, January 18, 2011, 12:23 PM
>>  >>  I would love to chime in and help but
>>  >>  am in Israel on a customer stint
>>  >>  working 12 hour days.
>>  >>
>>  >>  My plan is to use Whirr and a custom init script to automate testing
>>  >>  of HBase on a dynamic, on-demand cluster. I need good tests though
>>  >>  besides the junit ones. I would love to run something more useful,
>>  >>  could be YCSB! or some such. Could you send me what you are usually
>>  >>  using so I could all put this together so that others can do burn ins
>>  >>  as well?
>>  >>
>>  >>  Thanks,
>>  >>  Lars
>>  >
>>  >
>>  >
>>  >
>>  >
>>
>
>
>
>  --
>  Todd Lipcon
>  Software Engineer, Cloudera
>


TREND MICRO EMAIL NOTICE
The information contained in this email and any attachments is confidential and may be subject to copyright or other intellectual property protection. If you are not the intended recipient, you are not authorized to use or disclose this information, and we request that you notify us by reply mail or telephone and delete the original message from your mail system.

Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)

Posted by Lars George <la...@gmail.com>.
Hi Mingjie,

Let me look into this tomorrow but I assume that it will be no problem
to port those over to Whirr, since porting the original EC2 scripts
was equally easy. I will see if I create JIRA as per the below idea of
adding template roles for various tests.

I'll keep you posted and thank you for the update and work!

Lars

On Fri, Jan 21, 2011 at 8:40 PM, Mingjie Lai <mi...@trendmicro.com> wrote:
> Guys.
> There is a discussion regarding testing HBASE with YCSB on Whirr or EC2.
> Send to @dev so more people can be involved.
>
> Lars.
> I have an automatic YCSB test for HBase running on EC2. It was derived from
> Andy and Eugene's HBase EC2 script. What I added include:
> - YCSB test support
> - build and upload new HBase jar triggered by SCM(git) changes
> - email YCSB test results to configured recipients
> - automatically running as a daily cron job
>
> You can take a look at: https://github.com/mlai/hbase-ec2/tree/ycsb for more
> detail.
>
> We do want to move the script to support Whirr, but right now we're lack of
> resources to do the job. Also It seems there is a Whirr HBase bug reported
> although I haven't exactly checked the detail. So there is no further
> progress toward Whirr support right now.
>
>>> Reporting back the results will be a bit more challenging as usually
>>> you spin down the cluster at end.
> I was also bothered a lot for what could be best way to present the result
> from an automatic test. I picked the simplest way -- sending result by
> emails, so that I can avoid the problem to save the data to somewhere.
>
> But it could be extended to support Hudson. Right now it downloads the
> result files locally after YCSB tests finished, and parse the result locally
> where I grab the detail of results as email contents. I think hudson can use
> the same files to present results.
>
>>> And we do
>>> not want to keep the cluster running unnecessarily for a build in web
>>> interface to browse the results etc.
> Totally agree, we want to terminate the cluster as soon as the test
> finished.
>
> Here is an example of a test result:
> http://pastebin.com/f08bRCkY
>
> What do you think, Lars?
>
> Thanks,
> Mingjie
>
>
> -------- Original Message --------
> Subject:        Re: Report to Apache board: first cut
> Date:   Fri, 21 Jan 2011 09:46:46 -0800
> From:   Stack <st...@duboce.net>
>
>
>
>
>
>
> +1 to Todd suggestion (and change subject -- smile)
> St.Ack
>
> On Fri, Jan 21, 2011 at 8:19 AM, Todd Lipcon<to...@cloudera.com>  wrote:
>>
>>  Should we move this discussion to the dev list at large?
>>
>>  Our QA team is also starting to look at at least smoke testing HBase on a
>>  cluster. We should coordinate efforts!
>>
>>  On Fri, Jan 21, 2011 at 12:56 AM, Lars George<la...@gmail.com>
>>  wrote:
>>
>>>  Hi Andy,
>>>
>>>  I assumed as much from our previous conversations. I send Eugene the
>>>  details on Whirr and using HBase with it. Unfortunately currently
>>>  JClouds can not yet ship the scripts from the local directory, but
>>>  that is coming soon. In the meantime we need to use a "public" S3
>>>  based repo that has a copy. He had that set up last time we got HBase
>>>  running together using Whirr. I think he is pretty much set, we simply
>>>  need to add a specific "test" role that allows us to start the cluster
>>>  and when "test" is part of the template we can not only start the
>>>  cluster but invoke whatever test we need. In effect we could have
>>>  "test-ycsb-basic", "test-ycsb-workload-5050", "test-mvn-test" (for the
>>>  build in tests) and so on to start this. That has the advantage of
>>>  being able to use various templates to test different cluster setups
>>>  against equally different test scenarios.
>>>
>>>  Reporting back the results will be a bit more challenging as usually
>>>  you spin down the cluster at end. We could grab whatever the test
>>>  results are and upload them back to an S3 repo or so? I am not sure if
>>>  there is a common interface for that which would make sense given
>>>  YCSB! and the Surefire reports are different end results. And we do
>>>  not want to keep the cluster running unnecessarily for a build in web
>>>  interface to browse the results etc. Nice would be some Hudson
>>>  integration which would spin up clusters and then retain the test
>>>  results? Sorry for not having a clear idea here, though I assume you
>>>  already have a much better plan, so just throwing it out there.
>>>
>>>  If this makes sense I could also add those tests into the Whirr HBase
>>>  service itself so that it gets shipped with Whirr for everyone to
>>>  execute. That way the test scripts would evolve with the project.
>>>
>>>  Eugene and Mingjie, what is your take on this? Looking forward hearing
>>> from
>>>  you.
>>>
>>>  Regards,
>>>  Lars
>>>
>>>  On Fri, Jan 21, 2011 at 1:35 AM, Andrew Purtell<ap...@apache.org>
>>>  wrote:
>>>  >  I've talked with our guys about doing exactly this Lars.
>>>  >
>>>  >  Best regards,
>>>  >
>>>  >      - Andy
>>>  >
>>>  >  Problems worthy of attack prove their worth by hitting back.
>>>  >    - Piet Hein (via Tom White)
>>>  >
>>>  >
>>>  >  --- On Tue, 1/18/11, Lars George<la...@gmail.com>  wrote:
>>>  >
>>>  >>  From: Lars George<la...@gmail.com>
>>>  >>  Subject: Re: Report to Apache board: first cut
>>>  >>  To: private@hbase.apache.org
>>>  >>  Date: Tuesday, January 18, 2011, 12:23 PM
>>>  >>  I would love to chime in and help but
>>>  >>  am in Israel on a customer stint
>>>  >>  working 12 hour days.
>>>  >>
>>>  >>  My plan is to use Whirr and a custom init script to automate testing
>>>  >>  of HBase on a dynamic, on-demand cluster. I need good tests though
>>>  >>  besides the junit ones. I would love to run something more useful,
>>>  >>  could be YCSB! or some such. Could you send me what you are usually
>>>  >>  using so I could all put this together so that others can do burn
>>> ins
>>>  >>  as well?
>>>  >>
>>>  >>  Thanks,
>>>  >>  Lars
>>>  >
>>>  >
>>>  >
>>>  >
>>>  >
>>>
>>
>>
>>
>>  --
>>  Todd Lipcon
>>  Software Engineer, Cloudera
>>
>
>
> TREND MICRO EMAIL NOTICE
> The information contained in this email and any attachments is confidential
> and may be subject to copyright or other intellectual property protection.
> If you are not the intended recipient, you are not authorized to use or
> disclose this information, and we request that you notify us by reply mail
> or telephone and delete the original message from your mail system.
>

Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)

Posted by Himanshu Vashishtha <hv...@cs.ualberta.ca>.
hello Mingjie,
this comes at a very apt time for me. I will be evaluating hbase on ec2
using ycsb, and will run mapreduce jobs over there. Like for instance, I
will evaluate some simple agg ones (1512), with mapreduce jobs, coprocessor
and pure HBase APIs (like Scan + client side processing).

I have things running on local, and will move to ec2 pretty soon (by today).
Right now, zero experience with setting hbase on  ec2. I may be bugging you
guys in case I get stuck. :)

Thanks,
Himanshu

On Fri, Jan 21, 2011 at 1:40 PM, Mingjie Lai <mi...@trendmicro.com>wrote:

> Guys.
> There is a discussion regarding testing HBASE with YCSB on Whirr or EC2.
> Send to @dev so more people can be involved.
>
> Lars.
> I have an automatic YCSB test for HBase running on EC2. It was derived from
> Andy and Eugene's HBase EC2 script. What I added include:
> - YCSB test support
> - build and upload new HBase jar triggered by SCM(git) changes
> - email YCSB test results to configured recipients
> - automatically running as a daily cron job
>
> You can take a look at: https://github.com/mlai/hbase-ec2/tree/ycsb for
> more detail.
>
> We do want to move the script to support Whirr, but right now we're lack of
> resources to do the job. Also It seems there is a Whirr HBase bug reported
> although I haven't exactly checked the detail. So there is no further
> progress toward Whirr support right now.
>
> >> Reporting back the results will be a bit more challenging as usually
> >> you spin down the cluster at end.
> I was also bothered a lot for what could be best way to present the result
> from an automatic test. I picked the simplest way -- sending result by
> emails, so that I can avoid the problem to save the data to somewhere.
>
> But it could be extended to support Hudson. Right now it downloads the
> result files locally after YCSB tests finished, and parse the result locally
> where I grab the detail of results as email contents. I think hudson can use
> the same files to present results.
>
> >> And we do
> >> not want to keep the cluster running unnecessarily for a build in web
> >> interface to browse the results etc.
> Totally agree, we want to terminate the cluster as soon as the test
> finished.
>
> Here is an example of a test result:
> http://pastebin.com/f08bRCkY
>
> What do you think, Lars?
>
> Thanks,
> Mingjie
>
>
> -------- Original Message --------
> Subject:        Re: Report to Apache board: first cut
> Date:   Fri, 21 Jan 2011 09:46:46 -0800
> From:   Stack <st...@duboce.net>
>
>
>
>
>
>
> +1 to Todd suggestion (and change subject -- smile)
> St.Ack
>
> On Fri, Jan 21, 2011 at 8:19 AM, Todd Lipcon<to...@cloudera.com>  wrote:
>
>>  Should we move this discussion to the dev list at large?
>>
>>  Our QA team is also starting to look at at least smoke testing HBase on a
>>  cluster. We should coordinate efforts!
>>
>>  On Fri, Jan 21, 2011 at 12:56 AM, Lars George<la...@gmail.com>
>>  wrote:
>>
>>   Hi Andy,
>>>
>>>  I assumed as much from our previous conversations. I send Eugene the
>>>  details on Whirr and using HBase with it. Unfortunately currently
>>>  JClouds can not yet ship the scripts from the local directory, but
>>>  that is coming soon. In the meantime we need to use a "public" S3
>>>  based repo that has a copy. He had that set up last time we got HBase
>>>  running together using Whirr. I think he is pretty much set, we simply
>>>  need to add a specific "test" role that allows us to start the cluster
>>>  and when "test" is part of the template we can not only start the
>>>  cluster but invoke whatever test we need. In effect we could have
>>>  "test-ycsb-basic", "test-ycsb-workload-5050", "test-mvn-test" (for the
>>>  build in tests) and so on to start this. That has the advantage of
>>>  being able to use various templates to test different cluster setups
>>>  against equally different test scenarios.
>>>
>>>  Reporting back the results will be a bit more challenging as usually
>>>  you spin down the cluster at end. We could grab whatever the test
>>>  results are and upload them back to an S3 repo or so? I am not sure if
>>>  there is a common interface for that which would make sense given
>>>  YCSB! and the Surefire reports are different end results. And we do
>>>  not want to keep the cluster running unnecessarily for a build in web
>>>  interface to browse the results etc. Nice would be some Hudson
>>>  integration which would spin up clusters and then retain the test
>>>  results? Sorry for not having a clear idea here, though I assume you
>>>  already have a much better plan, so just throwing it out there.
>>>
>>>  If this makes sense I could also add those tests into the Whirr HBase
>>>  service itself so that it gets shipped with Whirr for everyone to
>>>  execute. That way the test scripts would evolve with the project.
>>>
>>>  Eugene and Mingjie, what is your take on this? Looking forward hearing
>>> from
>>>  you.
>>>
>>>  Regards,
>>>  Lars
>>>
>>>  On Fri, Jan 21, 2011 at 1:35 AM, Andrew Purtell<ap...@apache.org>
>>>  wrote:
>>>  >  I've talked with our guys about doing exactly this Lars.
>>>  >
>>>  >  Best regards,
>>>  >
>>>  >      - Andy
>>>  >
>>>  >  Problems worthy of attack prove their worth by hitting back.
>>>  >    - Piet Hein (via Tom White)
>>>  >
>>>  >
>>>  >  --- On Tue, 1/18/11, Lars George<la...@gmail.com>  wrote:
>>>  >
>>>  >>  From: Lars George<la...@gmail.com>
>>>  >>  Subject: Re: Report to Apache board: first cut
>>>  >>  To: private@hbase.apache.org
>>>  >>  Date: Tuesday, January 18, 2011, 12:23 PM
>>>  >>  I would love to chime in and help but
>>>  >>  am in Israel on a customer stint
>>>  >>  working 12 hour days.
>>>  >>
>>>  >>  My plan is to use Whirr and a custom init script to automate testing
>>>  >>  of HBase on a dynamic, on-demand cluster. I need good tests though
>>>  >>  besides the junit ones. I would love to run something more useful,
>>>  >>  could be YCSB! or some such. Could you send me what you are usually
>>>  >>  using so I could all put this together so that others can do burn
>>> ins
>>>  >>  as well?
>>>  >>
>>>  >>  Thanks,
>>>  >>  Lars
>>>  >
>>>  >
>>>  >
>>>  >
>>>  >
>>>
>>>
>>
>>
>>  --
>>  Todd Lipcon
>>  Software Engineer, Cloudera
>>
>>
>
> TREND MICRO EMAIL NOTICE
> The information contained in this email and any attachments is confidential
> and may be subject to copyright or other intellectual property protection.
> If you are not the intended recipient, you are not authorized to use or
> disclose this information, and we request that you notify us by reply mail
> or telephone and delete the original message from your mail system.
>

Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)

Posted by Lars George <la...@gmail.com>.
Let me ask around internally (with the Hudson gods).

On Mon, Jan 24, 2011 at 12:48 AM, Andrew Purtell <ap...@apache.org> wrote:
> A Hudson plugin that uses Whirr to dynamically build a HBase cluster and run YCSB, then present the results, and fail the build also on configurable out-of-range values ... this would be super awesome.
>
>  - Andy
>
>
> --- On Fri, 1/21/11, Ted Dunning <td...@maprtech.com> wrote:
>
>> From: Ted Dunning <td...@maprtech.com>
>> Subject: Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)
>> To: dev@hbase.apache.org
>> Date: Friday, January 21, 2011, 2:43 PM
>> Nice work!
>>
>> On Fri, Jan 21, 2011 at 12:40 PM, Mingjie Lai <mi...@trendmicro.com>wrote:
>>
>> > Guys.
>> > There is a discussion regarding testing HBASE with
>> YCSB on Whirr or EC2.
>> > Send to @dev so more people can be involved.
>> >
>> > Lars.
>> > I have an automatic YCSB test for HBase running on
>> EC2. It was derived from
>> > Andy and Eugene's HBase EC2 script. What I added
>> include:
>> > - YCSB test support
>> > - build and upload new HBase jar triggered by SCM(git)
>> changes
>> > - email YCSB test results to configured recipients
>> > - automatically running as a daily cron job
>> >
>> > You can take a look at: https://github.com/mlai/hbase-ec2/tree/ycsb for
>> > more detail.
>> >
>> > We do want to move the script to support Whirr, but
>> right now we're lack of
>> > resources to do the job. Also It seems there is a
>> Whirr HBase bug reported
>> > although I haven't exactly checked the detail. So
>> there is no further
>> > progress toward Whirr support right now.
>> >
>> > >> Reporting back the results will be a bit more
>> challenging as usually
>> > >> you spin down the cluster at end.
>> > I was also bothered a lot for what could be best way
>> to present the result
>> > from an automatic test. I picked the simplest way --
>> sending result by
>> > emails, so that I can avoid the problem to save the
>> data to somewhere.
>> >
>> > But it could be extended to support Hudson. Right now
>> it downloads the
>> > result files locally after YCSB tests finished, and
>> parse the result locally
>> > where I grab the detail of results as email contents.
>> I think hudson can use
>> > the same files to present results.
>> >
>> > >> And we do
>> > >> not want to keep the cluster running
>> unnecessarily for a build in web
>> > >> interface to browse the results etc.
>> > Totally agree, we want to terminate the cluster as
>> soon as the test
>> > finished.
>> >
>> > Here is an example of a test result:
>> > http://pastebin.com/f08bRCkY
>> >
>> > What do you think, Lars?
>> >
>> > Thanks,
>> > Mingjie
>> >
>> >
>> > -------- Original Message --------
>> > Subject:        Re: Report to
>> Apache board: first cut
>> > Date:   Fri, 21 Jan 2011 09:46:46
>> -0800
>> > From:   Stack <st...@duboce.net>
>> >
>> >
>> >
>> >
>> >
>> >
>> > +1 to Todd suggestion (and change subject -- smile)
>> > St.Ack
>> >
>> > On Fri, Jan 21, 2011 at 8:19 AM, Todd Lipcon<to...@cloudera.com>
>> wrote:
>> >
>> >>  Should we move this discussion to the dev
>> list at large?
>> >>
>> >>  Our QA team is also starting to look at at
>> least smoke testing HBase on a
>> >>  cluster. We should coordinate efforts!
>> >>
>> >>  On Fri, Jan 21, 2011 at 12:56 AM, Lars
>> George<la...@gmail.com>
>> >>  wrote:
>> >>
>> >>   Hi Andy,
>> >>>
>> >>>  I assumed as much from our previous
>> conversations. I send Eugene the
>> >>>  details on Whirr and using HBase with
>> it. Unfortunately currently
>> >>>  JClouds can not yet ship the scripts
>> from the local directory, but
>> >>>  that is coming soon. In the meantime we
>> need to use a "public" S3
>> >>>  based repo that has a copy. He had that
>> set up last time we got HBase
>> >>>  running together using Whirr. I think he
>> is pretty much set, we simply
>> >>>  need to add a specific "test" role that
>> allows us to start the cluster
>> >>>  and when "test" is part of the template
>> we can not only start the
>> >>>  cluster but invoke whatever test we
>> need. In effect we could have
>> >>>  "test-ycsb-basic",
>> "test-ycsb-workload-5050", "test-mvn-test" (for the
>> >>>  build in tests) and so on to start this.
>> That has the advantage of
>> >>>  being able to use various templates to
>> test different cluster setups
>> >>>  against equally different test
>> scenarios.
>> >>>
>> >>>  Reporting back the results will be a bit
>> more challenging as usually
>> >>>  you spin down the cluster at end. We
>> could grab whatever the test
>> >>>  results are and upload them back to an
>> S3 repo or so? I am not sure if
>> >>>  there is a common interface for that
>> which would make sense given
>> >>>  YCSB! and the Surefire reports are
>> different end results. And we do
>> >>>  not want to keep the cluster running
>> unnecessarily for a build in web
>> >>>  interface to browse the results etc.
>> Nice would be some Hudson
>> >>>  integration which would spin up clusters
>> and then retain the test
>> >>>  results? Sorry for not having a clear
>> idea here, though I assume you
>> >>>  already have a much better plan, so just
>> throwing it out there.
>> >>>
>> >>>  If this makes sense I could also add
>> those tests into the Whirr HBase
>> >>>  service itself so that it gets shipped
>> with Whirr for everyone to
>> >>>  execute. That way the test scripts would
>> evolve with the project.
>> >>>
>> >>>  Eugene and Mingjie, what is your take on
>> this? Looking forward hearing
>> >>> from
>> >>>  you.
>> >>>
>> >>>  Regards,
>> >>>  Lars
>> >>>
>> >>>  On Fri, Jan 21, 2011 at 1:35 AM, Andrew
>> Purtell<ap...@apache.org>
>> >>>  wrote:
>> >>>  >  I've talked with our guys
>> about doing exactly this Lars.
>> >>>  >
>> >>>  >  Best regards,
>> >>>  >
>> >>>  >      - Andy
>> >>>  >
>> >>>  >  Problems worthy of attack
>> prove their worth by hitting back.
>> >>>  >    - Piet Hein (via Tom
>> White)
>> >>>  >
>> >>>  >
>> >>>  >  --- On Tue, 1/18/11, Lars
>> George<la...@gmail.com>
>> wrote:
>> >>>  >
>> >>>  >>  From: Lars George<la...@gmail.com>
>> >>>  >>  Subject: Re: Report to
>> Apache board: first cut
>> >>>  >>  To: private@hbase.apache.org
>> >>>  >>  Date: Tuesday, January
>> 18, 2011, 12:23 PM
>> >>>  >>  I would love to chime in
>> and help but
>> >>>  >>  am in Israel on a
>> customer stint
>> >>>  >>  working 12 hour days.
>> >>>  >>
>> >>>  >>  My plan is to use Whirr
>> and a custom init script to automate testing
>> >>>  >>  of HBase on a dynamic,
>> on-demand cluster. I need good tests though
>> >>>  >>  besides the junit ones. I
>> would love to run something more useful,
>> >>>  >>  could be YCSB! or some
>> such. Could you send me what you are usually
>> >>>  >>  using so I could all put
>> this together so that others can do burn
>> >>> ins
>> >>>  >>  as well?
>> >>>  >>
>> >>>  >>  Thanks,
>> >>>  >>  Lars
>> >>>  >
>> >>>  >
>> >>>  >
>> >>>  >
>> >>>  >
>> >>>
>> >>>
>> >>
>> >>
>> >>  --
>> >>  Todd Lipcon
>> >>  Software Engineer, Cloudera
>> >>
>> >>
>> >
>> > TREND MICRO EMAIL NOTICE
>> > The information contained in this email and any
>> attachments is confidential
>> > and may be subject to copyright or other intellectual
>> property protection.
>> > If you are not the intended recipient, you are not
>> authorized to use or
>> > disclose this information, and we request that you
>> notify us by reply mail
>> > or telephone and delete the original message from your
>> mail system.
>> >
>>
>
>
>
>

Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)

Posted by Andrew Purtell <ap...@apache.org>.
A Hudson plugin that uses Whirr to dynamically build a HBase cluster and run YCSB, then present the results, and fail the build also on configurable out-of-range values ... this would be super awesome.

  - Andy


--- On Fri, 1/21/11, Ted Dunning <td...@maprtech.com> wrote:

> From: Ted Dunning <td...@maprtech.com>
> Subject: Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)
> To: dev@hbase.apache.org
> Date: Friday, January 21, 2011, 2:43 PM
> Nice work!
> 
> On Fri, Jan 21, 2011 at 12:40 PM, Mingjie Lai <mi...@trendmicro.com>wrote:
> 
> > Guys.
> > There is a discussion regarding testing HBASE with
> YCSB on Whirr or EC2.
> > Send to @dev so more people can be involved.
> >
> > Lars.
> > I have an automatic YCSB test for HBase running on
> EC2. It was derived from
> > Andy and Eugene's HBase EC2 script. What I added
> include:
> > - YCSB test support
> > - build and upload new HBase jar triggered by SCM(git)
> changes
> > - email YCSB test results to configured recipients
> > - automatically running as a daily cron job
> >
> > You can take a look at: https://github.com/mlai/hbase-ec2/tree/ycsb for
> > more detail.
> >
> > We do want to move the script to support Whirr, but
> right now we're lack of
> > resources to do the job. Also It seems there is a
> Whirr HBase bug reported
> > although I haven't exactly checked the detail. So
> there is no further
> > progress toward Whirr support right now.
> >
> > >> Reporting back the results will be a bit more
> challenging as usually
> > >> you spin down the cluster at end.
> > I was also bothered a lot for what could be best way
> to present the result
> > from an automatic test. I picked the simplest way --
> sending result by
> > emails, so that I can avoid the problem to save the
> data to somewhere.
> >
> > But it could be extended to support Hudson. Right now
> it downloads the
> > result files locally after YCSB tests finished, and
> parse the result locally
> > where I grab the detail of results as email contents.
> I think hudson can use
> > the same files to present results.
> >
> > >> And we do
> > >> not want to keep the cluster running
> unnecessarily for a build in web
> > >> interface to browse the results etc.
> > Totally agree, we want to terminate the cluster as
> soon as the test
> > finished.
> >
> > Here is an example of a test result:
> > http://pastebin.com/f08bRCkY
> >
> > What do you think, Lars?
> >
> > Thanks,
> > Mingjie
> >
> >
> > -------- Original Message --------
> > Subject:        Re: Report to
> Apache board: first cut
> > Date:   Fri, 21 Jan 2011 09:46:46
> -0800
> > From:   Stack <st...@duboce.net>
> >
> >
> >
> >
> >
> >
> > +1 to Todd suggestion (and change subject -- smile)
> > St.Ack
> >
> > On Fri, Jan 21, 2011 at 8:19 AM, Todd Lipcon<to...@cloudera.com> 
> wrote:
> >
> >>  Should we move this discussion to the dev
> list at large?
> >>
> >>  Our QA team is also starting to look at at
> least smoke testing HBase on a
> >>  cluster. We should coordinate efforts!
> >>
> >>  On Fri, Jan 21, 2011 at 12:56 AM, Lars
> George<la...@gmail.com>
> >>  wrote:
> >>
> >>   Hi Andy,
> >>>
> >>>  I assumed as much from our previous
> conversations. I send Eugene the
> >>>  details on Whirr and using HBase with
> it. Unfortunately currently
> >>>  JClouds can not yet ship the scripts
> from the local directory, but
> >>>  that is coming soon. In the meantime we
> need to use a "public" S3
> >>>  based repo that has a copy. He had that
> set up last time we got HBase
> >>>  running together using Whirr. I think he
> is pretty much set, we simply
> >>>  need to add a specific "test" role that
> allows us to start the cluster
> >>>  and when "test" is part of the template
> we can not only start the
> >>>  cluster but invoke whatever test we
> need. In effect we could have
> >>>  "test-ycsb-basic",
> "test-ycsb-workload-5050", "test-mvn-test" (for the
> >>>  build in tests) and so on to start this.
> That has the advantage of
> >>>  being able to use various templates to
> test different cluster setups
> >>>  against equally different test
> scenarios.
> >>>
> >>>  Reporting back the results will be a bit
> more challenging as usually
> >>>  you spin down the cluster at end. We
> could grab whatever the test
> >>>  results are and upload them back to an
> S3 repo or so? I am not sure if
> >>>  there is a common interface for that
> which would make sense given
> >>>  YCSB! and the Surefire reports are
> different end results. And we do
> >>>  not want to keep the cluster running
> unnecessarily for a build in web
> >>>  interface to browse the results etc.
> Nice would be some Hudson
> >>>  integration which would spin up clusters
> and then retain the test
> >>>  results? Sorry for not having a clear
> idea here, though I assume you
> >>>  already have a much better plan, so just
> throwing it out there.
> >>>
> >>>  If this makes sense I could also add
> those tests into the Whirr HBase
> >>>  service itself so that it gets shipped
> with Whirr for everyone to
> >>>  execute. That way the test scripts would
> evolve with the project.
> >>>
> >>>  Eugene and Mingjie, what is your take on
> this? Looking forward hearing
> >>> from
> >>>  you.
> >>>
> >>>  Regards,
> >>>  Lars
> >>>
> >>>  On Fri, Jan 21, 2011 at 1:35 AM, Andrew
> Purtell<ap...@apache.org>
> >>>  wrote:
> >>>  >  I've talked with our guys
> about doing exactly this Lars.
> >>>  >
> >>>  >  Best regards,
> >>>  >
> >>>  >      - Andy
> >>>  >
> >>>  >  Problems worthy of attack
> prove their worth by hitting back.
> >>>  >    - Piet Hein (via Tom
> White)
> >>>  >
> >>>  >
> >>>  >  --- On Tue, 1/18/11, Lars
> George<la...@gmail.com> 
> wrote:
> >>>  >
> >>>  >>  From: Lars George<la...@gmail.com>
> >>>  >>  Subject: Re: Report to
> Apache board: first cut
> >>>  >>  To: private@hbase.apache.org
> >>>  >>  Date: Tuesday, January
> 18, 2011, 12:23 PM
> >>>  >>  I would love to chime in
> and help but
> >>>  >>  am in Israel on a
> customer stint
> >>>  >>  working 12 hour days.
> >>>  >>
> >>>  >>  My plan is to use Whirr
> and a custom init script to automate testing
> >>>  >>  of HBase on a dynamic,
> on-demand cluster. I need good tests though
> >>>  >>  besides the junit ones. I
> would love to run something more useful,
> >>>  >>  could be YCSB! or some
> such. Could you send me what you are usually
> >>>  >>  using so I could all put
> this together so that others can do burn
> >>> ins
> >>>  >>  as well?
> >>>  >>
> >>>  >>  Thanks,
> >>>  >>  Lars
> >>>  >
> >>>  >
> >>>  >
> >>>  >
> >>>  >
> >>>
> >>>
> >>
> >>
> >>  --
> >>  Todd Lipcon
> >>  Software Engineer, Cloudera
> >>
> >>
> >
> > TREND MICRO EMAIL NOTICE
> > The information contained in this email and any
> attachments is confidential
> > and may be subject to copyright or other intellectual
> property protection.
> > If you are not the intended recipient, you are not
> authorized to use or
> > disclose this information, and we request that you
> notify us by reply mail
> > or telephone and delete the original message from your
> mail system.
> >
> 


      

Re: YCSB tests for HBase on Whirr (was: Report to Apache board: first cut)

Posted by Ted Dunning <td...@maprtech.com>.
Nice work!

On Fri, Jan 21, 2011 at 12:40 PM, Mingjie Lai <mi...@trendmicro.com>wrote:

> Guys.
> There is a discussion regarding testing HBASE with YCSB on Whirr or EC2.
> Send to @dev so more people can be involved.
>
> Lars.
> I have an automatic YCSB test for HBase running on EC2. It was derived from
> Andy and Eugene's HBase EC2 script. What I added include:
> - YCSB test support
> - build and upload new HBase jar triggered by SCM(git) changes
> - email YCSB test results to configured recipients
> - automatically running as a daily cron job
>
> You can take a look at: https://github.com/mlai/hbase-ec2/tree/ycsb for
> more detail.
>
> We do want to move the script to support Whirr, but right now we're lack of
> resources to do the job. Also It seems there is a Whirr HBase bug reported
> although I haven't exactly checked the detail. So there is no further
> progress toward Whirr support right now.
>
> >> Reporting back the results will be a bit more challenging as usually
> >> you spin down the cluster at end.
> I was also bothered a lot for what could be best way to present the result
> from an automatic test. I picked the simplest way -- sending result by
> emails, so that I can avoid the problem to save the data to somewhere.
>
> But it could be extended to support Hudson. Right now it downloads the
> result files locally after YCSB tests finished, and parse the result locally
> where I grab the detail of results as email contents. I think hudson can use
> the same files to present results.
>
> >> And we do
> >> not want to keep the cluster running unnecessarily for a build in web
> >> interface to browse the results etc.
> Totally agree, we want to terminate the cluster as soon as the test
> finished.
>
> Here is an example of a test result:
> http://pastebin.com/f08bRCkY
>
> What do you think, Lars?
>
> Thanks,
> Mingjie
>
>
> -------- Original Message --------
> Subject:        Re: Report to Apache board: first cut
> Date:   Fri, 21 Jan 2011 09:46:46 -0800
> From:   Stack <st...@duboce.net>
>
>
>
>
>
>
> +1 to Todd suggestion (and change subject -- smile)
> St.Ack
>
> On Fri, Jan 21, 2011 at 8:19 AM, Todd Lipcon<to...@cloudera.com>  wrote:
>
>>  Should we move this discussion to the dev list at large?
>>
>>  Our QA team is also starting to look at at least smoke testing HBase on a
>>  cluster. We should coordinate efforts!
>>
>>  On Fri, Jan 21, 2011 at 12:56 AM, Lars George<la...@gmail.com>
>>  wrote:
>>
>>   Hi Andy,
>>>
>>>  I assumed as much from our previous conversations. I send Eugene the
>>>  details on Whirr and using HBase with it. Unfortunately currently
>>>  JClouds can not yet ship the scripts from the local directory, but
>>>  that is coming soon. In the meantime we need to use a "public" S3
>>>  based repo that has a copy. He had that set up last time we got HBase
>>>  running together using Whirr. I think he is pretty much set, we simply
>>>  need to add a specific "test" role that allows us to start the cluster
>>>  and when "test" is part of the template we can not only start the
>>>  cluster but invoke whatever test we need. In effect we could have
>>>  "test-ycsb-basic", "test-ycsb-workload-5050", "test-mvn-test" (for the
>>>  build in tests) and so on to start this. That has the advantage of
>>>  being able to use various templates to test different cluster setups
>>>  against equally different test scenarios.
>>>
>>>  Reporting back the results will be a bit more challenging as usually
>>>  you spin down the cluster at end. We could grab whatever the test
>>>  results are and upload them back to an S3 repo or so? I am not sure if
>>>  there is a common interface for that which would make sense given
>>>  YCSB! and the Surefire reports are different end results. And we do
>>>  not want to keep the cluster running unnecessarily for a build in web
>>>  interface to browse the results etc. Nice would be some Hudson
>>>  integration which would spin up clusters and then retain the test
>>>  results? Sorry for not having a clear idea here, though I assume you
>>>  already have a much better plan, so just throwing it out there.
>>>
>>>  If this makes sense I could also add those tests into the Whirr HBase
>>>  service itself so that it gets shipped with Whirr for everyone to
>>>  execute. That way the test scripts would evolve with the project.
>>>
>>>  Eugene and Mingjie, what is your take on this? Looking forward hearing
>>> from
>>>  you.
>>>
>>>  Regards,
>>>  Lars
>>>
>>>  On Fri, Jan 21, 2011 at 1:35 AM, Andrew Purtell<ap...@apache.org>
>>>  wrote:
>>>  >  I've talked with our guys about doing exactly this Lars.
>>>  >
>>>  >  Best regards,
>>>  >
>>>  >      - Andy
>>>  >
>>>  >  Problems worthy of attack prove their worth by hitting back.
>>>  >    - Piet Hein (via Tom White)
>>>  >
>>>  >
>>>  >  --- On Tue, 1/18/11, Lars George<la...@gmail.com>  wrote:
>>>  >
>>>  >>  From: Lars George<la...@gmail.com>
>>>  >>  Subject: Re: Report to Apache board: first cut
>>>  >>  To: private@hbase.apache.org
>>>  >>  Date: Tuesday, January 18, 2011, 12:23 PM
>>>  >>  I would love to chime in and help but
>>>  >>  am in Israel on a customer stint
>>>  >>  working 12 hour days.
>>>  >>
>>>  >>  My plan is to use Whirr and a custom init script to automate testing
>>>  >>  of HBase on a dynamic, on-demand cluster. I need good tests though
>>>  >>  besides the junit ones. I would love to run something more useful,
>>>  >>  could be YCSB! or some such. Could you send me what you are usually
>>>  >>  using so I could all put this together so that others can do burn
>>> ins
>>>  >>  as well?
>>>  >>
>>>  >>  Thanks,
>>>  >>  Lars
>>>  >
>>>  >
>>>  >
>>>  >
>>>  >
>>>
>>>
>>
>>
>>  --
>>  Todd Lipcon
>>  Software Engineer, Cloudera
>>
>>
>
> TREND MICRO EMAIL NOTICE
> The information contained in this email and any attachments is confidential
> and may be subject to copyright or other intellectual property protection.
> If you are not the intended recipient, you are not authorized to use or
> disclose this information, and we request that you notify us by reply mail
> or telephone and delete the original message from your mail system.
>