You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@hbase.apache.org by Jesse Yates <je...@gmail.com> on 2011/10/03 22:55:09 UTC

Speeding up tests

Hey everyone,

There has been a bunch of work recently on speeding up the testing to make
it easier for developers to iterate quickly on new features fixes. Part of
the problem is that the test suite takes anywhere from 1-2 hrs to run and
have some apparently non-deterministic hanging of tests.

TL;DR To speed it all up, the attack plan would be:
(1) Move long running tests to be integration tests,
(2) Use a build server for patches so people only run unit tests locally,
(3) We add unit tests when integration tests are breaking a lot but unit
tests pass,
(4) Go from forked to single jvm unit tests,
(5) Add in surefire parallelization
(6) Entertain using HBaseTestingUtilFactory

I recently chatted with Stack and Doug about ways around this. Here is what
we came up with:

(1) Break up long running tests from medium and short (2 mins max) tests and
move the former to be IntegrationTests.  This was based on Todd's suggestion
in HBASE-4438. Either naming them integration tests or long running
functional tests, they would become part of the 'mvn verify' rather than the
'mvn test' suite of tests. Starting point would be to  use Doug's
spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480.

Right now, that means a LOT of tests are going to shift, but it means when
developers run 'mvn test' the amount of time spent running unit tests will
be cut down dramatically (hopefully towards the sub 10 -15 mins range)

There is an implicit problem here: if the soon to be integration tests
capture functionality that is not covered by unit tests, then people may
incorrectly think that they are not breaking things. Therefore, we would do
(2) and (3):

(2) Add a patch continuous integration server that goes and actually builds
and tests patches as they come in. This would run 'mvn verify' and ensure
that the patch actually isn't breaking high level/complex functionality. It
would be a requirement before patches are committed that they pass this
build.

(3) If we find that the unit tests aren't covering a certain level of
functionality that is constantly breaking on the build server, we add more
unit tests of the breaking functionality to ensure the unit tests are more
complete and provide more assurances to developers when running them.

This would be an ongoing process of comparing the integration tests vs. the
unit tests.

(4) Once we have a true unit test suite, we should be able to go from
'forked' jvm mode back to a single jvm for running tests. Unit tests should
not do crazy fault injection, full failure scenarios, so they should be able
to cleanup after themselves. This means we are going to get some speedup
from not spinning up a new jvm for each test.

(5) Once we are running in non-forked mode, we can try turning on
parallelized test execution in surefire, which does parallel runs in a
single jvm.

(6) Once things all run in a single jvm, using the HBaseTestUtilFactory
(HBASE-4448) make sense reuse the mini clusters across tests.

What does everyone think of this approach?

Thanks!

-Jesse Yates

Re: Speeding up tests

Posted by Jesse Yates <je...@gmail.com>.

Yeah, they would still run in forked mode. There are also a lot of cases
where we are testing edge failure scenarios and injecting errors into the
cluster that really need to be in their own jvm.

We are trying to cut down on everything that we can to speed up the build.
Agree that a lot of the time is on spin up/down of mini-clusters, so if we
can just use one across multiple tests, we can see a big speed up. When we
move them into single jvm mode there is definitely going to be some issues,
but lets cross that bridge when we come to it :)

-Jesse Yates

On Mon, Oct 3, 2011 at 5:25 PM, Jonathan Hsieh <jo...@cloudera.com> wrote:

> I'm assuming that the longrunning/integration tests will still run as new
> jvm instances?
>
> I've been working on hbase offline recovery tools whose tests (in current
> incarnation) requires hbase cluster shutdown and restart.   I assume these
> would be integration tests.  I've also found at least in these cases, it is
> mini cluster spinup and spindown that is more costly than jvm spin up.
>  I've
> also found that here is some file handle leakage in the mini cluster
> utility
> classes and other possbible leakage due to statics and what such as
> HConnectionManager which may make using a single jvm infeasible for the
> long
> running tests until that gets fixed.
>
> Jon.
>
> On Mon, Oct 3, 2011 at 1:55 PM, Jesse Yates <je...@gmail.com>
> wrote:
>
> > Hey everyone,
> >
> > There has been a bunch of work recently on speeding up the testing to
> make
> > it easier for developers to iterate quickly on new features fixes. Part
> of
> > the problem is that the test suite takes anywhere from 1-2 hrs to run and
> > have some apparently non-deterministic hanging of tests.
> >
> > TL;DR To speed it all up, the attack plan would be:
> > (1) Move long running tests to be integration tests,
> > (2) Use a build server for patches so people only run unit tests locally,
> > (3) We add unit tests when integration tests are breaking a lot but unit
> > tests pass,
> > (4) Go from forked to single jvm unit tests,
> > (5) Add in surefire parallelization
> > (6) Entertain using HBaseTestingUtilFactory
> >
> > I recently chatted with Stack and Doug about ways around this. Here is
> what
> > we came up with:
> >
> > (1) Break up long running tests from medium and short (2 mins max) tests
> > and
> > move the former to be IntegrationTests.  This was based on Todd's
> > suggestion
> > in HBASE-4438. Either naming them integration tests or long running
> > functional tests, they would become part of the 'mvn verify' rather than
> > the
> > 'mvn test' suite of tests. Starting point would be to  use Doug's
> > spreadsheet from HBASE-4448 and when its done, the script from
> HBASE-4480.
> >
> > Right now, that means a LOT of tests are going to shift, but it means
> when
> > developers run 'mvn test' the amount of time spent running unit tests
> will
> > be cut down dramatically (hopefully towards the sub 10 -15 mins range)
> >
> > There is an implicit problem here: if the soon to be integration tests
> > capture functionality that is not covered by unit tests, then people may
> > incorrectly think that they are not breaking things. Therefore, we would
> do
> > (2) and (3):
> >
> > (2) Add a patch continuous integration server that goes and actually
> builds
> > and tests patches as they come in. This would run 'mvn verify' and ensure
> > that the patch actually isn't breaking high level/complex functionality.
> It
> > would be a requirement before patches are committed that they pass this
> > build.
> >
> > (3) If we find that the unit tests aren't covering a certain level of
> > functionality that is constantly breaking on the build server, we add
> more
> > unit tests of the breaking functionality to ensure the unit tests are
> more
> > complete and provide more assurances to developers when running them.
> >
> > This would be an ongoing process of comparing the integration tests vs.
> the
> > unit tests.
> >
> > (4) Once we have a true unit test suite, we should be able to go from
> > 'forked' jvm mode back to a single jvm for running tests. Unit tests
> should
> > not do crazy fault injection, full failure scenarios, so they should be
> > able
> > to cleanup after themselves. This means we are going to get some speedup
> > from not spinning up a new jvm for each test.
> >
> > (5) Once we are running in non-forked mode, we can try turning on
> > parallelized test execution in surefire, which does parallel runs in a
> > single jvm.
> >
> > (6) Once things all run in a single jvm, using the HBaseTestUtilFactory
> > (HBASE-4448) make sense reuse the mini clusters across tests.
> >
> > What does everyone think of this approach?
> >
> > Thanks!
> >
> > -Jesse Yates
> >
>
>
>
> --
> // Jonathan Hsieh (shay)
> // Software Engineer, Cloudera
> // jon@cloudera.com
>

Re: Speeding up tests

Posted by Jonathan Hsieh <jo...@cloudera.com>.

I'm assuming that the longrunning/integration tests will still run as new
jvm instances?

I've been working on hbase offline recovery tools whose tests (in current
incarnation) requires hbase cluster shutdown and restart.   I assume these
would be integration tests.  I've also found at least in these cases, it is
mini cluster spinup and spindown that is more costly than jvm spin up.  I've
also found that here is some file handle leakage in the mini cluster utility
classes and other possbible leakage due to statics and what such as
HConnectionManager which may make using a single jvm infeasible for the long
running tests until that gets fixed.

Jon.

On Mon, Oct 3, 2011 at 1:55 PM, Jesse Yates <je...@gmail.com> wrote:

> Hey everyone,
>
> There has been a bunch of work recently on speeding up the testing to make
> it easier for developers to iterate quickly on new features fixes. Part of
> the problem is that the test suite takes anywhere from 1-2 hrs to run and
> have some apparently non-deterministic hanging of tests.
>
> TL;DR To speed it all up, the attack plan would be:
> (1) Move long running tests to be integration tests,
> (2) Use a build server for patches so people only run unit tests locally,
> (3) We add unit tests when integration tests are breaking a lot but unit
> tests pass,
> (4) Go from forked to single jvm unit tests,
> (5) Add in surefire parallelization
> (6) Entertain using HBaseTestingUtilFactory
>
> I recently chatted with Stack and Doug about ways around this. Here is what
> we came up with:
>
> (1) Break up long running tests from medium and short (2 mins max) tests
> and
> move the former to be IntegrationTests.  This was based on Todd's
> suggestion
> in HBASE-4438. Either naming them integration tests or long running
> functional tests, they would become part of the 'mvn verify' rather than
> the
> 'mvn test' suite of tests. Starting point would be to  use Doug's
> spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480.
>
> Right now, that means a LOT of tests are going to shift, but it means when
> developers run 'mvn test' the amount of time spent running unit tests will
> be cut down dramatically (hopefully towards the sub 10 -15 mins range)
>
> There is an implicit problem here: if the soon to be integration tests
> capture functionality that is not covered by unit tests, then people may
> incorrectly think that they are not breaking things. Therefore, we would do
> (2) and (3):
>
> (2) Add a patch continuous integration server that goes and actually builds
> and tests patches as they come in. This would run 'mvn verify' and ensure
> that the patch actually isn't breaking high level/complex functionality. It
> would be a requirement before patches are committed that they pass this
> build.
>
> (3) If we find that the unit tests aren't covering a certain level of
> functionality that is constantly breaking on the build server, we add more
> unit tests of the breaking functionality to ensure the unit tests are more
> complete and provide more assurances to developers when running them.
>
> This would be an ongoing process of comparing the integration tests vs. the
> unit tests.
>
> (4) Once we have a true unit test suite, we should be able to go from
> 'forked' jvm mode back to a single jvm for running tests. Unit tests should
> not do crazy fault injection, full failure scenarios, so they should be
> able
> to cleanup after themselves. This means we are going to get some speedup
> from not spinning up a new jvm for each test.
>
> (5) Once we are running in non-forked mode, we can try turning on
> parallelized test execution in surefire, which does parallel runs in a
> single jvm.
>
> (6) Once things all run in a single jvm, using the HBaseTestUtilFactory
> (HBASE-4448) make sense reuse the mini clusters across tests.
>
> What does everyone think of this approach?
>
> Thanks!
>
> -Jesse Yates
>



-- 
// Jonathan Hsieh (shay)
// Software Engineer, Cloudera
// jon@cloudera.com

RE: Speeding up tests

Posted by Ramkrishna S Vasudevan <ra...@huawei.com>.

Hi Jesse
Thanks for the write up.
I am using the script in HBASE-4480 widely.  I have a problem
Sometimes some test cases gets killed by the maven as it took a long time
and those testcases don't have timeout property in them.

Now if such testcase dont get completed(hanging happens) then maven kills
entirely and we are not able to proceed with other testcases.
Could you help me in this?

And do you have a list of flaky testcases?  I have a list prepared may be
you can add on to it.
            <include>**/TestActiveMasterManager*.java</include> 
            <include>**/TestMasterFailover*.java</include> 
            <include>**/TestMasterRestartAfterDisablingTable*.java</include>

            <include>**/TestLogsCleaner*.java</include> 
            <include>**/TestRestartCluster*.java</include> 
            <include>**/TestMasterAddressManager*.java</include> 
            <include>**/TestLogRolling*.java</include> 
            <include>**/TestRegionRebalancing*.java</include> 
            <include>**/TestZKTable*.java</include> 
            <include>**/TestZooKeeperNodeTracker*.java</include> 
            <include>**/TestMergeTool*.java</include> 
            <include>**/TestMergeTable*.java</include> 
            <include>**/TestHBaseFsck*.java</include> 
            <include>**/TestThriftServer*.java</include> 
            <include>**/TestFullLogReconstruction*.java</include> 
            <include>**/TestReplicationSink*.java</include> 
            <include>**/TestReplicationSourceManager*.java</include> 
            <include>**/TestMasterReplication*.java</include> 
            <include>**/TestMultiSlaveReplication*.java</include> 
            <include>**/TestSplitLogWorker*.java</include> 
            <include>**/TestSplitTransaction*.java</include> 
            <include>**/TestSplitTransactionOnCluster*.java</include> 
            <include>**/TestRollingRestart*.java</include> 
            <include>**/TestSplitLogManager*.java</include> 
            <include>**/TestAdmin*.java</include>





-----Original Message-----
From: Doug Meil [mailto:doug.meil@explorysmedical.com] 
Sent: Tuesday, October 04, 2011 6:04 AM
To: dev@hbase.apache.org
Subject: Re: Speeding up tests

Thanks Jesse.  Great write up!




On 10/3/11 4:55 PM, "Jesse Yates" <je...@gmail.com> wrote:

>Hey everyone,
>
>There has been a bunch of work recently on speeding up the testing to make
>it easier for developers to iterate quickly on new features fixes. Part of
>the problem is that the test suite takes anywhere from 1-2 hrs to run and
>have some apparently non-deterministic hanging of tests.
>
>TL;DR To speed it all up, the attack plan would be:
>(1) Move long running tests to be integration tests,
>(2) Use a build server for patches so people only run unit tests locally,
>(3) We add unit tests when integration tests are breaking a lot but unit
>tests pass,
>(4) Go from forked to single jvm unit tests,
>(5) Add in surefire parallelization
>(6) Entertain using HBaseTestingUtilFactory
>
>I recently chatted with Stack and Doug about ways around this. Here is
>what
>we came up with:
>
>(1) Break up long running tests from medium and short (2 mins max) tests
>and
>move the former to be IntegrationTests.  This was based on Todd's
>suggestion
>in HBASE-4438. Either naming them integration tests or long running
>functional tests, they would become part of the 'mvn verify' rather than
>the
>'mvn test' suite of tests. Starting point would be to  use Doug's
>spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480.
>
>Right now, that means a LOT of tests are going to shift, but it means when
>developers run 'mvn test' the amount of time spent running unit tests will
>be cut down dramatically (hopefully towards the sub 10 -15 mins range)
>
>There is an implicit problem here: if the soon to be integration tests
>capture functionality that is not covered by unit tests, then people may
>incorrectly think that they are not breaking things. Therefore, we would
>do
>(2) and (3):
>
>(2) Add a patch continuous integration server that goes and actually
>builds
>and tests patches as they come in. This would run 'mvn verify' and ensure
>that the patch actually isn't breaking high level/complex functionality.
>It
>would be a requirement before patches are committed that they pass this
>build.
>
>(3) If we find that the unit tests aren't covering a certain level of
>functionality that is constantly breaking on the build server, we add more
>unit tests of the breaking functionality to ensure the unit tests are more
>complete and provide more assurances to developers when running them.
>
>This would be an ongoing process of comparing the integration tests vs.
>the
>unit tests.
>
>(4) Once we have a true unit test suite, we should be able to go from
>'forked' jvm mode back to a single jvm for running tests. Unit tests
>should
>not do crazy fault injection, full failure scenarios, so they should be
>able
>to cleanup after themselves. This means we are going to get some speedup
>from not spinning up a new jvm for each test.
>
>(5) Once we are running in non-forked mode, we can try turning on
>parallelized test execution in surefire, which does parallel runs in a
>single jvm.
>
>(6) Once things all run in a single jvm, using the HBaseTestUtilFactory
>(HBASE-4448) make sense reuse the mini clusters across tests.
>
>What does everyone think of this approach?
>
>Thanks!
>
>-Jesse Yates

Re: Speeding up tests

Posted by Doug Meil <do...@explorysmedical.com>.

Thanks Jesse.  Great write up!




On 10/3/11 4:55 PM, "Jesse Yates" <je...@gmail.com> wrote:

>Hey everyone,
>
>There has been a bunch of work recently on speeding up the testing to make
>it easier for developers to iterate quickly on new features fixes. Part of
>the problem is that the test suite takes anywhere from 1-2 hrs to run and
>have some apparently non-deterministic hanging of tests.
>
>TL;DR To speed it all up, the attack plan would be:
>(1) Move long running tests to be integration tests,
>(2) Use a build server for patches so people only run unit tests locally,
>(3) We add unit tests when integration tests are breaking a lot but unit
>tests pass,
>(4) Go from forked to single jvm unit tests,
>(5) Add in surefire parallelization
>(6) Entertain using HBaseTestingUtilFactory
>
>I recently chatted with Stack and Doug about ways around this. Here is
>what
>we came up with:
>
>(1) Break up long running tests from medium and short (2 mins max) tests
>and
>move the former to be IntegrationTests.  This was based on Todd's
>suggestion
>in HBASE-4438. Either naming them integration tests or long running
>functional tests, they would become part of the 'mvn verify' rather than
>the
>'mvn test' suite of tests. Starting point would be to  use Doug's
>spreadsheet from HBASE-4448 and when its done, the script from HBASE-4480.
>
>Right now, that means a LOT of tests are going to shift, but it means when
>developers run 'mvn test' the amount of time spent running unit tests will
>be cut down dramatically (hopefully towards the sub 10 -15 mins range)
>
>There is an implicit problem here: if the soon to be integration tests
>capture functionality that is not covered by unit tests, then people may
>incorrectly think that they are not breaking things. Therefore, we would
>do
>(2) and (3):
>
>(2) Add a patch continuous integration server that goes and actually
>builds
>and tests patches as they come in. This would run 'mvn verify' and ensure
>that the patch actually isn't breaking high level/complex functionality.
>It
>would be a requirement before patches are committed that they pass this
>build.
>
>(3) If we find that the unit tests aren't covering a certain level of
>functionality that is constantly breaking on the build server, we add more
>unit tests of the breaking functionality to ensure the unit tests are more
>complete and provide more assurances to developers when running them.
>
>This would be an ongoing process of comparing the integration tests vs.
>the
>unit tests.
>
>(4) Once we have a true unit test suite, we should be able to go from
>'forked' jvm mode back to a single jvm for running tests. Unit tests
>should
>not do crazy fault injection, full failure scenarios, so they should be
>able
>to cleanup after themselves. This means we are going to get some speedup
>from not spinning up a new jvm for each test.
>
>(5) Once we are running in non-forked mode, we can try turning on
>parallelized test execution in surefire, which does parallel runs in a
>single jvm.
>
>(6) Once things all run in a single jvm, using the HBaseTestUtilFactory
>(HBASE-4448) make sense reuse the mini clusters across tests.
>
>What does everyone think of this approach?
>
>Thanks!
>
>-Jesse Yates