You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by Ted Yu <yu...@gmail.com> on 2011/09/26 23:48:43 UTC
HBASE-4480 Was: maintaining stable HBase build

I have an updated script that runs multiple (related) tests repeatedly.
Will attach to HBASE-4480

FYI

On Mon, Sep 26, 2011 at 10:26 AM, Ted Yu <yu...@gmail.com> wrote:

> That would be nice Jesse.
>
> Thanks
>
>
> On Mon, Sep 26, 2011 at 10:16 AM, Jesse Yates <je...@gmail.com>wrote:
>
>> Ted,
>>
>> There is a ticket (HBASE-4480) up for wrapping tests in a retry script for
>> failed tests (though no work has been done on it yet). Maybe we can
>> incorporate this script into that ticket?
>>
>> -Jesse Yates
>>
>> On Mon, Sep 26, 2011 at 2:08 AM, Ted Yu <yu...@gmail.com> wrote:
>>
>> > Below is a simple script to repeatedly run a unit test.
>> > I suggest using it or similar script on the new unit test(s) in future
>> > patches.
>> >
>> > #!/bin/bash
>> > # script to run test repeatedly
>> > # usage: ./runtest.sh <name of test> <number of repetitions>
>> > #
>> > for ((  i = 1 ;  i <= $2; i++  ))
>> > do
>> >  nice -10 mvn test -Dtest=$1
>> >  if [ $? -ne 0 ]; then
>> >    echo "$1 failed"
>> >    exit 1
>> >  fi
>> > done
>> >
>> > Thanks
>> >
>> > On Sun, Sep 25, 2011 at 2:27 PM, lars hofhansl <lh...@yahoo.com>
>> > wrote:
>> >
>> > > At Salesforce we call these "flappers" and they are considered almost
>> > worse
>> > > than failing tests,
>> > > as they add noise to a test run without adding confidence.
>> > > At test that fails once in - say - 10 runs is worthless.
>> > >
>> > >
>> > >
>> > > ________________________________
>> > > From: Ted Yu <yu...@gmail.com>
>> > > To: dev@hbase.apache.org
>> > > Sent: Sunday, September 25, 2011 1:41 PM
>> > > Subject: Re: maintaining stable HBase build
>> > >
>> > > As of 1:38 PST Sunday, the three builds all passed.
>> > >
>> > > I think we have some tests that exhibit in-deterministic behavior.
>> > >
>> > > I suggest committers interleave patch submissions by 2 hour span so
>> that
>> > we
>> > > can more easily identify patch(es) that break the build.
>> > >
>> > > Thanks
>> > >
>> > > On Sun, Sep 25, 2011 at 7:45 AM, Ted Yu <yu...@gmail.com> wrote:
>> > >
>> > > > I wrote a short blog:
>> > > >
>> >
>> http://zhihongyu.blogspot.com/2011/09/streamlining-patch-submission.html
>> > > >
>> > > > It is geared towards contributors.
>> > > >
>> > > > Cheers
>> > > >
>> > > >
>> > > > On Sat, Sep 24, 2011 at 9:16 AM, Ramakrishna S Vasudevan 00902313 <
>> > > > ramakrishnas@huawei.com> wrote:
>> > > >
>> > > >> Hi
>> > > >>
>> > > >> Ted, I agree with you.  Pasting the testcase results in JIRA is
>> also
>> > > fine,
>> > > >> mainly when there are some testcase failures when we run locally
>> but
>> > if
>> > > we
>> > > >> feel it is not due to the fix we have added we can mention that
>> also.
>> >  I
>> > > >> think rather than in a windows machine its better to run in linux
>> box.
>> > > >>
>> > > >> +1 for your suggestion Ted.
>> > > >>
>> > > >> Can we add the feature like in HDFS when we submit patch
>> automatically
>> > > the
>> > > >> Jenkin's run the testcases?
>> > > >>
>> > > >> Atleast till this is done I go with your suggestion.
>> > > >>
>> > > >> Regards
>> > > >> Ram
>> > > >>
>> > > >> ----- Original Message -----
>> > > >> From: Ted Yu <yu...@gmail.com>
>> > > >> Date: Saturday, September 24, 2011 4:22 pm
>> > > >> Subject: maintaining stable HBase build
>> > > >> To: dev@hbase.apache.org
>> > > >>
>> > > >> > Hi,
>> > > >> > I want to bring the importance of maintaining stable HBase build
>> to
>> > > >> > ourattention.
>> > > >> > A stable HBase build is important, not just for the next release
>> > > >> > but also
>> > > >> > for authors of the pending patches to verify the correctness of
>> > > >> > their work.
>> > > >> >
>> > > >> > At some time on Thursday (Sept 22nd) 0.90, 0.92 and TRUNK builds
>> > > >> > were all
>> > > >> > blue. Now they're all red.
>> > > >> >
>> > > >> > I don't mind fixing Jenkins build. But if we collectively adopt
>> > > >> > some good
>> > > >> > practice, it would be easier to achieve the goal of having stable
>> > > >> > builds.
>> > > >> > For contributors, I understand that it takes so much time to run
>> > > >> > whole test
>> > > >> > suite that he/she may not have the luxury of doing this - Apache
>> > > >> > Jenkinswouldn't do it when you press Submit Patch button.
>> > > >> > If this is the case (let's call it scenario A), please use
>> Eclipse
>> > > >> > (or other
>> > > >> > tool) to identify tests that exercise the classes/methods in your
>> > > >> > patch and
>> > > >> > run them. Also clearly state what tests you ran in the JIRA.
>> > > >> >
>> > > >> > If you have a Linux box where you can run whole test suite, it
>> > > >> > would be nice
>> > > >> > to utilize such resource and run whole suite. Then please state
>> > > >> > this fact on
>> > > >> > the JIRA as well.
>> > > >> > Considering Todd's suggestion of holding off commit for 24 hours
>> > > >> > after code
>> > > >> > review, 2 hour test run isn't that long.
>> > > >> >
>> > > >> > Sometimes you may see the following (from 0.92 build 18):
>> > > >> >
>> > > >> > Tests run: 1004, Failures: 0, Errors: 0, Skipped: 21
>> > > >> >
>> > > >> > [INFO]
>> -------------------------------------------------------------
>> > > >> > -----------
>> > > >> > [INFO] BUILD FAILURE
>> > > >> > [INFO]
>> -------------------------------------------------------------
>> > > >> > -----------
>> > > >> > [INFO] Total time: 1:51:41.797s
>> > > >> >
>> > > >> > You should examine the test summary above these lines and find
>> out
>> > > >> > which test(s) hung. For this case it was TestMasterFailover:
>> > > >> >
>> > > >> > Running org.apache.hadoop.hbase.master.TestMasterFailover
>> > > >> > Running
>> > > >> >
>> > >
>> org.apache.hadoop.hbase.master.TestMasterRestartAfterDisablingTableTests
>> > > >> run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 32.265
>> sec
>> > > >> >
>> > > >> > I think a script should be developed that parses test output and
>> > > >> > identify hanging test(s).
>> > > >> >
>> > > >> > For scenario A, I hope committer would run test suite.
>> > > >> > The net effect would be a statement on the JIRA, saying all tests
>> > > >> > passed.
>> > > >> > Your comments/suggestions are welcome.
>> > > >> >
>> > > >>
>> > > >
>> > > >
>> > >
>> >
>>
>
>