You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hive.apache.org by Edward Capriolo <ed...@gmail.com> on 2012/06/11 04:29:13 UTC

Turn around on patches that do not need full unit testing

Hive's unit tests take a long time. There are many simple patches we
can get into hive earlier if we drop the notion of running the full
test suite to QA every patch. For example:

https://issues.apache.org/jira/browse/HIVE-3081  --> spelling mistakes
that involved types

https://issues.apache.org/jira/browse/HIVE-3061  --> patches with code cleanup

https://issues.apache.org/jira/browse/HIVE-3048  --> patches that are
one or two lines of code

https://issues.apache.org/jira/browse/HIVE-2288  --> patches that are
only additive

Also I do not believe we should kick a patch back to someone for every
tiny change. For example, suppose someone commits 9000 lines of code,
with one typo. I have seen similar situations where the status gets
reverted back to OPEN. It takes the person working on it a day to get
back into the patch again, then by the time someone comes back around
to reviewing another 3 days might go by.

This is similar to a situation in the supermarket where  "You can only
use one coupon" so people walk in and out of the store 6 times to buy
6 items. Procedure and rules are followed, end results is really the
same, but 6 times the work.

In this case the committer should just make he change, re upload the
patch and say 'committed with typo fixed' and commit.

please comment,

Edward

Re: Turn around on patches that do not need full unit testing

Posted by Alan Gates <ga...@hortonworks.com>.
One approach I've seen other projects take is to have an "ant test-commit" target that users are responsible to run before committing.  This is a short (15 min or less) target that runs all true unit tests (tests that exercise just a class or two in isolation) and a couple of functional tests that exercise major functionality but not every last thing.  The full test suite can then be run nightly and any issues addressed.

Alan.

On Jun 11, 2012, at 6:17 AM, Edward Capriolo wrote:

> I agree. Having a short-test and long-test might make more sense. IE
> long-test includes funky serdes and UDFs.
> 
> As for "In the meanwhile, check in without test may introduce bug
> which can break production cluster.costly." the solution is not to run
> trunk. Run only releases.
> 
> All the tests are run by jenkins post commit so we know when trunk is
> broken and we should not cut a release if all the tests are not
> passing. Also we should not knowingly break the build or leave it
> broken. IE would should strive to have all tests passing on trunk at
> all times, but not committing a typo patch for fear that the "build
> might break" does not make much sense. We can easily revert things in
> such a case.
> 
> Edward
> 
> On Sun, Jun 10, 2012 at 11:14 PM, Gang Liu <ga...@fb.com> wrote:
>> Yeah it is frustrated to take a long time to turn around for a tiny change. It is understood.
>> 
>> In the meanwhile, check in without test may introduce bug which can break production cluster.costly.
>> 
>> I think the problem is not if we should run test but running tests takes long time. If it takes reasonable time like 30 minutes, we have less pain.
>> 
>> In a summary let us keep high quality via running test for every commit. Target to make unit test fast.
>> 
>> Btw we can run test in parallel a hive wiki has details
>> 
>> Thanks
>> 
>> Sent from my iPhone
>> 
>> On Jun 10, 2012, at 7:29 PM, "Edward Capriolo" <ed...@gmail.com> wrote:
>> 
>>> Hive's unit tests take a long time. There are many simple patches we
>>> can get into hive earlier if we drop the notion of running the full
>>> test suite to QA every patch. For example:
>>> 
>>> https://issues.apache.org/jira/browse/HIVE-3081  --> spelling mistakes
>>> that involved types
>>> 
>>> https://issues.apache.org/jira/browse/HIVE-3061  --> patches with code cleanup
>>> 
>>> https://issues.apache.org/jira/browse/HIVE-3048  --> patches that are
>>> one or two lines of code
>>> 
>>> https://issues.apache.org/jira/browse/HIVE-2288  --> patches that are
>>> only additive
>>> 
>>> Also I do not believe we should kick a patch back to someone for every
>>> tiny change. For example, suppose someone commits 9000 lines of code,
>>> with one typo. I have seen similar situations where the status gets
>>> reverted back to OPEN. It takes the person working on it a day to get
>>> back into the patch again, then by the time someone comes back around
>>> to reviewing another 3 days might go by.
>>> 
>>> This is similar to a situation in the supermarket where  "You can only
>>> use one coupon" so people walk in and out of the store 6 times to buy
>>> 6 items. Procedure and rules are followed, end results is really the
>>> same, but 6 times the work.
>>> 
>>> In this case the committer should just make he change, re upload the
>>> patch and say 'committed with typo fixed' and commit.
>>> 
>>> please comment,
>>> 
>>> Edward


Re: Turn around on patches that do not need full unit testing

Posted by Edward Capriolo <ed...@gmail.com>.
I agree. Having a short-test and long-test might make more sense. IE
long-test includes funky serdes and UDFs.

As for "In the meanwhile, check in without test may introduce bug
which can break production cluster.costly." the solution is not to run
trunk. Run only releases.

All the tests are run by jenkins post commit so we know when trunk is
broken and we should not cut a release if all the tests are not
passing. Also we should not knowingly break the build or leave it
broken. IE would should strive to have all tests passing on trunk at
all times, but not committing a typo patch for fear that the "build
might break" does not make much sense. We can easily revert things in
such a case.

Edward

On Sun, Jun 10, 2012 at 11:14 PM, Gang Liu <ga...@fb.com> wrote:
> Yeah it is frustrated to take a long time to turn around for a tiny change. It is understood.
>
> In the meanwhile, check in without test may introduce bug which can break production cluster.costly.
>
> I think the problem is not if we should run test but running tests takes long time. If it takes reasonable time like 30 minutes, we have less pain.
>
> In a summary let us keep high quality via running test for every commit. Target to make unit test fast.
>
> Btw we can run test in parallel a hive wiki has details
>
> Thanks
>
> Sent from my iPhone
>
> On Jun 10, 2012, at 7:29 PM, "Edward Capriolo" <ed...@gmail.com> wrote:
>
>> Hive's unit tests take a long time. There are many simple patches we
>> can get into hive earlier if we drop the notion of running the full
>> test suite to QA every patch. For example:
>>
>> https://issues.apache.org/jira/browse/HIVE-3081  --> spelling mistakes
>> that involved types
>>
>> https://issues.apache.org/jira/browse/HIVE-3061  --> patches with code cleanup
>>
>> https://issues.apache.org/jira/browse/HIVE-3048  --> patches that are
>> one or two lines of code
>>
>> https://issues.apache.org/jira/browse/HIVE-2288  --> patches that are
>> only additive
>>
>> Also I do not believe we should kick a patch back to someone for every
>> tiny change. For example, suppose someone commits 9000 lines of code,
>> with one typo. I have seen similar situations where the status gets
>> reverted back to OPEN. It takes the person working on it a day to get
>> back into the patch again, then by the time someone comes back around
>> to reviewing another 3 days might go by.
>>
>> This is similar to a situation in the supermarket where  "You can only
>> use one coupon" so people walk in and out of the store 6 times to buy
>> 6 items. Procedure and rules are followed, end results is really the
>> same, but 6 times the work.
>>
>> In this case the committer should just make he change, re upload the
>> patch and say 'committed with typo fixed' and commit.
>>
>> please comment,
>>
>> Edward

Re: Turn around on patches that do not need full unit testing

Posted by Gang Liu <ga...@fb.com>.
Yeah it is frustrated to take a long time to turn around for a tiny change. It is understood.

In the meanwhile, check in without test may introduce bug which can break production cluster.costly.

I think the problem is not if we should run test but running tests takes long time. If it takes reasonable time like 30 minutes, we have less pain.

In a summary let us keep high quality via running test for every commit. Target to make unit test fast.

Btw we can run test in parallel a hive wiki has details

Thanks

Sent from my iPhone

On Jun 10, 2012, at 7:29 PM, "Edward Capriolo" <ed...@gmail.com> wrote:

> Hive's unit tests take a long time. There are many simple patches we
> can get into hive earlier if we drop the notion of running the full
> test suite to QA every patch. For example:
> 
> https://issues.apache.org/jira/browse/HIVE-3081  --> spelling mistakes
> that involved types
> 
> https://issues.apache.org/jira/browse/HIVE-3061  --> patches with code cleanup
> 
> https://issues.apache.org/jira/browse/HIVE-3048  --> patches that are
> one or two lines of code
> 
> https://issues.apache.org/jira/browse/HIVE-2288  --> patches that are
> only additive
> 
> Also I do not believe we should kick a patch back to someone for every
> tiny change. For example, suppose someone commits 9000 lines of code,
> with one typo. I have seen similar situations where the status gets
> reverted back to OPEN. It takes the person working on it a day to get
> back into the patch again, then by the time someone comes back around
> to reviewing another 3 days might go by.
> 
> This is similar to a situation in the supermarket where  "You can only
> use one coupon" so people walk in and out of the store 6 times to buy
> 6 items. Procedure and rules are followed, end results is really the
> same, but 6 times the work.
> 
> In this case the committer should just make he change, re upload the
> patch and say 'committed with typo fixed' and commit.
> 
> please comment,
> 
> Edward