You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by Mark Thomas <ma...@apache.org> on 2015/06/15 14:02:34 UTC

Multi-threaded unit tests

I have been experimenting with the free Azure credits that come with the
MSDN subscription Microsoft kindly offers to all Apache committers to
use for their ASF work.

I have been looking at options for making the unit tests run faster.

All the figures below are for running the trunk unit tests on a fully
updated Ubuntu 14.04 LTS instance.


A2 Basic 233:53 tests on hdd, with code coverage, 1 thread
D2       120:57 tests on hdd, with code coverage, 1 thread
D2       119:53 tests on ssd, with code coverage, 1 thread
D2        32:16 tests on hdd, no code coverage,   2 threads
D2        23:24 tests on hdd, no code coverage,   4 threads

(Both A2 and D2 boxes have 2 cores. D2 have 60% faster processors).

I'll be testing larger instance with more cores later.

So far, I think it is safe to draw the following conclusions:
- code coverage is expensive
- code coverage (as currently configured) requires single thread
  execution (more on this below)
- 1 test thread per core definitely gives better performance
- 2 test threads per core gives even better performance

Where the limit is for threads per core is TBD.

I've already fixed the unit tests (I think) so parallel running is
possible. I'll be adding a threads option to build.xml shortly. It will
default to 1 and I'll add a comment to build.properties.default not to
increase it above 1 if code coverage is enabled (I might try and detect
and handle that case). Once I have data on threads vs cores I'll add
that too.

The reason code coverage doesn't work with the junit threads option is
that cobertura serialises the coverage data between tests. If we
partitioned the tests (e.g. by name) and configured separated coverage
data files for each partition (merging them at the end) then cobertura
would be OK. Sensibly partitioning the tests is more effort than I have
time for at the moment so I am going with the simple option.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: Multi-threaded unit tests

Posted by Mark Thomas <ma...@apache.org>.
On 16/06/2015 21:43, Christopher Schultz wrote:
> Mark,
> 
> On 6/16/15 4:13 PM, Mark Thomas wrote:
>> On 16/06/2015 20:39, Christopher Schultz wrote:
>>> Mark,
>>>
>>> On 6/15/15 8:02 AM, Mark Thomas wrote:
>>>> I have been experimenting with the free Azure credits that come with the
>>>> MSDN subscription Microsoft kindly offers to all Apache committers to
>>>> use for their ASF work.
>>>>
>>>> I have been looking at options for making the unit tests run faster.
>>>>
>>>> All the figures below are for running the trunk unit tests on a fully
>>>> updated Ubuntu 14.04 LTS instance.
>>>>
>>>>
>>>> A2 Basic 233:53 tests on hdd, with code coverage, 1 thread
>>>> D2       120:57 tests on hdd, with code coverage, 1 thread
>>>> D2       119:53 tests on ssd, with code coverage, 1 thread
>>>> D2        32:16 tests on hdd, no code coverage,   2 threads
>>>> D2        23:24 tests on hdd, no code coverage,   4 threads
>>>>
>>>> (Both A2 and D2 boxes have 2 cores. D2 have 60% faster processors).
>>>>
>>>> I'll be testing larger instance with more cores later.
>>>>
>>>> So far, I think it is safe to draw the following conclusions:
>>>> - code coverage is expensive
>>>> - code coverage (as currently configured) requires single thread
>>>>   execution (more on this below)
>>>> - 1 test thread per core definitely gives better performance
>>>> - 2 test threads per core gives even better performance
>>>
>>> Obviously, code coverage and CPU power (more likely access to the CPU,
>>> not the CPU speed itself) are bigger factors in the equation, here.
>>
>> Comparing A2 and D2 above, the only difference is CPU speed.
> 
> I don't know anything about Azure specifically, but I do know that in
> AWS the virtual machine classes include both differences in the CPU
> itself (the hardware) and also access to certain numbers of cores. So
> for example, they may have a 16-core box, but your VM will be limited to
> at most one of them at a time. I was wondering if Azure did the same
> kind of thing.

No idea. I'm using core in whatever way Azure means it. My assumption
(that seems to be backed up by the results) is that core == ability to
run a concurrent thread.

>>> Multi-threaded is nice, but it's marginal compared to the other factors
>>> (which are orders of magnitude at this point).
>>
>> Not when you increase the number of cores it isn't.
> 
> Okay, we'll see with more data. Your numbers below are encouraging.
> 
> When you say "core", do you mean CPU or CPU-thread? For example, most
> Intel processors these days have hyperthreading which means two threads
> per core. I have a quad-core laptop but I can have 8
> simultaneously-executing processes.
> 
> I'm curious as to why you are getting I/O timeouts when you use Nthreads
>> Ncores, but I suppose that depends upon the way you count. On my
> laptop, does that mean I shouldn't exceed 4 threads or 8 threads?

The context switching when you use more threads than your hardware can
handle concurrently can introduce enough of a delay that a I/O timeout
occurs.

>> I think that is more effort than it took to get the multi-threaded tests
>> working. You'll need to set up parallel junit tests in Ant, manage the
>> separate code coverage files and then merge the result.
> 
> Hmm. Merging the results would be ugly, especially if we have to figure
> out how two separate runs of Cobertura ran over the same piece of code
> (we might have some cross-over).
> 
> Or we could just have two separate reports ;)

There is a cobertura utility to do the merge for you.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: Multi-threaded unit tests

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Mark,

On 6/16/15 4:13 PM, Mark Thomas wrote:
> On 16/06/2015 20:39, Christopher Schultz wrote:
>> Mark,
>>
>> On 6/15/15 8:02 AM, Mark Thomas wrote:
>>> I have been experimenting with the free Azure credits that come with the
>>> MSDN subscription Microsoft kindly offers to all Apache committers to
>>> use for their ASF work.
>>>
>>> I have been looking at options for making the unit tests run faster.
>>>
>>> All the figures below are for running the trunk unit tests on a fully
>>> updated Ubuntu 14.04 LTS instance.
>>>
>>>
>>> A2 Basic 233:53 tests on hdd, with code coverage, 1 thread
>>> D2       120:57 tests on hdd, with code coverage, 1 thread
>>> D2       119:53 tests on ssd, with code coverage, 1 thread
>>> D2        32:16 tests on hdd, no code coverage,   2 threads
>>> D2        23:24 tests on hdd, no code coverage,   4 threads
>>>
>>> (Both A2 and D2 boxes have 2 cores. D2 have 60% faster processors).
>>>
>>> I'll be testing larger instance with more cores later.
>>>
>>> So far, I think it is safe to draw the following conclusions:
>>> - code coverage is expensive
>>> - code coverage (as currently configured) requires single thread
>>>   execution (more on this below)
>>> - 1 test thread per core definitely gives better performance
>>> - 2 test threads per core gives even better performance
>>
>> Obviously, code coverage and CPU power (more likely access to the CPU,
>> not the CPU speed itself) are bigger factors in the equation, here.
> 
> Comparing A2 and D2 above, the only difference is CPU speed.

I don't know anything about Azure specifically, but I do know that in
AWS the virtual machine classes include both differences in the CPU
itself (the hardware) and also access to certain numbers of cores. So
for example, they may have a 16-core box, but your VM will be limited to
at most one of them at a time. I was wondering if Azure did the same
kind of thing.

>> Multi-threaded is nice, but it's marginal compared to the other factors
>> (which are orders of magnitude at this point).
> 
> Not when you increase the number of cores it isn't.

Okay, we'll see with more data. Your numbers below are encouraging.

When you say "core", do you mean CPU or CPU-thread? For example, most
Intel processors these days have hyperthreading which means two threads
per core. I have a quad-core laptop but I can have 8
simultaneously-executing processes.

I'm curious as to why you are getting I/O timeouts when you use Nthreads
> Ncores, but I suppose that depends upon the way you count. On my
laptop, does that mean I shouldn't exceed 4 threads or 8 threads?

>> One more data point would have been good to have:
>>
>> D2    ???:?? tests on hdd, no code coverage, 1 thread
> 
> Agreed. I'll set a test running and see where it ends up.
> 
> Some more figures I do have:
> 
> D8 09:53 tests on hdd, no code coverage, 8 threads (8 core box).
> 
> On my laptop (4 core) the time taken to run the unit tests dropped from
> ~60 mins to ~15 mins with 4 threads (pretty much linear).
> 
>>> Where the limit is for threads per core is TBD.
>>>
>>> I've already fixed the unit tests (I think) so parallel running is
>>> possible. I'll be adding a threads option to build.xml shortly. It will
>>> default to 1 and I'll add a comment to build.properties.default not to
>>> increase it above 1 if code coverage is enabled (I might try and detect
>>> and handle that case). Once I have data on threads vs cores I'll add
>>> that too.
>>>
>>> The reason code coverage doesn't work with the junit threads option is
>>> that cobertura serialises the coverage data between tests. If we
>>> partitioned the tests (e.g. by name) and configured separated coverage
>>> data files for each partition (merging them at the end) then cobertura
>>> would be OK. Sensibly partitioning the tests is more effort than I have
>>> time for at the moment so I am going with the simple option.
>>
>> If doubling the number of threads delivers a ~30% performance
>> improvement in the code coverage (just extrapolating the results for
>> merely running the tests over to code-coverage), then perhaps a
>> heavy-handed segmentation of the Cobertura tests into two
>> arbitrarily-selected sets of tests would be a good trial with not too
>> much effort to give it a try.
>>
>> What do you think?
> 
> I think that is more effort than it took to get the multi-threaded tests
> working. You'll need to set up parallel junit tests in Ant, manage the
> separate code coverage files and then merge the result.

Hmm. Merging the results would be ugly, especially if we have to figure
out how two separate runs of Cobertura ran over the same piece of code
(we might have some cross-over).

Or we could just have two separate reports ;)

-chris


Re: Multi-threaded unit tests

Posted by Mark Thomas <ma...@apache.org>.
On 16/06/2015 20:39, Christopher Schultz wrote:
> Mark,
> 
> On 6/15/15 8:02 AM, Mark Thomas wrote:
>> I have been experimenting with the free Azure credits that come with the
>> MSDN subscription Microsoft kindly offers to all Apache committers to
>> use for their ASF work.
>>
>> I have been looking at options for making the unit tests run faster.
>>
>> All the figures below are for running the trunk unit tests on a fully
>> updated Ubuntu 14.04 LTS instance.
>>
>>
>> A2 Basic 233:53 tests on hdd, with code coverage, 1 thread
>> D2       120:57 tests on hdd, with code coverage, 1 thread
>> D2       119:53 tests on ssd, with code coverage, 1 thread
>> D2        32:16 tests on hdd, no code coverage,   2 threads
>> D2        23:24 tests on hdd, no code coverage,   4 threads
>>
>> (Both A2 and D2 boxes have 2 cores. D2 have 60% faster processors).
>>
>> I'll be testing larger instance with more cores later.
>>
>> So far, I think it is safe to draw the following conclusions:
>> - code coverage is expensive
>> - code coverage (as currently configured) requires single thread
>>   execution (more on this below)
>> - 1 test thread per core definitely gives better performance
>> - 2 test threads per core gives even better performance
> 
> Obviously, code coverage and CPU power (more likely access to the CPU,
> not the CPU speed itself) are bigger factors in the equation, here.

Comparing A2 and D2 above, the only difference is CPU speed.

> Multi-threaded is nice, but it's marginal compared to the other factors
> (which are orders of magnitude at this point).

Not when you increase the number of cores it isn't.

> One more data point would have been good to have:
> 
> D2    ???:?? tests on hdd, no code coverage, 1 thread

Agreed. I'll set a test running and see where it ends up.

Some more figures I do have:

D8 09:53 tests on hdd, no code coverage, 8 threads (8 core box).

On my laptop (4 core) the time taken to run the unit tests dropped from
~60 mins to ~15 mins with 4 threads (pretty much linear).

>> Where the limit is for threads per core is TBD.
>>
>> I've already fixed the unit tests (I think) so parallel running is
>> possible. I'll be adding a threads option to build.xml shortly. It will
>> default to 1 and I'll add a comment to build.properties.default not to
>> increase it above 1 if code coverage is enabled (I might try and detect
>> and handle that case). Once I have data on threads vs cores I'll add
>> that too.
>>
>> The reason code coverage doesn't work with the junit threads option is
>> that cobertura serialises the coverage data between tests. If we
>> partitioned the tests (e.g. by name) and configured separated coverage
>> data files for each partition (merging them at the end) then cobertura
>> would be OK. Sensibly partitioning the tests is more effort than I have
>> time for at the moment so I am going with the simple option.
> 
> If doubling the number of threads delivers a ~30% performance
> improvement in the code coverage (just extrapolating the results for
> merely running the tests over to code-coverage), then perhaps a
> heavy-handed segmentation of the Cobertura tests into two
> arbitrarily-selected sets of tests would be a good trial with not too
> much effort to give it a try.
> 
> What do you think?

I think that is more effort than it took to get the multi-threaded tests
working. You'll need to set up parallel junit tests in Ant, manage the
separate code coverage files and then merge the result.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: Multi-threaded unit tests

Posted by Christopher Schultz <ch...@christopherschultz.net>.
Mark,

On 6/15/15 8:02 AM, Mark Thomas wrote:
> I have been experimenting with the free Azure credits that come with the
> MSDN subscription Microsoft kindly offers to all Apache committers to
> use for their ASF work.
> 
> I have been looking at options for making the unit tests run faster.
> 
> All the figures below are for running the trunk unit tests on a fully
> updated Ubuntu 14.04 LTS instance.
> 
> 
> A2 Basic 233:53 tests on hdd, with code coverage, 1 thread
> D2       120:57 tests on hdd, with code coverage, 1 thread
> D2       119:53 tests on ssd, with code coverage, 1 thread
> D2        32:16 tests on hdd, no code coverage,   2 threads
> D2        23:24 tests on hdd, no code coverage,   4 threads
> 
> (Both A2 and D2 boxes have 2 cores. D2 have 60% faster processors).
> 
> I'll be testing larger instance with more cores later.
> 
> So far, I think it is safe to draw the following conclusions:
> - code coverage is expensive
> - code coverage (as currently configured) requires single thread
>   execution (more on this below)
> - 1 test thread per core definitely gives better performance
> - 2 test threads per core gives even better performance

Obviously, code coverage and CPU power (more likely access to the CPU,
not the CPU speed itself) are bigger factors in the equation, here.
Multi-threaded is nice, but it's marginal compared to the other factors
(which are orders of magnitude at this point).

One more data point would have been good to have:

D2    ???:?? tests on hdd, no code coverage, 1 thread

> Where the limit is for threads per core is TBD.
> 
> I've already fixed the unit tests (I think) so parallel running is
> possible. I'll be adding a threads option to build.xml shortly. It will
> default to 1 and I'll add a comment to build.properties.default not to
> increase it above 1 if code coverage is enabled (I might try and detect
> and handle that case). Once I have data on threads vs cores I'll add
> that too.
> 
> The reason code coverage doesn't work with the junit threads option is
> that cobertura serialises the coverage data between tests. If we
> partitioned the tests (e.g. by name) and configured separated coverage
> data files for each partition (merging them at the end) then cobertura
> would be OK. Sensibly partitioning the tests is more effort than I have
> time for at the moment so I am going with the simple option.

If doubling the number of threads delivers a ~30% performance
improvement in the code coverage (just extrapolating the results for
merely running the tests over to code-coverage), then perhaps a
heavy-handed segmentation of the Cobertura tests into two
arbitrarily-selected sets of tests would be a good trial with not too
much effort to give it a try.

What do you think?

-chris


Re: Multi-threaded unit tests

Posted by Mark Thomas <ma...@apache.org>.
On 15/06/2015 13:02, Mark Thomas wrote:
> I have been experimenting with the free Azure credits that come with the
> MSDN subscription Microsoft kindly offers to all Apache committers to
> use for their ASF work.
> 
> I have been looking at options for making the unit tests run faster.
> 
> All the figures below are for running the trunk unit tests on a fully
> updated Ubuntu 14.04 LTS instance.
> 
> 
> A2 Basic 233:53 tests on hdd, with code coverage, 1 thread
> D2       120:57 tests on hdd, with code coverage, 1 thread
> D2       119:53 tests on ssd, with code coverage, 1 thread
> D2        32:16 tests on hdd, no code coverage,   2 threads
> D2        23:24 tests on hdd, no code coverage,   4 threads

Any higher than 2 threads per core there is a risk that tests experience
IO timeouts.

I'm going to recommend 1 thread per core.

The other factor here is that multi-threaded tests requires ant 1.9.5
onwards.

Mark


> 
> (Both A2 and D2 boxes have 2 cores. D2 have 60% faster processors).
> 
> I'll be testing larger instance with more cores later.
> 
> So far, I think it is safe to draw the following conclusions:
> - code coverage is expensive
> - code coverage (as currently configured) requires single thread
>   execution (more on this below)
> - 1 test thread per core definitely gives better performance
> - 2 test threads per core gives even better performance
> 
> Where the limit is for threads per core is TBD.
> 
> I've already fixed the unit tests (I think) so parallel running is
> possible. I'll be adding a threads option to build.xml shortly. It will
> default to 1 and I'll add a comment to build.properties.default not to
> increase it above 1 if code coverage is enabled (I might try and detect
> and handle that case). Once I have data on threads vs cores I'll add
> that too.
> 
> The reason code coverage doesn't work with the junit threads option is
> that cobertura serialises the coverage data between tests. If we
> partitioned the tests (e.g. by name) and configured separated coverage
> data files for each partition (merging them at the end) then cobertura
> would be OK. Sensibly partitioning the tests is more effort than I have
> time for at the moment so I am going with the simple option.
> 
> Mark
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org