You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@httpd.apache.org by Jim Van Fleet <jj...@yahoo.com> on 2011/01/27 18:21:08 UTC

Performance fix in event mpm

I have been developing an application using apache 2.2 on linux 2.6.  My test 
environment creates a very heavy workload and puts a strain on every thing.  


I would get good performance for a while and as the load ramped up, performance 
would quickly get very bad.  Erratically, transactions would finish quickly or 
take a very long time -- tcpdump analysis showed millisecond or seconds between 
responses. Also, the recv queue got very large.

I noticed that ap_queue_pop removes elements from the queue LIFO rather than 
FIFO.  Also noticed that apr_queue_pop uses a different technique which is not 
too expensive and is fifo, so I changed ap_queue/pop/push to use that technique 
and the receive problems went away.

snippet from ap_queue_pop (push is similar with appropriate changes to the 
fd_queue_t struct)

    AP_DEBUG_ASSERT(!queue->terminated);
#if 1
    ap_assert(!ap_queue_full(queue));  /* we'd never expect the queue to be 
full, so for debug, we check */
#else
    AP_DEBUG_ASSERT(!ap_queue_full(queue));
#endif

#if 1
    elem = &queue->data[queue->in];
    queue->in = (queue->in + 1) % queue->bounds;
#else
    elem = &queue->data[queue->nelts];
#endif
    elem->sd = sd;
    elem->cs = cs;
    elem->p = p;
    queue->nelts++;

Please let me know if you think this change is appropriate and/or if you'd like 
more data

Jim Van Fleet



      

apr_queue_pop/push (Was: Re: Performance fix in event mpm)

Posted by Jim Jagielski <ji...@jaguNET.com>.
Looping in APR as well...

On Jan 27, 2011, at 1:43 PM, Jim Jagielski wrote:

> 
> On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:
> 
>> 
>> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
>> 
>>> I noticed that ap_queue_pop removes elements from the queue LIFO rather than FIFO.  Also noticed that apr_queue_pop uses a different technique which is not too expensive and is fifo, so I changed ap_queue/pop/push to use that technique and the receive problems went away.
>> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
>> the above ain't right for pop! :)
> 
> OK, looking over the history, it looks like the Q was changed from
> FIFO to LIFO ~10years ago (worker)... The reasoning:
> 
>  This is a rather simple patch that may improve cache-hit performance
>  under some conditions by changing the queue of available worker threads
>  from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
>  happens in the critical section, which will definately help if you have
>  a lame compiler.
> 
> Seems to me that changing back to FIFO would make sense, esp
> with trunk. We can profile the expense of the '% queue->bounds'
> but it seems to me that if it was really bad, we'd have seen it
> in apr and changed it there... after all, all we're doing
> with that is keeping it in bounds and a comparison and subtraction
> would do that just as well...

Doing some profiling, we can improve things more in apr_queue_pop/push...

    a++;
    if (a>=bounds)
      a -= bounds;

is about 2-4 times as fast as:

    a = (a+1)%bounds;

Seems like an EZ and obvious optimization to me ;)

apr_queue_pop/push (Was: Re: Performance fix in event mpm)

Posted by Jim Jagielski <ji...@jaguNET.com>.
Looping in APR as well...

On Jan 27, 2011, at 1:43 PM, Jim Jagielski wrote:

> 
> On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:
> 
>> 
>> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
>> 
>>> I noticed that ap_queue_pop removes elements from the queue LIFO rather than FIFO.  Also noticed that apr_queue_pop uses a different technique which is not too expensive and is fifo, so I changed ap_queue/pop/push to use that technique and the receive problems went away.
>> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
>> the above ain't right for pop! :)
> 
> OK, looking over the history, it looks like the Q was changed from
> FIFO to LIFO ~10years ago (worker)... The reasoning:
> 
>  This is a rather simple patch that may improve cache-hit performance
>  under some conditions by changing the queue of available worker threads
>  from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
>  happens in the critical section, which will definately help if you have
>  a lame compiler.
> 
> Seems to me that changing back to FIFO would make sense, esp
> with trunk. We can profile the expense of the '% queue->bounds'
> but it seems to me that if it was really bad, we'd have seen it
> in apr and changed it there... after all, all we're doing
> with that is keeping it in bounds and a comparison and subtraction
> would do that just as well...

Doing some profiling, we can improve things more in apr_queue_pop/push...

    a++;
    if (a>=bounds)
      a -= bounds;

is about 2-4 times as fast as:

    a = (a+1)%bounds;

Seems like an EZ and obvious optimization to me ;)

Re: Performance fix in event mpm

Posted by Jim Van Fleet <jj...@yahoo.com>.
I've run my stress tests with my version of your patch -- have not run the 
patched branch.  I get super performance with the patch, about 3 times as many 
transactions per second and better response. I could go bigger, lots of cpu 
left, but I am running out of memory. 


Time to look at memory usage - I have about 130K per connection, which seems 
much too big.

Jim





________________________________
From: Jim Jagielski <ji...@apache.org>
To: dev@httpd.apache.org
Sent: Thu, February 3, 2011 7:19:34 AM
Subject: Re: Performance fix in event mpm

I've run SMOKE tests and not seen any discernible diffs
in performance, but they have not been incredibly
stressful tests.

On Feb 2, 2011, at 7:02 PM, David Dabbs wrote:

> Hi.
> 
> Has anyone compared before/after performance when pounding a pre-patched
> httpd (with ab or other load generator) and with the fdqueue mods? 
> Or, for those more daring readers, observed improvements in a production
> environment?
> Before deploying a manually patched 2.2.x branch, we're probably going to
> run some sort of load test. 
> Having read the thread, I don't think we'd need to do anything other than
> throw a lot of load at it, right?
> 
> Thanks,
> 
> David
> 
> 
> -----Original Message-----
> From: Niklas Edmundsson [mailto:nikke@acc.umu.se] 
> Sent: Friday, January 28, 2011 8:58 AM
> To: dev@httpd.apache.org
> Subject: Re: Performance fix in event mpm
> 
> On Fri, 28 Jan 2011, Jim Jagielski wrote:
> 
>> I was going to submit it as a backport, yes.
> 
> I have a strong feeling that this can explain the weird performance 
> issues/behavior we've seen when hitting any bottleneck that results in 
> requests being queued up.
> 
> Thanks for finding/fixing this :)
> 
>> 
>> On Jan 27, 2011, at 9:08 PM, David Dabbs wrote:
>> 
>>> I see that the changes described below were applied to the trunk worker
> and
>>> event MPM code.
>>> Would you consider applying it to the 2.2x branch? I will do so myself
> and
>>> test in my env.
>>> 
>>> 
>>> Many thanks,
>>> 
>>> David Dabbs
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>> Sent: Thursday, January 27, 2011 12:43 PM
>>> To: dev@httpd.apache.org
>>> Subject: Re: Performance fix in event mpm
>>> 
>>> 
>>> On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:
>>> 
>>>> 
>>>> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
>>>> 
>>>>> I have been developing an application using apache 2.2 on linux 2.6.
> My
>>> test environment creates a very heavy workload and puts a strain on every
>>> thing.
>>>>> 
>>>>> I would get good performance for a while and as the load ramped up,
>>> performance would quickly get very bad.  Erratically, transactions would
>>> finish quickly or take a very long time -- tcpdump analysis showed
>>> millisecond or seconds between responses. Also, the recv queue got very
>>> large.
>>>>> 
>>>>> I noticed that ap_queue_pop removes elements from the queue LIFO rather
>>> than FIFO.  Also noticed that apr_queue_pop uses a different technique
> which
>>> is not too expensive and is fifo, so I changed ap_queue/pop/push to use
> that
>>> technique and the receive problems went away.
>>>>> 
>>>>> Please let me know if you think this change is appropriate and/or if
>>> you'd like more data
>>>>> 
>>>> 
>>>> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
>>>> the above ain't right for pop! :)
>>> 
>>> OK, looking over the history, it looks like the Q was changed from
>>> FIFO to LIFO ~10years ago (worker)... The reasoning:
>>> 
>>> This is a rather simple patch that may improve cache-hit performance
>>> under some conditions by changing the queue of available worker threads
>>> from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
>>> happens in the critical section, which will definately help if you have
>>> a lame compiler.
>>> 
>>> Seems to me that changing back to FIFO would make sense, esp
>>> with trunk. We can profile the expense of the '% queue->bounds'
>>> but it seems to me that if it was really bad, we'd have seen it
>>> in apr and changed it there... after all, all we're doing
>>> with that is keeping it in bounds and a comparison and subtraction
>>> would do that just as well...
>>> 
>> 
> 
> 
> /Nikke
> -- 
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |    nikke@acc.umu.se
> ---------------------------------------------------------------------------
>  I am Bashir on Borg: I'd be hostile to if my poop was cubed!
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> 


      

Re: Performance fix in event mpm

Posted by Jim Jagielski <ji...@apache.org>.
I've run SMOKE tests and not seen any discernible diffs
in performance, but they have not been incredibly
stressful tests.

On Feb 2, 2011, at 7:02 PM, David Dabbs wrote:

> Hi.
> 
> Has anyone compared before/after performance when pounding a pre-patched
> httpd (with ab or other load generator) and with the fdqueue mods? 
> Or, for those more daring readers, observed improvements in a production
> environment?
> Before deploying a manually patched 2.2.x branch, we're probably going to
> run some sort of load test. 
> Having read the thread, I don't think we'd need to do anything other than
> throw a lot of load at it, right?
> 
> Thanks,
> 
> David
> 
> 
> -----Original Message-----
> From: Niklas Edmundsson [mailto:nikke@acc.umu.se] 
> Sent: Friday, January 28, 2011 8:58 AM
> To: dev@httpd.apache.org
> Subject: Re: Performance fix in event mpm
> 
> On Fri, 28 Jan 2011, Jim Jagielski wrote:
> 
>> I was going to submit it as a backport, yes.
> 
> I have a strong feeling that this can explain the weird performance 
> issues/behavior we've seen when hitting any bottleneck that results in 
> requests being queued up.
> 
> Thanks for finding/fixing this :)
> 
>> 
>> On Jan 27, 2011, at 9:08 PM, David Dabbs wrote:
>> 
>>> I see that the changes described below were applied to the trunk worker
> and
>>> event MPM code.
>>> Would you consider applying it to the 2.2x branch? I will do so myself
> and
>>> test in my env.
>>> 
>>> 
>>> Many thanks,
>>> 
>>> David Dabbs
>>> 
>>> 
>>> 
>>> -----Original Message-----
>>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>>> Sent: Thursday, January 27, 2011 12:43 PM
>>> To: dev@httpd.apache.org
>>> Subject: Re: Performance fix in event mpm
>>> 
>>> 
>>> On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:
>>> 
>>>> 
>>>> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
>>>> 
>>>>> I have been developing an application using apache 2.2 on linux 2.6.
> My
>>> test environment creates a very heavy workload and puts a strain on every
>>> thing.
>>>>> 
>>>>> I would get good performance for a while and as the load ramped up,
>>> performance would quickly get very bad.  Erratically, transactions would
>>> finish quickly or take a very long time -- tcpdump analysis showed
>>> millisecond or seconds between responses. Also, the recv queue got very
>>> large.
>>>>> 
>>>>> I noticed that ap_queue_pop removes elements from the queue LIFO rather
>>> than FIFO.  Also noticed that apr_queue_pop uses a different technique
> which
>>> is not too expensive and is fifo, so I changed ap_queue/pop/push to use
> that
>>> technique and the receive problems went away.
>>>>> 
>>>>> Please let me know if you think this change is appropriate and/or if
>>> you'd like more data
>>>>> 
>>>> 
>>>> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
>>>> the above ain't right for pop! :)
>>> 
>>> OK, looking over the history, it looks like the Q was changed from
>>> FIFO to LIFO ~10years ago (worker)... The reasoning:
>>> 
>>> This is a rather simple patch that may improve cache-hit performance
>>> under some conditions by changing the queue of available worker threads
>>> from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
>>> happens in the critical section, which will definately help if you have
>>> a lame compiler.
>>> 
>>> Seems to me that changing back to FIFO would make sense, esp
>>> with trunk. We can profile the expense of the '% queue->bounds'
>>> but it seems to me that if it was really bad, we'd have seen it
>>> in apr and changed it there... after all, all we're doing
>>> with that is keeping it in bounds and a comparison and subtraction
>>> would do that just as well...
>>> 
>> 
> 
> 
> /Nikke
> -- 
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
>  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
> ---------------------------------------------------------------------------
>  I am Bashir on Borg: I'd be hostile to if my poop was cubed!
> =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
> 


RE: Performance fix in event mpm

Posted by David Dabbs <dm...@gmail.com>.
Hi.

Has anyone compared before/after performance when pounding a pre-patched
httpd (with ab or other load generator) and with the fdqueue mods? 
Or, for those more daring readers, observed improvements in a production
environment?
Before deploying a manually patched 2.2.x branch, we're probably going to
run some sort of load test. 
Having read the thread, I don't think we'd need to do anything other than
throw a lot of load at it, right?

Thanks,

David


-----Original Message-----
From: Niklas Edmundsson [mailto:nikke@acc.umu.se] 
Sent: Friday, January 28, 2011 8:58 AM
To: dev@httpd.apache.org
Subject: Re: Performance fix in event mpm

On Fri, 28 Jan 2011, Jim Jagielski wrote:

> I was going to submit it as a backport, yes.

I have a strong feeling that this can explain the weird performance 
issues/behavior we've seen when hitting any bottleneck that results in 
requests being queued up.

Thanks for finding/fixing this :)

>
> On Jan 27, 2011, at 9:08 PM, David Dabbs wrote:
>
>> I see that the changes described below were applied to the trunk worker
and
>> event MPM code.
>> Would you consider applying it to the 2.2x branch? I will do so myself
and
>> test in my env.
>>
>>
>> Many thanks,
>>
>> David Dabbs
>>
>>
>>
>> -----Original Message-----
>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>> Sent: Thursday, January 27, 2011 12:43 PM
>> To: dev@httpd.apache.org
>> Subject: Re: Performance fix in event mpm
>>
>>
>> On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:
>>
>>>
>>> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
>>>
>>>> I have been developing an application using apache 2.2 on linux 2.6.
My
>> test environment creates a very heavy workload and puts a strain on every
>> thing.
>>>>
>>>> I would get good performance for a while and as the load ramped up,
>> performance would quickly get very bad.  Erratically, transactions would
>> finish quickly or take a very long time -- tcpdump analysis showed
>> millisecond or seconds between responses. Also, the recv queue got very
>> large.
>>>>
>>>> I noticed that ap_queue_pop removes elements from the queue LIFO rather
>> than FIFO.  Also noticed that apr_queue_pop uses a different technique
which
>> is not too expensive and is fifo, so I changed ap_queue/pop/push to use
that
>> technique and the receive problems went away.
>>>>
>>>> Please let me know if you think this change is appropriate and/or if
>> you'd like more data
>>>>
>>>
>>> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
>>> the above ain't right for pop! :)
>>
>> OK, looking over the history, it looks like the Q was changed from
>> FIFO to LIFO ~10years ago (worker)... The reasoning:
>>
>>  This is a rather simple patch that may improve cache-hit performance
>>  under some conditions by changing the queue of available worker threads
>>  from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
>>  happens in the critical section, which will definately help if you have
>>  a lame compiler.
>>
>> Seems to me that changing back to FIFO would make sense, esp
>> with trunk. We can profile the expense of the '% queue->bounds'
>> but it seems to me that if it was really bad, we'd have seen it
>> in apr and changed it there... after all, all we're doing
>> with that is keeping it in bounds and a comparison and subtraction
>> would do that just as well...
>>
>


/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  I am Bashir on Borg: I'd be hostile to if my poop was cubed!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=


Re: Performance fix in event mpm

Posted by Niklas Edmundsson <ni...@acc.umu.se>.
On Fri, 28 Jan 2011, Jim Jagielski wrote:

> I was going to submit it as a backport, yes.

I have a strong feeling that this can explain the weird performance 
issues/behavior we've seen when hitting any bottleneck that results in 
requests being queued up.

Thanks for finding/fixing this :)

>
> On Jan 27, 2011, at 9:08 PM, David Dabbs wrote:
>
>> I see that the changes described below were applied to the trunk worker and
>> event MPM code.
>> Would you consider applying it to the 2.2x branch? I will do so myself and
>> test in my env.
>>
>>
>> Many thanks,
>>
>> David Dabbs
>>
>>
>>
>> -----Original Message-----
>> From: Jim Jagielski [mailto:jim@jaguNET.com]
>> Sent: Thursday, January 27, 2011 12:43 PM
>> To: dev@httpd.apache.org
>> Subject: Re: Performance fix in event mpm
>>
>>
>> On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:
>>
>>>
>>> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
>>>
>>>> I have been developing an application using apache 2.2 on linux 2.6.  My
>> test environment creates a very heavy workload and puts a strain on every
>> thing.
>>>>
>>>> I would get good performance for a while and as the load ramped up,
>> performance would quickly get very bad.  Erratically, transactions would
>> finish quickly or take a very long time -- tcpdump analysis showed
>> millisecond or seconds between responses. Also, the recv queue got very
>> large.
>>>>
>>>> I noticed that ap_queue_pop removes elements from the queue LIFO rather
>> than FIFO.  Also noticed that apr_queue_pop uses a different technique which
>> is not too expensive and is fifo, so I changed ap_queue/pop/push to use that
>> technique and the receive problems went away.
>>>>
>>>> Please let me know if you think this change is appropriate and/or if
>> you'd like more data
>>>>
>>>
>>> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
>>> the above ain't right for pop! :)
>>
>> OK, looking over the history, it looks like the Q was changed from
>> FIFO to LIFO ~10years ago (worker)... The reasoning:
>>
>>  This is a rather simple patch that may improve cache-hit performance
>>  under some conditions by changing the queue of available worker threads
>>  from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
>>  happens in the critical section, which will definately help if you have
>>  a lame compiler.
>>
>> Seems to me that changing back to FIFO would make sense, esp
>> with trunk. We can profile the expense of the '% queue->bounds'
>> but it seems to me that if it was really bad, we'd have seen it
>> in apr and changed it there... after all, all we're doing
>> with that is keeping it in bounds and a comparison and subtraction
>> would do that just as well...
>>
>


/Nikke
-- 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
  Niklas Edmundsson, Admin @ {acc,hpc2n}.umu.se      |     nikke@acc.umu.se
---------------------------------------------------------------------------
  I am Bashir on Borg: I'd be hostile to if my poop was cubed!
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

Re: Performance fix in event mpm

Posted by Jim Jagielski <ji...@apache.org>.
I was going to submit it as a backport, yes.

On Jan 27, 2011, at 9:08 PM, David Dabbs wrote:

> I see that the changes described below were applied to the trunk worker and
> event MPM code.
> Would you consider applying it to the 2.2x branch? I will do so myself and
> test in my env.
> 
> 
> Many thanks,
> 
> David Dabbs
> 
> 
> 
> -----Original Message-----
> From: Jim Jagielski [mailto:jim@jaguNET.com] 
> Sent: Thursday, January 27, 2011 12:43 PM
> To: dev@httpd.apache.org
> Subject: Re: Performance fix in event mpm
> 
> 
> On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:
> 
>> 
>> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
>> 
>>> I have been developing an application using apache 2.2 on linux 2.6.  My
> test environment creates a very heavy workload and puts a strain on every
> thing.  
>>> 
>>> I would get good performance for a while and as the load ramped up,
> performance would quickly get very bad.  Erratically, transactions would
> finish quickly or take a very long time -- tcpdump analysis showed
> millisecond or seconds between responses. Also, the recv queue got very
> large.
>>> 
>>> I noticed that ap_queue_pop removes elements from the queue LIFO rather
> than FIFO.  Also noticed that apr_queue_pop uses a different technique which
> is not too expensive and is fifo, so I changed ap_queue/pop/push to use that
> technique and the receive problems went away.
>>> 
>>> Please let me know if you think this change is appropriate and/or if
> you'd like more data
>>> 
>> 
>> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
>> the above ain't right for pop! :)
> 
> OK, looking over the history, it looks like the Q was changed from
> FIFO to LIFO ~10years ago (worker)... The reasoning:
> 
>  This is a rather simple patch that may improve cache-hit performance
>  under some conditions by changing the queue of available worker threads
>  from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
>  happens in the critical section, which will definately help if you have
>  a lame compiler.
> 
> Seems to me that changing back to FIFO would make sense, esp
> with trunk. We can profile the expense of the '% queue->bounds'
> but it seems to me that if it was really bad, we'd have seen it
> in apr and changed it there... after all, all we're doing
> with that is keeping it in bounds and a comparison and subtraction
> would do that just as well...
> 


RE: Performance fix in event mpm

Posted by David Dabbs <dm...@gmail.com>.
I see that the changes described below were applied to the trunk worker and
event MPM code.
Would you consider applying it to the 2.2x branch? I will do so myself and
test in my env.


Many thanks,

David Dabbs



-----Original Message-----
From: Jim Jagielski [mailto:jim@jaguNET.com] 
Sent: Thursday, January 27, 2011 12:43 PM
To: dev@httpd.apache.org
Subject: Re: Performance fix in event mpm


On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:

> 
> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
> 
>> I have been developing an application using apache 2.2 on linux 2.6.  My
test environment creates a very heavy workload and puts a strain on every
thing.  
>> 
>> I would get good performance for a while and as the load ramped up,
performance would quickly get very bad.  Erratically, transactions would
finish quickly or take a very long time -- tcpdump analysis showed
millisecond or seconds between responses. Also, the recv queue got very
large.
>> 
>> I noticed that ap_queue_pop removes elements from the queue LIFO rather
than FIFO.  Also noticed that apr_queue_pop uses a different technique which
is not too expensive and is fifo, so I changed ap_queue/pop/push to use that
technique and the receive problems went away.
>> 
>> Please let me know if you think this change is appropriate and/or if
you'd like more data
>> 
> 
> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
> the above ain't right for pop! :)

OK, looking over the history, it looks like the Q was changed from
FIFO to LIFO ~10years ago (worker)... The reasoning:

  This is a rather simple patch that may improve cache-hit performance
  under some conditions by changing the queue of available worker threads
  from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
  happens in the critical section, which will definately help if you have
  a lame compiler.

Seems to me that changing back to FIFO would make sense, esp
with trunk. We can profile the expense of the '% queue->bounds'
but it seems to me that if it was really bad, we'd have seen it
in apr and changed it there... after all, all we're doing
with that is keeping it in bounds and a comparison and subtraction
would do that just as well...


Re: Performance fix in event mpm

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Jan 27, 2011, at 1:31 PM, Jim Jagielski wrote:

> 
> On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:
> 
>> I have been developing an application using apache 2.2 on linux 2.6.  My test environment creates a very heavy workload and puts a strain on every thing.  
>> 
>> I would get good performance for a while and as the load ramped up, performance would quickly get very bad.  Erratically, transactions would finish quickly or take a very long time -- tcpdump analysis showed millisecond or seconds between responses. Also, the recv queue got very large.
>> 
>> I noticed that ap_queue_pop removes elements from the queue LIFO rather than FIFO.  Also noticed that apr_queue_pop uses a different technique which is not too expensive and is fifo, so I changed ap_queue/pop/push to use that technique and the receive problems went away.
>> 
>> Please let me know if you think this change is appropriate and/or if you'd like more data
>> 
> 
> Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
> the above ain't right for pop! :)

OK, looking over the history, it looks like the Q was changed from
FIFO to LIFO ~10years ago (worker)... The reasoning:

  This is a rather simple patch that may improve cache-hit performance
  under some conditions by changing the queue of available worker threads
  from FIFO to LIFO. It also adds a tiny reduction in the arithmetic that
  happens in the critical section, which will definately help if you have
  a lame compiler.

Seems to me that changing back to FIFO would make sense, esp
with trunk. We can profile the expense of the '% queue->bounds'
but it seems to me that if it was really bad, we'd have seen it
in apr and changed it there... after all, all we're doing
with that is keeping it in bounds and a comparison and subtraction
would do that just as well...

Re: Performance fix in event mpm

Posted by Jim Jagielski <ji...@jaguNET.com>.
On Jan 27, 2011, at 12:21 PM, Jim Van Fleet wrote:

> I have been developing an application using apache 2.2 on linux 2.6.  My test environment creates a very heavy workload and puts a strain on every thing.  
> 
> I would get good performance for a while and as the load ramped up, performance would quickly get very bad.  Erratically, transactions would finish quickly or take a very long time -- tcpdump analysis showed millisecond or seconds between responses. Also, the recv queue got very large.
> 
> I noticed that ap_queue_pop removes elements from the queue LIFO rather than FIFO.  Also noticed that apr_queue_pop uses a different technique which is not too expensive and is fifo, so I changed ap_queue/pop/push to use that technique and the receive problems went away.
> 
> snippet from ap_queue_pop (push is similar with appropriate changes to the fd_queue_t struct)
> 
>     AP_DEBUG_ASSERT(!queue->terminated);
> #if 1
>     ap_assert(!ap_queue_full(queue));  /* we'd never expect the queue to be full, so for debug, we check */
> #else
>     AP_DEBUG_ASSERT(!ap_queue_full(queue));
> #endif
> 
> #if 1
>     elem = &queue->data[queue->in];
>     queue->in = (queue->in + 1) % queue->bounds;
> #else
>     elem = &queue->data[queue->nelts];
> #endif
>     elem->sd = sd;
>     elem->cs = cs;
>     elem->p = p;
>     queue->nelts++;
> 
> Please let me know if you think this change is appropriate and/or if you'd like more data
> 

Hmmm.... Not sure why the fdqueue would be LIFO. But certainly
the above ain't right for pop! :)