You are viewing a plain text version of this content. The canonical link for it is here.
Posted to mod_python-dev@quetz.apache.org by Graham Dumpleton <gr...@dscpl.com.au> on 2006/03/06 11:02:47 UTC

Apache 2.2 failure on Mac OS X 10.4.5.

I hinted a while back that Apache 2.2 changes had failed tests on Mac  
OS X.
I have gone back run the tests again and am posting the output here.  
I'll now
start investigating the problem. I have commented out all but the  
test that fails.


The output from test.py is:

Checking Apache version....
2.2

   * Testing _global_lock
   Creating config....
     listen port: 49571
   Starting Apache....
      /usr/local/apache-2.2/bin/httpd  -k start -f /Users/grahamd/ 
Workspaces/mod_python-trunk-dev-1/test/conf/test.conf
      Mon Mar  6 20:57:33 2006
      /usr/local/apache-2.2/bin/ab -c 5 -n 5 http://127.0.0.1:49571/ 
tests.py > /dev/null
      Mon Mar  6 20:57:36 2006
F  Stopping Apache...
      /usr/local/apache-2.2/bin/httpd -k stop -f /Users/grahamd/ 
Workspaces/mod_python-trunk-dev-1/test/conf/test.conf

======================================================================
FAIL: test_global_lock (__main__.PerInstanceTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
   File "test.py", line 2247, in test_global_lock
     self.fail("global_lock is broken (too quick)")
   File "/System/Library/Frameworks/Python.framework/Versions/2.3/lib/ 
python2.3/unittest.py", line 270, in fail
     raise self.failureException, msg
AssertionError: global_lock is broken (too quick)

----------------------------------------------------------------------
Ran 1 test in 8.658s

FAILED (failures=1)


The contents of the logs/error_log file are:

[Mon Mar 06 20:57:31 2006] [notice] mod_python: Creating 6 session  
mutexes based on 2 max processes and 3 max threads.
[Mon Mar 06 20:57:31 2006] [notice] Apache/2.2.0 (Unix) mod_python/ 
3.3.0-dev-20060305 Python/2.3.5 configured -- resuming normal operations
[Mon Mar 06 20:57:31 2006] [info] Server built: Feb 15 2006 21:39:56
[Mon Mar 06 20:57:31 2006] [debug] worker.c(1715): AcceptMutex:  
sysvsem (default: sysvsem)
[Mon Mar 06 20:57:32 2006] [notice] mod_python: (Re)importing module  
'tests'
[Mon Mar 06 20:57:33 2006] [notice] mod_python: (Re)importing module  
'tests'
[Mon Mar 06 20:57:34 2006] [debug] worker.c(1329): taking over  
scoreboard slot from 2500 (quiescing)
[Mon Mar 06 20:57:34 2006] [notice] mod_python: (Re)importing module  
'tests'
[Mon Mar 06 20:57:37 2006] [info] [client 127.0.0.1] (9)Bad file  
descriptor: core_output_filter: writing data to the network
[Mon Mar 06 20:57:38 2006] [info] [client 127.0.0.1] (9)Bad file  
descriptor: core_output_filter: writing data to the network
[Mon Mar 06 20:57:39 2006] [info] removed PID file /Users/grahamd/ 
Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=2497)
[Mon Mar 06 20:57:39 2006] [notice] caught SIGTERM, shutting down

Graham



Re: Apache 2.2 failure on Mac OS X 10.4.5.

Posted by Graham Dumpleton <gr...@dscpl.com.au>.
Sorry, last for the night. Time for me to go to sleep.

If I modify "ab" source code in Apache 2.2 to output error number
causing it to fail, get:

   Send request failed 9!

The code where modification was done was:

         if (e != APR_SUCCESS) {
             /*
              * Let's hope this traps EWOULDBLOCK too !
              */
             if (!APR_STATUS_IS_EAGAIN(e)) {
                 epipe++;
                 printf("Send request failed %d!\n", e);
                 close_connection(c);
             }
             return;
         }

On Mac OS X this equates to:

   #define EBADF           9               /* Bad file descriptor */

Maybe when I am really bored I'll pursue further as to why.

Graham

On 06/03/2006, at 10:27 PM, Graham Dumpleton wrote:

> Don't even need to rewrite test to use threads to fire off  
> requests. If
> I hardwire test to use "ab" from Apache 1.3 or Apache 2.0, then it
> works okay. It is only the version of "ab" in Apache 2.0 which causes
> the test to fail.
>
> Graham
>
> On 06/03/2006, at 9:41 PM, Graham Dumpleton wrote:
>
>> A bit more information. This may actually be a problem with the  
>> "ab" tool
>> used for this test. I have actually seen the "ab" program give  
>> problems on
>> Mac OS X before even with Apache 2.0.
>>
>> On 06/03/2006, at 9:02 PM, Graham Dumpleton wrote:
>>
>>> Traceback (most recent call last):
>>>   File "test.py", line 2247, in test_global_lock
>>>     self.fail("global_lock is broken (too quick)")
>>>   File "/System/Library/Frameworks/Python.framework/Versions/2.3/ 
>>> lib/python2.3/unittest.py", line 270, in fail
>>>     raise self.failureException, msg
>>> AssertionError: global_lock is broken (too quick)
>>
>> What the test does is to run "ab" to fire off five concurrent  
>> requests. The
>> handler the request is calling acquires a global lock, sleeps for  
>> a second
>> and then releases the lock. Because all requests need to acquire the
>> lock but are blocked out for a second, the execution of the five  
>> requests
>> should therefore take five seconds because they will in effect be  
>> serialised.
>>
>> The problem is that "ab" is giving errors on 3 out of 5 of the  
>> requests:
>>
>> Concurrency Level:      5
>> Time taken for tests:   3.514548 seconds
>> Complete requests:      5
>> Failed requests:        0
>> Write errors:           3
>> Total transferred:      540 bytes
>> HTML transferred:       21 bytes
>> Requests per second:    1.42 [#/sec] (mean)
>> Time per request:       3514.548 [ms] (mean)
>> Time per request:       702.910 [ms] (mean, across all concurrent  
>> requests)
>> Transfer rate:          0.00 [Kbytes/sec] received
>>
>> As example, when "ab" is run in verbose mode:
>>
>> LOG: header received:
>> HTTP/1.1 200 OK^M
>> Date: Mon, 06 Mar 2006 10:28:42 GMT^M
>> Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/ 
>> 2.3.5^M
>> Connection: close^M
>> Content-Type: text/plain^M
>> ^M
>> test ok
>> LOG: Response code = 200
>> Send request failed!
>> LOG: header received:
>> HTTP/1.1 200 OK^M
>> Date: Mon, 06 Mar 2006 10:28:42 GMT^M
>> Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/ 
>> 2.3.5^M
>> Connection: close^M
>> Content-Type: text/plain^M
>> ^M
>>
>> Note the "Send request failed!" message.
>>
>> Although "ab" is giving the error, the requests are in fact  
>> received by Apache
>> and handled. This time output is from prefork and not worker.
>>
>> [Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640921.61 0
>> [Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640921.61 0
>> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock  
>> 3 1141640922.61 0
>> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock  
>> 4 1141640922.61 0
>> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock  
>> 5 1141640922.61 0
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.14 4
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640923.14 4
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.22 1
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.26 3
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.26 2
>> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock  
>> 1 1141640923.26 5
>> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock  
>> 3 1141640924.14 4
>> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock  
>> 4 1141640924.14 4
>> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640924.14 1
>> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock  
>> 5 1141640924.14 4
>> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock  
>> 3 1141640925.14 1
>> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock  
>> 4 1141640925.14 1
>> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640925.14 3
>> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock  
>> 5 1141640925.14 1
>> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock  
>> 3 1141640926.14 3
>> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock  
>> 4 1141640926.14 3
>> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock  
>> 2 1141640926.14 2
>> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock  
>> 5 1141640926.14 3
>> [Mon Mar 06 21:28:49 2006] [warn] child process 902 still did not  
>> exit, sending a SIGTERM
>> [Mon Mar 06 21:28:51 2006] [warn] child process 902 still did not  
>> exit, sending a SIGTERM
>> [Mon Mar 06 21:28:53 2006] [warn] child process 902 still did not  
>> exit, sending a SIGTERM
>> [Mon Mar 06 21:28:55 2006] [error] child process 902 still did not  
>> exit, sending a SIGKILL
>> [Mon Mar 06 21:28:56 2006] [info] removed PID file /Users/grahamd/ 
>> Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=898)
>> [Mon Mar 06 21:28:56 2006] [notice] caught SIGTERM, shutting down
>>
>> The request with connection.id of 2 (last value) never gets to  
>> complete because the
>> the test script is killing off Apache before it has a chance since  
>> "ab" prematurely
>> gave up on that request. Apache didn't shutdown properly because  
>> of the problem
>> with Py_Finalize() being called in signal handler and deadlocking  
>> because the
>> request handler still had the GIL.
>>
>>> -------------------------------------------------------------------- 
>>> --
>>> Ran 1 test in 8.658s
>>>
>>> FAILED (failures=1)
>>>
>>>
>>> The contents of the logs/error_log file are:
>>>
>>> [Mon Mar 06 20:57:31 2006] [notice] mod_python: Creating 6  
>>> session mutexes based on 2 max processes and 3 max threads.
>>> [Mon Mar 06 20:57:31 2006] [notice] Apache/2.2.0 (Unix)  
>>> mod_python/3.3.0-dev-20060305 Python/2.3.5 configured -- resuming  
>>> normal operations
>>> [Mon Mar 06 20:57:31 2006] [info] Server built: Feb 15 2006 21:39:56
>>> [Mon Mar 06 20:57:31 2006] [debug] worker.c(1715): AcceptMutex:  
>>> sysvsem (default: sysvsem)
>>> [Mon Mar 06 20:57:32 2006] [notice] mod_python: (Re)importing  
>>> module 'tests'
>>> [Mon Mar 06 20:57:33 2006] [notice] mod_python: (Re)importing  
>>> module 'tests'
>>> [Mon Mar 06 20:57:34 2006] [debug] worker.c(1329): taking over  
>>> scoreboard slot from 2500 (quiescing)
>>> [Mon Mar 06 20:57:34 2006] [notice] mod_python: (Re)importing  
>>> module 'tests'
>>> [Mon Mar 06 20:57:37 2006] [info] [client 127.0.0.1] (9)Bad file  
>>> descriptor: core_output_filter: writing data to the network
>>> [Mon Mar 06 20:57:38 2006] [info] [client 127.0.0.1] (9)Bad file  
>>> descriptor: core_output_filter: writing data to the network
>>> [Mon Mar 06 20:57:39 2006] [info] removed PID file /Users/grahamd/ 
>>> Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=2497)
>>> [Mon Mar 06 20:57:39 2006] [notice] caught SIGTERM, shutting down
>>
>> In the above earlier worker log, the failures in the core output  
>> filter is purely
>> because "ab" had killed the connection and thus the file  
>> descriptor on
>> server side had gone bad.
>>
>> Now as to whether "ab" is the problem. I know that I have seen  
>> with "ab"
>> before when doing load testing on Mac OS X that it will often  
>> exhibit some
>> rate of failures for no good reason even though the handlers in  
>> the server
>> look to work perfectly okay based on logging. I don't know the  
>> reason for
>> why "ab" does this, but reasonably confident that "ab" is probably  
>> the cause
>> of the failure and not Apache.
>>
>> What I might do is to use threading to make concurrent requests  
>> from the
>> test script itself and not involve "ab" at all and see how that goes.
>>
>> BTW, the log output at top was from modified test handler as below.
>>
>> def global_lock(req):
>>
>>     import _apache
>>
>>     req.log_error("global_lock 1 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     _apache._global_lock(req.server, 1)
>>
>>     req.log_error("global_lock 2 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     time.sleep(1)
>>
>>     req.log_error("global_lock 3 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     _apache._global_unlock(req.server, 1)
>>
>>     req.log_error("global_lock 4 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     req.write("test ok")
>>
>>     req.log_error("global_lock 5 %s %s" % (time.time 
>> (),req.connection.id))
>>
>>     return apache.OK
>>
>>
>> Graham
>>


Re: Apache 2.2 failure on Mac OS X 10.4.5.

Posted by Graham Dumpleton <gr...@dscpl.com.au>.
Don't even need to rewrite test to use threads to fire off requests. If
I hardwire test to use "ab" from Apache 1.3 or Apache 2.0, then it
works okay. It is only the version of "ab" in Apache 2.0 which causes
the test to fail.

Graham

On 06/03/2006, at 9:41 PM, Graham Dumpleton wrote:

> A bit more information. This may actually be a problem with the  
> "ab" tool
> used for this test. I have actually seen the "ab" program give  
> problems on
> Mac OS X before even with Apache 2.0.
>
> On 06/03/2006, at 9:02 PM, Graham Dumpleton wrote:
>
>> Traceback (most recent call last):
>>   File "test.py", line 2247, in test_global_lock
>>     self.fail("global_lock is broken (too quick)")
>>   File "/System/Library/Frameworks/Python.framework/Versions/2.3/ 
>> lib/python2.3/unittest.py", line 270, in fail
>>     raise self.failureException, msg
>> AssertionError: global_lock is broken (too quick)
>
> What the test does is to run "ab" to fire off five concurrent  
> requests. The
> handler the request is calling acquires a global lock, sleeps for a  
> second
> and then releases the lock. Because all requests need to acquire the
> lock but are blocked out for a second, the execution of the five  
> requests
> should therefore take five seconds because they will in effect be  
> serialised.
>
> The problem is that "ab" is giving errors on 3 out of 5 of the  
> requests:
>
> Concurrency Level:      5
> Time taken for tests:   3.514548 seconds
> Complete requests:      5
> Failed requests:        0
> Write errors:           3
> Total transferred:      540 bytes
> HTML transferred:       21 bytes
> Requests per second:    1.42 [#/sec] (mean)
> Time per request:       3514.548 [ms] (mean)
> Time per request:       702.910 [ms] (mean, across all concurrent  
> requests)
> Transfer rate:          0.00 [Kbytes/sec] received
>
> As example, when "ab" is run in verbose mode:
>
> LOG: header received:
> HTTP/1.1 200 OK^M
> Date: Mon, 06 Mar 2006 10:28:42 GMT^M
> Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/ 
> 2.3.5^M
> Connection: close^M
> Content-Type: text/plain^M
> ^M
> test ok
> LOG: Response code = 200
> Send request failed!
> LOG: header received:
> HTTP/1.1 200 OK^M
> Date: Mon, 06 Mar 2006 10:28:42 GMT^M
> Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/ 
> 2.3.5^M
> Connection: close^M
> Content-Type: text/plain^M
> ^M
>
> Note the "Send request failed!" message.
>
> Although "ab" is giving the error, the requests are in fact  
> received by Apache
> and handled. This time output is from prefork and not worker.
>
> [Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock 1  
> 1141640921.61 0
> [Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock 2  
> 1141640921.61 0
> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 3  
> 1141640922.61 0
> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 4  
> 1141640922.61 0
> [Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 5  
> 1141640922.61 0
> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
> module 'tests'
> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
> module 'tests'
> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
> 1141640923.14 4
> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 2  
> 1141640923.14 4
> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
> module 'tests'
> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
> 1141640923.22 1
> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
> module 'tests'
> [Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing  
> module 'tests'
> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
> 1141640923.26 3
> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
> 1141640923.26 2
> [Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
> 1141640923.26 5
> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 3  
> 1141640924.14 4
> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 4  
> 1141640924.14 4
> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 2  
> 1141640924.14 1
> [Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 5  
> 1141640924.14 4
> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 3  
> 1141640925.14 1
> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 4  
> 1141640925.14 1
> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 2  
> 1141640925.14 3
> [Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 5  
> 1141640925.14 1
> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 3  
> 1141640926.14 3
> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 4  
> 1141640926.14 3
> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 2  
> 1141640926.14 2
> [Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 5  
> 1141640926.14 3
> [Mon Mar 06 21:28:49 2006] [warn] child process 902 still did not  
> exit, sending a SIGTERM
> [Mon Mar 06 21:28:51 2006] [warn] child process 902 still did not  
> exit, sending a SIGTERM
> [Mon Mar 06 21:28:53 2006] [warn] child process 902 still did not  
> exit, sending a SIGTERM
> [Mon Mar 06 21:28:55 2006] [error] child process 902 still did not  
> exit, sending a SIGKILL
> [Mon Mar 06 21:28:56 2006] [info] removed PID file /Users/grahamd/ 
> Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=898)
> [Mon Mar 06 21:28:56 2006] [notice] caught SIGTERM, shutting down
>
> The request with connection.id of 2 (last value) never gets to  
> complete because the
> the test script is killing off Apache before it has a chance since  
> "ab" prematurely
> gave up on that request. Apache didn't shutdown properly because of  
> the problem
> with Py_Finalize() being called in signal handler and deadlocking  
> because the
> request handler still had the GIL.
>
>> --------------------------------------------------------------------- 
>> -
>> Ran 1 test in 8.658s
>>
>> FAILED (failures=1)
>>
>>
>> The contents of the logs/error_log file are:
>>
>> [Mon Mar 06 20:57:31 2006] [notice] mod_python: Creating 6 session  
>> mutexes based on 2 max processes and 3 max threads.
>> [Mon Mar 06 20:57:31 2006] [notice] Apache/2.2.0 (Unix) mod_python/ 
>> 3.3.0-dev-20060305 Python/2.3.5 configured -- resuming normal  
>> operations
>> [Mon Mar 06 20:57:31 2006] [info] Server built: Feb 15 2006 21:39:56
>> [Mon Mar 06 20:57:31 2006] [debug] worker.c(1715): AcceptMutex:  
>> sysvsem (default: sysvsem)
>> [Mon Mar 06 20:57:32 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 20:57:33 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 20:57:34 2006] [debug] worker.c(1329): taking over  
>> scoreboard slot from 2500 (quiescing)
>> [Mon Mar 06 20:57:34 2006] [notice] mod_python: (Re)importing  
>> module 'tests'
>> [Mon Mar 06 20:57:37 2006] [info] [client 127.0.0.1] (9)Bad file  
>> descriptor: core_output_filter: writing data to the network
>> [Mon Mar 06 20:57:38 2006] [info] [client 127.0.0.1] (9)Bad file  
>> descriptor: core_output_filter: writing data to the network
>> [Mon Mar 06 20:57:39 2006] [info] removed PID file /Users/grahamd/ 
>> Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=2497)
>> [Mon Mar 06 20:57:39 2006] [notice] caught SIGTERM, shutting down
>
> In the above earlier worker log, the failures in the core output  
> filter is purely
> because "ab" had killed the connection and thus the file descriptor on
> server side had gone bad.
>
> Now as to whether "ab" is the problem. I know that I have seen with  
> "ab"
> before when doing load testing on Mac OS X that it will often  
> exhibit some
> rate of failures for no good reason even though the handlers in the  
> server
> look to work perfectly okay based on logging. I don't know the  
> reason for
> why "ab" does this, but reasonably confident that "ab" is probably  
> the cause
> of the failure and not Apache.
>
> What I might do is to use threading to make concurrent requests  
> from the
> test script itself and not involve "ab" at all and see how that goes.
>
> BTW, the log output at top was from modified test handler as below.
>
> def global_lock(req):
>
>     import _apache
>
>     req.log_error("global_lock 1 %s %s" % (time.time 
> (),req.connection.id))
>
>     _apache._global_lock(req.server, 1)
>
>     req.log_error("global_lock 2 %s %s" % (time.time 
> (),req.connection.id))
>
>     time.sleep(1)
>
>     req.log_error("global_lock 3 %s %s" % (time.time 
> (),req.connection.id))
>
>     _apache._global_unlock(req.server, 1)
>
>     req.log_error("global_lock 4 %s %s" % (time.time 
> (),req.connection.id))
>
>     req.write("test ok")
>
>     req.log_error("global_lock 5 %s %s" % (time.time 
> (),req.connection.id))
>
>     return apache.OK
>
>
> Graham
>


Re: Apache 2.2 failure on Mac OS X 10.4.5.

Posted by Graham Dumpleton <gr...@dscpl.com.au>.
A bit more information. This may actually be a problem with the "ab"  
tool
used for this test. I have actually seen the "ab" program give  
problems on
Mac OS X before even with Apache 2.0.

On 06/03/2006, at 9:02 PM, Graham Dumpleton wrote:

> Traceback (most recent call last):
>   File "test.py", line 2247, in test_global_lock
>     self.fail("global_lock is broken (too quick)")
>   File "/System/Library/Frameworks/Python.framework/Versions/2.3/ 
> lib/python2.3/unittest.py", line 270, in fail
>     raise self.failureException, msg
> AssertionError: global_lock is broken (too quick)

What the test does is to run "ab" to fire off five concurrent  
requests. The
handler the request is calling acquires a global lock, sleeps for a  
second
and then releases the lock. Because all requests need to acquire the
lock but are blocked out for a second, the execution of the five  
requests
should therefore take five seconds because they will in effect be  
serialised.

The problem is that "ab" is giving errors on 3 out of 5 of the requests:

Concurrency Level:      5
Time taken for tests:   3.514548 seconds
Complete requests:      5
Failed requests:        0
Write errors:           3
Total transferred:      540 bytes
HTML transferred:       21 bytes
Requests per second:    1.42 [#/sec] (mean)
Time per request:       3514.548 [ms] (mean)
Time per request:       702.910 [ms] (mean, across all concurrent  
requests)
Transfer rate:          0.00 [Kbytes/sec] received

As example, when "ab" is run in verbose mode:

LOG: header received:
HTTP/1.1 200 OK^M
Date: Mon, 06 Mar 2006 10:28:42 GMT^M
Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/2.3.5^M
Connection: close^M
Content-Type: text/plain^M
^M
test ok
LOG: Response code = 200
Send request failed!
LOG: header received:
HTTP/1.1 200 OK^M
Date: Mon, 06 Mar 2006 10:28:42 GMT^M
Server: Apache/2.2.0 (Unix) mod_python/3.3.0-dev-20060305 Python/2.3.5^M
Connection: close^M
Content-Type: text/plain^M
^M

Note the "Send request failed!" message.

Although "ab" is giving the error, the requests are in fact received  
by Apache
and handled. This time output is from prefork and not worker.

[Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock 1  
1141640921.61 0
[Mon Mar 06 21:28:41 2006] [error] [client 127.0.0.1] global_lock 2  
1141640921.61 0
[Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 3  
1141640922.61 0
[Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 4  
1141640922.61 0
[Mon Mar 06 21:28:42 2006] [error] [client 127.0.0.1] global_lock 5  
1141640922.61 0
[Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module  
'tests'
[Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module  
'tests'
[Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
1141640923.14 4
[Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 2  
1141640923.14 4
[Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module  
'tests'
[Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
1141640923.22 1
[Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module  
'tests'
[Mon Mar 06 21:28:43 2006] [notice] mod_python: (Re)importing module  
'tests'
[Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
1141640923.26 3
[Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
1141640923.26 2
[Mon Mar 06 21:28:43 2006] [error] [client 127.0.0.1] global_lock 1  
1141640923.26 5
[Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 3  
1141640924.14 4
[Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 4  
1141640924.14 4
[Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 2  
1141640924.14 1
[Mon Mar 06 21:28:44 2006] [error] [client 127.0.0.1] global_lock 5  
1141640924.14 4
[Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 3  
1141640925.14 1
[Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 4  
1141640925.14 1
[Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 2  
1141640925.14 3
[Mon Mar 06 21:28:45 2006] [error] [client 127.0.0.1] global_lock 5  
1141640925.14 1
[Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 3  
1141640926.14 3
[Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 4  
1141640926.14 3
[Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 2  
1141640926.14 2
[Mon Mar 06 21:28:46 2006] [error] [client 127.0.0.1] global_lock 5  
1141640926.14 3
[Mon Mar 06 21:28:49 2006] [warn] child process 902 still did not  
exit, sending a SIGTERM
[Mon Mar 06 21:28:51 2006] [warn] child process 902 still did not  
exit, sending a SIGTERM
[Mon Mar 06 21:28:53 2006] [warn] child process 902 still did not  
exit, sending a SIGTERM
[Mon Mar 06 21:28:55 2006] [error] child process 902 still did not  
exit, sending a SIGKILL
[Mon Mar 06 21:28:56 2006] [info] removed PID file /Users/grahamd/ 
Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=898)
[Mon Mar 06 21:28:56 2006] [notice] caught SIGTERM, shutting down

The request with connection.id of 2 (last value) never gets to  
complete because the
the test script is killing off Apache before it has a chance since  
"ab" prematurely
gave up on that request. Apache didn't shutdown properly because of  
the problem
with Py_Finalize() being called in signal handler and deadlocking  
because the
request handler still had the GIL.

> ----------------------------------------------------------------------
> Ran 1 test in 8.658s
>
> FAILED (failures=1)
>
>
> The contents of the logs/error_log file are:
>
> [Mon Mar 06 20:57:31 2006] [notice] mod_python: Creating 6 session  
> mutexes based on 2 max processes and 3 max threads.
> [Mon Mar 06 20:57:31 2006] [notice] Apache/2.2.0 (Unix) mod_python/ 
> 3.3.0-dev-20060305 Python/2.3.5 configured -- resuming normal  
> operations
> [Mon Mar 06 20:57:31 2006] [info] Server built: Feb 15 2006 21:39:56
> [Mon Mar 06 20:57:31 2006] [debug] worker.c(1715): AcceptMutex:  
> sysvsem (default: sysvsem)
> [Mon Mar 06 20:57:32 2006] [notice] mod_python: (Re)importing  
> module 'tests'
> [Mon Mar 06 20:57:33 2006] [notice] mod_python: (Re)importing  
> module 'tests'
> [Mon Mar 06 20:57:34 2006] [debug] worker.c(1329): taking over  
> scoreboard slot from 2500 (quiescing)
> [Mon Mar 06 20:57:34 2006] [notice] mod_python: (Re)importing  
> module 'tests'
> [Mon Mar 06 20:57:37 2006] [info] [client 127.0.0.1] (9)Bad file  
> descriptor: core_output_filter: writing data to the network
> [Mon Mar 06 20:57:38 2006] [info] [client 127.0.0.1] (9)Bad file  
> descriptor: core_output_filter: writing data to the network
> [Mon Mar 06 20:57:39 2006] [info] removed PID file /Users/grahamd/ 
> Workspaces/mod_python-trunk-dev-1/test/logs/httpd.pid (pid=2497)
> [Mon Mar 06 20:57:39 2006] [notice] caught SIGTERM, shutting down

In the above earlier worker log, the failures in the core output  
filter is purely
because "ab" had killed the connection and thus the file descriptor on
server side had gone bad.

Now as to whether "ab" is the problem. I know that I have seen with "ab"
before when doing load testing on Mac OS X that it will often exhibit  
some
rate of failures for no good reason even though the handlers in the  
server
look to work perfectly okay based on logging. I don't know the reason  
for
why "ab" does this, but reasonably confident that "ab" is probably  
the cause
of the failure and not Apache.

What I might do is to use threading to make concurrent requests from the
test script itself and not involve "ab" at all and see how that goes.

BTW, the log output at top was from modified test handler as below.

def global_lock(req):

     import _apache

     req.log_error("global_lock 1 %s %s" % (time.time 
(),req.connection.id))

     _apache._global_lock(req.server, 1)

     req.log_error("global_lock 2 %s %s" % (time.time 
(),req.connection.id))

     time.sleep(1)

     req.log_error("global_lock 3 %s %s" % (time.time 
(),req.connection.id))

     _apache._global_unlock(req.server, 1)

     req.log_error("global_lock 4 %s %s" % (time.time 
(),req.connection.id))

     req.write("test ok")

     req.log_error("global_lock 5 %s %s" % (time.time 
(),req.connection.id))

     return apache.OK


Graham