You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by Alan Conway <ac...@redhat.com> on 2007/09/18 23:43:07 UTC

Re: test_flush intermittent failure

There seems to be an intermittent failure in the python test_flush test.
I haven't managed to reproduce it with tracing on. Any thoughts where to
look? Possibly I've messed up something with my latest refactoring but
not sure what. 

Python says: 
======================================================================
ERROR: test_flush (tests_0-10.execution.ExecutionTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/aconway/svn/fixme/python/tests_0-10/execution.py", line
28, in test_flush
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 353, in
<lambda>
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 271, in
invoke
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 308, in
invoke_method
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 217, in write
Closed: Timed-out waiting for completion

======================================================================
ERROR: test_flush (tests_0-10.execution.ExecutionTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/aconway/svn/qpid/python/qpid/testlib.py", line 228, in
tearDown
    self.client.channel(0).connection_close(reply_code=200)
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 353, in
<lambda>
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 271, in
invoke
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 346, in
invoke_method
Closed


The qpidd log says:

======================================================================
ERROR: test_flush (tests_0-10.execution.ExecutionTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/aconway/svn/fixme/python/tests_0-10/execution.py", line
28, in test_flush
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 353, in
<lambda>
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 271, in
invoke
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 308, in
invoke_method
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 217, in write
Closed: Timed-out waiting for completion

======================================================================
ERROR: test_flush (tests_0-10.execution.ExecutionTests)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/aconway/svn/qpid/python/qpid/testlib.py", line 228, in
tearDown
    self.client.channel(0).connection_close(reply_code=200)
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 353, in
<lambda>
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 271, in
invoke
  File "/home/aconway/svn/fixme/python/qpid/peer.py", line 346, in
invoke_method
Closed




Re: test_flush intermittent failure

Posted by Alan Conway <ac...@redhat.com>.
On Wed, 2007-09-19 at 11:29 +0100, Gordon Sim wrote:
> Alan Conway wrote:
> On another note there are actually two flushes sent in this test per 
> publish. As the channel is in synchronous mode the peer sends one itself 
> (should be a synch now, that wasn't implemented when this was started). 
> The explicit flush is no no longer necessary. However thats not the 
> cause of the failure I don't think.

Aha, those are the extra flushes I was seeing. Well that means I don't
have a clue what's going on...

> I can't actually reproduce the issue... If you have a broker log I can 
> have a look at it.

I get it semi-reliably on amd-32:/home/aconway/svn/qpid with:

while make check TESTS="start_broker python_tests stop_broker"  PYTHON_TESTS=tests_0-10.execution QPID_TRACE=1; do true; done
Perpahs

Success and fail logs attached. I modified execution.py to put the loop
index in the routing key for a more readable log, just comitted that
modification now.

Thanks,
Alan.

Re: test_flush intermittent failure

Posted by Gordon Sim <gs...@redhat.com>.
Alan Conway wrote:
> In all the cases where I see the failure below, the broker log shows the
> broker receiving an ExecutionFlushBody *before* the corresponding
> BasicPublishBody.

There is a loop in this test; could the apparently preceding flush be 
from the earlier loop?

On another note there are actually two flushes sent in this test per 
publish. As the channel is in synchronous mode the peer sends one itself 
(should be a synch now, that wasn't implemented when this was started). 
The explicit flush is no no longer necessary. However thats not the 
cause of the failure I don't think.

> Is there any way the python client could be re-ordering these methods?
> 
> Is there any way the brokers AsynchIOAcceptor could be delivering these
> methods out of order?

I think this is unlikely. Its more likely to be a subtle race in the 
broker or client that is somehow occasionally getting the command 
completion confused.

I can't actually reproduce the issue... If you have a broker log I can 
have a look at it.

Re: test_flush intermittent failure

Posted by Alan Conway <ac...@redhat.com>.
In all the cases where I see the failure below, the broker log shows the
broker receiving an ExecutionFlushBody *before* the corresponding
BasicPublishBody.

Is there any way the python client could be re-ordering these methods?

Is there any way the brokers AsynchIOAcceptor could be delivering these
methods out of order?

I'll get back into it tomorrow.

On Tue, 2007-09-18 at 17:43 -0400, Alan Conway wrote:
> There seems to be an intermittent failure in the python test_flush test.
> I haven't managed to reproduce it with tracing on. Any thoughts where to
> look? Possibly I've messed up something with my latest refactoring but
> not sure what. 
> 
> Python says: 
> ======================================================================
> ERROR: test_flush (tests_0-10.execution.ExecutionTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/aconway/svn/fixme/python/tests_0-10/execution.py", line
> 28, in test_flush
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 353, in
> <lambda>
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 271, in
> invoke
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 308, in
> invoke_method
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 217, in write
> Closed: Timed-out waiting for completion
> 
> ======================================================================
> ERROR: test_flush (tests_0-10.execution.ExecutionTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/aconway/svn/qpid/python/qpid/testlib.py", line 228, in
> tearDown
>     self.client.channel(0).connection_close(reply_code=200)
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 353, in
> <lambda>
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 271, in
> invoke
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 346, in
> invoke_method
> Closed
> 
> 
> The qpidd log says:
> 
> ======================================================================
> ERROR: test_flush (tests_0-10.execution.ExecutionTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/aconway/svn/fixme/python/tests_0-10/execution.py", line
> 28, in test_flush
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 353, in
> <lambda>
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 271, in
> invoke
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 308, in
> invoke_method
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 217, in write
> Closed: Timed-out waiting for completion
> 
> ======================================================================
> ERROR: test_flush (tests_0-10.execution.ExecutionTests)
> ----------------------------------------------------------------------
> Traceback (most recent call last):
>   File "/home/aconway/svn/qpid/python/qpid/testlib.py", line 228, in
> tearDown
>     self.client.channel(0).connection_close(reply_code=200)
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 353, in
> <lambda>
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 271, in
> invoke
>   File "/home/aconway/svn/fixme/python/qpid/peer.py", line 346, in
> invoke_method
> Closed
> 
>