You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Aidan Skinner (JIRA)" <qp...@incubator.apache.org> on 2008/10/23 14:34:44 UTC
[jira] Created: (QPID-1391) Reliability tests fail, broker is
unable to process connections
Reliability tests fail, broker is unable to process connections
---------------------------------------------------------------
Key: QPID-1391
URL: https://issues.apache.org/jira/browse/QPID-1391
Project: Qpid
Issue Type: Bug
Components: Java Broker
Affects Versions: M4
Reporter: Aidan Skinner
Assignee: Aidan Skinner
Fix For: M4
The reliability tests eventually cause the broker to lock up, it's still up but all threads a waiting on either a BDB lock or a senderLock. This is bad.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Reopened: (QPID-1391) Reliability tests fail, broker is
unable to process connections
Posted by "Aidan Skinner (JIRA)" <qp...@incubator.apache.org>.
[ https://issues.apache.org/jira/browse/QPID-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aidan Skinner reopened QPID-1391:
---------------------------------
*sigh*
> Reliability tests fail, broker is unable to process connections
> ---------------------------------------------------------------
>
> Key: QPID-1391
> URL: https://issues.apache.org/jira/browse/QPID-1391
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Affects Versions: M4
> Reporter: Aidan Skinner
> Assignee: Aidan Skinner
> Fix For: M4
>
>
> The reliability tests eventually cause the broker to lock up, it's still up but all threads a waiting on either a BDB lock or a senderLock. This is bad.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (QPID-1391) Reliability tests fail, broker is
unable to process connections
Posted by "Aidan Skinner (JIRA)" <qp...@incubator.apache.org>.
[ https://issues.apache.org/jira/browse/QPID-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aidan Skinner resolved QPID-1391.
---------------------------------
Resolution: Fixed
This is a problem with the BDB message store I was using, I've filed (and fixed) it at https://jira.jboss.org/jira/browse/RHM-7
> Reliability tests fail, broker is unable to process connections
> ---------------------------------------------------------------
>
> Key: QPID-1391
> URL: https://issues.apache.org/jira/browse/QPID-1391
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Affects Versions: M4
> Reporter: Aidan Skinner
> Assignee: Aidan Skinner
> Fix For: M4
>
> Attachments: stack.txt
>
>
> The reliability tests eventually cause the broker to lock up, it's still up but all threads a waiting on either a BDB lock or a senderLock. This is bad.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Updated: (QPID-1391) Reliability tests fail, broker is
unable to process connections
Posted by "Aidan Skinner (JIRA)" <qp...@incubator.apache.org>.
[ https://issues.apache.org/jira/browse/QPID-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aidan Skinner updated QPID-1391:
--------------------------------
Attachment: stack.txt
Stack dump from the broker which is up but refusing to start new protocol sessions.
> Reliability tests fail, broker is unable to process connections
> ---------------------------------------------------------------
>
> Key: QPID-1391
> URL: https://issues.apache.org/jira/browse/QPID-1391
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Affects Versions: M4
> Reporter: Aidan Skinner
> Assignee: Aidan Skinner
> Fix For: M4
>
> Attachments: stack.txt
>
>
> The reliability tests eventually cause the broker to lock up, it's still up but all threads a waiting on either a BDB lock or a senderLock. This is bad.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Resolved: (QPID-1391) Reliability tests fail, broker is
unable to process connections
Posted by "Aidan Skinner (JIRA)" <qp...@incubator.apache.org>.
[ https://issues.apache.org/jira/browse/QPID-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aidan Skinner resolved QPID-1391.
---------------------------------
Resolution: Invalid
This appears to have been an environmental issue with the machine in question.
> Reliability tests fail, broker is unable to process connections
> ---------------------------------------------------------------
>
> Key: QPID-1391
> URL: https://issues.apache.org/jira/browse/QPID-1391
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Affects Versions: M4
> Reporter: Aidan Skinner
> Assignee: Aidan Skinner
> Fix For: M4
>
>
> The reliability tests eventually cause the broker to lock up, it's still up but all threads a waiting on either a BDB lock or a senderLock. This is bad.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (QPID-1391) Reliability tests fail, broker is
unable to process connections
Posted by "Aidan Skinner (JIRA)" <qp...@incubator.apache.org>.
[ https://issues.apache.org/jira/browse/QPID-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648136#action_12648136 ]
Aidan Skinner commented on QPID-1391:
-------------------------------------
There are a ton of Connection objects still open that aren't closed, even though the associated socket has gone away.
> Reliability tests fail, broker is unable to process connections
> ---------------------------------------------------------------
>
> Key: QPID-1391
> URL: https://issues.apache.org/jira/browse/QPID-1391
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Affects Versions: M4
> Reporter: Aidan Skinner
> Assignee: Aidan Skinner
> Fix For: M4
>
>
> The reliability tests eventually cause the broker to lock up, it's still up but all threads a waiting on either a BDB lock or a senderLock. This is bad.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[jira] Commented: (QPID-1391) Reliability tests fail, broker is
unable to process connections
Posted by "Martin Ritchie (JIRA)" <qp...@incubator.apache.org>.
[ https://issues.apache.org/jira/browse/QPID-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12648614#action_12648614 ]
Martin Ritchie commented on QPID-1391:
--------------------------------------
Having taken a look at the stack trace the problem seems to be comming from the BerkeleyDB MessageStore module.
Thead 'pool-1-thread-13' is blocked trying to add a new message to the channel Unacknowledged Map.
The thread is blocked waiting for a lock held by thread 'pool-1-thread-27' which is acknowledging a message locking the map.
Thread 27 is currently waiting in the BDBStore code for the completion of the commit. Question is why is it taking so long, as in it is not completing at all. Hours have passed and the code is still sitting at wait();
When (if) this wait-ing thread returns then that will release the locks for all the currently waiting threads.
So points for further discussion:
1) [Slightly off Apache Qpid] BDBMessageStore L1804 synchronizes on 'this', IMO this is a poor design as you cannot tell if the BDB code is also going to lock on that object.
2) UnacknowledgeMessageMapImpl L:141 acknowledgeMessage: This is synchronizing around the whole acknowledge method per TranscationalContext. This seems unnecessary as we pass in the UMMI (this) to the method which then uses the visitors to safely access the map in the NonTransactionalContext and the LoclaTransactionalContext does not actually update the map so should not need to lock at all.
> Reliability tests fail, broker is unable to process connections
> ---------------------------------------------------------------
>
> Key: QPID-1391
> URL: https://issues.apache.org/jira/browse/QPID-1391
> Project: Qpid
> Issue Type: Bug
> Components: Java Broker
> Affects Versions: M4
> Reporter: Aidan Skinner
> Assignee: Aidan Skinner
> Fix For: M4
>
> Attachments: stack.txt
>
>
> The reliability tests eventually cause the broker to lock up, it's still up but all threads a waiting on either a BDB lock or a senderLock. This is bad.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.