You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Gordon Sim (JIRA)" <qp...@incubator.apache.org> on 2009/09/08 18:41:57 UTC

[jira] Created: (QPID-2086) intermittent federated cluster hangs since r810591

intermittent federated cluster hangs since r810591
--------------------------------------------------

                 Key: QPID-2086
                 URL: https://issues.apache.org/jira/browse/QPID-2086
             Project: Qpid
          Issue Type: Bug
          Components: C++ Broker, C++ Client
            Reporter: Gordon Sim


On revision 810590 I can run federated_cluster_test_with_node_failure in a loop for over 100 iterations without any hangs.

However on revision 810591 the same loop hangs quite easily (generally within 10 runs or so). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


[jira] Assigned: (QPID-2086) intermittent federated cluster hangs since r810591

Posted by "Andrew Stitcher (JIRA)" <qp...@incubator.apache.org>.
     [ https://issues.apache.org/jira/browse/QPID-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Andrew Stitcher reassigned QPID-2086:
-------------------------------------

    Assignee: Alan Conway  (was: Andrew Stitcher)

> intermittent federated cluster hangs since r810591
> --------------------------------------------------
>
>                 Key: QPID-2086
>                 URL: https://issues.apache.org/jira/browse/QPID-2086
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker, C++ Client
>            Reporter: Gordon Sim
>            Assignee: Alan Conway
>            Priority: Blocker
>
> On revision 810590 I can run federated_cluster_test_with_node_failure in a loop for over 100 iterations without any hangs.
> However on revision 810591 the same loop hangs quite easily (generally within 10 runs or so). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


[jira] Commented: (QPID-2086) intermittent federated cluster hangs since r810591

Posted by "Andrew Stitcher (JIRA)" <qp...@incubator.apache.org>.
    [ https://issues.apache.org/jira/browse/QPID-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12752674#action_12752674 ] 

Andrew Stitcher commented on QPID-2086:
---------------------------------------

I can't reproduce this running federated_cluster_test_with_node_failure in a loop for about 30 mins on a 2 CPU fairly slow box.

This is after r812590 which fixes a probably unrelated bug in AsynchIO.cpp

> intermittent federated cluster hangs since r810591
> --------------------------------------------------
>
>                 Key: QPID-2086
>                 URL: https://issues.apache.org/jira/browse/QPID-2086
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker, C++ Client
>            Reporter: Gordon Sim
>            Assignee: Andrew Stitcher
>            Priority: Blocker
>
> On revision 810590 I can run federated_cluster_test_with_node_failure in a loop for over 100 iterations without any hangs.
> However on revision 810591 the same loop hangs quite easily (generally within 10 runs or so). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


[jira] Assigned: (QPID-2086) intermittent federated cluster hangs since r810591

Posted by "Gordon Sim (JIRA)" <qp...@incubator.apache.org>.
     [ https://issues.apache.org/jira/browse/QPID-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gordon Sim reassigned QPID-2086:
--------------------------------

    Assignee: Andrew Stitcher

> intermittent federated cluster hangs since r810591
> --------------------------------------------------
>
>                 Key: QPID-2086
>                 URL: https://issues.apache.org/jira/browse/QPID-2086
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker, C++ Client
>            Reporter: Gordon Sim
>            Assignee: Andrew Stitcher
>
> On revision 810590 I can run federated_cluster_test_with_node_failure in a loop for over 100 iterations without any hangs.
> However on revision 810591 the same loop hangs quite easily (generally within 10 runs or so). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


[jira] Resolved: (QPID-2086) intermittent federated cluster hangs since r810591

Posted by "Alan Conway (JIRA)" <qp...@incubator.apache.org>.
     [ https://issues.apache.org/jira/browse/QPID-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Conway resolved QPID-2086.
-------------------------------

    Resolution: Fixed

This is a problem with the read-credit code. AsyncIOHandler::readbuff is occasionally called with readCredit == 0. This fix changes readbuff to do nothing if readCredit==0. There is an issue somewhere else however since readbuff shouldn't be called with readCredit==0 in the first place. 

Committed r820717

> intermittent federated cluster hangs since r810591
> --------------------------------------------------
>
>                 Key: QPID-2086
>                 URL: https://issues.apache.org/jira/browse/QPID-2086
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker, C++ Client
>            Reporter: Gordon Sim
>            Assignee: Alan Conway
>            Priority: Blocker
>
> On revision 810590 I can run federated_cluster_test_with_node_failure in a loop for over 100 iterations without any hangs.
> However on revision 810591 the same loop hangs quite easily (generally within 10 runs or so). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


[jira] Updated: (QPID-2086) intermittent federated cluster hangs since r810591

Posted by "Gordon Sim (JIRA)" <qp...@incubator.apache.org>.
     [ https://issues.apache.org/jira/browse/QPID-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gordon Sim updated QPID-2086:
-----------------------------

    Priority: Blocker  (was: Major)

> intermittent federated cluster hangs since r810591
> --------------------------------------------------
>
>                 Key: QPID-2086
>                 URL: https://issues.apache.org/jira/browse/QPID-2086
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker, C++ Client
>            Reporter: Gordon Sim
>            Assignee: Andrew Stitcher
>            Priority: Blocker
>
> On revision 810590 I can run federated_cluster_test_with_node_failure in a loop for over 100 iterations without any hangs.
> However on revision 810591 the same loop hangs quite easily (generally within 10 runs or so). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


[jira] Resolved: (QPID-2086) intermittent federated cluster hangs since r810591

Posted by "Alan Conway (JIRA)" <qp...@incubator.apache.org>.
     [ https://issues.apache.org/jira/browse/QPID-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Alan Conway resolved QPID-2086.
-------------------------------

    Resolution: Fixed

Fixed in 813100

cluster::Connection did not give read credit if there was an exception processing a frame.



> intermittent federated cluster hangs since r810591
> --------------------------------------------------
>
>                 Key: QPID-2086
>                 URL: https://issues.apache.org/jira/browse/QPID-2086
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker, C++ Client
>            Reporter: Gordon Sim
>            Assignee: Alan Conway
>            Priority: Blocker
>
> On revision 810590 I can run federated_cluster_test_with_node_failure in a loop for over 100 iterations without any hangs.
> However on revision 810591 the same loop hangs quite easily (generally within 10 runs or so). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org


[jira] Reopened: (QPID-2086) intermittent federated cluster hangs since r810591

Posted by "Gordon Sim (JIRA)" <qp...@incubator.apache.org>.
     [ https://issues.apache.org/jira/browse/QPID-2086?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gordon Sim reopened QPID-2086:
------------------------------


The exact same symptoms are still observable on r814077 so I suspect there is more to this.

> intermittent federated cluster hangs since r810591
> --------------------------------------------------
>
>                 Key: QPID-2086
>                 URL: https://issues.apache.org/jira/browse/QPID-2086
>             Project: Qpid
>          Issue Type: Bug
>          Components: C++ Broker, C++ Client
>            Reporter: Gordon Sim
>            Assignee: Alan Conway
>            Priority: Blocker
>
> On revision 810590 I can run federated_cluster_test_with_node_failure in a loop for over 100 iterations without any hangs.
> However on revision 810591 the same loop hangs quite easily (generally within 10 runs or so). 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
Apache Qpid - AMQP Messaging Implementation
Project:      http://qpid.apache.org
Use/Interact: mailto:dev-subscribe@qpid.apache.org