You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@geode.apache.org by nabarunnag <gi...@git.apache.org> on 2017/08/22 21:57:13 UTC

[GitHub] geode pull request #732: GEODE-3276: Managing race conditions while the send...

GitHub user nabarunnag opened a pull request:

    https://github.com/apache/geode/pull/732

    GEODE-3276: Managing race conditions while the senders are stopped

    	* When a connection is initialized, a readAckThread may be alive from a previous incarnation.
    	* This AckThread will be stuck on a read socket with no timeout as nothing was dispatched.
    	* Also while it was stuck on the read, it will hold a connection lifecycle read lock
    	* The initialize connection needs a connection life cycle write lock to start the connection but the read lock is held by the ack thread.
    	* This results in a deadlock and eventually a hang.
    	* Another situation is that we set the flag isStopped for the event processor before actually shutting down the diapatcher and ack thread.
    	* So after the flag is set and before actually shutting down the dispatcher and ackThread, a gateway proxy stomper thread gets in between these two steps of execution.
    	* The stomper thread checks the isStopped flag, which was set to true, and proceeds to destroy the connection pool. However the dispatcher and ackThread were still running.
    	* This results in a out of heap memory exception while the ack thread is reading from the socket while connection pool was destroyed.
    	* To solve this issue, the stomper thread checks if the event processor and dispatcher exists, if true then we close the input streams before destroying the connection pool.
    
    Thank you for submitting a contribution to Apache Geode.
    
    In order to streamline the review of the contribution we ask you
    to ensure the following steps have been taken:
    
    ### For all changes:
    - [ ] Is there a JIRA ticket associated with this PR? Is it referenced in the commit message?
    
    - [ ] Has your PR been rebased against the latest commit within the target branch (typically `develop`)?
    
    - [ ] Is your initial contribution a single, squashed commit?
    
    - [ ] Does `gradlew build` run cleanly?
    
    - [ ] Have you written or updated unit tests to verify your changes?
    
    - [ ] If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under [ASF 2.0](http://www.apache.org/legal/resolved.html#category-a)?
    
    ### Note:
    Please ensure that once the PR is submitted, you check travis-ci for build issues and
    submit an update to your PR as soon as possible. If you need help, please send an
    email to dev@geode.apache.org.


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/nabarunnag/incubator-geode feature/GEODE-3276

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/geode/pull/732.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #732
    
----
commit 9618f69d2620f5a3d3a6a8576631906a61b512f9
Author: nabarun <nn...@pivotal.io>
Date:   2017-08-17T17:51:15Z

    GEODE-3276: Managing race conditions while the senders are stopped
    
    	* When a connection is initialized, a readAckThread may be alive from a previous incarnation.
    	* This AckThread will be stuck on a read socket with no timeout as nothing was dispatched.
    	* Also while it was stuck on the read, it will hold a connection lifecycle read lock
    	* The initialize connection needs a connection life cycle write lock to start the connection but the read lock is held by the ack thread.
    	* This results in a deadlock and eventually a hang.
    	* Another situation is that we set the flag isStopped for the event processor before actually shutting down the diapatcher and ack thread.
    	* So after the flag is set and before actually shutting down the dispatcher and ackThread, a gateway proxy stomper thread gets in between these two steps of execution.
    	* The stomper thread checks the isStopped flag, which was set to true, and proceeds to destroy the connection pool. However the dispatcher and ackThread were still running.
    	* This results in a out of heap memory exception while the ack thread is reading from the socket while connection pool was destroyed.
    	* To solve this issue, the stomper thread checks if the event processor and dispatcher exists, if true then we close the input streams before destroying the connection pool.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] geode pull request #732: GEODE-3276: Managing race conditions while the send...

Posted by jhuynh1 <gi...@git.apache.org>.
Github user jhuynh1 commented on a diff in the pull request:

    https://github.com/apache/geode/pull/732#discussion_r134622619
  
    --- Diff: geode-wan/src/main/java/org/apache/geode/internal/cache/wan/parallel/ParallelGatewaySenderImpl.java ---
    @@ -107,6 +107,9 @@ public void stop() {
           if (ev != null && !ev.isStopped()) {
             ev.stopProcessing();
           }
    +      if (ev != null && ev.getDispatcher() != null) {
    --- End diff --
    
    Is it possible to pull this check and shutdown into a method?  Looks like it's used a few times throughout the code


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] geode pull request #732: GEODE-3276: Managing race conditions while the send...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/geode/pull/732


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---