You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@qpid.apache.org by "Ken Giusti (Jira)" <ji...@apache.org> on 2021/09/08 14:48:00 UTC

[jira] [Created] (DISPATCH-2249) Refactor router shutdown to eliminate leaks and races

Ken Giusti created DISPATCH-2249:
------------------------------------

             Summary: Refactor router shutdown to eliminate leaks and races
                 Key: DISPATCH-2249
                 URL: https://issues.apache.org/jira/browse/DISPATCH-2249
             Project: Qpid Dispatch
          Issue Type: Task
          Components: Router Node
    Affects Versions: 1.17.0
            Reporter: Ken Giusti
            Assignee: Michael Goulish
             Fix For: Backlog


This work item refactors the router shutdown sequence.

On shutdown the router currently simply terminates all threads (but the main thread) mid-process.  All state is essentially frozen at the point where the shutdown signal is received.  Then the main thread attempts to clean up state before exiting.

This approach is error prone and results in memory leaks (see open JIRAs).  In addition it requires a bespoke cleanup handler that essentially duplicates run-time cleanup code (e.g. link close handling, connection close handling, etc).

It would be better to implement a controlled shutdown that leverages the "normal" connection close/management delete code that exists in the router.

For example, the new shutdown process could go something like this:

Add two new attributes to the "router" management entity: adminStatus and operStatus:

adminStatus values: ["up", "down"]

operStatus: ["active", "quiescing", "shutdown"]

adminStatus defaults to "up".  When modified either via management or SIGTERM/QUIT/INT/etc) to "down" initiate the shutdown process:
 # Set operStatus to quiescing
 # Close & delete all listeners and connectors.  This will prevent new connections from being established.
 # Initiate close of all active connections
 # Wait for all connections to complete close and delete
 # Join() all I/O threads but the main thread (this leaves the main thread and the core thread).
 # Issue a new action to the core thread to cause it to clean up its resources and exit
 # Join() the core thread.
 # Clean up any remaining server state then exit the main thread.

This is just an example of a possible shutdown sequence.  A proper design document should be proposed for review as a first step.

 

 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@qpid.apache.org
For additional commands, e-mail: dev-help@qpid.apache.org