You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "Bill Farner (JIRA)" <ji...@apache.org> on 2014/05/24 07:30:02 UTC

[jira] [Resolved] (AURORA-100) Thrift connection appears to keep the scheduler from shutting down

     [ https://issues.apache.org/jira/browse/AURORA-100?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Bill Farner resolved AURORA-100.
--------------------------------

    Resolution: Won't Fix

We're now using HTTP for transport, so this is no longer an issue.

> Thrift connection appears to keep the scheduler from shutting down
> ------------------------------------------------------------------
>
>                 Key: AURORA-100
>                 URL: https://issues.apache.org/jira/browse/AURORA-100
>             Project: Aurora
>          Issue Type: Bug
>          Components: Scheduler
>            Reporter: Bill Farner
>            Priority: Minor
>              Labels: newbie
>
> This originally cropped up when we were using thrift 0.5.0, so the code sample below will be stale:
> Looking at TThreadPoolServer source, the behavior here makes sense. We use the default ExecutorService created by TThreadPoolServer, which uses non-daemon threads. Here's TThreadPoolServer's serve loop:
> {code}
>   public void serve() {
>     try {
>       serverTransport_.listen();
>     } catch (TTransportException ttx) {
>       LOGGER.error("Error occurred during listening.", ttx);
>       return;
>     }
>     // Run the preServe event
>     if (eventHandler_ != null) {
>       eventHandler_.preServe();
>     }
>     stopped_ = false;
>     setServing(true);
>     while (!stopped_) {
>       int failureCount = 0;
>       try {
>         TTransport client = serverTransport_.accept();
>         WorkerProcess wp = new WorkerProcess(client);
>         executorService_.execute(wp);
>       } catch (TTransportException ttx) {
>         if (!stopped_) {
>           ++failureCount;
>           LOGGER.warn("Transport error occurred during acceptance of message.", ttx);
>         }
>       }
>     }
>     executorService_.shutdown();
>     // Loop until awaitTermination finally does return without a interrupted
>     // exception. If we don't do this, then we'll shut down prematurely. We want
>     // to let the executorService clear it's task queue, closing client sockets
>     // appropriately.
>     long timeoutMS = stopTimeoutUnit.toMillis(stopTimeoutVal);
>     long now = System.currentTimeMillis();
>     while (timeoutMS >= 0) {
>       try {
>         executorService_.awaitTermination(timeoutMS, TimeUnit.MILLISECONDS);
>         break;
>       } catch (InterruptedException ix) {
>         long newnow = System.currentTimeMillis();
>         timeoutMS -= (newnow - now);
>         now = newnow;
>       }
>     }
>     setServing(false);
>   }
> {code}
> The important bit, near the end, is that they never invoke executorService_.shutdownNow , which would terminate active connections.
> This is likely a deliberate design choice, and thrift 0.6.0+ allows callers to provide their own ExecutorService, which would give us some more control here.



--
This message was sent by Atlassian JIRA
(v6.2#6252)