You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@aurora.apache.org by "Stephan Erb (JIRA)" <ji...@apache.org> on 2015/11/02 13:57:27 UTC

[jira] [Created] (AURORA-1533) Transient connection errors can leave client in irrecoverable state

Stephan Erb created AURORA-1533:
-----------------------------------

             Summary: Transient connection errors can leave client in irrecoverable state
                 Key: AURORA-1533
                 URL: https://issues.apache.org/jira/browse/AURORA-1533
             Project: Aurora
          Issue Type: Bug
            Reporter: Stephan Erb
            Priority: Minor


During a cluster update, some of our schedulers returned an unknown error to connecting clients ([relevant code|https://github.com/apache/aurora/blob/b712d577364f6b1613b54ba696bac4ddc255ae58/src/main/python/apache/aurora/client/api/scheduler_client.py#L268]).  Long running clients failed to recover from these errors as the code assumed the connection was already established. Subsequent scheduling calls thus failed with the following exception:

{code}
File  "venv/local/lib/python2.7/site-packages/apache/aurora/client/api/__init__.py"  in query_no_configs
  140.       raise self.ThriftInternalError(e.args[0])

Exception Type: ThriftInternalError
Exception Value: Error during thrift call getTasksWithoutConfigs to 
testcluster: 'NoneType' object has no attribute 'getTasksWithoutConfigs'
{code}

Background: We are using the python client to dispatch calls to Aurora from within a long-running web service. The connection is kept open as long as the web service is running. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)