You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-dev@hadoop.apache.org by "Alex Ivanov (JIRA)" <ji...@apache.org> on 2016/09/25 08:56:20 UTC
[jira] [Created] (HADOOP-13652) ZKDelegationTokenSecretManager
doesn't seem to honor ZK connection/session timeouts
Alex Ivanov created HADOOP-13652:
------------------------------------
Summary: ZKDelegationTokenSecretManager doesn't seem to honor ZK connection/session timeouts
Key: HADOOP-13652
URL: https://issues.apache.org/jira/browse/HADOOP-13652
Project: Hadoop Common
Issue Type: Bug
Components: kms
Reporter: Alex Ivanov
Looking at some of the errors I've seen due to Zookeeper connection issues from KMS, it doesn't seem like the following timeouts are picked up.
{code}
package org.apache.hadoop.security.token.delegation;
public abstract class ZKDelegationTokenSecretManager<TokenIdent extends AbstractDelegationTokenIdentifier>
extends AbstractDelegationTokenSecretManager<TokenIdent> {
public static final int ZK_DTSM_ZK_SESSION_TIMEOUT_DEFAULT = 10000;
public static final int ZK_DTSM_ZK_CONNECTION_TIMEOUT_DEFAULT = 10000;
...
}
{code}
Instead, the connection/session timeouts are, correspondingly, 15 & 60 secs: the curator defaults.
{code}
package org.apache.curator.framework;
public class CuratorFrameworkFactory
{
private static final int DEFAULT_SESSION_TIMEOUT_MS = Integer.getInteger("curator-default-session-timeout", 60 * 1000);
private static final int DEFAULT_CONNECTION_TIMEOUT_MS = Integer.getInteger("curator-default-connection-timeout", 15 * 1000);
...
}
{code}
It looks like DelegationTokenAuthenticationFilter is setting curator, and that may cause an issue:
{code}
package org.apache.hadoop.security.token.delegation.web;
public class DelegationTokenAuthenticationFilter
extends AuthenticationFilter {
protected void initializeAuthHandler(String authHandlerClassName,
FilterConfig filterConfig) throws ServletException {
ZKDelegationTokenSecretManager.setCurator((CuratorFramework)
filterConfig.getServletContext().getAttribute(ZKSignerSecretProvider.
ZOOKEEPER_SIGNER_SECRET_PROVIDER_CURATOR_CLIENT_ATTRIBUTE));
super.initializeAuthHandler(authHandlerClassName, filterConfig);
ZKDelegationTokenSecretManager.setCurator(null);
}
{code}
Example errors:
{code}
2016-09-25 01:46:33,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15001)
2016-09-25 01:46:33,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15001)
2016-09-25 01:46:34,028 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15976)
2016-09-25 01:46:34,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (16001)
2016-09-25 01:46:37,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (19001)
2016-09-25 01:46:40,053 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (22001)
2016-09-25 01:46:49,055 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (31003)
2016-09-25 01:46:52,029 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (33977)
2016-09-25 01:47:05,344 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (47292)
2016-09-25 01:47:09,345 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (51292)
2016-09-25 01:47:24,346 WARN ConnectionState - Connection attempt unsuccessful after 66294 (greater than max timeout of 60000). Resetting connection and trying again with a new connection.
2016-09-25 01:47:43,740 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15001)
2016-09-25 01:47:43,740 ERROR ConnectionState - Connection timed out for connection string (host1, host2, host3) and timeout (15000) / elapsed (15001)
{code}
There are also some connections issues between KMS and Zookeeper. It is sporadic, that's why I'm still trying to pinpoint them, but essentially KMS can get into this perpetual connect/disconnect cycle from which it eventually recovers or a restart also helps. I'm mentioning this fact in case it is related to this jira.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-dev-help@hadoop.apache.org