You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Masatake Iwasaki (Jira)" <ji...@apache.org> on 2019/12/19 02:26:00 UTC

[jira] [Comment Edited] (HADOOP-16763) Make Curator 4 run in soft-compatibility mode with ZooKeeper 3.4

    [ https://issues.apache.org/jira/browse/HADOOP-16763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16999663#comment-16999663 ] 

Masatake Iwasaki edited comment on HADOOP-16763 at 12/19/19 2:25 AM:
---------------------------------------------------------------------

[~elgoiri], I should have set {{yarn.resourcemanager.ha.curator-leader-elector.enabled}} to {{true}} to reproduce the issue. I got the error below with zookeeper-3.5.6.jar on the classpath of RM:
{noformat}
2019-12-19 02:23:25,754 ERROR org.apache.curator.framework.recipes.leader.LeaderLatch: getChildren() failed. rc = -6
2019-12-19 02:23:25,853 INFO org.apache.curator.framework.state.ConnectionStateManager: State change: SUSPENDED
2019-12-19 02:23:25,901 INFO org.apache.hadoop.yarn.factories.impl.pb.RpcServerFactoryPBImpl: Adding protocol org.apache.hadoop.yarn.server.api.ResourceManagerAdministrationProtocolPB to the server
2019-12-19 02:23:25,915 INFO org.apache.hadoop.ipc.Server: IPC Server Responder: starting
2019-12-19 02:23:25,916 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8033: starting
2019-12-19 02:23:26,269 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server 98e7b66e95e3/172.18.0.11:2181. Will not attempt to authenticate using SASL (unknown error)
2019-12-19 02:23:26,270 INFO org.apache.zookeeper.ClientCnxn: Socket connection established, initiating session, client: /172.18.0.11:57042, server: 98e7b66e95e3/172.18.0.11:2181
2019-12-19 02:23:26,282 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server 98e7b66e95e3/172.18.0.11:2181, sessionid = 0x2000dc764480006, negotiated timeout = 10000
2019-12-19 02:23:26,282 INFO org.apache.curator.framework.state.ConnectionStateManager: State change: RECONNECTED
2019-12-19 02:23:26,287 WARN org.apache.zookeeper.ClientCnxn: Session 0x2000dc764480006 for server 98e7b66e95e3/172.18.0.11:2181, unexpected error, closing socket connection and attempting reconnect
java.io.IOException: Xid out of order. Got Xid 5 with err -6 expected Xid 4 for a packet with details: clientPath:/zookeeper/config serverPath:/zookeeper/config finished:false header:: 4,4  replyHeader:: 0,0,-4  request:: '/zookeeper/config,T  response::  
        at org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:907)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:101)
        at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:363)
        at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1223)
2019-12-19 02:23:26,391 INFO org.apache.curator.framework.state.ConnectionStateM
{noformat}

If I replace the zookeeper (client) jar on the classpath, it worked.
{noformat}
$ docker exec hadoop01 /hadoop/bin/hadoop classpath --glob | sed -z -e 's/:/\n/g' | grep zookeeper
/zookeeper/zookeeper-3.4.14.jar

$ docker exec hadoop01 /hadoop/bin/yarn rmadmin -getServiceState rm1
2019-12-19 01:52:27,597 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
{noformat}

If both of zookeeper-3.4.14.jar and zookeeper-3.5.6.jar are on the classpath, I got the error above.
{noformat}
$ docker exec hadoop01 cat /hadoop/etc/hadoop/hadoop-env.sh | grep '^export.*CLASSPATH'
export HADOOP_CLASSPATH="/zookeeper/zookeeper-3.4.14.jar"
export HADOOP_USER_CLASSPATH_FIRST="yes"

$ docker exec hadoop01 /hadoop/bin/hadoop classpath --glob | sed -z -e 's/:/\n/g' | grep zookeeper
/zookeeper/zookeeper-3.4.14.jar
/hadoop/share/hadoop/common/lib/zookeeper-3.5.6.jar
/hadoop/share/hadoop/common/lib/zookeeper-jute-3.5.6.jar
/hadoop/share/hadoop/hdfs/lib/zookeeper-3.5.6.jar
/hadoop/share/hadoop/hdfs/lib/zookeeper-jute-3.5.6.jar
{noformat}



was (Author: iwasakims):
[~elgoiri], I should have set {{yarn.resourcemanager.ha.curator-leader-elector.enabled}} to {{true}} to reproduce the issue. I got the error below with zookeeper-3.5.6.jar on the classpath of RM:
{noformat}
2019-12-19 01:45:37,149 ERROR org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Error starting ResourceManager
java.lang.NoSuchMethodError: org.apache.zookeeper.server.quorum.flexible.QuorumMaj.<init>(Ljava/util/Map;)V
        at org.apache.curator.framework.imps.EnsembleTracker.<init>(EnsembleTracker.java:57)
        at org.apache.curator.framework.imps.CuratorFrameworkImpl.<init>(CuratorFrameworkImpl.java:159)
        at org.apache.curator.framework.CuratorFrameworkFactory$Builder.build(CuratorFrameworkFactory.java:165)
        at org.apache.hadoop.util.curator.ZKCuratorManager.start(ZKCuratorManager.java:154)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createAndStartZKManager(ResourceManager.java:419)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.createEmbeddedElector(ResourceManager.java:385)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.serviceInit(ResourceManager.java:333)
        at org.apache.hadoop.service.AbstractService.init(AbstractService.java:164)
        at org.apache.hadoop.yarn.server.resourcemanager.ResourceManager.main(ResourceManager.java:1576)
{noformat}

If I replace the zookeeper (client) jar on the classpath, it worked.
{noformat}
$ docker exec hadoop01 /hadoop/bin/hadoop classpath --glob | sed -z -e 's/:/\n/g' | grep zookeeper
/zookeeper/zookeeper-3.4.14.jar

$ docker exec hadoop01 /hadoop/bin/yarn rmadmin -getServiceState rm1
2019-12-19 01:52:27,597 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
active
{noformat}

If both of zookeeper-3.4.14.jar and zookeeper-3.5.6.jar are on the classpath, I got the error above.
{noformat}
$ docker exec hadoop01 cat /hadoop/etc/hadoop/hadoop-env.sh | grep '^export.*CLASSPATH'
export HADOOP_CLASSPATH="/zookeeper/zookeeper-3.4.14.jar"
export HADOOP_USER_CLASSPATH_FIRST="yes"

$ docker exec hadoop01 /hadoop/bin/hadoop classpath --glob | sed -z -e 's/:/\n/g' | grep zookeeper
/zookeeper/zookeeper-3.4.14.jar
/hadoop/share/hadoop/common/lib/zookeeper-3.5.6.jar
/hadoop/share/hadoop/common/lib/zookeeper-jute-3.5.6.jar
/hadoop/share/hadoop/hdfs/lib/zookeeper-3.5.6.jar
/hadoop/share/hadoop/hdfs/lib/zookeeper-jute-3.5.6.jar
{noformat}


> Make Curator 4 run in soft-compatibility mode with ZooKeeper 3.4
> ----------------------------------------------------------------
>
>                 Key: HADOOP-16763
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16763
>             Project: Hadoop Common
>          Issue Type: Improvement
>            Reporter: Íñigo Goiri
>            Priority: Major
>
> HADOOP-16579 changed Curator to 4.2 and ZooKeeper to 3.5.
> This change relate to the client libraries used by the components.
> However, the ensemble in most deployments is 3.4 (default in Ubuntu for example).
> To allow this mode, there is a soft-compatibility mode described in http://curator.apache.org/zk-compatibility.html
> We should enable this soft-compatibility mode.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org