You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Zack Marsh (JIRA)" <ji...@apache.org> on 2015/05/06 22:25:00 UTC

[jira] [Created] (AMBARI-10977) HDFS Rebalance failed with IllegalArgumentException: Does not contain a valid host:port authority

Zack Marsh created AMBARI-10977:
-----------------------------------

             Summary: HDFS Rebalance failed with IllegalArgumentException: Does not contain a valid host:port authority
                 Key: AMBARI-10977
                 URL: https://issues.apache.org/jira/browse/AMBARI-10977
             Project: Ambari
          Issue Type: Bug
         Environment: ambari-2.1.0-376, hdp-2.3.0.0-1880, sles11sp3

            Reporter: Zack Marsh


The HDFS Rebalance is failing with the following error messages:


stderr:
{code}
2015-05-06 12:31:46,656 - Error while executing command 'rebalancehdfs':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 243, in rebalancehdfs
    logoutput = False,
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 269, in action_run
    raise ex
Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"' ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'' returned 255. May 6, 2015 12:31:46 PM  Balancing took 888.0 milliseconds
15/05/06 12:31:46 ERROR balancer.Balancer: Exiting balancer due an exception
java.lang.IllegalArgumentException: Does not contain a valid host:port authority: jolokia1.labs.teradata.com
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
	at org.apache.hadoop.hdfs.DFSUtil.getNameServiceUris(DFSUtil.java:1037)
	at org.apache.hadoop.hdfs.DFSUtil.getNsServiceRpcUris(DFSUtil.java:978)
	at org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:682)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:794)
{code}

stdout:
{code}
Starting balancer with threshold = 10
Executing command ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"' ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'
2015-05-06 12:31:43,096 - Execute['ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"' ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10''] {'logoutput': False, 'on_new_line': handle_new_line}
[balancer] May 6, 2015 12:31:46 PM [balancer]  [balancer] Balancing took 888.0 milliseconds[balancer] 
[balancer] 15/05/06 12:31:46 ERROR balancer.Balancer: Exiting balancer due an exception
java.lang.IllegalArgumentException: Does not contain a valid host:port authority: jolokia1.labs.teradata.com
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
	at org.apache.hadoop.hdfs.DFSUtil.getNameServiceUris(DFSUtil.java:1037)
	at org.apache.hadoop.hdfs.DFSUtil[balancer] .getNsServiceRpcUris(DFSUtil.java:978)
	at org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:682)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:794)
2015-05-06 12:31:46,656 - Error while executing command 'rebalancehdfs':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 243, in rebalancehdfs
    logoutput = False,
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 269, in action_run
    raise ex
Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"' ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'' returned 255. May 6, 2015 12:31:46 PM  Balancing took 888.0 milliseconds
15/05/06 12:31:46 ERROR balancer.Balancer: Exiting balancer due an exception
java.lang.IllegalArgumentException: Does not contain a valid host:port authority: jolokia1.labs.teradata.com
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:164)
	at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:153)
	at org.apache.hadoop.hdfs.DFSUtil.getNameServiceUris(DFSUtil.java:1037)
	at org.apache.hadoop.hdfs.DFSUtil.getNsServiceRpcUris(DFSUtil.java:978)
	at org.apache.hadoop.hdfs.server.balancer.Balancer$Cli.run(Balancer.java:682)
	at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
	at org.apache.hadoop.hdfs.server.balancer.Balancer.main(Balancer.java:794)
{code}

The values of the namenode rpc address properties are as follows:
{code}
dfs.namenode.rpc-address = <NN1 FQDN>
dfs.namenode.rpc-address.<CLUSTER-NAME>.nn1 = <NN1 FQDN>:8020
dfs.namenode.rpc-address.<CLUSTER-NAME>.nn2 = <NN2FQDN>:8020
{code}

Setting the plain dfs.namenode.rpc-address property to the Active Namenode at port 8020 allows the rebalance to succeed.

However, if this property is set to the Stand-by namenode at port 8020 the rebalance fails with the error:

{code}
2015-05-06 12:45:59,673 - Error while executing command 'rebalancehdfs':
Traceback (most recent call last):
  File "/usr/lib/python2.6/site-packages/resource_management/libraries/script/script.py", line 214, in execute
    method(env)
  File "/var/lib/ambari-agent/cache/common-services/HDFS/2.1.0.2.0/package/scripts/namenode.py", line 243, in rebalancehdfs
    logoutput = False,
  File "/usr/lib/python2.6/site-packages/resource_management/core/base.py", line 148, in __init__
    self.env.run()
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 152, in run
    self.run_action(resource, action)
  File "/usr/lib/python2.6/site-packages/resource_management/core/environment.py", line 118, in run_action
    provider_action()
  File "/usr/lib/python2.6/site-packages/resource_management/core/providers/system.py", line 269, in action_run
    raise ex
Fail: Execution of 'ambari-sudo.sh su hdfs -l -s /bin/bash -c 'export  PATH='"'"'/usr/sbin:/sbin:/usr/lib/ambari-server/*:/usr/sbin:/sbin:/usr/lib/ambari-server/*:/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/X11R6/bin:/usr/games:/opt/teradata/bynet/bin:/usr/lib/mit/bin:/usr/lib/mit/sbin:.:/opt/teradata/sm3g/bin:/opt/dell/srvadmin/bin:/opt/dell/srvadmin/sbin:/opt/teradata/dswap/sbin:/usr/tdbms/bin:/opt/teradata/gsctools/bin:/opt/teradata/vmf/bin:/opt/teradata/bin:/var/lib/ambari-agent:/var/lib/ambari-agent:/usr/hdp/current/hadoop-client/bin'"'"' ; hdfs --config /usr/hdp/current/hadoop-client/conf balancer -threshold 10'' returned 252. 15/05/06 12:45:56 INFO balancer.Balancer: Using a threshold of 10.0
15/05/06 12:45:56 INFO balancer.Balancer: namenodes  = [hdfs://jolokia1.labs.teradata.com:8020, hdfs://JOLOKIA]
15/05/06 12:45:56 INFO balancer.Balancer: parameters = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration = 5, number of nodes to be excluded = 0, number of nodes to be included = 0]
Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To Move  Bytes Being Moved
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.ipc.StandbyException): Operation category READ is not supported in state standby
	at org.apache.hadoop.hdfs.server.namenode.ha.StandbyState.checkOperation(StandbyState.java:87)
	at org.apache.hadoop.hdfs.server.namenode.NameNode$NameNodeHAContext.checkOperation(NameNode.java:1785)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.checkOperation(FSNamesystem.java:1301)
	at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getServerDefaults(FSNamesystem.java:1613)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.getServerDefaults(NameNodeRpcServer.java:593)
	at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.getServerDefaults(ClientNamenodeProtocolServerSideTranslatorPB.java:383)
	at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
	at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:969)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2049)
	at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2045)
	at java.security.AccessController.doPrivileged(Native Method)
	at javax.security.auth.Subject.doAs(Subject.java:422)
	at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1657)
	at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2043)
.  Exiting ...
May 6, 2015 12:45:59 PM  Balancing took 3.127 seconds
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)