You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-user@hadoop.apache.org by Benjamin Bowman <bb...@gmail.com> on 2014/05/30 16:30:27 UTC

HDFS Balancer short iterations

All,

Setup:  Hadoop 2.4.  3 servers.  1 master, 2 slaves.  Master is operating
as a datanode as well.

I am having a problem with the HDFS balancer utility not moving the amount
of data it says it is going to per iteration.  I have modified the
dfs.balance.bandwidthPerSec flag and the problem persists.  It is not an
issue with speed exactly.  When it is moving blocks - it moves them
quickly.  However, after about 10 blocks into the iteration, it stops
moving blocks and moves onto the next iteration - even though it said it
was going to move 10GB in the current iteration.  The "Bytes Already Moved"
field does not increase either.  And since it is moving such a trivial
amount of data (maybe 50 MB per iteration) the "Bytes Left To Move" field
does not go down.  In fact it increases since the system is loading data.

Below is the command and output from from first 2 iterations (debug output
enabled).  You will see only a small amount of data is moved per iteration.
 How can I fix this?  Is there a configuration option I am missing here?

Thanks,
Benjamin Bowman



[root@master_server ~]# /opt/hadoop/bin/hdfs balancer -threshold 5
14/05/30 14:08:48 DEBUG util.Shell: setsid exited with exit code 0
14/05/30 14:08:48 INFO balancer.Balancer: Using a threshold of 5.0
14/05/30 14:08:48 INFO balancer.Balancer: namenodes =
[hdfs://master_server:8020]
14/05/30 14:08:48 INFO balancer.Balancer: p         =
Balancer.Parameters[BalancingPolicy.Node, threshold=5.0]
Time Stamp               Iteration#  Bytes Already Moved  Bytes Left To
Move  Bytes Being Moved
14/05/30 14:08:48 DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess
with annotation
@org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, about=,
value=[Rate of successful kerberos logins and latency (milliseconds)],
always=false, type=DEFAULT, sampleName=Ops)
14/05/30 14:08:48 DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure
with annotation
@org.apache.hadoop.metrics2.annotation.Metric(valueName=Time, about=,
value=[Rate of failed kerberos logins and latency (milliseconds)],
always=false, type=DEFAULT, sampleName=Ops)
14/05/30 14:08:48 DEBUG lib.MutableMetricsFactory: field
org.apache.hadoop.metrics2.lib.MutableRate
org.apache.hadoop.security.UserGroupInformation$UgiMetrics.getGroups with
annotation @org.apache.hadoop.metrics2.annotation.Metric(valueName=Time,
about=, value=[GetGroups], always=false, type=DEFAULT, sampleName=Ops)
14/05/30 14:08:48 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and group
related metrics
14/05/30 14:08:49 DEBUG security.Groups:  Creating new Groups object
14/05/30 14:08:49 DEBUG util.NativeCodeLoader: Trying to load the
custom-built native-hadoop library...
14/05/30 14:08:49 DEBUG util.NativeCodeLoader: Loaded the native-hadoop
library
14/05/30 14:08:49 DEBUG security.JniBasedUnixGroupsMapping: Using
JniBasedUnixGroupsMapping for Group resolution
14/05/30 14:08:49 DEBUG security.JniBasedUnixGroupsMappingWithFallback:
Group mapping impl=org.apache.hadoop.security.JniBasedUnixGroupsMapping
14/05/30 14:08:49 DEBUG security.Groups: Group mapping
impl=org.apache.hadoop.security.JniBasedUnixGroupsMappingWithFallback;
cacheTimeout=300000; warningDeltaMs=5000
14/05/30 14:08:49 DEBUG security.UserGroupInformation: hadoop login
14/05/30 14:08:49 DEBUG security.UserGroupInformation: hadoop login commit
14/05/30 14:08:49 DEBUG security.UserGroupInformation: using local
user:UnixPrincipal: root
14/05/30 14:08:49 DEBUG security.UserGroupInformation: UGI loginUser:root
(auth:SIMPLE)
14/05/30 14:08:49 DEBUG ipc.Server: rpcKind=RPC_PROTOCOL_BUFFER,
rpcRequestWrapperClass=class
org.apache.hadoop.ipc.ProtobufRpcEngine$RpcRequestWrapper,
rpcInvoker=org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker@298488ef
14/05/30 14:08:49 DEBUG ipc.Client: getting client out of cache:
org.apache.hadoop.ipc.Client@69264eab
14/05/30 14:08:49 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
14/05/30 14:08:49 DEBUG ipc.Client: getting client out of cache:
org.apache.hadoop.ipc.Client@69264eab
14/05/30 14:08:49 DEBUG hdfs.BlockReaderLocal:
dfs.client.use.legacy.blockreader.local = false
14/05/30 14:08:49 DEBUG hdfs.BlockReaderLocal: dfs.client.read.shortcircuit
= true
14/05/30 14:08:49 DEBUG hdfs.BlockReaderLocal:
dfs.client.domain.socket.data.traffic = false
14/05/30 14:08:49 DEBUG hdfs.BlockReaderLocal: dfs.domain.socket.path =
/var/lib/hadoop-hdfs/dn_socket
14/05/30 14:08:49 DEBUG retry.RetryUtils: multipleLinearRandomRetry = null
14/05/30 14:08:49 DEBUG ipc.Client: getting client out of cache:
org.apache.hadoop.ipc.Client@69264eab
14/05/30 14:08:49 DEBUG unix.DomainSocketWatcher:
org.apache.hadoop.net.unix.DomainSocketWatcher$1@154e4e31: starting with
interruptCheckPeriodMs = 60000
14/05/30 14:08:49 DEBUG hdfs.BlockReaderLocal: The short-circuit local
reads feature is enabled.
14/05/30 14:08:49 DEBUG ipc.Client: The ping interval is 60000 ms.
14/05/30 14:08:49 DEBUG ipc.Client: Connecting to
master_server/master_IP:8020
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: starting, having connections 1
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #0
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #0
14/05/30 14:08:49 DEBUG ipc.ProtobufRpcEngine: Call: versionRequest took
61ms
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #1
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #1
14/05/30 14:08:49 DEBUG ipc.ProtobufRpcEngine: Call: getBlockKeys took 1ms
14/05/30 14:08:49 DEBUG ipc.Client: The ping interval is 60000 ms.
14/05/30 14:08:49 DEBUG ipc.Client: Connecting to
master_server/master_IP:8020
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #2
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: starting, having connections 2
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #2
14/05/30 14:08:49 DEBUG ipc.ProtobufRpcEngine: Call: getServerDefaults took
3ms
14/05/30 14:08:49 DEBUG hdfs.DFSClient: /system/balancer.id:
masked=rw-r--r--
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #3
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #3
14/05/30 14:08:49 DEBUG ipc.ProtobufRpcEngine: Call: create took 3ms
14/05/30 14:08:49 DEBUG hdfs.DFSClient: computePacketChunkSize: src=/system/
balancer.id, chunkSize=516, chunksPerPacket=127, packetSize=65532
14/05/30 14:08:49 DEBUG hdfs.LeaseRenewer: Lease renewer daemon for
[DFSClient_NONMAPREDUCE_1578292367_1] with renew id 1 started
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #4
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #4
14/05/30 14:08:49 DEBUG ipc.ProtobufRpcEngine: Call: getDatanodeReport took
2ms
14/05/30 14:08:49 INFO net.NetworkTopology: Adding a new node:
/rack-11/slave2_IP:50010
14/05/30 14:08:49 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:1
/rack-11/slave2_IP:50010

14/05/30 14:08:49 INFO net.NetworkTopology: Adding a new node:
/rack-11/slave1_IP:50010
14/05/30 14:08:49 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:2
/rack-11/slave2_IP:50010
/rack-11/slave1_IP:50010

14/05/30 14:08:49 INFO net.NetworkTopology: Adding a new node:
/rack-11/master_IP:50010
14/05/30 14:08:49 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:3
/rack-11/slave2_IP:50010
/rack-11/slave1_IP:50010
/rack-11/master_IP:50010

14/05/30 14:08:49 INFO balancer.Balancer: 1 over-utilized:
[Source[slave1_IP:50010, utilization=9.776028568910695]]
14/05/30 14:08:49 INFO balancer.Balancer: 0 underutilized: []
14/05/30 14:08:49 INFO balancer.Balancer: Need to move 4.12 GB to make the
cluster balanced.
14/05/30 14:08:49 INFO balancer.Balancer: Decided to move 10 GB bytes from
slave1_IP:50010 to slave2_IP:50010
14/05/30 14:08:49 INFO balancer.Balancer: Will move 10 GB in this iteration
May 30, 2014 2:08:49 PM           0                  0 B             4.12
GB              10 GB
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #5
14/05/30 14:08:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #5
14/05/30 14:08:49 DEBUG ipc.ProtobufRpcEngine: Call: getBlocks took 23ms
14/05/30 14:08:50 DEBUG balancer.Balancer: Decided to move
blk_1073883436_142612 with size=1210409 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Decided to move
blk_1073814562_73738 with size=428885 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Start moving
blk_1073883436_142612 with size=1210409 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Decided to move
blk_1073745873_5049 with size=182741 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Start moving
blk_1073814562_73738 with size=428885 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Decided to move
blk_1073814747_73923 with size=122828 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Start moving blk_1073745873_5049
with size=182741 from slave1_IP:50010 to slave2_IP:50010 through
slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Decided to move
blk_1073883600_142776 with size=1974217 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Start moving
blk_1073814747_73923 with size=122828 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:50 DEBUG balancer.Balancer: Start moving
blk_1073883600_142776 with size=1974217 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:51 INFO balancer.Balancer: Successfully moved
blk_1073883436_142612 with size=1210409 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:51 INFO balancer.Balancer: Successfully moved
blk_1073745873_5049 with size=182741 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:51 DEBUG balancer.Balancer: Decided to move
blk_1073883605_142781 with size=1960335 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:51 DEBUG balancer.Balancer: Decided to move
blk_1073814750_73926 with size=999 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:51 DEBUG balancer.Balancer: Start moving
blk_1073883605_142781 with size=1960335 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:51 DEBUG balancer.Balancer: Start moving
blk_1073814750_73926 with size=999 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:51 INFO balancer.Balancer: Successfully moved
blk_1073814750_73926 with size=999 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:51 DEBUG balancer.Balancer: Decided to move
blk_1073814749_73925 with size=1366 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:51 INFO balancer.Balancer: Successfully moved
blk_1073814747_73923 with size=122828 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:51 DEBUG balancer.Balancer: Decided to move
blk_1073814748_73924 with size=897 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:51 DEBUG balancer.Balancer: Start moving
blk_1073814749_73925 with size=1366 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:51 DEBUG balancer.Balancer: Start moving
blk_1073814748_73924 with size=897 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:52 INFO balancer.Balancer: Successfully moved
blk_1073814749_73925 with size=1366 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:52 DEBUG balancer.Balancer: Decided to move
blk_1073883585_142761 with size=1978004 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:52 INFO balancer.Balancer: Successfully moved
blk_1073814748_73924 with size=897 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:52 DEBUG balancer.Balancer: Decided to move
blk_1073745871_5047 with size=30883 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:52 INFO balancer.Balancer: Successfully moved
blk_1073814562_73738 with size=428885 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:52 DEBUG balancer.Balancer: Start moving
blk_1073883585_142761 with size=1978004 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:52 DEBUG balancer.Balancer: Decided to move
blk_1073883591_142767 with size=1972397 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:52 DEBUG balancer.Balancer: Start moving
blk_1073883591_142767 with size=1972397 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:52 DEBUG balancer.Balancer: Start moving blk_1073745871_5047
with size=30883 from slave1_IP:50010 to slave2_IP:50010 through
slave1_IP:50010
14/05/30 14:08:52 INFO balancer.Balancer: Successfully moved
blk_1073745871_5047 with size=30883 from slave1_IP:50010 to slave2_IP:50010
through slave1_IP:50010
14/05/30 14:08:57 INFO balancer.Balancer: Successfully moved
blk_1073883605_142781 with size=1960335 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:58 INFO balancer.Balancer: Successfully moved
blk_1073883600_142776 with size=1974217 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:59 INFO balancer.Balancer: Successfully moved
blk_1073883585_142761 with size=1978004 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:59 INFO balancer.Balancer: Successfully moved
blk_1073883591_142767 with size=1972397 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:08:59 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: closed
14/05/30 14:08:59 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: stopped, remaining connections 1
14/05/30 14:08:59 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: closed
14/05/30 14:08:59 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: stopped, remaining connections 0
14/05/30 14:09:19 DEBUG ipc.Client: The ping interval is 60000 ms.
14/05/30 14:09:19 DEBUG ipc.Client: Connecting to
master_server/master_IP:8020
14/05/30 14:09:19 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: starting, having connections 1
14/05/30 14:09:19 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #6
14/05/30 14:09:19 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #6
14/05/30 14:09:19 DEBUG ipc.ProtobufRpcEngine: Call: renewLease took 4ms
14/05/30 14:09:19 DEBUG hdfs.LeaseRenewer: Lease renewed for client
DFSClient_NONMAPREDUCE_1578292367_1
14/05/30 14:09:19 DEBUG hdfs.LeaseRenewer: Lease renewer daemon for
[DFSClient_NONMAPREDUCE_1578292367_1] with renew id 1 executed
14/05/30 14:09:28 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #7
14/05/30 14:09:28 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #7
14/05/30 14:09:28 DEBUG ipc.ProtobufRpcEngine: Call: getDatanodeReport took
2ms
14/05/30 14:09:28 INFO net.NetworkTopology: Adding a new node:
/rack-11/master_IP:50010
14/05/30 14:09:28 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:1
/rack-11/master_IP:50010

14/05/30 14:09:28 INFO net.NetworkTopology: Adding a new node:
/rack-11/slave1_IP:50010
14/05/30 14:09:28 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:2
/rack-11/master_IP:50010
/rack-11/slave1_IP:50010

14/05/30 14:09:28 INFO net.NetworkTopology: Adding a new node:
/rack-11/slave2_IP:50010
14/05/30 14:09:28 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:3
/rack-11/master_IP:50010
/rack-11/slave1_IP:50010
/rack-11/slave2_IP:50010

14/05/30 14:09:28 INFO balancer.Balancer: 1 over-utilized:
[Source[slave1_IP:50010, utilization=9.688501189056069]]
14/05/30 14:09:28 INFO balancer.Balancer: 0 underutilized: []
14/05/30 14:09:28 INFO balancer.Balancer: Need to move 3.93 GB to make the
cluster balanced.
14/05/30 14:09:28 INFO balancer.Balancer: Decided to move 10 GB bytes from
slave1_IP:50010 to slave2_IP:50010
14/05/30 14:09:28 INFO balancer.Balancer: Will move 10 GB in this iteration
May 30, 2014 2:09:28 PM           1                  0 B             3.93
GB              10 GB
14/05/30 14:09:28 DEBUG ipc.Client: The ping interval is 60000 ms.
14/05/30 14:09:28 DEBUG ipc.Client: Connecting to
master_server/master_IP:8020
14/05/30 14:09:28 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: starting, having connections 2
14/05/30 14:09:28 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #8
14/05/30 14:09:28 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #8
14/05/30 14:09:28 DEBUG ipc.ProtobufRpcEngine: Call: getBlocks took 20ms
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073874225_133401 with size=104628 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073750846_10022 with size=1261380 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073874225_133401 with size=104628 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073874226_133402 with size=123851 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073750846_10022 with size=1261380 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073874227_133403 with size=118403 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073874226_133402 with size=123851 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073820127_79303 with size=182087 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073874227_133403 with size=118403 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073820127_79303 with size=182087 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 INFO balancer.Balancer: Successfully moved
blk_1073874226_133402 with size=123851 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 INFO balancer.Balancer: Successfully moved
blk_1073874227_133403 with size=118403 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073820126_79302 with size=42102 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073820124_79300 with size=1460116 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 INFO balancer.Balancer: Successfully moved
blk_1073820127_79303 with size=182087 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073820126_79302 with size=42102 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073820124_79300 with size=1460116 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 INFO balancer.Balancer: Successfully moved
blk_1073874225_133401 with size=104628 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073820118_79294 with size=1453105 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 INFO balancer.Balancer: Successfully moved
blk_1073820126_79302 with size=42102 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073820118_79294 with size=1453105 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073820106_79282 with size=1445486 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Decided to move
blk_1073874116_133292 with size=1381854 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073820106_79282 with size=1445486 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:28 DEBUG balancer.Balancer: Start moving
blk_1073874116_133292 with size=1381854 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:33 INFO balancer.Balancer: Successfully moved
blk_1073750846_10022 with size=1261380 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:35 INFO balancer.Balancer: Successfully moved
blk_1073820124_79300 with size=1460116 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:35 INFO balancer.Balancer: Successfully moved
blk_1073820118_79294 with size=1453105 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:35 INFO balancer.Balancer: Successfully moved
blk_1073874116_133292 with size=1381854 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:35 INFO balancer.Balancer: Successfully moved
blk_1073820106_79282 with size=1445486 from slave1_IP:50010 to
slave2_IP:50010 through slave1_IP:50010
14/05/30 14:09:38 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: closed
14/05/30 14:09:38 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: stopped, remaining connections 1
14/05/30 14:09:38 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: closed
14/05/30 14:09:38 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: stopped, remaining connections 0
14/05/30 14:09:49 DEBUG ipc.Client: The ping interval is 60000 ms.
14/05/30 14:09:49 DEBUG ipc.Client: Connecting to
master_server/master_IP:8020
14/05/30 14:09:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #9
14/05/30 14:09:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: starting, having connections 1
14/05/30 14:09:49 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #9
14/05/30 14:09:49 DEBUG ipc.ProtobufRpcEngine: Call: renewLease took 3ms
14/05/30 14:09:49 DEBUG hdfs.LeaseRenewer: Lease renewed for client
DFSClient_NONMAPREDUCE_1578292367_1
14/05/30 14:09:49 DEBUG hdfs.LeaseRenewer: Lease renewer daemon for
[DFSClient_NONMAPREDUCE_1578292367_1] with renew id 1 executed
14/05/30 14:09:59 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: closed
14/05/30 14:09:59 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: stopped, remaining connections 0
14/05/30 14:10:07 DEBUG ipc.Client: The ping interval is 60000 ms.
14/05/30 14:10:07 DEBUG ipc.Client: Connecting to
master_server/master_IP:8020
14/05/30 14:10:07 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root: starting, having connections 1
14/05/30 14:10:07 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root sending #10
14/05/30 14:10:07 DEBUG ipc.Client: IPC Client (1763929301) connection to
master_server/master_IP:8020 from root got value #10
14/05/30 14:10:07 DEBUG ipc.ProtobufRpcEngine: Call: getDatanodeReport took
4ms
14/05/30 14:10:07 INFO net.NetworkTopology: Adding a new node:
/rack-11/slave2_IP:50010
14/05/30 14:10:07 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:1
/rack-11/slave2_IP:50010

14/05/30 14:10:07 INFO net.NetworkTopology: Adding a new node:
/rack-11/slave1_IP:50010
14/05/30 14:10:07 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:2
/rack-11/slave2_IP:50010
/rack-11/slave1_IP:50010

14/05/30 14:10:07 INFO net.NetworkTopology: Adding a new node:
/rack-11/master_IP:50010
14/05/30 14:10:07 DEBUG net.NetworkTopology: NetworkTopology became:
Number of racks: 1
Expected number of leaves:3
/rack-11/slave2_IP:50010
/rack-11/slave1_IP:50010
/rack-11/master_IP:50010

14/05/30 14:10:07 INFO balancer.Balancer: 1 over-utilized:
[Source[slave1_IP:50010, utilization=9.78960407336004]]
14/05/30 14:10:07 INFO balancer.Balancer: 0 underutilized: []
14/05/30 14:10:07 INFO balancer.Balancer: Need to move 4.17 GB to make the
cluster balanced.
14/05/30 14:10:07 INFO balancer.Balancer: Decided to move 10 GB bytes from
slave1_IP:50010 to slave2_IP:50010
14/05/30 14:10:07 INFO balancer.Balancer: Will move 10 GB in this iteration
May 30, 2014 2:10:07 PM           2                  0 B             4.17
GB              10 GB

.... and so on.