You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2009/11/12 06:17:39 UTC

[jira] Created: (HBASE-1973) Second RS goes down because "Bad connect ack with firstBadLink as..."

Second RS goes down because "Bad connect ack with firstBadLink as..."
---------------------------------------------------------------------

                 Key: HBASE-1973
                 URL: https://issues.apache.org/jira/browse/HBASE-1973
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack


Placeholder for issue where a second RS goes down when I kill another RS and its DN.  I asked over on hdfs-user why we're stuck on 140.  Why can't it move on to another DN creating a block.

{code}
2009-11-12 05:09:38,624 [regionserver/208.76.44.141:60020.logRoller] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002405346, entries=52747, calcsize=63757973, filesize=58588977. New hlog /hbase/.logs/aa0-000-14.u.powerset.com,60020,12
58002404935/hlog.dat.1258002578617
2009-11-12 05:09:38,628 [regionserver/208.76.44.141:60020.logRoller] DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Found 0 hlogs to remove  out of total 1; oldest outstanding seqnum is 1 from region .META.,,1
2009-11-12 05:09:38,628 [IPC Server handler 16 on 60020] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: edit=0, write=TestTable/TestTable,,1258002466308/612228
2009-11-12 05:09:38,630 [regionserver/208.76.44.141:60020.logSyncer] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: sync
2009-11-12 05:09:38,646 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
2009-11-12 05:09:38,646 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_1786802275426334496_1076
2009-11-12 05:09:44,662 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
2009-11-12 05:09:44,662 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_5838454320666652488_1076
2009-11-12 05:09:46,226 [pool-1-thread-1] DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=5.5656967MB (5836056), Free=672.77185MB (705452392), Max=678.3375MB (711288448), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted
/Run=NaN
2009-11-12 05:09:50,683 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
2009-11-12 05:09:50,684 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_8615264034947520229_1076
2009-11-12 05:09:56,691 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
2009-11-12 05:09:56,692 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-5632723632070749366_1077
2009-11-12 05:10:02,696 [Thread-61] WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block.
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3100)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)

2009-11-12 05:10:02,697 [Thread-61] WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617" - Aborting...
2009-11-12 05:10:02,699 [regionserver/208.76.44.141:60020.logSyncer] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
2009-11-12 05:10:02,701 [regionserver/208.76.44.141:60020.logSyncer] ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Error while syncing, requesting close of hlog 
java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
2009-11-12 05:10:02,703 [IPC Server handler 20 on 60020] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
java.io.IOException: IOException flush:java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.hflush(DFSClient.java:3527)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3473)
        at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
        at org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:829)
        at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:751)
2009-11-12 05:10:02,703 [regionserver/208.76.44.141:60020.logSyncer] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: regionserver/208.76.44.141:60020.logSyncer exiting
2009-11-12 05:10:02,706 [IPC Server handler 10 on 60020] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
java.io.IOException: IOException flush:java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.hflush(DFSClient.java:3527)
        at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3473)
        at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
        at org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:829)
        at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:751)
{code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-1973) Second RS goes down because "Bad connect ack with firstBadLink as..."

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-1973.
--------------------------

    Resolution: Fixed

Apply HDFS-630 to fix this issue.  Resolving.

> Second RS goes down because "Bad connect ack with firstBadLink as..."
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1973
>                 URL: https://issues.apache.org/jira/browse/HBASE-1973
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> Placeholder for issue where a second RS goes down when I kill another RS and its DN.  I asked over on hdfs-user why we're stuck on 140.  Why can't it move on to another DN creating a block.
> {code}
> 2009-11-12 05:09:38,624 [regionserver/208.76.44.141:60020.logRoller] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002405346, entries=52747, calcsize=63757973, filesize=58588977. New hlog /hbase/.logs/aa0-000-14.u.powerset.com,60020,12
> 58002404935/hlog.dat.1258002578617
> 2009-11-12 05:09:38,628 [regionserver/208.76.44.141:60020.logRoller] DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Found 0 hlogs to remove  out of total 1; oldest outstanding seqnum is 1 from region .META.,,1
> 2009-11-12 05:09:38,628 [IPC Server handler 16 on 60020] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: edit=0, write=TestTable/TestTable,,1258002466308/612228
> 2009-11-12 05:09:38,630 [regionserver/208.76.44.141:60020.logSyncer] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: sync
> 2009-11-12 05:09:38,646 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:38,646 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_1786802275426334496_1076
> 2009-11-12 05:09:44,662 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:44,662 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_5838454320666652488_1076
> 2009-11-12 05:09:46,226 [pool-1-thread-1] DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=5.5656967MB (5836056), Free=672.77185MB (705452392), Max=678.3375MB (711288448), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted
> /Run=NaN
> 2009-11-12 05:09:50,683 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:50,684 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_8615264034947520229_1076
> 2009-11-12 05:09:56,691 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:56,692 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-5632723632070749366_1077
> 2009-11-12 05:10:02,696 [Thread-61] WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block.
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3100)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,697 [Thread-61] WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617" - Aborting...
> 2009-11-12 05:10:02,699 [regionserver/208.76.44.141:60020.logSyncer] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
> java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,701 [regionserver/208.76.44.141:60020.logSyncer] ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Error while syncing, requesting close of hlog 
> java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,703 [IPC Server handler 20 on 60020] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
> java.io.IOException: IOException flush:java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.hflush(DFSClient.java:3527)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3473)
>         at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:829)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:751)
> 2009-11-12 05:10:02,703 [regionserver/208.76.44.141:60020.logSyncer] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: regionserver/208.76.44.141:60020.logSyncer exiting
> 2009-11-12 05:10:02,706 [IPC Server handler 10 on 60020] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
> java.io.IOException: IOException flush:java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.hflush(DFSClient.java:3527)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3473)
>         at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:829)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:751)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-1973) Second RS goes down because "Bad connect ack with firstBadLink as..."

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-1973?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12776855#action_12776855 ] 

stack commented on HBASE-1973:
------------------------------

Weird thing is that it seems like the create eventually succeeds.. after we've given up on it if I look at the NN:

{code}
[stack@aa0-000-12 logs]$ grep '/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617' hadoop-stack-namenode-aa0-000-12.u.powerset.com.log 
2009-11-12 05:09:38,618 DEBUG org.apache.hadoop.hdfs.StateChange: *DIR* NameNode.create: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250 at 208.76.44.141
2009-11-12 05:09:38,618 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617, holder=DFSClient_276778250, clientMachine=208.76.44.141, createParent=true, replication=3, overwrite=true, append=false
2009-11-12 05:09:38,619 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 is added to the file system
2009-11-12 05:09:38,619 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.startFile: add /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 to namespace for DFSClient_276778250
2009-11-12 05:09:38,620 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=stack,powerset,engineering  ip=/208.76.44.141       cmd=create      src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617   dst=null        perm=stack:supergroup:rw-r--r--
2009-11-12 05:09:38,630 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250
2009-11-12 05:09:38,630 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250
2009-11-12 05:09:38,631 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 with blk_1786802275426334496_1076 block is added to the in-memory file system
2009-11-12 05:09:38,631 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617. blk_1786802275426334496_1076{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[208.76.44.141:51010|RBW], ReplicaUnderConstruction[208.76.44.139:51010|RBW], ReplicaUnderConstruction[208.76.44.140:51010|RBW]]}
2009-11-12 05:09:38,646 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.abandonBlock: blk_1786802275426334496_1076 of file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:38,646 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.abandonBlock: blk_1786802275426334496_1076of file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:38,646 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 with blk_1786802275426334496_1076 block is added to the file system
2009-11-12 05:09:44,655 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250
2009-11-12 05:09:44,655 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250
2009-11-12 05:09:44,656 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 with blk_5838454320666652488_1076 block is added to the in-memory file system
2009-11-12 05:09:44,656 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617. blk_5838454320666652488_1076{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[208.76.44.141:51010|RBW], ReplicaUnderConstruction[208.76.44.140:51010|RBW], ReplicaUnderConstruction[208.76.44.139:51010|RBW]]}
2009-11-12 05:09:44,662 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.abandonBlock: blk_5838454320666652488_1076 of file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:44,662 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.abandonBlock: blk_5838454320666652488_1076of file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:44,662 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 with blk_5838454320666652488_1076 block is added to the file system
2009-11-12 05:09:50,675 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250
2009-11-12 05:09:50,675 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250
2009-11-12 05:09:50,676 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 with blk_8615264034947520229_1076 block is added to the in-memory file system
2009-11-12 05:09:50,676 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617. blk_8615264034947520229_1076{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[208.76.44.141:51010|RBW], ReplicaUnderConstruction[208.76.44.139:51010|RBW], ReplicaUnderConstruction[208.76.44.140:51010|RBW]]}
2009-11-12 05:09:50,683 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.abandonBlock: blk_8615264034947520229_1076 of file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:50,683 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.abandonBlock: blk_8615264034947520229_1076of file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:50,683 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 with blk_8615264034947520229_1076 block is added to the file system
2009-11-12 05:09:56,685 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.addBlock: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250
2009-11-12 05:09:56,685 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.getAdditionalBlock: file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 for DFSClient_276778250
2009-11-12 05:09:56,686 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 with blk_-5632723632070749366_1077 block is added to the in-memory file system
2009-11-12 05:09:56,686 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.allocateBlock: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617. blk_-5632723632070749366_1077{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[ReplicaUnderConstruction[208.76.44.141:51010|RBW], ReplicaUnderConstruction[208.76.44.140:51010|RBW], ReplicaUnderConstruction[208.76.44.139:51010|RBW]]}
2009-11-12 05:09:56,691 DEBUG org.apache.hadoop.hdfs.StateChange: *BLOCK* NameNode.abandonBlock: blk_-5632723632070749366_1077 of file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:56,691 DEBUG org.apache.hadoop.hdfs.StateChange: BLOCK* NameSystem.abandonBlock: blk_-5632723632070749366_1077of file /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:09:56,691 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.addFile: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 with blk_-5632723632070749366_1077 block is added to the file system

Its failing above opening the file.

Below is open by the master doing its split....

2009-11-12 05:10:10,924 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=stack,powerset,engineering  ip=/208.76.44.139       cmd=open        src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617   dst=null        perm=null
2009-11-12 05:10:10,925 DEBUG org.apache.hadoop.hdfs.StateChange: *DIR* Namenode.delete: src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617, recursive=true
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* NameSystem.delete: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.delete: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* FSDirectory.unprotectedDelete: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 is removed
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.server.namenode.LeaseManager: LeaseManager.findLease: prefix=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.server.namenode.LeaseManager: LeaseManager.removeLeaseWithPrefixPath: entry=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617=[Lease.  Holder: DFSClient_276778250, pendingcreates: 1]
2009-11-12 05:10:10,926 DEBUG org.apache.hadoop.hdfs.StateChange: DIR* Namesystem.delete: /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617 is removed
2009-11-12 05:10:10,927 INFO org.apache.hadoop.hdfs.server.namenode.FSNamesystem.audit: ugi=stack,powerset,engineering  ip=/208.76.44.139       cmd=delete      src=/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617   dst=null        perm=null

{code}

The above open on master actually throws an EOFE.  File length is seen as zero.  We don't do the available on the backing stream to get the length hdfs sees.

> Second RS goes down because "Bad connect ack with firstBadLink as..."
> ---------------------------------------------------------------------
>
>                 Key: HBASE-1973
>                 URL: https://issues.apache.org/jira/browse/HBASE-1973
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> Placeholder for issue where a second RS goes down when I kill another RS and its DN.  I asked over on hdfs-user why we're stuck on 140.  Why can't it move on to another DN creating a block.
> {code}
> 2009-11-12 05:09:38,624 [regionserver/208.76.44.141:60020.logRoller] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: Roll /hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002405346, entries=52747, calcsize=63757973, filesize=58588977. New hlog /hbase/.logs/aa0-000-14.u.powerset.com,60020,12
> 58002404935/hlog.dat.1258002578617
> 2009-11-12 05:09:38,628 [regionserver/208.76.44.141:60020.logRoller] DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Found 0 hlogs to remove  out of total 1; oldest outstanding seqnum is 1 from region .META.,,1
> 2009-11-12 05:09:38,628 [IPC Server handler 16 on 60020] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: edit=0, write=TestTable/TestTable,,1258002466308/612228
> 2009-11-12 05:09:38,630 [regionserver/208.76.44.141:60020.logSyncer] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: sync
> 2009-11-12 05:09:38,646 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:38,646 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_1786802275426334496_1076
> 2009-11-12 05:09:44,662 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:44,662 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_5838454320666652488_1076
> 2009-11-12 05:09:46,226 [pool-1-thread-1] DEBUG org.apache.hadoop.hbase.io.hfile.LruBlockCache: Cache Stats: Sizes: Total=5.5656967MB (5836056), Free=672.77185MB (705452392), Max=678.3375MB (711288448), Counts: Blocks=0, Access=0, Hit=0, Miss=0, Evictions=0, Evicted=0, Ratios: Hit Ratio=NaN%, Miss Ratio=NaN%, Evicted
> /Run=NaN
> 2009-11-12 05:09:50,683 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:50,684 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_8615264034947520229_1076
> 2009-11-12 05:09:56,691 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Exception in createBlockOutputStream java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
> 2009-11-12 05:09:56,692 [Thread-61] INFO org.apache.hadoop.hdfs.DFSClient: Abandoning block blk_-5632723632070749366_1077
> 2009-11-12 05:10:02,696 [Thread-61] WARN org.apache.hadoop.hdfs.DFSClient: DataStreamer Exception: java.io.IOException: Unable to create new block.
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3100)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,697 [Thread-61] WARN org.apache.hadoop.hdfs.DFSClient: Could not get block locations. Source file "/hbase/.logs/aa0-000-14.u.powerset.com,60020,1258002404935/hlog.dat.1258002578617" - Aborting...
> 2009-11-12 05:10:02,699 [regionserver/208.76.44.141:60020.logSyncer] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
> java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,701 [regionserver/208.76.44.141:60020.logSyncer] ERROR org.apache.hadoop.hbase.regionserver.wal.HLog: Error while syncing, requesting close of hlog 
> java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.createBlockOutputStream(DFSClient.java:3160)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSClient.java:3080)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:2681)
> 2009-11-12 05:10:02,703 [IPC Server handler 20 on 60020] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
> java.io.IOException: IOException flush:java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.hflush(DFSClient.java:3527)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3473)
>         at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:829)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:751)
> 2009-11-12 05:10:02,703 [regionserver/208.76.44.141:60020.logSyncer] INFO org.apache.hadoop.hbase.regionserver.wal.HLog: regionserver/208.76.44.141:60020.logSyncer exiting
> 2009-11-12 05:10:02,706 [IPC Server handler 10 on 60020] FATAL org.apache.hadoop.hbase.regionserver.wal.HLog: Could not append. Requesting close of hlog
> java.io.IOException: IOException flush:java.io.IOException: Bad connect ack with firstBadLink as 208.76.44.140:51010
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.hflush(DFSClient.java:3527)
>         at org.apache.hadoop.hdfs.DFSClient$DFSOutputStream.sync(DFSClient.java:3473)
>         at org.apache.hadoop.fs.FSDataOutputStream.sync(FSDataOutputStream.java:97)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog.hflush(HLog.java:829)
>         at org.apache.hadoop.hbase.regionserver.wal.HLog$LogSyncer.run(HLog.java:751)
> {code}

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.