You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@hbase.apache.org by "stack (JIRA)" <ji...@apache.org> on 2008/10/15 22:58:46 UTC

[jira] Created: (HBASE-930) RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...

RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...
-----------------------------------------------------------------------------------------------------------------------------------

                 Key: HBASE-930
                 URL: https://issues.apache.org/jira/browse/HBASE-930
             Project: Hadoop HBase
          Issue Type: Bug
            Reporter: stack


HDFS went wonky.  Manifest itself in regionserver with below: i.e. first can't replicate and then the exceptions every time we append and then every time we try to rotate the log file.   Exception is always: "IOException: Could not get block locations. Aborting..."  Meantime the HDFS is fixed but for whatever reason the HRS just keeps on with the below failings; as though the error is stuck in the DFSClient.

{code}
...
2008-10-15 02:54:59,867 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1399814750/alternate_url store size is 2.9m
2008-10-15 02:54:59,868 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region enwiki_old,7qE8TUI5v-NIZu_9oZhaJF==,1221856888271 in 38sec
2008-10-15 02:59:08,507 INFO org.apache.hadoop.dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 could only be re
plicated to 0 nodes, instead of 1
        at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
        at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
        at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)

        at org.apache.hadoop.ipc.Client.call(Client.java:715)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)

...
2008-10-15 02:59:11,357 WARN org.apache.hadoop.dfs.DFSClient: NotReplicatedYetException sleeping /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 retries left 1
2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.122403
8569764 could only be replicated to 0 nodes, instead of 1
        at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
        at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
        at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)

        at org.apache.hadoop.ipc.Client.call(Client.java:715)
        at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
        at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
        at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
        at java.lang.reflect.Method.invoke(Unknown Source)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
        at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
        at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)

2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: Error Recovery for block null bad datanode[0]
2008-10-15 02:59:14,566 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log
java.io.IOException: Could not get block locations. Aborting...
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
2008-10-15 02:59:14,567 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 81
2008-10-15 02:59:14,567 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@3e4afee5, row => Dnsw4RukHJ5-0FQddY35mF==, {column => misc:upload_time, value => '...', column => page:mime, value => '...', c
olumn => page:content, value => '...', column => page:url, value => '...'}, -1) from 208.76.44.183:59031: error: java.io.IOException: Could not get block locations. Aborting...
java.io.IOException: Could not get block locations. Aborting...
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
2008-10-15 02:59:14,588 ERROR org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed
java.io.IOException: Could not get block locations. Aborting...
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
        at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
...
{code}

For now adding abort of regionserver if we can't close the log roller file.  If we can't close, then we're losing edits.  If this behavior causes us close to much, then need to dig in more.

Happened on pset cluster running 0.18.1 hadoop and 0.18.0 hbase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Updated: (HBASE-930) RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-930:
------------------------

    Attachment: 930.patch

Patch that shutsdown regionserver when we fail close of write-ahead-log.

> RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-930
>                 URL: https://issues.apache.org/jira/browse/HBASE-930
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>         Attachments: 930.patch
>
>
> HDFS went wonky.  Manifest itself in regionserver with below: i.e. first can't replicate and then the exceptions every time we append and then every time we try to rotate the log file.   Exception is always: "IOException: Could not get block locations. Aborting..."  Meantime the HDFS is fixed but for whatever reason the HRS just keeps on with the below failings; as though the error is stuck in the DFSClient.
> {code}
> ...
> 2008-10-15 02:54:59,867 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1399814750/alternate_url store size is 2.9m
> 2008-10-15 02:54:59,868 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region enwiki_old,7qE8TUI5v-NIZu_9oZhaJF==,1221856888271 in 38sec
> 2008-10-15 02:59:08,507 INFO org.apache.hadoop.dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 could only be re
> plicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> ...
> 2008-10-15 02:59:11,357 WARN org.apache.hadoop.dfs.DFSClient: NotReplicatedYetException sleeping /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 retries left 1
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.122403
> 8569764 could only be replicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: Error Recovery for block null bad datanode[0]
> 2008-10-15 02:59:14,566 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 81
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@3e4afee5, row => Dnsw4RukHJ5-0FQddY35mF==, {column => misc:upload_time, value => '...', column => page:mime, value => '...', c
> olumn => page:content, value => '...', column => page:url, value => '...'}, -1) from 208.76.44.183:59031: error: java.io.IOException: Could not get block locations. Aborting...
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,588 ERROR org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> ...
> {code}
> For now adding abort of regionserver if we can't close the log roller file.  If we can't close, then we're losing edits.  If this behavior causes us close to much, then need to dig in more.
> Happened on pset cluster running 0.18.1 hadoop and 0.18.0 hbase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Resolved: (HBASE-930) RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack resolved HBASE-930.
-------------------------

       Resolution: Fixed
    Fix Version/s: 0.19.0
                   0.18.1

Applied branch and trunk

> RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-930
>                 URL: https://issues.apache.org/jira/browse/HBASE-930
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>             Fix For: 0.18.1, 0.19.0
>
>         Attachments: 930.patch
>
>
> HDFS went wonky.  Manifest itself in regionserver with below: i.e. first can't replicate and then the exceptions every time we append and then every time we try to rotate the log file.   Exception is always: "IOException: Could not get block locations. Aborting..."  Meantime the HDFS is fixed but for whatever reason the HRS just keeps on with the below failings; as though the error is stuck in the DFSClient.
> {code}
> ...
> 2008-10-15 02:54:59,867 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1399814750/alternate_url store size is 2.9m
> 2008-10-15 02:54:59,868 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region enwiki_old,7qE8TUI5v-NIZu_9oZhaJF==,1221856888271 in 38sec
> 2008-10-15 02:59:08,507 INFO org.apache.hadoop.dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 could only be re
> plicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> ...
> 2008-10-15 02:59:11,357 WARN org.apache.hadoop.dfs.DFSClient: NotReplicatedYetException sleeping /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 retries left 1
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.122403
> 8569764 could only be replicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: Error Recovery for block null bad datanode[0]
> 2008-10-15 02:59:14,566 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 81
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@3e4afee5, row => Dnsw4RukHJ5-0FQddY35mF==, {column => misc:upload_time, value => '...', column => page:mime, value => '...', c
> olumn => page:content, value => '...', column => page:url, value => '...'}, -1) from 208.76.44.183:59031: error: java.io.IOException: Could not get block locations. Aborting...
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,588 ERROR org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> ...
> {code}
> For now adding abort of regionserver if we can't close the log roller file.  If we can't close, then we're losing edits.  If this behavior causes us close to much, then need to dig in more.
> Happened on pset cluster running 0.18.1 hadoop and 0.18.0 hbase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Commented: (HBASE-930) RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...

Posted by "stack (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-930?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12639971#action_12639971 ] 

stack commented on HBASE-930:
-----------------------------

The above logging goes on for hours until its noticed by admin.

> RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-930
>                 URL: https://issues.apache.org/jira/browse/HBASE-930
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>
> HDFS went wonky.  Manifest itself in regionserver with below: i.e. first can't replicate and then the exceptions every time we append and then every time we try to rotate the log file.   Exception is always: "IOException: Could not get block locations. Aborting..."  Meantime the HDFS is fixed but for whatever reason the HRS just keeps on with the below failings; as though the error is stuck in the DFSClient.
> {code}
> ...
> 2008-10-15 02:54:59,867 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1399814750/alternate_url store size is 2.9m
> 2008-10-15 02:54:59,868 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region enwiki_old,7qE8TUI5v-NIZu_9oZhaJF==,1221856888271 in 38sec
> 2008-10-15 02:59:08,507 INFO org.apache.hadoop.dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 could only be re
> plicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> ...
> 2008-10-15 02:59:11,357 WARN org.apache.hadoop.dfs.DFSClient: NotReplicatedYetException sleeping /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 retries left 1
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.122403
> 8569764 could only be replicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: Error Recovery for block null bad datanode[0]
> 2008-10-15 02:59:14,566 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 81
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@3e4afee5, row => Dnsw4RukHJ5-0FQddY35mF==, {column => misc:upload_time, value => '...', column => page:mime, value => '...', c
> olumn => page:content, value => '...', column => page:url, value => '...'}, -1) from 208.76.44.183:59031: error: java.io.IOException: Could not get block locations. Aborting...
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,588 ERROR org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> ...
> {code}
> For now adding abort of regionserver if we can't close the log roller file.  If we can't close, then we're losing edits.  If this behavior causes us close to much, then need to dig in more.
> Happened on pset cluster running 0.18.1 hadoop and 0.18.0 hbase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


[jira] Assigned: (HBASE-930) RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-930?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack reassigned HBASE-930:
---------------------------

    Assignee: stack

> RegionServer stuck: HLog: Could not append. Requesting close of log java.io.IOException: Could not get block locations. Aborting...
> -----------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HBASE-930
>                 URL: https://issues.apache.org/jira/browse/HBASE-930
>             Project: Hadoop HBase
>          Issue Type: Bug
>            Reporter: stack
>            Assignee: stack
>             Fix For: 0.18.1, 0.19.0
>
>         Attachments: 930.patch
>
>
> HDFS went wonky.  Manifest itself in regionserver with below: i.e. first can't replicate and then the exceptions every time we append and then every time we try to rotate the log file.   Exception is always: "IOException: Could not get block locations. Aborting..."  Meantime the HDFS is fixed but for whatever reason the HRS just keeps on with the below failings; as though the error is stuck in the DFSClient.
> {code}
> ...
> 2008-10-15 02:54:59,867 DEBUG org.apache.hadoop.hbase.regionserver.HStore: Completed compaction of 1399814750/alternate_url store size is 2.9m
> 2008-10-15 02:54:59,868 INFO org.apache.hadoop.hbase.regionserver.HRegion: compaction completed on region enwiki_old,7qE8TUI5v-NIZu_9oZhaJF==,1221856888271 in 38sec
> 2008-10-15 02:59:08,507 INFO org.apache.hadoop.dfs.DFSClient: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 could only be re
> plicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> ...
> 2008-10-15 02:59:11,357 WARN org.apache.hadoop.dfs.DFSClient: NotReplicatedYetException sleeping /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.1224038569764 retries left 1
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: DataStreamer Exception: org.apache.hadoop.ipc.RemoteException: java.io.IOException: File /hbase/aa0-000-8.u.powerset.com/log_208.76.45.180_1223616861290_60020/hlog.dat.122403
> 8569764 could only be replicated to 0 nodes, instead of 1
>         at org.apache.hadoop.dfs.FSNamesystem.getAdditionalBlock(FSNamesystem.java:1117)
>         at org.apache.hadoop.dfs.NameNode.addBlock(NameNode.java:330)
>         at sun.reflect.GeneratedMethodAccessor15.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:452)
>         at org.apache.hadoop.ipc.Server$Handler.run(Server.java:888)
>         at org.apache.hadoop.ipc.Client.call(Client.java:715)
>         at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:216)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at sun.reflect.GeneratedMethodAccessor9.invoke(Unknown Source)
>         at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
>         at java.lang.reflect.Method.invoke(Unknown Source)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>         at org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>         at org.apache.hadoop.dfs.$Proxy1.addBlock(Unknown Source)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.locateFollowingBlock(DFSClient.java:2440)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.nextBlockOutputStream(DFSClient.java:2323)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1800(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1912)
> 2008-10-15 02:59:14,566 WARN org.apache.hadoop.dfs.DFSClient: Error Recovery for block null bad datanode[0]
> 2008-10-15 02:59:14,566 FATAL org.apache.hadoop.hbase.regionserver.HLog: Could not append. Requesting close of log
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.hbase.regionserver.LogRoller: Rolling hlog. Number of entries: 81
> 2008-10-15 02:59:14,567 INFO org.apache.hadoop.ipc.Server: IPC Server handler 8 on 60020, call batchUpdate([B@3e4afee5, row => Dnsw4RukHJ5-0FQddY35mF==, {column => misc:upload_time, value => '...', column => page:mime, value => '...', c
> olumn => page:content, value => '...', column => page:url, value => '...'}, -1) from 208.76.44.183:59031: error: java.io.IOException: Could not get block locations. Aborting...
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> 2008-10-15 02:59:14,588 ERROR org.apache.hadoop.hbase.regionserver.LogRoller: Log rolling failed
> java.io.IOException: Could not get block locations. Aborting...
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.processDatanodeError(DFSClient.java:2143)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.access$1400(DFSClient.java:1735)
>         at org.apache.hadoop.dfs.DFSClient$DFSOutputStream$DataStreamer.run(DFSClient.java:1889)
> ...
> {code}
> For now adding abort of regionserver if we can't close the log roller file.  If we can't close, then we're losing edits.  If this behavior causes us close to much, then need to dig in more.
> Happened on pset cluster running 0.18.1 hadoop and 0.18.0 hbase.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.