You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2008/11/27 00:12:19 UTC

[Hadoop Wiki] Trivial Update of "Hbase/Troubleshooting" by stack

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Hadoop Wiki" for change notification.

The following page has been changed by stack:
http://wiki.apache.org/hadoop/Hbase/Troubleshooting

The comment on the change is:
Add some 0.18.x troubleshooting

------------------------------------------------------------------------------
   1. [#3 Problem: Replay of hlog required, forcing regionserver restart]
   2. [#4 Problem: Master initializes, but Region Servers do not]
   1. [#5 Problem: On migration, no files in root directory]
+  1. [#6 Problem: "xceiverCount 258 exceeds the limit of concurrent xcievers 256"]
+  1. [#7 Problem: "No live nodes contain current block"]
  
  [[Anchor(1)]]
  == Problem: Master initializes, but Region Servers do not ==
@@ -116, +118 @@

  === Resolution ===
   * Either reduce the load or set dfs.datanode.max.xcievers (hadoop-site.xml) to a larger value than the default (256). Note that in order to change the tunable, you need 0.17.2 or 0.18.0 (HADOOP-3859).
  
+ [[Anchor(6)]]
+ == Problem: "xceiverCount 258 exceeds the limit of concurrent xcievers 256" ==
+  * See an exception with above message in logs (usually hadoop 0.18.x).
+ === Causes ===
+  * An upper bound on connections was added in Hadoop (HADOOP-3633/HADOOP-3859).
+ === Resolution ===
+  * Up the maximum by setting '''dfs.datanode.max.xcievers''' (sic).  See [http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200810.mbox/%3C20126171.post@talk.nabble.com%3E message from jean-adrien] for some background.
+ 
+ 
+ [[Anchor(7)]]
+ == Problem: "No live nodes contain current block" ==
+  * See an exception with above message in logs (usually hadoop 0.18.x).
+ === Causes ===
+  * Slow datanodes are marked as down by DFSClient; eventually all replicas are marked as 'bad' (HADOOP-3831).
+ === Resolution ===
+  * Try setting '''dfs.datanode.socket.write.timeout''' to zero.  See the thread at [http://mail-archives.apache.org/mod_mbox/hadoop-hbase-user/200810.mbox/%3C20126171.post@talk.nabble.com%3E message from jean-adrien] for some background.
+