You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-commits@hadoop.apache.org by Apache Wiki <wi...@apache.org> on 2007/09/25 00:46:05 UTC

[Lucene-hadoop Wiki] Update of "Hbase/HbaseArchitecture" by JimKellerman

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Lucene-hadoop Wiki" for change notification.

The following page has been changed by JimKellerman:
http://wiki.apache.org/lucene-hadoop/Hbase/HbaseArchitecture

The comment on the change is:
add issue: when is region server dead?

------------------------------------------------------------------------------
  
  The multi-machine stuff (the HMaster and the H!RegionServer) are actively being enhanced and debugged.
  
- Other related features and TODOs:
+ Issues and TODOs:
+  1. How do we know if a region server is really dead, or if the network is partitioned or if the region server is merely late in reporting in or getting its lease renewed? If we decide that a region server is dead, and it is not, it could still be doing updates on behalf of clients, adding to its log. It is not until it does successfully report in that it knows the master has "delisted" it. Only at that point does it start flushing the cache, finishing the log, etc. In the mean time the master may be ripping the rug out from under it by trying to split its log file (the most recent of which will be zero length because it is visible, but has no content until the region server closes it), and may have already reassigned the regions being served by the region server to another one, which will at a minimum lose data, and in the worst case, corrupt the region. This issue is being addressed in [https://issues.apache.org/jira/browse/HADOOP-1937 HADOOP-1937]
   1. Vuk Ercegovac [[MailTo(vercego AT SPAMFREE us DOT ibm DOT com)]] of IBM Almaden Research pointed out that keeping HBase HRegion edit logs in HDFS is currently flawed.  HBase writes edits to logs and to a memcache.  The 'atomic' write to the log is meant to serve as insurance against abnormal !RegionServer exit: on startup, the log is rerun to reconstruct an HRegion's last wholesome state. But files in HDFS do not 'exist' until they are cleanly closed -- something that will not happen if !RegionServer exits without running its 'close'.
   1. The HMemcache lookup structure is relatively inefficient
   1. Implement some kind of block caching in HRegion. While the DFS isn't hitting the disk to fetch blocks, HRegion is making IPC calls to DFS (via !MapFile)