You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hbase.apache.org by mb...@apache.org on 2012/09/19 03:52:11 UTC

svn commit: r1387430 - in /hbase/branches/0.89-fb/src: main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java

Author: mbautin
Date: Wed Sep 19 01:52:10 2012
New Revision: 1387430

URL: http://svn.apache.org/viewvc?rev=1387430&view=rev
Log:
[HBASE-6813] [0.89-fb] Optmise the time we hold the updateLock during Log Roll

Author: aaiyer

Summary:
Titan appserver sees a lot of timeouts/errors on cell 22 and cell 4.
One thing that came out of the investigation is that (a) most of the
operationTooSlow was multiPuts, and (b) these seem to be correlated
around Log rolls.

It appears that when a log roll is happening, the log roller grabs the
updateLock, and no put/multiPut can write to WAL at that time. A lot
of I/O seem to be done under the lock, some of which can be done
outside.

Specifically,
   (i) Create the new Log file, before we grab the lock.
   (ii) sync under the lock. But allow the close to be done later.
        This is because closing a HDFS file will require the NN to receive
        a blockReport from at least one of the DN. We do not want to block
        for this, while holding the updatesLock.
   (iii) Write/Sync a dummy entry to the new Log, before we try to "roll" it
   in. This is done because the first write involves a call to NN for
   addBlock, and to set up the appropriate HDFS pipeline. The first sync needs
   to wait until the NN writes its state to disk (since the modifications at
       add block were only kept in memory). By writing a dummy entry, we
   ensure that "real" appends, that come later, do not have to go through this
   costly process.

Test Plan:
Essentially a performance optimisation. Will run all MR unit test -- for safety.
Will check the performance after deployment to the DL.

On the dev cluster, msgstore209: this looks really promising
about 2 orders of magnitude improvement.

on msgstore269.sn4:
2012-09-17 11:37:33,924 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 502 ms.
2012-09-17 11:37:50,141 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 415 ms.
2012-09-17 11:38:05,498 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 510 ms.
2012-09-17 11:38:20,087 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 560 ms.
2012-09-17 11:38:33,739 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 521 ms.
2012-09-17 11:38:47,537 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 585 ms.
2012-09-17 11:39:05,388 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 425 ms.
2012-09-17 11:39:20,890 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 653 ms.
2012-09-17 11:39:37,152 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 442 ms.
2012-09-17 12:02:55,975 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 132 ms.
2012-09-17 12:03:50,573 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 664 ms.
2012-09-17 12:04:06,833 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 582 ms.
2012-09-17 12:04:25,330 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 695 ms.
2012-09-17 12:10:15,282 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 161 ms.
2012-09-17 12:10:29,485 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 497 ms.
2012-09-17 12:10:45,153 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 564 ms.
2012-09-17 12:11:02,016 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 460 ms.
2012-09-17 12:18:06,894 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 142 ms.
2012-09-17 12:19:01,013 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 618 ms.
2012-09-17 12:19:15,486 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 582 ms.
2012-09-17 12:19:29,246 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 495 ms.
2012-09-17 12:22:00,707 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 294 ms.
2012-09-17 12:24:17,927 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 141 ms.
2012-09-17 12:25:11,850 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 777 ms.
2012-09-17 12:25:26,177 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 687 ms.
2012-09-17 12:25:38,496 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 645 ms.
2012-09-17 12:25:56,679 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 435 ms.
2012-09-17 12:28:27,963 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:29:25,607 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:29:42,078 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 16 ms.
2012-09-17 12:29:56,325 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 17 ms.
2012-09-17 12:33:55,944 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:34:13,230 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 22 ms.
2012-09-17 12:34:34,981 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 1 ms.
2012-09-17 12:34:51,631 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:08,717 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 6 ms.
2012-09-17 12:35:21,406 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 5 ms.
2012-09-17 12:35:34,232 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:49,090 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:03,452 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:17,868 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:33,197 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 6 ms.
2012-09-17 12:36:51,197 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 390 ms.
2012-09-17 12:37:03,352 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 5 ms.
2012-09-17 12:37:16,447 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:37:31,997 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.

on msgstore289.snc4:

2012-09-17 11:38:21,966 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 412 ms.
2012-09-17 11:38:42,371 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 581 ms.
2012-09-17 11:38:57,865 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 560 ms.
2012-09-17 11:39:11,573 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 501 ms.
2012-09-17 11:39:25,504 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 487 ms.
2012-09-17 11:39:42,216 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 448 ms.
2012-09-17 12:02:55,938 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 133 ms.
2012-09-17 12:03:50,524 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 622 ms.
2012-09-17 12:04:04,616 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 566 ms.
2012-09-17 12:04:16,900 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 523 ms.
2012-09-17 12:04:34,805 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 419 ms.
2012-09-17 12:10:29,172 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 432 ms.
2012-09-17 12:10:45,001 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 444 ms.
2012-09-17 12:10:59,241 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 443 ms.
2012-09-17 12:18:07,314 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 132 ms.
2012-09-17 12:19:03,560 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 531 ms.
2012-09-17 12:19:20,536 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 499 ms.
2012-09-17 12:19:37,737 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 450 ms.
2012-09-17 12:22:08,662 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 329 ms.
2012-09-17 12:24:17,868 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 140 ms.
2012-09-17 12:25:11,547 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 517 ms.
2012-09-17 12:25:28,635 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 395 ms.
2012-09-17 12:25:46,021 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 540 ms.
2012-09-17 12:28:27,954 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:29:23,291 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:29:38,049 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 5 ms.
2012-09-17 12:29:49,822 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 6 ms.
2012-09-17 12:33:51,669 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:34:06,871 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:34:29,074 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:34:42,253 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 7 ms.
2012-09-17 12:34:55,319 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:11,577 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:27,700 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:39,159 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:56,510 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:09,439 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:25,609 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:38,043 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:52,594 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
^N2012-09-17 12:37:11,480 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:37:26,547 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:37:42,288 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 1 ms.

msgstore249:

2012-09-17 11:38:20,865 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 395 ms.
2012-09-17 11:38:33,491 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 184 ms.
2012-09-17 11:38:49,294 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 100 ms.
2012-09-17 11:39:02,444 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 498 ms.
2012-09-17 11:39:18,797 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 533 ms.
2012-09-17 11:39:33,345 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 473 ms.
2012-09-17 12:02:55,924 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 138 ms.
2012-09-17 12:03:52,781 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 675 ms.
2012-09-17 12:04:09,109 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 535 ms.
2012-09-17 12:04:24,060 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 510 ms.
2012-09-17 12:10:18,085 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 318 ms.
2012-09-17 12:10:31,945 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 422 ms.
2012-09-17 12:10:46,144 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 595 ms.
2012-09-17 12:11:02,067 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 588 ms.
2012-09-17 12:18:06,716 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 141 ms.
2012-09-17 12:19:00,312 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 411 ms.
2012-09-17 12:19:17,512 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 390 ms.
2012-09-17 12:19:38,872 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 716 ms.
2012-09-17 12:22:05,979 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 397 ms.
2012-09-17 12:24:17,039 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 141 ms.
2012-09-17 12:25:13,046 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 507 ms.
2012-09-17 12:25:29,480 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 428 ms.
2012-09-17 12:25:44,468 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: Blocked updates for 488 ms.
2012-09-17 12:28:26,604 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:29:20,893 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:29:38,118 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:29:55,748 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:33:51,870 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:34:10,603 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:34:33,645 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:34:49,720 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 7 ms.
2012-09-17 12:35:05,421 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:23,116 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:41,707 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:35:55,909 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 14 ms.
2012-09-17 12:36:10,984 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:26,346 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:36:44,326 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 1 ms.
2012-09-17 12:36:59,502 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:37:15,892 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:37:28,174 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.
2012-09-17 12:37:44,640 DEBUG org.apache.hadoop.hbase.regionserver.wal.HLog: (optimised) Blocked updates for 0 ms.

Reviewers: kannan, hkuang, liyintang

Reviewed By: liyintang

CC: hbase-eng@, nspiegelberg, mycnyc, amirshim, kranganathan, liyintang

Differential Revision: https://phabricator.fb.com/D575297

Task ID: 1734024

Modified:
    hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
    hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
    hbase/branches/0.89-fb/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java

Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java?rev=1387430&r1=1387429&r2=1387430&view=diff
==============================================================================
--- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java (original)
+++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLog.java Wed Sep 19 01:52:10 2012
@@ -216,7 +216,7 @@ public class HLog implements Syncable {
   private final ConcurrentSkipListMap<byte [], Long> lastSeqWritten =
     new ConcurrentSkipListMap<byte [], Long>(Bytes.BYTES_COMPARATOR);
 
-  private boolean closed = false;
+  private volatile boolean closed = false;
 
   private final AtomicLong logSeqNum = new AtomicLong(0);
 
@@ -263,6 +263,7 @@ public class HLog implements Syncable {
   private static final Pattern pattern = Pattern.compile(".*\\.\\d*");
 
   private static final FileStatus[] NO_FILES = new FileStatus[0];
+  protected static final byte[] DUMMY = Bytes.toBytes("");
 
   static byte [] COMPLETE_CACHE_FLUSH;
   static {
@@ -499,69 +500,99 @@ public class HLog implements Syncable {
    */
   public byte [][] rollWriter() throws IOException {
     // Return if nothing to flush.
-    if (this.writer != null && this.numEntries.get() <= 0) {
+    if (this.closed || this.writer != null && this.numEntries.get() <= 0) {
       return null;
     }
     byte [][] regionsToFlush = null;
     this.cacheFlushLock.lock();
+    long t0 = 0;
+    long t1 = 0;
     try {
+      // Creating the new file can be done earlier.
+      long newFileNum = System.currentTimeMillis();
+      Path newPath = computeFilename(newFileNum);
+      Writer newWriter;
+      try {
+        newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
+      } catch (IOException ioe) {
+        // If we fail to create a new writer, let us clean up the file.
+        fs.delete(newPath, false);
+        throw ioe;
+      }
+      int newFileReplication = fs.getFileStatus(newPath).getReplication();
+      OutputStream newOutStream = null;
+      if (newWriter instanceof SequenceFileLogWriter) {
+        newOutStream = ((SequenceFileLogWriter) newWriter)
+            .getDFSCOutputStream();
+        // append a dummy entry and sync. So we perform the costly
+        // allocateBlock and sync before we get the lock to roll writers.
+        WALEdit edit = new WALEdit();
+        HLogKey key = makeKey(DUMMY /* regionName */, DUMMY /* tableName */,
+            0, System.currentTimeMillis());
+        newWriter.append(new HLog.Entry(key, edit));
+        syncWriter(newWriter);
+      }
+
+      Writer oldWriter;
+      long oldFileLogSeqNum, oldFileNum;
+      int oldNumEntries;
       synchronized (updateLock) {
+        t0 = EnvironmentEdgeManager.currentTimeMillis();
         if (closed) {
           return regionsToFlush;
         }
+
         this.logRollPending = true;
         // Clean up current writer.
-        Path oldFile = cleanupCurrentWriter(this.filenum);
-        this.filenum = System.currentTimeMillis();
-        Path newPath = computeFilename();
-        this.writer = createWriter(fs, newPath, HBaseConfiguration.create(conf));
-        this.initialReplication = fs.getFileStatus(newPath).getReplication();
-
-        // Can we get at the dfsclient outputstream?  If an instance of
-        // SFLW, it'll have done the necessary reflection to get at the
-        // protected field name.
-        this.hdfs_out = null;
-        if (this.writer instanceof SequenceFileLogWriter) {
-          this.hdfs_out =
-            ((SequenceFileLogWriter)this.writer).getDFSCOutputStream();
-        }
-        // If there was an IOException before this point then either the old log
-        // has not been closed or the new writer has not been created.
-        // LogRoller will keep retyring. Nothing can be appended to the logs
+        syncWriter(this.writer);
+
+        // save the old information required for close()
+        oldFileNum = this.filenum;
+        oldWriter = this.writer;
+        oldNumEntries = this.numEntries.get();
+
+        this.filenum = newFileNum;
+        this.writer = newWriter;
+        this.initialReplication = newFileReplication;
+        this.hdfs_out = newOutStream;
+        this.numEntries.set(0);
+
+        // If there was an IOException before this point while the old file is being
+        // synched; LogRoller will keep retyring. Nothing can be appended to the logs
         // when logRollPending is true
         this.logRollPending = false;
+        oldFileLogSeqNum = this.logSeqNum.get();
+        t1 = EnvironmentEdgeManager.currentTimeMillis();
+      }
 
-        LOG.info((oldFile != null?
-            "Roll " + FSUtils.getPath(oldFile) + ", entries=" +
-            this.numEntries.get() +
-            ", filesize=" +
-            this.fs.getFileStatus(oldFile).getLen() + ". ": "") +
-          "New hlog " + FSUtils.getPath(newPath));
-        // Tell our listeners that a new log was created
-        if (!this.actionListeners.isEmpty()) {
-          for (LogActionsListener list : this.actionListeners) {
-            list.logRolled(newPath);
-          }
-        }
-        // Can we delete any of the old log files?
-        if (this.outputfiles.size() > 0) {
-          if (this.lastSeqWritten.size() <= 0) {
-            LOG.debug("Last sequence written is empty. Deleting all old hlogs");
-            // If so, then no new writes have come in since all regions were
-            // flushed (and removed from the lastSeqWritten map). Means can
-            // remove all but currently open log file.
-            TreeSet<Long> tempSet = new TreeSet<Long>(outputfiles.keySet());
-            for (Long seqNum : tempSet) {
-              archiveLogFile(outputfiles.get(seqNum), seqNum);
-              outputfiles.remove(seqNum);
-            }
-            assert outputfiles.size() == 0 :
-              "Someone added new log files? How?";
-          } else {
-            regionsToFlush = cleanOldLogs();
+      Path oldFile = closeWriter(oldWriter, oldFileNum, oldFileLogSeqNum);
+      LOG.info((oldFile != null ? "Roll " + FSUtils.getPath(oldFile)
+          + ", entries=" + oldNumEntries + ", filesize="
+          + this.fs.getFileStatus(oldFile).getLen() + ". " : "")
+          + "New hlog " + FSUtils.getPath(newPath)
+          + " Held updateLock for " + (t1 -t0) + " ms.");
+      // Tell our listeners that a new log was created
+      if (!this.actionListeners.isEmpty()) {
+        for (LogActionsListener list : this.actionListeners) {
+          list.logRolled(newPath);
+        }
+      }
+      // Can we delete any of the old log files?
+      if (this.outputfiles.size() > 0) {
+        if (this.lastSeqWritten.size() <= 0) {
+          LOG.debug("Last sequence written is empty. Deleting all old hlogs");
+          // If so, then no new writes have come in since all regions were
+          // flushed (and removed from the lastSeqWritten map). Means can
+          // remove all but currently open log file.
+          TreeSet<Long> tempSet = new TreeSet<Long>(outputfiles.keySet());
+          for (Long seqNum : tempSet) {
+            archiveLogFile(outputfiles.get(seqNum), seqNum);
+            outputfiles.remove(seqNum);
           }
+          assert outputfiles.size() == 0 : "Someone added new log files? How?";
+        } else {
+          regionsToFlush = cleanOldLogs();
         }
-        this.numEntries.set(0);
       }
     } finally {
       this.cacheFlushLock.unlock();
@@ -724,23 +755,46 @@ public class HLog implements Syncable {
    * @return Path to current writer or null if none.
    * @throws IOException
    */
-  Path cleanupCurrentWriter(final long currentfilenum)
-  throws IOException {
-    Path oldFile = null;
+  Path cleanupCurrentWriter(final long currentfilenum) throws IOException {
+    Path hlog = null;
     if (this.writer != null) {
-      try {
-        if (!writerCloseSyncDone) {
-          this.writer.sync();
-        }
-      } catch (IOException ioe) {
-        syncFailureAbortStrategy.abort("log sync failed when trying to close " + this.writer, ioe);
+      if (!writerCloseSyncDone) {
+        syncWriter(this.writer);
       }
       // Close the current writer
       writerCloseSyncDone = true;
+      hlog = closeWriter(this.writer, currentfilenum, this.logSeqNum.get());
+      this.writer =  null;
+    }
+    return hlog;
+  }
+
+  /*
+   * Sync the edits to the datanodes.
+   * If this is called on this.writer, it should be holding the updates lock.
+   */
+  void syncWriter(Writer writer) throws IOException {
+    if (writer != null) {
       try {
-        this.writer.close();
+          writer.sync();
       } catch (IOException ioe) {
-        Path fname = computeFilename();
+        syncFailureAbortStrategy.abort("log sync failed when trying to close "
+            + writer, ioe);
+      }
+    }
+  }
+
+  /*
+   * Closes the given writer. Updates the output files using the specified logSeqNum
+   * @return The path to the closed file.
+   */
+  Path closeWriter(Writer writer, final long filenum, long logSeqNum) throws IOException {
+    Path oldFile = null;
+    if (writer != null) {
+      try {
+        writer.close();
+      } catch (IOException ioe) {
+        Path fname = computeFilename(filenum);
         if (!tryRecoverFileLease(fs, fname, conf)) {
           IOException ioe2 =
               new IOException("lease recovery pending for " + fname, ioe);
@@ -748,10 +802,9 @@ public class HLog implements Syncable {
         }
       }
       writerCloseSyncDone = false;
-      this.writer = null;
-      if (currentfilenum >= 0) {
-        oldFile = computeFilename();
-        this.outputfiles.put(Long.valueOf(this.logSeqNum.get()), oldFile);
+      if (filenum >= 0) {
+        oldFile = computeFilename(filenum);
+        this.outputfiles.put(Long.valueOf(logSeqNum), oldFile);
       }
     }
     return oldFile;
@@ -771,11 +824,20 @@ public class HLog implements Syncable {
   }
 
   /**
+   * This is a convenience method that is used by tests to get the current
+   * filename without knowing the file-number.
+   * @return Path
+   */
+  protected Path computeFilename() {
+    return computeFilename(this.filenum);
+  }
+
+  /**
    * This is a convenience method that computes a new filename with a given
    * file-number.
    * @return Path
    */
-  protected Path computeFilename() {
+  protected Path computeFilename(long filenum) {
     if (filenum < 0) {
       throw new RuntimeException("hlog file number can't be < 0");
     }
@@ -1905,6 +1967,9 @@ public class HLog implements Syncable {
         long threadTime = System.currentTimeMillis();
         try {
           int editsCount = 0;
+          if (Arrays.equals(region, DUMMY)) { // ignore dummy edits
+            return null;
+          }
           WriterAndPath wap = logWriters.get(region);
           for (Entry logEntry: entries) {
             checkForShutdown(shutdownStatus);

Modified: hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java?rev=1387430&r1=1387429&r2=1387430&view=diff
==============================================================================
--- hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java (original)
+++ hbase/branches/0.89-fb/src/main/java/org/apache/hadoop/hbase/regionserver/wal/HLogSplitter.java Wed Sep 19 01:52:10 2012
@@ -194,6 +194,7 @@ public class HLogSplitter {
         "into a temporary staging area.");
 
     Object BAD_WRITER = new Object();
+    logWriters.put(HLog.DUMMY, BAD_WRITER);
 
     boolean progress_failed = false;
 

Modified: hbase/branches/0.89-fb/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java
URL: http://svn.apache.org/viewvc/hbase/branches/0.89-fb/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java?rev=1387430&r1=1387429&r2=1387430&view=diff
==============================================================================
--- hbase/branches/0.89-fb/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java (original)
+++ hbase/branches/0.89-fb/src/test/java/org/apache/hadoop/hbase/regionserver/wal/TestHLog.java Wed Sep 19 01:52:10 2012
@@ -24,6 +24,7 @@ import static org.junit.Assert.assertTru
 
 import java.io.IOException;
 import java.lang.reflect.Method;
+import java.util.Arrays;
 import java.util.HashMap;
 import java.util.List;
 import java.util.Map;
@@ -217,7 +218,7 @@ public class TestHLog  {
     int count = 0;
     HLog.Entry entry = new HLog.Entry();
     while ((entry = reader.next(entry)) != null) count++;
-    assertEquals(total, count);
+    assertEquals(total + 1, count); // one extra, for the initial Dummy entry.
     reader.close();
     // Add test that checks to see that an open of a Reader works on a file
     // that has had a sync done on it.
@@ -236,7 +237,7 @@ public class TestHLog  {
     reader = HLog.getReader(fs, walPath, conf);
     count = 0;
     while((entry = reader.next(entry)) != null) count++;
-    assertEquals(total * 2, count);
+    assertEquals(total * 2 + 1, count); // one extra, for the initial Dummy entry
     // Now do a test that ensures stuff works when we go over block boundary,
     // especially that we return good length on file.
     final byte [] value = new byte[1025 * 1024];  // Make a 1M value.
@@ -250,14 +251,14 @@ public class TestHLog  {
     reader = HLog.getReader(fs, walPath, conf);
     count = 0;
     while((entry = reader.next(entry)) != null) count++;
-    assertEquals(total * 3, count);
+    assertEquals(total * 3 + 1, count); // one extra, for the initial Dummy entry
     reader.close();
     // Close it and ensure that closed, Reader gets right length also.
     wal.close();
     reader = HLog.getReader(fs, walPath, conf);
     count = 0;
     while((entry = reader.next(entry)) != null) count++;
-    assertEquals(total * 3, count);
+    assertEquals(total * 3 + 1, count); // one extra, for the initial Dummy entry
     reader.close();
   }
 
@@ -421,6 +422,9 @@ public class TestHLog  {
     HLogKey key = HLog.newKey(conf);
     WALEdit val = new WALEdit();
     while (reader.next(key, val)) {
+      if (Arrays.equals(key.getRegionName(), HLog.DUMMY)) {
+        continue;
+      }
       count++;
       assertTrue("Should be one KeyValue per WALEdit",
                  val.getKeyValues().size() == 1);
@@ -463,17 +467,21 @@ public class TestHLog  {
       reader = HLog.getReader(fs, filename, conf);
       // Above we added all columns on a single row so we only read one
       // entry in the below... thats why we have '1'.
-      for (int i = 0; i < 1; i++) {
+      for (int count = 0; count < 1; ) {
         HLog.Entry entry = reader.next(null);
         if (entry == null) break;
         HLogKey key = entry.getKey();
         WALEdit val = entry.getEdit();
+        if (Arrays.equals(key.getRegionName(), HLog.DUMMY)) {
+          continue;
+        }
         assertTrue(Bytes.equals(regionName, key.getRegionName()));
         assertTrue(Bytes.equals(tableName, key.getTablename()));
         KeyValue kv = val.getKeyValues().get(0);
         assertTrue(Bytes.equals(row, kv.getRow()));
-        assertEquals((byte)(i + '0'), kv.getValue()[0]);
+        assertEquals((byte)(count + '0'), kv.getValue()[0]);
         System.out.println(key + " " + val);
+        count++;
       }
       HLog.Entry entry = null;
       while ((entry = reader.next(null)) != null) {
@@ -529,6 +537,7 @@ public class TestHLog  {
       log = null;
       // Now open a reader on the log and assert append worked.
       reader = HLog.getReader(fs, filename, conf);
+      reader.next(); // skip the dummy entry.
       HLog.Entry entry = reader.next();
       assertEquals(COL_COUNT, entry.getEdit().size());
       int idx = 0;