You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@hbase.apache.org by "Jonathan Gray (JIRA)" <ji...@apache.org> on 2010/11/29 18:05:13 UTC

[jira] Created: (HBASE-3281) During log replay on region open, hbase.hstore.report.interval.edits setting may be adding non-negligible overhead to log replay

During log replay on region open, hbase.hstore.report.interval.edits setting may be adding non-negligible overhead to log replay
--------------------------------------------------------------------------------------------------------------------------------

                 Key: HBASE-3281
                 URL: https://issues.apache.org/jira/browse/HBASE-3281
             Project: HBase
          Issue Type: Improvement
    Affects Versions: 0.90.0
            Reporter: Jonathan Gray


On cluster here, I see a log replay on a region taking about 28 seconds.  It does a replay of approximately 750,000 edits.  Since this can run for a while, we have a Progress By default we have:

{noformat}
      int interval = this.conf.getInt("hbase.hstore.report.interval.edits", 2000);
{noformat}

This led to about 300 ZK node re-transitions (from OPENING to OPENING) in about 30 seconds.  I haven't measured the operation in ZK but it's certainly several millis.

Seems like we could be adding a significant amount of overheard here (5ms * 300 = 1.5 seconds = 5%).  But I think some of these could be >5ms so we could be adding 10% or more.

One way to address this would be to do it based on size not entries (this region only had increments, so lots of small edits).  Another way would be to do it based on time instead of entries (check-in every 5 seconds, for example).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.