You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@hbase.apache.org by "Liyin Tang (JIRA)" <ji...@apache.org> on 2012/10/10 01:58:02 UTC

[jira] [Created] (HBASE-6968) Several HBase write perf improvement

Liyin Tang created HBASE-6968:
---------------------------------

             Summary: Several HBase write perf improvement
                 Key: HBASE-6968
                 URL: https://issues.apache.org/jira/browse/HBASE-6968
             Project: HBase
          Issue Type: Improvement
            Reporter: Liyin Tang


There are two improvements in this jira:
1) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.

2) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6968) Several HBase write perf improvement

Posted by "liang xie (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13482173#comment-13482173 ] 

liang xie commented on HBASE-6968:
----------------------------------

I look through trunk code, there's no change needed,  so let's set this affects issue version on 0.90/0.92/0.94 only, right ?
                
> Several HBase write perf improvement
> ------------------------------------
>
>                 Key: HBASE-6968
>                 URL: https://issues.apache.org/jira/browse/HBASE-6968
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> Here are 2 hbase write performance improvements recently:
> 1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
> In HLog.java:
> orig:
> {code:title=HLog.java}
>   newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
> {code}
> new:
> {code}
>   newWriter = createWriter(fs, newPath, conf);
> {code}
> 2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
> orig:
> {code:title=HBaseRpcMetrics.java}
>   public synchronized void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       m = create(name);	
>     }	
>     m.inc(amt);	
>   }
> {code}
> new:
> {code}
>   public void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       synchronized (this) {	
>         if ((m = get(name)) == null) {	
>           m = create(name);	
>         }	
>       }	
>     }	
>     m.inc(amt);	
>   }
> {code}
> =====================
> orig:
> {code:title=MemStoreFlusher.java}
>   public synchronized void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
> {code}
> new:
> {code}
>   public void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
>   private synchronized void flushSomeRegions() {	
>     if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
>       return; // double check the global memstore size inside of the synchronized block.	
>     }	
>  ...   
>  }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6968) Several HBase write perf improvement

Posted by "Liyin Tang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Liyin Tang updated HBASE-6968:
------------------------------

    Description: 
Here are 2 hbase write performance improvements recently found out. 

1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
In HLog.java:
orig:
{code:title=HLog.java}
  newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
{code}
new:
{code}
  newWriter = createWriter(fs, newPath, conf);
{code}


2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
orig:
{code:title=HBaseRpcMetrics.java}
  public synchronized void inc(String name, int amt) {	
    MetricsTimeVaryingRate m = get(name);	
    if (m == null) {	
      m = create(name);	
    }	
    m.inc(amt);	
  }
{code}

new:
{code}
  public void inc(String name, int amt) {	
    MetricsTimeVaryingRate m = get(name);	
    if (m == null) {	
      synchronized (this) {	
        if ((m = get(name)) == null) {	
          m = create(name);	
        }	
      }	
    }	
    m.inc(amt);	
  }
{code}
=====================
orig:
{code:title=MemStoreFlusher.java}
  public synchronized void reclaimMemStoreMemory() {	
    if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
      flushSomeRegions();	
    }
  }	
{code}
new:
{code}
  public void reclaimMemStoreMemory() {	
    if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
      flushSomeRegions();	
    }
  }	
  private synchronized void flushSomeRegions() {	
    if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
      return; // double check the global memstore size inside of the synchronized block.	
    }	
 ...   
 }
{code}



  was:
There are two improvements in this jira:
1) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.

2) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.

    
> Several HBase write perf improvement
> ------------------------------------
>
>                 Key: HBASE-6968
>                 URL: https://issues.apache.org/jira/browse/HBASE-6968
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> Here are 2 hbase write performance improvements recently found out. 
> 1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
> In HLog.java:
> orig:
> {code:title=HLog.java}
>   newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
> {code}
> new:
> {code}
>   newWriter = createWriter(fs, newPath, conf);
> {code}
> 2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
> orig:
> {code:title=HBaseRpcMetrics.java}
>   public synchronized void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       m = create(name);	
>     }	
>     m.inc(amt);	
>   }
> {code}
> new:
> {code}
>   public void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       synchronized (this) {	
>         if ((m = get(name)) == null) {	
>           m = create(name);	
>         }	
>       }	
>     }	
>     m.inc(amt);	
>   }
> {code}
> =====================
> orig:
> {code:title=MemStoreFlusher.java}
>   public synchronized void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
> {code}
> new:
> {code}
>   public void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
>   private synchronized void flushSomeRegions() {	
>     if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
>       return; // double check the global memstore size inside of the synchronized block.	
>     }	
>  ...   
>  }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6968) Several HBase write perf improvement

Posted by "stack (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

stack updated HBASE-6968:
-------------------------

    Affects Version/s: 0.90.6
                       0.92.2
                       0.94.2
    
> Several HBase write perf improvement
> ------------------------------------
>
>                 Key: HBASE-6968
>                 URL: https://issues.apache.org/jira/browse/HBASE-6968
>             Project: HBase
>          Issue Type: Improvement
>    Affects Versions: 0.90.6, 0.92.2, 0.94.2
>            Reporter: Liyin Tang
>
> Here are 2 hbase write performance improvements recently:
> 1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
> In HLog.java:
> orig:
> {code:title=HLog.java}
>   newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
> {code}
> new:
> {code}
>   newWriter = createWriter(fs, newPath, conf);
> {code}
> 2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
> orig:
> {code:title=HBaseRpcMetrics.java}
>   public synchronized void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       m = create(name);	
>     }	
>     m.inc(amt);	
>   }
> {code}
> new:
> {code}
>   public void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       synchronized (this) {	
>         if ((m = get(name)) == null) {	
>           m = create(name);	
>         }	
>       }	
>     }	
>     m.inc(amt);	
>   }
> {code}
> =====================
> orig:
> {code:title=MemStoreFlusher.java}
>   public synchronized void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
> {code}
> new:
> {code}
>   public void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
>   private synchronized void flushSomeRegions() {	
>     if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
>       return; // double check the global memstore size inside of the synchronized block.	
>     }	
>  ...   
>  }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6968) Several HBase write perf improvement

Posted by "ramkrishna.s.vasudevan (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13472978#comment-13472978 ] 

ramkrishna.s.vasudevan commented on HBASE-6968:
-----------------------------------------------

Nice one.
                
> Several HBase write perf improvement
> ------------------------------------
>
>                 Key: HBASE-6968
>                 URL: https://issues.apache.org/jira/browse/HBASE-6968
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> Here are 2 hbase write performance improvements recently:
> 1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
> In HLog.java:
> orig:
> {code:title=HLog.java}
>   newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
> {code}
> new:
> {code}
>   newWriter = createWriter(fs, newPath, conf);
> {code}
> 2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
> orig:
> {code:title=HBaseRpcMetrics.java}
>   public synchronized void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       m = create(name);	
>     }	
>     m.inc(amt);	
>   }
> {code}
> new:
> {code}
>   public void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       synchronized (this) {	
>         if ((m = get(name)) == null) {	
>           m = create(name);	
>         }	
>       }	
>     }	
>     m.inc(amt);	
>   }
> {code}
> =====================
> orig:
> {code:title=MemStoreFlusher.java}
>   public synchronized void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
> {code}
> new:
> {code}
>   public void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
>   private synchronized void flushSomeRegions() {	
>     if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
>       return; // double check the global memstore size inside of the synchronized block.	
>     }	
>  ...   
>  }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HBASE-6968) Several HBase write perf improvement

Posted by "Liyin Tang (JIRA)" <ji...@apache.org>.
     [ https://issues.apache.org/jira/browse/HBASE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Liyin Tang updated HBASE-6968:
------------------------------

    Description: 
Here are 2 hbase write performance improvements recently:

1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
In HLog.java:
orig:
{code:title=HLog.java}
  newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
{code}
new:
{code}
  newWriter = createWriter(fs, newPath, conf);
{code}


2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
orig:
{code:title=HBaseRpcMetrics.java}
  public synchronized void inc(String name, int amt) {	
    MetricsTimeVaryingRate m = get(name);	
    if (m == null) {	
      m = create(name);	
    }	
    m.inc(amt);	
  }
{code}

new:
{code}
  public void inc(String name, int amt) {	
    MetricsTimeVaryingRate m = get(name);	
    if (m == null) {	
      synchronized (this) {	
        if ((m = get(name)) == null) {	
          m = create(name);	
        }	
      }	
    }	
    m.inc(amt);	
  }
{code}
=====================
orig:
{code:title=MemStoreFlusher.java}
  public synchronized void reclaimMemStoreMemory() {	
    if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
      flushSomeRegions();	
    }
  }	
{code}
new:
{code}
  public void reclaimMemStoreMemory() {	
    if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
      flushSomeRegions();	
    }
  }	
  private synchronized void flushSomeRegions() {	
    if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
      return; // double check the global memstore size inside of the synchronized block.	
    }	
 ...   
 }
{code}



  was:
Here are 2 hbase write performance improvements recently found out. 

1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
In HLog.java:
orig:
{code:title=HLog.java}
  newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
{code}
new:
{code}
  newWriter = createWriter(fs, newPath, conf);
{code}


2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
orig:
{code:title=HBaseRpcMetrics.java}
  public synchronized void inc(String name, int amt) {	
    MetricsTimeVaryingRate m = get(name);	
    if (m == null) {	
      m = create(name);	
    }	
    m.inc(amt);	
  }
{code}

new:
{code}
  public void inc(String name, int amt) {	
    MetricsTimeVaryingRate m = get(name);	
    if (m == null) {	
      synchronized (this) {	
        if ((m = get(name)) == null) {	
          m = create(name);	
        }	
      }	
    }	
    m.inc(amt);	
  }
{code}
=====================
orig:
{code:title=MemStoreFlusher.java}
  public synchronized void reclaimMemStoreMemory() {	
    if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
      flushSomeRegions();	
    }
  }	
{code}
new:
{code}
  public void reclaimMemStoreMemory() {	
    if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
      flushSomeRegions();	
    }
  }	
  private synchronized void flushSomeRegions() {	
    if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
      return; // double check the global memstore size inside of the synchronized block.	
    }	
 ...   
 }
{code}



    
> Several HBase write perf improvement
> ------------------------------------
>
>                 Key: HBASE-6968
>                 URL: https://issues.apache.org/jira/browse/HBASE-6968
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> Here are 2 hbase write performance improvements recently:
> 1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
> In HLog.java:
> orig:
> {code:title=HLog.java}
>   newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
> {code}
> new:
> {code}
>   newWriter = createWriter(fs, newPath, conf);
> {code}
> 2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
> orig:
> {code:title=HBaseRpcMetrics.java}
>   public synchronized void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       m = create(name);	
>     }	
>     m.inc(amt);	
>   }
> {code}
> new:
> {code}
>   public void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       synchronized (this) {	
>         if ((m = get(name)) == null) {	
>           m = create(name);	
>         }	
>       }	
>     }	
>     m.inc(amt);	
>   }
> {code}
> =====================
> orig:
> {code:title=MemStoreFlusher.java}
>   public synchronized void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
> {code}
> new:
> {code}
>   public void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
>   private synchronized void flushSomeRegions() {	
>     if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
>       return; // double check the global memstore size inside of the synchronized block.	
>     }	
>  ...   
>  }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HBASE-6968) Several HBase write perf improvement

Posted by "Lars Hofhansl (JIRA)" <ji...@apache.org>.
    [ https://issues.apache.org/jira/browse/HBASE-6968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13473027#comment-13473027 ] 

Lars Hofhansl commented on HBASE-6968:
--------------------------------------

Looks like the first two changes are (in one form or the other) in 0.94+ already.

The 0.94 MemStoreFlusher.reclaimMemStoreMemory looks bad, though. We are taking a lock and are awaiting a condition *inside* a synchronized method... It seems we can remove the synchronized there. The code inside the method does not need it and there are no other synchronized methods (or synchronized(this) blocks) in the class (which is also why it is not noticeably bad, because nothing is locked out).

                
> Several HBase write perf improvement
> ------------------------------------
>
>                 Key: HBASE-6968
>                 URL: https://issues.apache.org/jira/browse/HBASE-6968
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: Liyin Tang
>
> Here are 2 hbase write performance improvements recently:
> 1) Avoid creating HBaseConfiguraiton object for each HLog. Every time when creating a HBaseConfiguraiton object, it would parse the xml configuration files from disk, which is not cheap operation.
> In HLog.java:
> orig:
> {code:title=HLog.java}
>   newWriter = createWriter(fs, newPath, HBaseConfiguration.create(conf));
> {code}
> new:
> {code}
>   newWriter = createWriter(fs, newPath, conf);
> {code}
> 2) Change 2 hotspot synchronized functions into double locking pattern. So it shall remove the synchronization overhead in the normal case.
> orig:
> {code:title=HBaseRpcMetrics.java}
>   public synchronized void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       m = create(name);	
>     }	
>     m.inc(amt);	
>   }
> {code}
> new:
> {code}
>   public void inc(String name, int amt) {	
>     MetricsTimeVaryingRate m = get(name);	
>     if (m == null) {	
>       synchronized (this) {	
>         if ((m = get(name)) == null) {	
>           m = create(name);	
>         }	
>       }	
>     }	
>     m.inc(amt);	
>   }
> {code}
> =====================
> orig:
> {code:title=MemStoreFlusher.java}
>   public synchronized void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
> {code}
> new:
> {code}
>   public void reclaimMemStoreMemory() {	
>     if (this.server.getGlobalMemstoreSize().get() >= globalMemStoreLimit) {	
>       flushSomeRegions();	
>     }
>   }	
>   private synchronized void flushSomeRegions() {	
>     if (this.server.getGlobalMemstoreSize().get() < globalMemStoreLimit) {	
>       return; // double check the global memstore size inside of the synchronized block.	
>     }	
>  ...   
>  }
> {code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira