You are viewing a plain text version of this content. The canonical link for it is here.
Posted to hdfs-dev@hadoop.apache.org by "Liang Xie (JIRA)" <ji...@apache.org> on 2014/07/29 14:51:39 UTC

[jira] [Created] (HDFS-6766) optimize ack notify mechanism to avoid thundering herd issue

Liang Xie created HDFS-6766:
-------------------------------

             Summary: optimize ack notify mechanism to avoid thundering herd issue
                 Key: HDFS-6766
                 URL: https://issues.apache.org/jira/browse/HDFS-6766
             Project: Hadoop HDFS
          Issue Type: Improvement
          Components: hdfs-client
    Affects Versions: 3.0.0
            Reporter: Liang Xie
            Assignee: Liang Xie


Currently, DFSOutputStream uses wait/notifyAll to coordinate ack receiving and ack waiting, etc..
say there're 5 threads(t1,t2,t3,t4,t5) wait for ack seq no: 1,2,3,4,5, once the no. 1 ack arrived, the "notifyAll" be called, so t2/t3/t4/t5 could do nothing except wait again.
we can rewrite it with Condition class, with a fair policy(fifo), we can just make t1 be notified, so a number of context switch be saved.
It's possible more than one thread waiting on the same ack seq no(e.g. no more data be written between two flush operations), so once it happened, we need to notify those threads, so i introduced a set to remember this seq no.
In a simple HBase ycsb testing, the context switch number per second was reduced about 15%, and reduced sys cpu% about 6%(My HBase has new write model patch, i think the benefit will be higher if w/o it)



--
This message was sent by Atlassian JIRA
(v6.2#6252)