You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@dubbo.apache.org by gi...@apache.org on 2020/03/27 03:43:54 UTC

[dubbo-website] branch asf-site updated: Automated deployment: Fri Mar 27 03:43:43 UTC 2020 da657288a0b017692e5c698f4f34e60802cce7dd

This is an automated email from the ASF dual-hosted git repository.

github-bot pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/dubbo-website.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new 8b1cfeb  Automated deployment: Fri Mar 27 03:43:43 UTC 2020 da657288a0b017692e5c698f4f34e60802cce7dd
8b1cfeb is described below

commit 8b1cfebdce367c0389722bcf9decfac8bed292a6
Author: chickenlj <ch...@users.noreply.github.com>
AuthorDate: Fri Mar 27 03:43:43 2020 +0000

    Automated deployment: Fri Mar 27 03:43:43 UTC 2020 da657288a0b017692e5c698f4f34e60802cce7dd
---
 .../blog/dubbo-consistent-hash-implementation.html | 123 ++++++++++++++++++++-
 .../blog/dubbo-consistent-hash-implementation.json |   2 +-
 2 files changed, 123 insertions(+), 2 deletions(-)

diff --git a/zh-cn/blog/dubbo-consistent-hash-implementation.html b/zh-cn/blog/dubbo-consistent-hash-implementation.html
index be85f06..f69808d 100644
--- a/zh-cn/blog/dubbo-consistent-hash-implementation.html
+++ b/zh-cn/blog/dubbo-consistent-hash-implementation.html
@@ -61,7 +61,7 @@
 <blockquote>
 <p>服务引入:<a href="http://dubbo.apache.org/zh-cn/docs/source_code_guide/refer-service.html%E3%80%82">http://dubbo.apache.org/zh-cn/docs/source_code_guide/refer-service.html。</a></p>
 </blockquote>
-<p>在接口代理类生成、并且装配好后,服务的调用基本是这样一个流程:proxy -&gt; MockClusterInvoker -&gt; 集群策略(如:FailoverClusterInvoker) -&gt; 根据选定的负载均衡策略确定选定的远程调用对象Invoker。</p>
+<p>在接口代理类生成、并且装配好后,服务的调用基本是这样一个流程:proxy -&gt; MockClusterInvoker -&gt; 集群策略(如:FailoverClusterInvoker) -&gt; 初始化负载均衡策略 -&gt; 根据选定的负载均衡策略确定Invoker。</p>
 <p><strong>负载均衡策略的初始化</strong>是在AbstractClusterInvoker中的initLoadBalance方法中初始化的:</p>
 <pre><code class="language-java"><span class="hljs-function"><span class="hljs-keyword">protected</span> LoadBalance <span class="hljs-title">initLoadBalance</span><span class="hljs-params">(List&lt;Invoker&lt;T&gt;&gt; invokers, Invocation invocation)</span> </span>{
     <span class="hljs-keyword">if</span> (CollectionUtils.isNotEmpty(invokers)) {
@@ -76,6 +76,127 @@
 <p>1、获取调用方法所配置的LOADBALANCE_KEY属性的值,LOADBALANCE_KEY这个常量的实际值为:loadbalance,即为我们的所配置的属性;</p>
 <p>2、利用SPI机制来初始化并加载该值所代表的负载均衡策略。</p>
 <p>所有的负载均衡策略都会继承LoadBalance接口。在各种集群策略中,最终都会调用AbstractClusterInvoker的select方法,而AbstractClusterInvoker会在doSelect中,<strong>调用LoadBalance的select方法,这里即开始了负载均衡策略的执行。</strong></p>
+<h3>三、Dubbo一致性Hash负载均衡的实现</h3>
+<p>需要说明的一点是,我所说的<strong>负载均衡策略的执行</strong>,即是在所有的Provider中选出一个,作为当前Consumer的远程调用对象。在代码中,Provider被封装成了Invoker实体,所以直接说来,负载均衡策略的执行就是在Invoker列表中选出一个Invoker。</p>
+<p>所以,对比普通一致性Hash的实现,Dubbo的一致性Hash算法也可以分为两步:</p>
+<p><strong>1、映射Provider至Hash值区间中(实际中映射的是Invoker);</strong></p>
+<p><strong>2、映射请求,然后找到大于请求Hash值的第一个Invoker。</strong></p>
+<h4><strong>a、映射Invoker</strong></h4>
+<p>Dubbo中所有的负载均衡实现类都继承了AbstractLoadBalance,调用LoadBalance的select方法时,实际上调用的是AbstractLoadBalance的实现:</p>
+<pre><code class="language-java"><span class="hljs-meta">@Override</span>
+<span class="hljs-keyword">public</span> &lt;T&gt; <span class="hljs-function">Invoker&lt;T&gt; <span class="hljs-title">select</span><span class="hljs-params">(List&lt;Invoker&lt;T&gt;&gt; invokers, URL url, Invocation invocation)</span> </span>{
+    <span class="hljs-keyword">if</span> (CollectionUtils.isEmpty(invokers)) {
+        <span class="hljs-keyword">return</span> <span class="hljs-keyword">null</span>;
+    }
+    <span class="hljs-keyword">if</span> (invokers.size() == <span class="hljs-number">1</span>) {
+        <span class="hljs-keyword">return</span> invokers.get(<span class="hljs-number">0</span>);
+    }
+    <span class="hljs-comment">// doSelect这里进入具体负载均衡算法的执行逻辑</span>
+    <span class="hljs-keyword">return</span> doSelect(invokers, url, invocation);
+}
+</code></pre>
+<p>可以看到这里调用了doSelect,Dubbo一致性Hash的具体实现类名字是<strong>ConsistentHashLoadBalance</strong>,让我们来看看它的doSelect方法干了啥:</p>
+<pre><code class="language-java"><span class="hljs-meta">@Override</span>
+<span class="hljs-keyword">protected</span> &lt;T&gt; <span class="hljs-function">Invoker&lt;T&gt; <span class="hljs-title">doSelect</span><span class="hljs-params">(List&lt;Invoker&lt;T&gt;&gt; invokers, URL url, Invocation invocation)</span> </span>{
+    String methodName = RpcUtils.getMethodName(invocation);
+    <span class="hljs-comment">// key格式:接口名.方法名</span>
+    String key = invokers.get(<span class="hljs-number">0</span>).getUrl().getServiceKey() + <span class="hljs-string">"."</span> + methodName;
+    <span class="hljs-comment">// identityHashCode 用来识别invokers是否发生过变更</span>
+    <span class="hljs-keyword">int</span> identityHashCode = System.identityHashCode(invokers);
+    ConsistentHashSelector&lt;T&gt; selector = (ConsistentHashSelector&lt;T&gt;) selectors.get(key);
+    <span class="hljs-keyword">if</span> (selector == <span class="hljs-keyword">null</span> || selector.identityHashCode != identityHashCode) {
+        <span class="hljs-comment">// 若不存在"接口.方法名"对应的选择器,或是Invoker列表已经发生了变更,则初始化一个选择器</span>
+        selectors.put(key, <span class="hljs-keyword">new</span> ConsistentHashSelector&lt;T&gt;(invokers, methodName, identityHashCode));
+        selector = (ConsistentHashSelector&lt;T&gt;) selectors.get(key);
+    }
+    <span class="hljs-keyword">return</span> selector.select(invocation);
+}
+</code></pre>
+<p>这里有个很重要的概念:<strong>选择器——selector</strong>。这是Dubbo一致性Hash实现中,承载着整个映射关系的数据结构。它里面主要有这么几个参数:</p>
+<pre><code class="language-java"><span class="hljs-comment">/**
+ * 存储Hash值与节点映射关系的TreeMap
+ */</span>
+<span class="hljs-keyword">private</span> <span class="hljs-keyword">final</span> TreeMap&lt;Long, Invoker&lt;T&gt;&gt; virtualInvokers;
+
+<span class="hljs-comment">/**
+ * 节点数目
+ */</span>
+<span class="hljs-keyword">private</span> <span class="hljs-keyword">final</span> <span class="hljs-keyword">int</span> replicaNumber;
+
+<span class="hljs-comment">/**
+ * 用来识别Invoker列表是否发生变更的Hash码
+ */</span>
+<span class="hljs-keyword">private</span> <span class="hljs-keyword">final</span> <span class="hljs-keyword">int</span> identityHashCode;
+
+<span class="hljs-comment">/**
+ * 请求中用来作Hash映射的参数的索引
+ */</span>
+<span class="hljs-keyword">private</span> <span class="hljs-keyword">final</span> <span class="hljs-keyword">int</span>[] argumentIndex;
+</code></pre>
+<p>在新建ConsistentHashSelector对象的时候,就会遍历所有Invoker对象,然后计算出其地址(ip+port)对应的md5码,并按照配置的节点数目replicaNumber的值来初始化服务节点和所有虚拟节点:</p>
+<pre><code class="language-java">ConsistentHashSelector(List&lt;Invoker&lt;T&gt;&gt; invokers, String methodName, <span class="hljs-keyword">int</span> identityHashCode) {
+    <span class="hljs-keyword">this</span>.virtualInvokers = <span class="hljs-keyword">new</span> TreeMap&lt;Long, Invoker&lt;T&gt;&gt;();
+    <span class="hljs-keyword">this</span>.identityHashCode = identityHashCode;
+    URL url = invokers.get(<span class="hljs-number">0</span>).getUrl();
+    <span class="hljs-comment">// 获取配置的节点数目</span>
+    <span class="hljs-keyword">this</span>.replicaNumber = url.getMethodParameter(methodName, HASH_NODES, <span class="hljs-number">160</span>);
+    <span class="hljs-comment">// 获取配置的用作Hash映射的参数的索引</span>
+    String[] index = COMMA_SPLIT_PATTERN.split(url.getMethodParameter(methodName, HASH_ARGUMENTS, <span class="hljs-string">"0"</span>));
+    argumentIndex = <span class="hljs-keyword">new</span> <span class="hljs-keyword">int</span>[index.length];
+    <span class="hljs-keyword">for</span> (<span class="hljs-keyword">int</span> i = <span class="hljs-number">0</span>; i &lt; index.length; i++) {
+        argumentIndex[i] = Integer.parseInt(index[i]);
+    }
+    <span class="hljs-comment">// 遍历所有Invoker对象</span>
+    <span class="hljs-keyword">for</span> (Invoker&lt;T&gt; invoker : invokers) {
+        <span class="hljs-comment">// 获取Provider的ip+port</span>
+        String address = invoker.getUrl().getAddress();
+        <span class="hljs-keyword">for</span> (<span class="hljs-keyword">int</span> i = <span class="hljs-number">0</span>; i &lt; replicaNumber / <span class="hljs-number">4</span>; i++) {
+            <span class="hljs-keyword">byte</span>[] digest = md5(address + i);
+            <span class="hljs-keyword">for</span> (<span class="hljs-keyword">int</span> h = <span class="hljs-number">0</span>; h &lt; <span class="hljs-number">4</span>; h++) {
+                <span class="hljs-keyword">long</span> m = hash(digest, h);
+                virtualInvokers.put(m, invoker);
+            }
+        }
+    }
+}
+</code></pre>
+<p>这里值得注意的是:以replicaNumber取默认值160为例,假设当前遍历到的Invoker地址为127.0.0.1:20880,它会依次获得“127.0.0.1:208800”、“127.0.0.1:208801”、......、“127.0.0.1:2088040”的md5摘要,在每次获得摘要之后,还会对该摘要进行四次数位级别的散列。大致可以猜到其目的应该是为了加强散列效果。(希望有人能告诉我相关的理论依据。)</p>
+<p>代码中**virtualInvokers.put(m, invoker)**即是存储当前计算出的Hash值与Invoker的映射关系。</p>
+<p>这段代码简单说来,就是为每个Invoker都创建replicaNumber个节点,Hash值与Invoker的映射关系即象征着一个节点,这个关系存储在TreeMap中。</p>
+<h4><strong>b、映射请求</strong></h4>
+<p>让我们重新回到ConsistentHashLoadBalance的<strong>doSelect</strong>方法,若没有找到selector则会新建selector,找到selector后便会调用selector的select方法:</p>
+<pre><code class="language-java"><span class="hljs-function"><span class="hljs-keyword">public</span> Invoker&lt;T&gt; <span class="hljs-title">select</span><span class="hljs-params">(Invocation invocation)</span> </span>{
+    <span class="hljs-comment">// 根据invocation的【参数值】来确定key,默认使用第一个参数来做hash计算</span>
+    String key = toKey(invocation.getArguments());
+    <span class="hljs-comment">//  获取【参数值】的md5编码</span>
+    <span class="hljs-keyword">byte</span>[] digest = md5(key);
+    <span class="hljs-keyword">return</span> selectForKey(hash(digest, <span class="hljs-number">0</span>));
+}
+
+<span class="hljs-comment">// 根据参数索引获取参数,并将所有参数拼接成字符串</span>
+<span class="hljs-function"><span class="hljs-keyword">private</span> String <span class="hljs-title">toKey</span><span class="hljs-params">(Object[] args)</span> </span>{
+    StringBuilder buf = <span class="hljs-keyword">new</span> StringBuilder();
+    <span class="hljs-keyword">for</span> (<span class="hljs-keyword">int</span> i : argumentIndex) {
+        <span class="hljs-keyword">if</span> (i &gt;= <span class="hljs-number">0</span> &amp;&amp; i &lt; args.length) {
+            buf.append(args[i]);
+        }
+    }
+    <span class="hljs-keyword">return</span> buf.toString();
+}
+
+<span class="hljs-comment">// 根据参数字符串的md5编码找出Invoker</span>
+<span class="hljs-function"><span class="hljs-keyword">private</span> Invoker&lt;T&gt; <span class="hljs-title">selectForKey</span><span class="hljs-params">(<span class="hljs-keyword">long</span> hash)</span> </span>{
+    Map.Entry&lt;Long, Invoker&lt;T&gt;&gt; entry = virtualInvokers.ceilingEntry(hash);
+    <span class="hljs-keyword">if</span> (entry == <span class="hljs-keyword">null</span>) {
+        entry = virtualInvokers.firstEntry();
+    }
+    <span class="hljs-keyword">return</span> entry.getValue();
+}
+</code></pre>
+<p>argumentIndex是在初始化Selector的时候一起赋值的,代表着需要用哪几个请求参数作Hash映射获取Invoker。比如:有方法methodA(Integer a, Integer b, Integer c),如果argumentIndex的值为{0,2},那么即用a和c拼接的字符串来计算Hash值。</p>
+<p>我们已经知道virtualInvokers是一个TreeMap,TreeMap的底层实现是红黑树。对于TreeMap的方法ceilingEntry(hash),它的作用是用来<strong>获取比传入值大的第一个元素</strong>。可以看到,这一点与一般的一致性Hash算法的处理逻辑完全是相同的。</p>
+<p>但这里的回环逻辑有点不同。对于取模运算来讲,大于最大值后,会自动回环从0开始,而这里的逻辑是:当没有比传入ceilingEntry()方法中的值大的元素的时候,virtualInvokers.ceilingEntry(hash)必然会得到null,于是,就用virtualInvokers.firstEntry()来获取整个TreeMap的第一个元素。</p>
+<p>从selectForKey中获取到Invoker后,负载均衡策略也就算是执行完毕了。后续获取远程调用客户端等调用流程不再赘述。</p>
 </section><footer class="footer-container"><div class="footer-body"><img src="/img/dubbo_gray.png"/><img class="apache" src="/img/apache_logo.png"/><div class="cols-container"><div class="col col-12"><h3></h3><p></p></div><div class="col col-4"><dl><dt>ASF</dt><dd><a href="http://www.apache.org" target="_self">基金会</a></dd><dd><a href="http://www.apache.org/licenses/" target="_self">证书</a></dd><dd><a href="http://www.apache.org/events/current-event" target="_self">事件</a></dd><dd><a href=" [...]
 	<script src="https://f.alicdn.com/react/15.4.1/react-with-addons.min.js"></script>
 	<script src="https://f.alicdn.com/react/15.4.1/react-dom.min.js"></script>
diff --git a/zh-cn/blog/dubbo-consistent-hash-implementation.json b/zh-cn/blog/dubbo-consistent-hash-implementation.json
index 4b99bfc..b4ac8ad 100644
--- a/zh-cn/blog/dubbo-consistent-hash-implementation.json
+++ b/zh-cn/blog/dubbo-consistent-hash-implementation.json
@@ -1,6 +1,6 @@
 {
   "filename": "dubbo-consistent-hash-implementation.md",
-  "__html": "<p>需要强调的是,Dubbo的Hash映射模型与大部分网上资料描述的<strong>环形队列Hash映射模型</strong>是存在一些区别的。于我而言,环形队列Hash映射模型,不足以让我对一致性Hash有足够彻底的了解。直到看懂了Dubbo的一致性Hash的实现,才觉得豁然开朗。</p>\n<h3>一、环形队列Hash映射模型</h3>\n<p>这种方案,其基础还是基于取模运算。对2^32取模,那么,Hash值的区间为[0, 2^32-1]。接下来要做的,就包括两部分:</p>\n<h4><strong>a、映射服务</strong></h4>\n<p>将服务地址(ip+端口)按照一定规则构造出特定的识别码(如md5码),再用识别码对2^32取模,确定服务在Hash值区间对应的位置。假设有Node1、Node2、Node3三个服务,其映射关系如下:</p>\n<p><img src=\"../../img/blog/consistenthash/consistent-hash-init-model.jpg\" alt=\"Init\">< [...]
+  "__html": "<p>需要强调的是,Dubbo的Hash映射模型与大部分网上资料描述的<strong>环形队列Hash映射模型</strong>是存在一些区别的。于我而言,环形队列Hash映射模型,不足以让我对一致性Hash有足够彻底的了解。直到看懂了Dubbo的一致性Hash的实现,才觉得豁然开朗。</p>\n<h3>一、环形队列Hash映射模型</h3>\n<p>这种方案,其基础还是基于取模运算。对2^32取模,那么,Hash值的区间为[0, 2^32-1]。接下来要做的,就包括两部分:</p>\n<h4><strong>a、映射服务</strong></h4>\n<p>将服务地址(ip+端口)按照一定规则构造出特定的识别码(如md5码),再用识别码对2^32取模,确定服务在Hash值区间对应的位置。假设有Node1、Node2、Node3三个服务,其映射关系如下:</p>\n<p><img src=\"../../img/blog/consistenthash/consistent-hash-init-model.jpg\" alt=\"Init\">< [...]
   "link": "/zh-cn/blog/dubbo-consistent-hash-implementation.html",
   "meta": {
     "title": "Dubbo一致性Hash负载均衡实现剖析",