You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by dh...@apache.org on 2017/04/17 18:36:12 UTC

[2/2] beam-site git commit: Rebuild website after merge

Rebuild website after merge


Project: http://git-wip-us.apache.org/repos/asf/beam-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/beam-site/commit/33b13882
Tree: http://git-wip-us.apache.org/repos/asf/beam-site/tree/33b13882
Diff: http://git-wip-us.apache.org/repos/asf/beam-site/diff/33b13882

Branch: refs/heads/asf-site
Commit: 33b13882e0008ae0c4605d47b4e7a2786a3ea88a
Parents: 0910783
Author: Dan Halperin <dh...@google.com>
Authored: Mon Apr 17 11:36:05 2017 -0700
Committer: Dan Halperin <dh...@google.com>
Committed: Mon Apr 17 11:36:05 2017 -0700

----------------------------------------------------------------------
 .../documentation/programming-guide/index.html  | 28 ++++++++++++++++++++
 1 file changed, 28 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/beam-site/blob/33b13882/content/documentation/programming-guide/index.html
----------------------------------------------------------------------
diff --git a/content/documentation/programming-guide/index.html b/content/documentation/programming-guide/index.html
index 8e56108..d7b1253 100644
--- a/content/documentation/programming-guide/index.html
+++ b/content/documentation/programming-guide/index.html
@@ -651,6 +651,34 @@ tree, [2]
 
 <p>Thus, <code class="highlighter-rouge">GroupByKey</code> represents a transform from a multimap (multiple keys to individual values) to a uni-map (unique keys to collections of values).</p>
 
+<h5 id="joins-with-cogroupbykey"><strong>Joins with CoGroupByKey</strong></h5>
+
+<p><code class="highlighter-rouge">CoGroupByKey</code> joins two or more key/value <code class="highlighter-rouge">PCollection</code>s that have the same key type, and then emits a collection of <code class="highlighter-rouge">KV&lt;K, CoGbkResult&gt;</code> pairs. <a href="/documentation/pipelines/design-your-pipeline/#multiple-sources">Design Your Pipeline</a> shows an example pipeline that uses a join.</p>
+
+<p>Given the input collections below:</p>
+<div class="highlighter-rouge"><pre class="highlight"><code>// collection 1
+user1, address1
+user2, address2
+user3, address3
+
+// collection 2
+user1, order1
+user1, order2
+user2, order3
+guest, order4
+...
+</code></pre>
+</div>
+
+<p><code class="highlighter-rouge">CoGroupByKey</code> gathers up the values with the same key from all <code class="highlighter-rouge">PCollection</code>s, and outputs a new pair consisting of the unique key and an object <code class="highlighter-rouge">CoGbkResult</code> containing all values that were associated with that key. If you apply <code class="highlighter-rouge">CoGroupByKey</code> to the input collections above, the output collection would look like this:</p>
+<div class="highlighter-rouge"><pre class="highlight"><code>user1, [[address1], [order1, order2]]
+user2, [[address2], [order3]]
+user3, [[address3], []]
+guest, [[], [order4]]
+...
+</code></pre>
+</div>
+
 <blockquote>
   <p><strong>A Note on Key/Value Pairs:</strong> Beam represents key/value pairs slightly differently depending on the language and SDK you\u2019re using. In the Beam SDK for Java, you represent a key/value pair with an object of type <code class="highlighter-rouge">KV&lt;K, V&gt;</code>. In Python, you represent key/value pairs with 2-tuples.</p>
 </blockquote>