You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by gi...@apache.org on 2021/12/07 06:04:17 UTC

[beam] branch asf-site updated: Publishing website 2021/12/07 06:03:36 at commit 6d63a70

This is an automated email from the ASF dual-hosted git repository.

git-site-role pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new a89b07d  Publishing website 2021/12/07 06:03:36 at commit 6d63a70
a89b07d is described below

commit a89b07df119671d4dcb216913f33723d82798d96
Author: jenkins <bu...@apache.org>
AuthorDate: Tue Dec 7 06:03:37 2021 +0000

    Publishing website 2021/12/07 06:03:36 at commit 6d63a70
---
 .../documentation/basics/index.html                |  71 +++++++++++---
 website/generated-content/documentation/index.xml  | 107 ++++++++++++++++++---
 .../documentation/programming-guide/index.html     |  11 ++-
 website/generated-content/sitemap.xml              |   2 +-
 4 files changed, 157 insertions(+), 34 deletions(-)

diff --git a/website/generated-content/documentation/basics/index.html b/website/generated-content/documentation/basics/index.html
index 89478b1..d74214d 100644
--- a/website/generated-content/documentation/basics/index.html
+++ b/website/generated-content/documentation/basics/index.html
@@ -18,7 +18,7 @@
 function addPlaceholder(){$('input:text').attr('placeholder',"What are you looking for?");}
 function endSearch(){var search=document.querySelector(".searchBar");search.classList.add("disappear");var icons=document.querySelector("#iconsBar");icons.classList.remove("disappear");}
 function blockScroll(){$("body").toggleClass("fixedPosition");}
-function openMenu(){addPlaceholder();blockScroll();}</script><div class="clearfix container-main-content"><div class="section-nav closed" data-offset-top=90 data-offset-bottom=500><span class="section-nav-back glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list data-section-nav><li><span class=section-nav-list-main-title>Documentation</span></li><li><a href=/documentation>Using the Documentation</a></li><li class=section-nav-item--collapsible><span class=section-nav-lis [...]
+function openMenu(){addPlaceholder();blockScroll();}</script><div class="clearfix container-main-content"><div class="section-nav closed" data-offset-top=90 data-offset-bottom=500><span class="section-nav-back glyphicon glyphicon-menu-left"></span><nav><ul class=section-nav-list data-section-nav><li><span class=section-nav-list-main-title>Documentation</span></li><li><a href=/documentation>Using the Documentation</a></li><li class=section-nav-item--collapsible><span class=section-nav-lis [...]
 data-parallel processing pipelines. To get started with Beam, you&rsquo;ll need to
 understand an important set of core concepts:</p><ul><li><a href=#pipeline><em>Pipeline</em></a> - A pipeline is a user-constructed graph of
 transformations that defines the desired data processing operations.</li><li><a href=#pcollection><em>PCollection</em></a> - A <code>PCollection</code> is a data set or data
@@ -33,7 +33,13 @@ a <code>PCollection</code>. The schema for a <code>PCollection</code> defines el
 <code>PCollection</code> as an ordered list of named fields.</li><li><a href=/documentation/sdks/java/><em>SDK</em></a> - A language-specific library that lets
 pipeline authors build transforms, construct their pipelines, and submit
 them to a runner.</li><li><a href=#runner><em>Runner</em></a> - A runner runs a Beam pipeline using the capabilities of
-your chosen data processing engine.</li><li><a href=#trigger><em>Trigger</em></a> - A trigger determines when to aggregate the results of
+your chosen data processing engine.</li><li><a href=#window><em>Window</em></a> - A <code>PCollection</code> can be subdivided into windows based on
+the timestamps of the individual elements. Windows enable grouping operations
+over collections that grow over time by dividing the collection into windows
+of finite collections.</li><li><a href=#watermark><em>Watermark</em></a> - A watermark is a guess as to when all data in a
+certain window is expected to have arrived. This is needed because data isn’t
+always guaranteed to arrive in a pipeline in time order, or to always arrive
+at predictable intervals.</li><li><a href=#trigger><em>Trigger</em></a> - A trigger determines when to aggregate the results of
 each window.</li><li><a href=#state-and-timers><em>State and timers</em></a> - Per-key state and timer callbacks
 are lower level primitives that give you full control over aggregating input
 collections that grow over time.</li><li><a href=#splittable-dofn><em>Splittable DoFn</em></a> - Splittable DoFns let you process
@@ -95,18 +101,17 @@ responsible for providing initial timestamps. The runner must propagate and
 aggregate timestamps. If the timestamp is not important, such as with certain
 batch processing jobs where elements do not denote events, the timestamp will be
 the minimum representable timestamp, often referred to colloquially as &ldquo;negative
-infinity&rdquo;.</p><h4 id=watermarks>Watermarks</h4><p>Every <code>PCollection</code> must have a watermark that estimates how complete the
-<code>PCollection</code> is.</p><p>The watermark is a guess that &ldquo;we&rsquo;ll never see an element with an earlier
+infinity&rdquo;.</p><h4 id=watermarks>Watermarks</h4><p>Every <code>PCollection</code> must have a <a href=#watermark>watermark</a> that estimates how
+complete the <code>PCollection</code> is.</p><p>The watermark is a guess that &ldquo;we&rsquo;ll never see an element with an earlier
 timestamp&rdquo;. Data sources are responsible for producing a watermark. The runner
 must implement watermark propagation as PCollections are processed, merged, and
 partitioned.</p><p>The contents of a <code>PCollection</code> are complete when a watermark advances to
 &ldquo;infinity&rdquo;. In this manner, you can discover that an unbounded PCollection is
-finite.</p><h4 id=windowed-elements>Windowed elements</h4><p>Every element in a <code>PCollection</code> resides in a window. No element resides in
-multiple windows; two elements can be equal except for their window, but they
-are not the same.</p><p>When elements are read from the outside world, they arrive in the global window.
-When they are written to the outside world, they are effectively placed back
+finite.</p><h4 id=windowed-elements>Windowed elements</h4><p>Every element in a <code>PCollection</code> resides in a <a href=#window>window</a>. No element
+resides in multiple windows; two elements can be equal except for their window,
+but they are not the same.</p><p>When elements are written to the outside world, they are effectively placed back
 into the global window. Transforms that write data and don&rsquo;t take this
-perspective probably risks data loss.</p><p>A window has a maximum timestamp. When the watermark exceeds the maximum
+perspective risk data loss.</p><p>A window has a maximum timestamp. When the watermark exceeds the maximum
 timestamp plus the user-specified allowed lateness, the window is expired. All
 data related to an expired window might be discarded at any time.</p><h4 id=coder>Coder</h4><p>Every <code>PCollection</code> has a coder, which is a specification of the binary format
 of the elements.</p><p>In Beam, the user&rsquo;s pipeline can be written in a language other than the
@@ -150,8 +155,8 @@ the transform. For example, when using <code>ParDo</code>, user-defined code spe
 operation to apply to every element. For <code>Combine</code>, it specifies how values
 should be combined. By using <a href=/documentation/patterns/cross-language/>cross-language transforms</a>,
 a Beam pipeline can contain UDFs written in a different language, or even
-multiple languages in the same pipeline.</p><p>Beam has several varieties of UDFs:</p><ul><li><a href=/programming-guide/#pardo><em>DoFn</em></a> - per-element processing function (used
-in <code>ParDo</code>)</li><li><a href=/programming-guide/#setting-your-pcollections-windowing-function><em>WindowFn</em></a> -
+multiple languages in the same pipeline.</p><p>Beam has several varieties of UDFs:</p><ul><li><a href=/documentation/programming-guide/#pardo><em>DoFn</em></a> - per-element processing
+function (used in <code>ParDo</code>)</li><li><a href=/documentation/programming-guide/#setting-your-pcollections-windowing-function><em>WindowFn</em></a> -
 places elements in windows and merges windows (used in <code>Window</code> and
 <code>GroupByKey</code>)</li><li><a href=/documentation/programming-guide/#side-inputs><em>ViewFn</em></a> - adapts a
 materialized <code>PCollection</code> to a particular interface (used in side inputs)</li><li><a href=/documentation/programming-guide/#side-inputs-windowing><em>WindowMappingFn</em></a> -
@@ -167,7 +172,7 @@ without communicating or sharing state with any of the other copies. Each copy
 of your user code function might be retried or run multiple times, depending on
 the pipeline runner and the processing backend that you choose for your
 pipeline. Beam also supports stateful processing through the
-<a href=/blog/stateful-processing/>stateful processing API</a>.</p><p>For more information about user-defined functions, see the following pages:</p><ul><li><a href=/documentation/programming-guide/#requirements-for-writing-user-code-for-beam-transforms>Requirements for writing user code for Beam transforms</a></li><li><a href=/documentation/programming-guide/#pardo>Beam Programming Guide: ParDo</a></li><li><a href=/programming-guide/#setting-your-pcollections-windowing-function>Beam Pro [...]
+<a href=/blog/stateful-processing/>stateful processing API</a>.</p><p>For more information about user-defined functions, see the following pages:</p><ul><li><a href=/documentation/programming-guide/#requirements-for-writing-user-code-for-beam-transforms>Requirements for writing user code for Beam transforms</a></li><li><a href=/documentation/programming-guide/#pardo>Beam Programming Guide: ParDo</a></li><li><a href=/documentation/programming-guide/#setting-your-pcollections-windowing-fun [...]
 schema for a <code>PCollection</code> defines elements of that <code>PCollection</code> as an ordered
 list of named fields. Each field has a name, a type, and possibly a set of user
 options.</p><p>In many cases, the element type in a <code>PCollection</code> has a structure that can be
@@ -188,7 +193,45 @@ Flink runner translates a Beam pipeline into a Flink job. The Direct Runner runs
 pipelines locally so you can test, debug, and validate that your pipeline
 adheres to the Apache Beam model as closely as possible.</p><p>For an up-to-date list of Beam runners and which features of the Apache Beam
 model they support, see the runner
-<a href=/documentation/runners/capability-matrix/>capability matrix</a>.</p><p>For more information about runners, see the following pages:</p><ul><li><a href=/documentation/#choosing-a-runner>Choosing a Runner</a></li><li><a href=/documentation/runners/capability-matrix/>Beam Capability Matrix</a></li></ul><h3 id=trigger>Trigger</h3><p>When collecting and grouping data into windows, Beam uses <em>triggers</em> to
+<a href=/documentation/runners/capability-matrix/>capability matrix</a>.</p><p>For more information about runners, see the following pages:</p><ul><li><a href=/documentation/#choosing-a-runner>Choosing a Runner</a></li><li><a href=/documentation/runners/capability-matrix/>Beam Capability Matrix</a></li></ul><h3 id=window>Window</h3><p>Windowing subdivides a <code>PCollection</code> into <em>windows</em> according to the timestamps
+of its individual elements. Windows enable grouping operations over unbounded
+collections by dividing the collection into windows of finite collections.</p><p>A <em>windowing function</em> tells the runner how to assign elements to one or more
+initial windows, and how to merge windows of grouped elements. Each element in a
+<code>PCollection</code> can only be in one window, so if a windowing function specifies
+multiple windows for an element, the element is conceptually duplicated into
+each of the windows and each element is identical except for its window.</p><p>Transforms that aggregate multiple elements, such as <code>GroupByKey</code> and <code>Combine</code>,
+work implicitly on a per-window basis; they process each <code>PCollection</code> as a
+succession of multiple, finite windows, though the entire collection itself may
+be of unbounded size.</p><p>Beam provides several windowing functions:</p><ul><li><strong>Fixed time windows</strong> (also known as &ldquo;tumbling windows&rdquo;) represent a consistent
+duration, non-overlapping time interval in the data stream.</li><li><strong>Sliding time windows</strong> (also known as &ldquo;hopping windows&rdquo;) also represent time
+intervals in the data stream; however, sliding time windows can overlap.</li><li><strong>Per-session windows</strong> define windows that contain elements that are within a
+certain gap duration of another element.</li><li><strong>Single global window</strong>: by default, all data in a <code>PCollection</code> is assigned to
+the single global window, and late data is discarded.</li><li><strong>Calendar-based windows</strong> (not supported by the Beam SDK for Python)</li></ul><p>You can also define your own windowing function if you have more complex
+requirements.</p><p>For example, let&rsquo;s say we have a <code>PCollection</code> that uses fixed-time windowing,
+with windows that are five minutes long. For each window, Beam must collect all
+the data with an event time timestamp in the given window range (between 0:00
+and 4:59 in the first window, for instance). Data with timestamps outside that
+range (data from 5:00 or later) belongs to a different window.</p><p>Two concepts are closely related to windowing and covered in the following
+sections: <a href=#watermark>watermarks</a> and <a href=#trigger>triggers</a>.</p><p>For more information about windows, see the following page:</p><ul><li><a href=/documentation/programming-guide/#windowing>Beam Programming Guide: Windowing</a></li><li><a href=/documentation/programming-guide/#setting-your-pcollections-windowing-function>Beam Programming Guide: WindowFn</a></li></ul><h3 id=watermark>Watermark</h3><p>In any data processing system, there is a certain amount of lag between [...]
+a data event occurs (the “event time”, determined by the timestamp on the data
+element itself) and the time the actual data element gets processed at any stage
+in your pipeline (the “processing time”, determined by the clock on the system
+processing the element). In addition, data isn’t always guaranteed to arrive in
+a pipeline in time order, or to always arrive at predictable intervals. For
+example, you might have intermediate systems that don&rsquo;t preserve order, or you
+might have two servers that timestamp data but one has a better network
+connection.</p><p>To address this potential unpredictability, Beam tracks a <em>watermark</em>. A
+watermark is a guess as to when all data in a certain window is expected to have
+arrived in the pipeline. You can also think of this as “we’ll never see an
+element with an earlier timestamp”.</p><p>Data sources are responsible for producing a watermark, and every <code>PCollection</code>
+must have a watermark that estimates how complete the <code>PCollection</code> is. The
+contents of a <code>PCollection</code> are complete when a watermark advances to
+“infinity”. In this manner, you might discover that an unbounded <code>PCollection</code>
+is finite. After the watermark progresses past the end of a window, any further
+element that arrives with a timestamp in that window is considered <em>late data</em>.</p><p>Triggers are a related concept that allow you to modify and refine the windowing
+strategy for a <code>PCollection</code>. You can use triggers to decide when each
+individual window aggregates and reports its results, including how the window
+emits late elements.</p><p>For more information about watermarks, see the following page:</p><ul><li><a href=/documentation/programming-guide/#watermarks-and-late-data>Beam Programming Guide: Watermarks and late data</a></li></ul><h3 id=trigger>Trigger</h3><p>When collecting and grouping data into windows, Beam uses <em>triggers</em> to
 determine when to emit the aggregated results of each window (referred to as a
 <em>pane</em>). If you use Beam’s default windowing configuration and default trigger,
 Beam outputs the aggregated result when it estimates all data has arrived, and
@@ -259,7 +302,7 @@ checkpoint the sub-element and the runner repeats step 2.</li></ol><p>You can al
 processing. For example, if you write a splittable <code>DoFn</code> to watch a set of
 directories and output filenames as they arrive, you can split to subdivide the
 work of different directories. This allows the runner to split off a hot
-directory and give it additional resources.</p><p>For more information about Splittable <code>DoFn</code>, see the following pages:</p><ul><li><a href=/documentation/programming-guide/#splittable-dofns>Splittable DoFns</a></li><li><a href=/blog/splittable-do-fn-is-available/>Splittable DoFn in Apache Beam is Ready to Use</a></li></ul><div class=feedback><p class=update>Last updated on 2021/10/21</p><h3>Have you found everything you were looking for?</h3><p class=description>Was it all us [...]
+directory and give it additional resources.</p><p>For more information about Splittable <code>DoFn</code>, see the following pages:</p><ul><li><a href=/documentation/programming-guide/#splittable-dofns>Splittable DoFns</a></li><li><a href=/blog/splittable-do-fn-is-available/>Splittable DoFn in Apache Beam is Ready to Use</a></li></ul><div class=feedback><p class=update>Last updated on 2021/12/06</p><h3>Have you found everything you were looking for?</h3><p class=description>Was it all us [...]
 <a href=http://www.apache.org>The Apache Software Foundation</a>
 | <a href=/privacy_policy>Privacy Policy</a>
 | <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation. All other products or name brands are trademarks of their respective holders, including The Apache Software Foundation.</div></div></div></div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/documentation/index.xml b/website/generated-content/documentation/index.xml
index fc279a0..53c27a4 100644
--- a/website/generated-content/documentation/index.xml
+++ b/website/generated-content/documentation/index.xml
@@ -3205,6 +3205,14 @@ pipeline authors build transforms, construct their pipelines, and submit
 them to a runner.&lt;/li>
 &lt;li>&lt;a href="#runner">&lt;em>Runner&lt;/em>&lt;/a> - A runner runs a Beam pipeline using the capabilities of
 your chosen data processing engine.&lt;/li>
+&lt;li>&lt;a href="#window">&lt;em>Window&lt;/em>&lt;/a> - A &lt;code>PCollection&lt;/code> can be subdivided into windows based on
+the timestamps of the individual elements. Windows enable grouping operations
+over collections that grow over time by dividing the collection into windows
+of finite collections.&lt;/li>
+&lt;li>&lt;a href="#watermark">&lt;em>Watermark&lt;/em>&lt;/a> - A watermark is a guess as to when all data in a
+certain window is expected to have arrived. This is needed because data isn’t
+always guaranteed to arrive in a pipeline in time order, or to always arrive
+at predictable intervals.&lt;/li>
 &lt;li>&lt;a href="#trigger">&lt;em>Trigger&lt;/em>&lt;/a> - A trigger determines when to aggregate the results of
 each window.&lt;/li>
 &lt;li>&lt;a href="#state-and-timers">&lt;em>State and timers&lt;/em>&lt;/a> - Per-key state and timer callbacks
@@ -3316,8 +3324,8 @@ batch processing jobs where elements do not denote events, the timestamp will be
 the minimum representable timestamp, often referred to colloquially as &amp;ldquo;negative
 infinity&amp;rdquo;.&lt;/p>
 &lt;h4 id="watermarks">Watermarks&lt;/h4>
-&lt;p>Every &lt;code>PCollection&lt;/code> must have a watermark that estimates how complete the
-&lt;code>PCollection&lt;/code> is.&lt;/p>
+&lt;p>Every &lt;code>PCollection&lt;/code> must have a &lt;a href="#watermark">watermark&lt;/a> that estimates how
+complete the &lt;code>PCollection&lt;/code> is.&lt;/p>
 &lt;p>The watermark is a guess that &amp;ldquo;we&amp;rsquo;ll never see an element with an earlier
 timestamp&amp;rdquo;. Data sources are responsible for producing a watermark. The runner
 must implement watermark propagation as PCollections are processed, merged, and
@@ -3326,13 +3334,12 @@ partitioned.&lt;/p>
 &amp;ldquo;infinity&amp;rdquo;. In this manner, you can discover that an unbounded PCollection is
 finite.&lt;/p>
 &lt;h4 id="windowed-elements">Windowed elements&lt;/h4>
-&lt;p>Every element in a &lt;code>PCollection&lt;/code> resides in a window. No element resides in
-multiple windows; two elements can be equal except for their window, but they
-are not the same.&lt;/p>
-&lt;p>When elements are read from the outside world, they arrive in the global window.
-When they are written to the outside world, they are effectively placed back
+&lt;p>Every element in a &lt;code>PCollection&lt;/code> resides in a &lt;a href="#window">window&lt;/a>. No element
+resides in multiple windows; two elements can be equal except for their window,
+but they are not the same.&lt;/p>
+&lt;p>When elements are written to the outside world, they are effectively placed back
 into the global window. Transforms that write data and don&amp;rsquo;t take this
-perspective probably risks data loss.&lt;/p>
+perspective risk data loss.&lt;/p>
 &lt;p>A window has a maximum timestamp. When the watermark exceeds the maximum
 timestamp plus the user-specified allowed lateness, the window is expired. All
 data related to an expired window might be discarded at any time.&lt;/p>
@@ -3410,9 +3417,9 @@ a Beam pipeline can contain UDFs written in a different language, or even
 multiple languages in the same pipeline.&lt;/p>
 &lt;p>Beam has several varieties of UDFs:&lt;/p>
 &lt;ul>
-&lt;li>&lt;a href="/programming-guide/#pardo">&lt;em>DoFn&lt;/em>&lt;/a> - per-element processing function (used
-in &lt;code>ParDo&lt;/code>)&lt;/li>
-&lt;li>&lt;a href="/programming-guide/#setting-your-pcollections-windowing-function">&lt;em>WindowFn&lt;/em>&lt;/a> -
+&lt;li>&lt;a href="/documentation/programming-guide/#pardo">&lt;em>DoFn&lt;/em>&lt;/a> - per-element processing
+function (used in &lt;code>ParDo&lt;/code>)&lt;/li>
+&lt;li>&lt;a href="/documentation/programming-guide/#setting-your-pcollections-windowing-function">&lt;em>WindowFn&lt;/em>&lt;/a> -
 places elements in windows and merges windows (used in &lt;code>Window&lt;/code> and
 &lt;code>GroupByKey&lt;/code>)&lt;/li>
 &lt;li>&lt;a href="/documentation/programming-guide/#side-inputs">&lt;em>ViewFn&lt;/em>&lt;/a> - adapts a
@@ -3439,7 +3446,7 @@ pipeline. Beam also supports stateful processing through the
 &lt;ul>
 &lt;li>&lt;a href="/documentation/programming-guide/#requirements-for-writing-user-code-for-beam-transforms">Requirements for writing user code for Beam transforms&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/programming-guide/#pardo">Beam Programming Guide: ParDo&lt;/a>&lt;/li>
-&lt;li>&lt;a href="/programming-guide/#setting-your-pcollections-windowing-function">Beam Programming Guide: WindowFn&lt;/a>&lt;/li>
+&lt;li>&lt;a href="/documentation/programming-guide/#setting-your-pcollections-windowing-function">Beam Programming Guide: WindowFn&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/programming-guide/#combine">Beam Programming Guide: CombineFn&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/programming-guide/#data-encoding-and-type-safety">Beam Programming Guide: Coder&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/programming-guide/#side-inputs">Beam Programming Guide: Side inputs&lt;/a>&lt;/li>
@@ -3482,6 +3489,73 @@ model they support, see the runner
 &lt;li>&lt;a href="/documentation/#choosing-a-runner">Choosing a Runner&lt;/a>&lt;/li>
 &lt;li>&lt;a href="/documentation/runners/capability-matrix/">Beam Capability Matrix&lt;/a>&lt;/li>
 &lt;/ul>
+&lt;h3 id="window">Window&lt;/h3>
+&lt;p>Windowing subdivides a &lt;code>PCollection&lt;/code> into &lt;em>windows&lt;/em> according to the timestamps
+of its individual elements. Windows enable grouping operations over unbounded
+collections by dividing the collection into windows of finite collections.&lt;/p>
+&lt;p>A &lt;em>windowing function&lt;/em> tells the runner how to assign elements to one or more
+initial windows, and how to merge windows of grouped elements. Each element in a
+&lt;code>PCollection&lt;/code> can only be in one window, so if a windowing function specifies
+multiple windows for an element, the element is conceptually duplicated into
+each of the windows and each element is identical except for its window.&lt;/p>
+&lt;p>Transforms that aggregate multiple elements, such as &lt;code>GroupByKey&lt;/code> and &lt;code>Combine&lt;/code>,
+work implicitly on a per-window basis; they process each &lt;code>PCollection&lt;/code> as a
+succession of multiple, finite windows, though the entire collection itself may
+be of unbounded size.&lt;/p>
+&lt;p>Beam provides several windowing functions:&lt;/p>
+&lt;ul>
+&lt;li>&lt;strong>Fixed time windows&lt;/strong> (also known as &amp;ldquo;tumbling windows&amp;rdquo;) represent a consistent
+duration, non-overlapping time interval in the data stream.&lt;/li>
+&lt;li>&lt;strong>Sliding time windows&lt;/strong> (also known as &amp;ldquo;hopping windows&amp;rdquo;) also represent time
+intervals in the data stream; however, sliding time windows can overlap.&lt;/li>
+&lt;li>&lt;strong>Per-session windows&lt;/strong> define windows that contain elements that are within a
+certain gap duration of another element.&lt;/li>
+&lt;li>&lt;strong>Single global window&lt;/strong>: by default, all data in a &lt;code>PCollection&lt;/code> is assigned to
+the single global window, and late data is discarded.&lt;/li>
+&lt;li>&lt;strong>Calendar-based windows&lt;/strong> (not supported by the Beam SDK for Python)&lt;/li>
+&lt;/ul>
+&lt;p>You can also define your own windowing function if you have more complex
+requirements.&lt;/p>
+&lt;p>For example, let&amp;rsquo;s say we have a &lt;code>PCollection&lt;/code> that uses fixed-time windowing,
+with windows that are five minutes long. For each window, Beam must collect all
+the data with an event time timestamp in the given window range (between 0:00
+and 4:59 in the first window, for instance). Data with timestamps outside that
+range (data from 5:00 or later) belongs to a different window.&lt;/p>
+&lt;p>Two concepts are closely related to windowing and covered in the following
+sections: &lt;a href="#watermark">watermarks&lt;/a> and &lt;a href="#trigger">triggers&lt;/a>.&lt;/p>
+&lt;p>For more information about windows, see the following page:&lt;/p>
+&lt;ul>
+&lt;li>&lt;a href="/documentation/programming-guide/#windowing">Beam Programming Guide: Windowing&lt;/a>&lt;/li>
+&lt;li>&lt;a href="/documentation/programming-guide/#setting-your-pcollections-windowing-function">Beam Programming Guide: WindowFn&lt;/a>&lt;/li>
+&lt;/ul>
+&lt;h3 id="watermark">Watermark&lt;/h3>
+&lt;p>In any data processing system, there is a certain amount of lag between the time
+a data event occurs (the “event time”, determined by the timestamp on the data
+element itself) and the time the actual data element gets processed at any stage
+in your pipeline (the “processing time”, determined by the clock on the system
+processing the element). In addition, data isn’t always guaranteed to arrive in
+a pipeline in time order, or to always arrive at predictable intervals. For
+example, you might have intermediate systems that don&amp;rsquo;t preserve order, or you
+might have two servers that timestamp data but one has a better network
+connection.&lt;/p>
+&lt;p>To address this potential unpredictability, Beam tracks a &lt;em>watermark&lt;/em>. A
+watermark is a guess as to when all data in a certain window is expected to have
+arrived in the pipeline. You can also think of this as “we’ll never see an
+element with an earlier timestamp”.&lt;/p>
+&lt;p>Data sources are responsible for producing a watermark, and every &lt;code>PCollection&lt;/code>
+must have a watermark that estimates how complete the &lt;code>PCollection&lt;/code> is. The
+contents of a &lt;code>PCollection&lt;/code> are complete when a watermark advances to
+“infinity”. In this manner, you might discover that an unbounded &lt;code>PCollection&lt;/code>
+is finite. After the watermark progresses past the end of a window, any further
+element that arrives with a timestamp in that window is considered &lt;em>late data&lt;/em>.&lt;/p>
+&lt;p>Triggers are a related concept that allow you to modify and refine the windowing
+strategy for a &lt;code>PCollection&lt;/code>. You can use triggers to decide when each
+individual window aggregates and reports its results, including how the window
+emits late elements.&lt;/p>
+&lt;p>For more information about watermarks, see the following page:&lt;/p>
+&lt;ul>
+&lt;li>&lt;a href="/documentation/programming-guide/#watermarks-and-late-data">Beam Programming Guide: Watermarks and late data&lt;/a>&lt;/li>
+&lt;/ul>
 &lt;h3 id="trigger">Trigger&lt;/h3>
 &lt;p>When collecting and grouping data into windows, Beam uses &lt;em>triggers&lt;/em> to
 determine when to emit the aggregated results of each window (referred to as a
@@ -8893,9 +8967,12 @@ window.&lt;/p>
 &lt;/ul>
 &lt;p>You can also define your own &lt;code>WindowFn&lt;/code> if you have a more complex need.&lt;/p>
 &lt;p>Note that each element can logically belong to more than one window, depending
-on the windowing function you use. Sliding time windowing, for example, creates
-overlapping windows wherein a single element can be assigned to multiple
-windows.&lt;/p>
+on the windowing function you use. Sliding time windowing, for example, can
+create overlapping windows wherein a single element can be assigned to multiple
+windows. However, each element in a &lt;code>PCollection&lt;/code> can only be in one window, so
+if an element is assigned to multiple windows, the element is conceptually
+duplicated into each of the windows and each element is identical except for its
+window.&lt;/p>
 &lt;h4 id="fixed-time-windows">8.2.1. Fixed time windows&lt;/h4>
 &lt;p>The simplest form of windowing is using &lt;strong>fixed time windows&lt;/strong>: given a
 timestamped &lt;code>PCollection&lt;/code> which might be continuously updating, each window
diff --git a/website/generated-content/documentation/programming-guide/index.html b/website/generated-content/documentation/programming-guide/index.html
index 1ef8286..82acce2 100644
--- a/website/generated-content/documentation/programming-guide/index.html
+++ b/website/generated-content/documentation/programming-guide/index.html
@@ -2558,9 +2558,12 @@ for that <code>PCollection</code>. The <code>GroupByKey</code> transform groups
 subsequent <code>ParDo</code> transform gets applied multiple times per key, once for each
 window.</p><h3 id=provided-windowing-functions>8.2. Provided windowing functions</h3><p>You can define different kinds of windows to divide the elements of your
 <code>PCollection</code>. Beam provides several windowing functions, including:</p><ul><li>Fixed Time Windows</li><li>Sliding Time Windows</li><li>Per-Session Windows</li><li>Single Global Window</li><li>Calendar-based Windows (not supported by the Beam SDK for Python or Go)</li></ul><p>You can also define your own <code>WindowFn</code> if you have a more complex need.</p><p>Note that each element can logically belong to more than one window, depending
-on the windowing function you use. Sliding time windowing, for example, creates
-overlapping windows wherein a single element can be assigned to multiple
-windows.</p><h4 id=fixed-time-windows>8.2.1. Fixed time windows</h4><p>The simplest form of windowing is using <strong>fixed time windows</strong>: given a
+on the windowing function you use. Sliding time windowing, for example, can
+create overlapping windows wherein a single element can be assigned to multiple
+windows. However, each element in a <code>PCollection</code> can only be in one window, so
+if an element is assigned to multiple windows, the element is conceptually
+duplicated into each of the windows and each element is identical except for its
+window.</p><h4 id=fixed-time-windows>8.2.1. Fixed time windows</h4><p>The simplest form of windowing is using <strong>fixed time windows</strong>: given a
 timestamped <code>PCollection</code> which might be continuously updating, each window
 might capture (for example) all elements with timestamps that fall into a 30
 second interval.</p><p>A fixed time window represents a consistent duration, non overlapping time
@@ -4307,7 +4310,7 @@ expansionAddr := &#34;localhost:8097&#34;
 outT := beam.UnnamedOutput(typex.New(reflectx.String))
 res := beam.CrossLanguage(s, urn, payload, expansionAddr, beam.UnnamedInput(inputPCol), outT)
    </code></pre></div></div></li><li><p>After the job has been submitted to the Beam runner, shutdown the expansion service by
-terminating the expansion service process.</p></li></ol><h3 id=x-lang-transform-runner-support>13.3. Runner Support</h3><p>Currently, portable runners such as Flink, Spark, and the Direct runner can be used with multi-language pipelines.</p><p>Google Cloud Dataflow supports multi-language pipelines through the Dataflow Runner v2 backend architecture.</p><div class=feedback><p class=update>Last updated on 2021/11/18</p><h3>Have you found everything you were looking for?</h3><p class=descr [...]
+terminating the expansion service process.</p></li></ol><h3 id=x-lang-transform-runner-support>13.3. Runner Support</h3><p>Currently, portable runners such as Flink, Spark, and the Direct runner can be used with multi-language pipelines.</p><p>Google Cloud Dataflow supports multi-language pipelines through the Dataflow Runner v2 backend architecture.</p><div class=feedback><p class=update>Last updated on 2021/12/06</p><h3>Have you found everything you were looking for?</h3><p class=descr [...]
 <a href=http://www.apache.org>The Apache Software Foundation</a>
 | <a href=/privacy_policy>Privacy Policy</a>
 | <a href=/feed.xml>RSS Feed</a><br><br>Apache Beam, Apache, Beam, the Beam logo, and the Apache feather logo are either registered trademarks or trademarks of The Apache Software Foundation. All other products or name brands are trademarks of their respective holders, including The Apache Software Foundation.</div></div></div></div></footer></body></html>
\ No newline at end of file
diff --git a/website/generated-content/sitemap.xml b/website/generated-content/sitemap.xml
index 78bbfff..5047cde 100644
--- a/website/generated-content/sitemap.xml
+++ b/website/generated-content/sitemap.xml
@@ -1 +1 @@
-<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.34.0/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2021-12-01T21:32:04+03:00</lastmod></url><url><loc>/blog/g [...]
\ No newline at end of file
+<?xml version="1.0" encoding="utf-8" standalone="yes"?><urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9" xmlns:xhtml="http://www.w3.org/1999/xhtml"><url><loc>/blog/beam-2.34.0/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/categories/blog/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/blog/</loc><lastmod>2021-11-11T11:07:06-08:00</lastmod></url><url><loc>/categories/</loc><lastmod>2021-12-01T21:32:04+03:00</lastmod></url><url><loc>/blog/g [...]
\ No newline at end of file