You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "alxp1982 (via GitHub)" <gi...@apache.org> on 2023/02/14 09:46:09 UTC

[GitHub] [beam] alxp1982 commented on a diff in pull request #25258: [Tour of Beam] Learning content for "Windowing" module

alxp1982 commented on code in PR #25258:
URL: https://github.com/apache/beam/pull/25258#discussion_r1105462640


##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.

Review Comment:
   `Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.

Review Comment:
   `Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into **fixed-length, non-overlapping** time intervals, which can be useful for a variety of use cases.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.
+
+Sliding time windows also allows to look at data in a more dynamic way. This is useful when you have a high-frequency data stream and you want to look at the most recent data.

Review Comment:
   Sliding time windows also allow looking at data more dynamically. This is useful when you have a high-frequency data stream, and you want to look at the most recent data.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.
+
+Sliding time windows also allows to look at data in a more dynamic way. This is useful when you have a high-frequency data stream and you want to look at the most recent data.
+
+In summary, Sliding time windows are useful for performing running aggregations, anomaly detection and looking at data in a more dynamic way.
+
+
+`Session windows` are a type of windowing that groups data elements based on periods of inactivity or "gaps" in the data stream. They are useful when you want to group data elements that are related to a specific event or activity together.
+
+One of the main use cases for session windows is to group together data elements that are related to a user's session on a website or application. By using session windows with a relatively short gap duration, you can ensure that all the events related to a user's session are grouped together. This allows you to compute session-level metrics, such as the number of pages viewed per session, the duration of a session, or the number of events per session.
+
+Another use case for session windows is to group together data elements that are related to a specific device's usage. For example, if you are collecting sensor data, you can use session windows to group together data elements that are collected while the device is in use. This allows you to compute device-level metrics, such as the number of sensor readings per device, the duration of device usage, or the number of events per device.

Review Comment:
   Another use case for session windows is to **group data elements related to a specific device's usage**. For example, if you are collecting sensor data, you can use session windows to group data elements collected while the device is in use. This allows you to compute device-level metrics, such as the number of sensor readings per device, the duration of device usage, or the number of events per device.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.
+
+Sliding time windows also allows to look at data in a more dynamic way. This is useful when you have a high-frequency data stream and you want to look at the most recent data.
+
+In summary, Sliding time windows are useful for performing running aggregations, anomaly detection and looking at data in a more dynamic way.
+
+
+`Session windows` are a type of windowing that groups data elements based on periods of inactivity or "gaps" in the data stream. They are useful when you want to group data elements that are related to a specific event or activity together.
+
+One of the main use cases for session windows is to group together data elements that are related to a user's session on a website or application. By using session windows with a relatively short gap duration, you can ensure that all the events related to a user's session are grouped together. This allows you to compute session-level metrics, such as the number of pages viewed per session, the duration of a session, or the number of events per session.
+
+Another use case for session windows is to group together data elements that are related to a specific device's usage. For example, if you are collecting sensor data, you can use session windows to group together data elements that are collected while the device is in use. This allows you to compute device-level metrics, such as the number of sensor readings per device, the duration of device usage, or the number of events per device.
+
+In summary, session windows are useful for grouping data elements that are related to specific events or activities, such as user sessions or device usage. This allows you to compute event- or device-level metrics.
+
+
+A `single global window` is a type of windowing that treats all data elements as belonging to the same window. This means that all elements in the data stream are processed together and no windowing is applied.
+
+The main use case for a single global window is when you want to process all the data elements in your data stream as a whole, without breaking them up into smaller windows. This can be useful in situations where you don't need to compute window-level metrics, such as running averages or counts, but instead want to process the entire data stream as a single unit.

Review Comment:
   The primary use case for a single global window is when you want to process all the data elements in your data stream **as a whole** without breaking them up into smaller windows. For example, this can be useful when you don't need to compute window-level metrics, such as running averages or counts, but instead, you want to process the entire data stream as a single unit.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.
+
+Sliding time windows also allows to look at data in a more dynamic way. This is useful when you have a high-frequency data stream and you want to look at the most recent data.
+
+In summary, Sliding time windows are useful for performing running aggregations, anomaly detection and looking at data in a more dynamic way.
+
+
+`Session windows` are a type of windowing that groups data elements based on periods of inactivity or "gaps" in the data stream. They are useful when you want to group data elements that are related to a specific event or activity together.
+
+One of the main use cases for session windows is to group together data elements that are related to a user's session on a website or application. By using session windows with a relatively short gap duration, you can ensure that all the events related to a user's session are grouped together. This allows you to compute session-level metrics, such as the number of pages viewed per session, the duration of a session, or the number of events per session.
+
+Another use case for session windows is to group together data elements that are related to a specific device's usage. For example, if you are collecting sensor data, you can use session windows to group together data elements that are collected while the device is in use. This allows you to compute device-level metrics, such as the number of sensor readings per device, the duration of device usage, or the number of events per device.
+
+In summary, session windows are useful for grouping data elements that are related to specific events or activities, such as user sessions or device usage. This allows you to compute event- or device-level metrics.
+
+
+A `single global window` is a type of windowing that treats all data elements as belonging to the same window. This means that all elements in the data stream are processed together and no windowing is applied.
+
+The main use case for a single global window is when you want to process all the data elements in your data stream as a whole, without breaking them up into smaller windows. This can be useful in situations where you don't need to compute window-level metrics, such as running averages or counts, but instead want to process the entire data stream as a single unit.
+
+For example, if you are using a data pipeline to filter out invalid data elements and then store the remaining data in a database, you might use a single global window to process all the data elements together, without breaking them up into smaller windows.
+
+Another use case is when your data streams are already time-stamped and you want to process events in the order they arrive, so you don't want to group them based on time windows.
+
+In summary, a single global window is useful when you want to process all the data elements in your data stream as a whole, without breaking them up into smaller windows. It can be useful for situations where you don't need to compute window-level metrics, or for processing events in the order they arrive.

Review Comment:
   In summary, a single global window is useful when you want to process all the data elements in your data stream as a whole without breaking them up into smaller windows. In addition, it can be helpful for situations **where you don't need to compute window-level metrics or for processing events in the order they arrive**.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.
+
+Sliding time windows also allows to look at data in a more dynamic way. This is useful when you have a high-frequency data stream and you want to look at the most recent data.
+
+In summary, Sliding time windows are useful for performing running aggregations, anomaly detection and looking at data in a more dynamic way.
+
+
+`Session windows` are a type of windowing that groups data elements based on periods of inactivity or "gaps" in the data stream. They are useful when you want to group data elements that are related to a specific event or activity together.
+
+One of the main use cases for session windows is to group together data elements that are related to a user's session on a website or application. By using session windows with a relatively short gap duration, you can ensure that all the events related to a user's session are grouped together. This allows you to compute session-level metrics, such as the number of pages viewed per session, the duration of a session, or the number of events per session.
+
+Another use case for session windows is to group together data elements that are related to a specific device's usage. For example, if you are collecting sensor data, you can use session windows to group together data elements that are collected while the device is in use. This allows you to compute device-level metrics, such as the number of sensor readings per device, the duration of device usage, or the number of events per device.
+
+In summary, session windows are useful for grouping data elements that are related to specific events or activities, such as user sessions or device usage. This allows you to compute event- or device-level metrics.

Review Comment:
   In summary, session windows help **group data elements related to specific events or activities, such as user sessions or device usage**. This allows you to compute event- or device-level metrics.



##########
learning/tour-of-beam/learning-content/windowing/fixed-time-window/description.md:
##########
@@ -0,0 +1,68 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### Fixed time windows
+
+The simplest form of windowing is using fixed time windows: given a timestamped `PCollection` which might be continuously updating, each window might capture (for example) all elements with timestamps that fall into a 30-second interval.
+
+A fixed time window represents a consistent duration, non overlapping time interval in the data stream. Consider windows with a 30-second duration: all the elements in your unbounded PCollection with timestamp values from 0:00:00 up to (but not including) 0:00:30 belong to the first window, elements with timestamp values from 0:00:30 up to (but not including) 0:01:00 belong to the second window, and so on.
+
+{{if (eq .Sdk "go")}}
+```
+fixedWindowedItems := beam.WindowInto(s,
+	window.NewFixedWindows(30*time.Second),
+	items)
+```
+{{end}}
+
+{{if (eq .Sdk "java")}}
+```
+PCollection<String> items = ...;
+    PCollection<String> fixedWindowedItems = items.apply(
+        Window.<String>into(FixedWindows.of(Duration.standardSeconds(30))));
+```
+{{end}}
+
+{{if (eq .Sdk "python")}}
+```

Review Comment:
   Same as above



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.

Review Comment:
   Additionally, a fixed time window can also be helpful when dealing with data that arrive out-of-order or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
   		
   
   To summarize, fixed time windows help perform **time-based aggregations** or handle **out-of-order or late data**.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.

Review Comment:
   One of the primary use cases for sliding time windows is to **compute running aggregates**. For example, if you want to calculate a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. You can do this by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each covering a 60-second interval.



##########
learning/tour-of-beam/learning-content/windowing/global-window/description.md:
##########
@@ -0,0 +1,59 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### The single global window
+
+By default, all data in a `PCollection` is assigned to the single global window, and late data is discarded. If your data set is of a fixed size, you can use the global window default for your `PCollection`.
+
+You can use the single global window if you are working with an unbounded data set (e.g. from a streaming data source) but use caution when applying aggregating transforms such as `GroupByKey` and `Combine`. The single global window with a default trigger generally requires the entire data set to be available before processing, which is not possible with continuously updating data. To perform aggregations on an unbounded `PCollection` that uses global windowing, you should specify a non-default trigger for that `PCollection`.
+
+If your `PCollection` is bounded (the size is fixed), you can assign all the elements to a single global window. The following example code shows how to set a single global window for a `PCollection`:
+
+{{if (eq .Sdk "go")}}
+```

Review Comment:
   Please add a short description of how a single global window could be created in this particular SDK. 



##########
learning/tour-of-beam/learning-content/windowing/global-window/description.md:
##########
@@ -0,0 +1,59 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### The single global window
+
+By default, all data in a `PCollection` is assigned to the single global window, and late data is discarded. If your data set is of a fixed size, you can use the global window default for your `PCollection`.
+
+You can use the single global window if you are working with an unbounded data set (e.g. from a streaming data source) but use caution when applying aggregating transforms such as `GroupByKey` and `Combine`. The single global window with a default trigger generally requires the entire data set to be available before processing, which is not possible with continuously updating data. To perform aggregations on an unbounded `PCollection` that uses global windowing, you should specify a non-default trigger for that `PCollection`.
+
+If your `PCollection` is bounded (the size is fixed), you can assign all the elements to a single global window. The following example code shows how to set a single global window for a `PCollection`:
+
+{{if (eq .Sdk "go")}}
+```
+globalWindowedItems := beam.WindowInto(s,
+	window.NewGlobalWindows(),
+	items)
+```
+{{end}}
+
+{{if (eq .Sdk "java")}}
+```
+PCollection<String> items = ...;
+PCollection<String> batchItems = items.apply(
+  Window.<String>into(new GlobalWindows()));
+```
+{{end}}
+
+{{if (eq .Sdk "python")}}
+```
+from apache_beam import window
+global_windowed_items = (
+    items | 'window' >> beam.WindowInto(window.GlobalWindows()))
+```
+{{end}}
+
+### Playground exercise
+

Review Comment:
   Please add runnable example description



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.

Review Comment:
   Another use case for sliding time windows is to perform **anomaly detection**. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.
+
+Sliding time windows also allows to look at data in a more dynamic way. This is useful when you have a high-frequency data stream and you want to look at the most recent data.
+
+In summary, Sliding time windows are useful for performing running aggregations, anomaly detection and looking at data in a more dynamic way.
+
+
+`Session windows` are a type of windowing that groups data elements based on periods of inactivity or "gaps" in the data stream. They are useful when you want to group data elements that are related to a specific event or activity together.
+
+One of the main use cases for session windows is to group together data elements that are related to a user's session on a website or application. By using session windows with a relatively short gap duration, you can ensure that all the events related to a user's session are grouped together. This allows you to compute session-level metrics, such as the number of pages viewed per session, the duration of a session, or the number of events per session.

Review Comment:
   One of the primary use cases for session windows is to **group together data elements related to a user's session on a website or application**. For example, you can use session windows with a relatively short gap duration to ensure that all the events related to a user's session are grouped together. This allows you to compute session-level metrics, such as the number of pages viewed per session, the duration of a session, or the number of events per session.



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.
+
+Sliding time windows also allows to look at data in a more dynamic way. This is useful when you have a high-frequency data stream and you want to look at the most recent data.
+
+In summary, Sliding time windows are useful for performing running aggregations, anomaly detection and looking at data in a more dynamic way.

Review Comment:
   In summary, Sliding time windows help perform **running aggregations, anomaly detection** and **looking at data more dynamically**.



##########
learning/tour-of-beam/learning-content/windowing/fixed-time-window/description.md:
##########
@@ -0,0 +1,68 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### Fixed time windows
+
+The simplest form of windowing is using fixed time windows: given a timestamped `PCollection` which might be continuously updating, each window might capture (for example) all elements with timestamps that fall into a 30-second interval.
+
+A fixed time window represents a consistent duration, non overlapping time interval in the data stream. Consider windows with a 30-second duration: all the elements in your unbounded PCollection with timestamp values from 0:00:00 up to (but not including) 0:00:30 belong to the first window, elements with timestamp values from 0:00:30 up to (but not including) 0:01:00 belong to the second window, and so on.
+
+{{if (eq .Sdk "go")}}
+```
+fixedWindowedItems := beam.WindowInto(s,
+	window.NewFixedWindows(30*time.Second),
+	items)
+```
+{{end}}
+
+{{if (eq .Sdk "java")}}
+```

Review Comment:
   Same as above



##########
learning/tour-of-beam/learning-content/windowing/global-window/description.md:
##########
@@ -0,0 +1,59 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### The single global window
+
+By default, all data in a `PCollection` is assigned to the single global window, and late data is discarded. If your data set is of a fixed size, you can use the global window default for your `PCollection`.
+
+You can use the single global window if you are working with an unbounded data set (e.g. from a streaming data source) but use caution when applying aggregating transforms such as `GroupByKey` and `Combine`. The single global window with a default trigger generally requires the entire data set to be available before processing, which is not possible with continuously updating data. To perform aggregations on an unbounded `PCollection` that uses global windowing, you should specify a non-default trigger for that `PCollection`.
+
+If your `PCollection` is bounded (the size is fixed), you can assign all the elements to a single global window. The following example code shows how to set a single global window for a `PCollection`:
+
+{{if (eq .Sdk "go")}}
+```
+globalWindowedItems := beam.WindowInto(s,
+	window.NewGlobalWindows(),
+	items)
+```
+{{end}}
+
+{{if (eq .Sdk "java")}}
+```
+PCollection<String> items = ...;
+PCollection<String> batchItems = items.apply(
+  Window.<String>into(new GlobalWindows()));
+```
+{{end}}
+
+{{if (eq .Sdk "python")}}
+```
+from apache_beam import window
+global_windowed_items = (
+    items | 'window' >> beam.WindowInto(window.GlobalWindows()))
+```
+{{end}}
+
+### Playground exercise
+
+`CombineFn` : This function allows you to perform operations such as counting, summing, or finding the minimum or maximum element within a global window.
+
+`GroupByKey` : This function groups elements by a key, and allows you to apply a beam.CombineFn to each group of elements within a global window.
+
+`Map` : This function allows you to apply a user-defined function to each element within a global window.
+
+`Filter` : This function allows you to filter elements based on a user-defined condition, within a global window.
+
+`FlatMap` : This function allows you to apply a user-defined function to each element within a global window and output zero or more elements.
+
+These functions can be easily composed together to create complex data processing pipelines. Additionally, it's also possible to create your own custom functions to perform specific operations within a global window.

Review Comment:
   How do we challenge users here? Great to provide descriptions for different methods that can be used, but need to challenge. 



##########
learning/tour-of-beam/learning-content/windowing/windowing-concept/description.md:
##########
@@ -0,0 +1,57 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+### Windowing
+
+Windowing subdivides a `PCollection` according to the timestamps of its individual elements. Transforms that aggregate multiple elements, such as GroupByKey and Combine, work implicitly on a per-window basis — they process each PCollection as a succession of multiple, finite windows, though the entire collection itself may be of unbounded size.
+
+Some Beam transforms, such as `GroupByKey` and `Combine`, group multiple elements by a common key. Ordinarily, that grouping operation groups all the elements that have the same key within the entire data set. With an unbounded data set, it is impossible to collect all the elements, since new elements are constantly being added and may be infinitely many (e.g. streaming data). If you are working with unbounded PCollections, windowing is especially useful.
+
+
+`Fixed time windows` are useful for performing time-based aggregations, such as counting the number of elements that arrived during each hour of the day. It allows you to group elements of a data set into fixed-length, non-overlapping time intervals, which can be useful for a variety of use cases.
+For example, imagine you have a stream of data that is recording the number of website visitors every second, and you want to know the total number of visitors for each hour of the day. Using fixed-time windows, you can group the data into hour-long windows and then perform a sum aggregation on each window to get the total number of visitors for each hour.
+
+Additionally, fixed time window can also be useful when dealing with data that arrives out-of-order, or when dealing with late data. By specifying a fixed window duration, you can ensure that all elements that belong to a particular window are processed together, regardless of when they arrived.
+
+In summary, fixed time windows are useful for performing time-based aggregations and for handling out-of-order or late data.
+
+
+`Sliding time windows` are similar to fixed time windows, but they have the added ability to move or slide over the data stream, allowing them to overlap with each other.
+
+One of the main use cases for sliding time windows is to compute running aggregates. For example, if you want to compute a running average of the past 60 seconds’ worth of data updated every 30 seconds, you can use sliding time windows. This is done by defining a window duration of 60 seconds and a sliding interval of 30 seconds. With this configuration, you will have windows that slide every 30 seconds, each one covering a 60-second interval.
+
+Another use case for sliding time windows is to perform anomaly detection. By computing the running aggregates over a sliding window, you can detect patterns that deviate significantly from the historical data.
+
+Sliding time windows also allows to look at data in a more dynamic way. This is useful when you have a high-frequency data stream and you want to look at the most recent data.
+
+In summary, Sliding time windows are useful for performing running aggregations, anomaly detection and looking at data in a more dynamic way.
+
+
+`Session windows` are a type of windowing that groups data elements based on periods of inactivity or "gaps" in the data stream. They are useful when you want to group data elements that are related to a specific event or activity together.
+
+One of the main use cases for session windows is to group together data elements that are related to a user's session on a website or application. By using session windows with a relatively short gap duration, you can ensure that all the events related to a user's session are grouped together. This allows you to compute session-level metrics, such as the number of pages viewed per session, the duration of a session, or the number of events per session.
+
+Another use case for session windows is to group together data elements that are related to a specific device's usage. For example, if you are collecting sensor data, you can use session windows to group together data elements that are collected while the device is in use. This allows you to compute device-level metrics, such as the number of sensor readings per device, the duration of device usage, or the number of events per device.
+
+In summary, session windows are useful for grouping data elements that are related to specific events or activities, such as user sessions or device usage. This allows you to compute event- or device-level metrics.
+
+
+A `single global window` is a type of windowing that treats all data elements as belonging to the same window. This means that all elements in the data stream are processed together and no windowing is applied.
+
+The main use case for a single global window is when you want to process all the data elements in your data stream as a whole, without breaking them up into smaller windows. This can be useful in situations where you don't need to compute window-level metrics, such as running averages or counts, but instead want to process the entire data stream as a single unit.
+
+For example, if you are using a data pipeline to filter out invalid data elements and then store the remaining data in a database, you might use a single global window to process all the data elements together, without breaking them up into smaller windows.

Review Comment:
   For example, if you use a data pipeline to filter out invalid data elements and then store the remaining data in a database, you might use a single global window to process all the data elements together without breaking them up into smaller windows.



##########
learning/tour-of-beam/learning-content/windowing/fixed-time-window/description.md:
##########
@@ -0,0 +1,68 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### Fixed time windows
+
+The simplest form of windowing is using fixed time windows: given a timestamped `PCollection` which might be continuously updating, each window might capture (for example) all elements with timestamps that fall into a 30-second interval.
+
+A fixed time window represents a consistent duration, non overlapping time interval in the data stream. Consider windows with a 30-second duration: all the elements in your unbounded PCollection with timestamp values from 0:00:00 up to (but not including) 0:00:30 belong to the first window, elements with timestamp values from 0:00:30 up to (but not including) 0:01:00 belong to the second window, and so on.
+
+{{if (eq .Sdk "go")}}
+```

Review Comment:
   Please add a description of how fixed-time window can be created in go.



##########
learning/tour-of-beam/learning-content/windowing/fixed-time-window/description.md:
##########
@@ -0,0 +1,68 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### Fixed time windows
+
+The simplest form of windowing is using fixed time windows: given a timestamped `PCollection` which might be continuously updating, each window might capture (for example) all elements with timestamps that fall into a 30-second interval.
+
+A fixed time window represents a consistent duration, non overlapping time interval in the data stream. Consider windows with a 30-second duration: all the elements in your unbounded PCollection with timestamp values from 0:00:00 up to (but not including) 0:00:30 belong to the first window, elements with timestamp values from 0:00:30 up to (but not including) 0:01:00 belong to the second window, and so on.
+
+{{if (eq .Sdk "go")}}
+```
+fixedWindowedItems := beam.WindowInto(s,
+	window.NewFixedWindows(30*time.Second),
+	items)
+```
+{{end}}
+
+{{if (eq .Sdk "java")}}
+```
+PCollection<String> items = ...;
+    PCollection<String> fixedWindowedItems = items.apply(
+        Window.<String>into(FixedWindows.of(Duration.standardSeconds(30))));
+```
+{{end}}
+
+{{if (eq .Sdk "python")}}
+```
+from apache_beam import window
+fixed_windowed_items = (
+    items | 'window' >> beam.WindowInto(window.FixedWindows(30)))
+```
+{{end}}
+
+### Playground exercise 
+

Review Comment:
   Please add the description of what a runnable example does out of the box. Such as:
   
   In the playground window, you can try an example of how to create a fixed-time window and print elements in it. 



##########
learning/tour-of-beam/learning-content/windowing/global-window/description.md:
##########
@@ -0,0 +1,59 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### The single global window
+
+By default, all data in a `PCollection` is assigned to the single global window, and late data is discarded. If your data set is of a fixed size, you can use the global window default for your `PCollection`.
+
+You can use the single global window if you are working with an unbounded data set (e.g. from a streaming data source) but use caution when applying aggregating transforms such as `GroupByKey` and `Combine`. The single global window with a default trigger generally requires the entire data set to be available before processing, which is not possible with continuously updating data. To perform aggregations on an unbounded `PCollection` that uses global windowing, you should specify a non-default trigger for that `PCollection`.
+
+If your `PCollection` is bounded (the size is fixed), you can assign all the elements to a single global window. The following example code shows how to set a single global window for a `PCollection`:
+
+{{if (eq .Sdk "go")}}
+```
+globalWindowedItems := beam.WindowInto(s,
+	window.NewGlobalWindows(),
+	items)
+```
+{{end}}
+
+{{if (eq .Sdk "java")}}
+```

Review Comment:
   Please add a short description of how a single global window could be created in this particular SDK. 



##########
learning/tour-of-beam/learning-content/windowing/fixed-time-window/description.md:
##########
@@ -0,0 +1,68 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### Fixed time windows
+
+The simplest form of windowing is using fixed time windows: given a timestamped `PCollection` which might be continuously updating, each window might capture (for example) all elements with timestamps that fall into a 30-second interval.
+
+A fixed time window represents a consistent duration, non overlapping time interval in the data stream. Consider windows with a 30-second duration: all the elements in your unbounded PCollection with timestamp values from 0:00:00 up to (but not including) 0:00:30 belong to the first window, elements with timestamp values from 0:00:30 up to (but not including) 0:01:00 belong to the second window, and so on.
+
+{{if (eq .Sdk "go")}}
+```
+fixedWindowedItems := beam.WindowInto(s,
+	window.NewFixedWindows(30*time.Second),
+	items)
+```
+{{end}}
+
+{{if (eq .Sdk "java")}}
+```
+PCollection<String> items = ...;
+    PCollection<String> fixedWindowedItems = items.apply(
+        Window.<String>into(FixedWindows.of(Duration.standardSeconds(30))));
+```
+{{end}}
+
+{{if (eq .Sdk "python")}}
+```
+from apache_beam import window
+fixed_windowed_items = (
+    items | 'window' >> beam.WindowInto(window.FixedWindows(30)))
+```
+{{end}}
+
+### Playground exercise 
+
+You can start displaying elements from the beginning but also from the end:

Review Comment:
   Not sure what the user is expected to do. 



##########
learning/tour-of-beam/learning-content/windowing/global-window/description.md:
##########
@@ -0,0 +1,59 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### The single global window
+
+By default, all data in a `PCollection` is assigned to the single global window, and late data is discarded. If your data set is of a fixed size, you can use the global window default for your `PCollection`.
+
+You can use the single global window if you are working with an unbounded data set (e.g. from a streaming data source) but use caution when applying aggregating transforms such as `GroupByKey` and `Combine`. The single global window with a default trigger generally requires the entire data set to be available before processing, which is not possible with continuously updating data. To perform aggregations on an unbounded `PCollection` that uses global windowing, you should specify a non-default trigger for that `PCollection`.
+
+If your `PCollection` is bounded (the size is fixed), you can assign all the elements to a single global window. The following example code shows how to set a single global window for a `PCollection`:
+
+{{if (eq .Sdk "go")}}
+```
+globalWindowedItems := beam.WindowInto(s,
+	window.NewGlobalWindows(),
+	items)
+```
+{{end}}
+
+{{if (eq .Sdk "java")}}
+```
+PCollection<String> items = ...;
+PCollection<String> batchItems = items.apply(
+  Window.<String>into(new GlobalWindows()));
+```
+{{end}}
+
+{{if (eq .Sdk "python")}}
+```

Review Comment:
   Please add a short description of how a single global window could be created in this particular SDK. 



##########
learning/tour-of-beam/learning-content/windowing/adding-timestamp/description.md:
##########
@@ -0,0 +1,60 @@
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+### Adding timestamps to a PCollection’s elements
+
+An unbounded source provides a timestamp for each element. Depending on your unbounded source, you may need to configure how the timestamp is extracted from the raw data stream.
+
+However, bounded sources (such as a file from TextIO) do not provide timestamps. If you need timestamps, you must add them to your PCollection’s elements.
+
+You can assign new timestamps to the elements of a PCollection by applying a ParDo transform that outputs new elements with timestamps that you set.
+
+An example might be if your pipeline reads log records from an input file, and each log record includes a timestamp field; since your pipeline reads the records in from a file, the file source doesn’t assign timestamps automatically. You can parse the timestamp field from each record and use a ParDo transform with a DoFn to attach the timestamps to each element in your `PCollection`.

Review Comment:
   An example might be if your pipeline reads log records from an input file, and each log record includes a timestamp field; since your pipeline reads the records from a file, the file source doesn’t assign timestamps automatically. Instead, you can parse the timestamp field from each record and use a ParDo transform with a DoFn to attach the timestamps to each element in your `PCollection`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org