You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by al...@apache.org on 2019/06/25 18:17:56 UTC
[beam] branch master updated: [BEAM-7475] update wordcount example
(#8803)
This is an automated email from the ASF dual-hosted git repository.
altay pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git
The following commit(s) were added to refs/heads/master by this push:
new 8745d6e [BEAM-7475] update wordcount example (#8803)
8745d6e is described below
commit 8745d6eb943b1d5c9d28e2c4c7b917b842122ffd
Author: Rakesh Kumar <ra...@lyft.com>
AuthorDate: Tue Jun 25 11:17:43 2019 -0700
[BEAM-7475] update wordcount example (#8803)
* BEAM-7475: Updated wordcount example with code snippet
---
website/src/get-started/wordcount-example.md | 29 ++++++++++++++++++++++------
1 file changed, 23 insertions(+), 6 deletions(-)
diff --git a/website/src/get-started/wordcount-example.md b/website/src/get-started/wordcount-example.md
index 910a626..fba69b3 100644
--- a/website/src/get-started/wordcount-example.md
+++ b/website/src/get-started/wordcount-example.md
@@ -1235,7 +1235,15 @@ public static void main(String[] args) throws IOException {
```
```py
-# This feature is not yet available in the Beam SDK for Python.
+def main(arvg=None):
+ parser = argparse.ArgumentParser()
+ parser.add_argument('--input-file',
+ dest='input_file',
+ default='/Users/home/words-example.txt')
+ known_args, pipeline_args = parser.parse_known_args(argv)
+ pipeline_options = PipelineOptions(pipeline_args)
+ p = beam.Pipeline(options=pipeline_options)
+ lines = p | 'read' >> ReadFromText(known_args.input_file)
```
```go
@@ -1267,7 +1275,7 @@ each element in the `PCollection`.
```
```py
-# This feature is not yet available in the Beam SDK for Python.
+beam.Map(AddTimestampFn(timestamp_seconds))
```
```go
@@ -1308,7 +1316,16 @@ static class AddTimestampFn extends DoFn<String, String> {
```
```py
-# This feature is not yet available in the Beam SDK for Python.
+class AddTimestampFn(beam.DoFn):
+
+ def __init__(self, min_timestamp, max_timestamp):
+ self.min_timestamp = min_timestamp
+ self.max_timestamp = max_timestamp
+
+ def process(self, element):
+ return window.TimestampedValue(
+ element,
+ random.randint(self.min_timestamp, self.max_timestamp))
```
```go
@@ -1344,7 +1361,7 @@ PCollection<String> windowedWords = input
```
```py
-# This feature is not yet available in the Beam SDK for Python.
+windowed_words = input | beam.WindowInto(window.FixedWindows(60 * window_size_minutes))
```
```go
@@ -1362,7 +1379,7 @@ PCollection<KV<String, Long>> wordCounts = windowedWords.apply(new WordCount.Cou
```
```py
-# This feature is not yet available in the Beam SDK for Python.
+word_counts = windowed_words | CountWords()
```
```go
@@ -1499,4 +1516,4 @@ using [`beam.io.WriteStringsToPubSub`](https://beam.apache.org/releases/pydoc/{{
* Dive in to some of our favorite [Videos and Podcasts]({{ site.baseurl }}/documentation/resources/videos-and-podcasts).
* Join the Beam [users@]({{ site.baseurl }}/community/contact-us) mailing list.
-Please don't hesitate to [reach out]({{ site.baseurl }}/community/contact-us) if you encounter any issues!
\ No newline at end of file
+Please don't hesitate to [reach out]({{ site.baseurl }}/community/contact-us) if you encounter any issues!