You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/06/29 06:57:18 UTC

[GitHub] [beam] youngoli commented on a change in pull request #15057: [BEAM-12513] Update initial sections of BPG for Go

youngoli commented on a change in pull request #15057:
URL: https://github.com/apache/beam/pull/15057#discussion_r660326406



##########
File path: sdks/go/examples/snippets/04transforms.go
##########
@@ -0,0 +1,289 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package snippets
+
+import (
+	"fmt"
+	"math"
+	"reflect"
+	"sort"
+	"strings"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/transforms/stats"
+)
+
+// [START model_pardo_pardo]
+
+// ComputeWordLengthFn is the DoFn to perform on each element in the input PCollection.
+type ComputeWordLengthFn struct{}
+
+// ProcessElement is the method to execute for each element.
+func (fn *ComputeWordLengthFn) ProcessElement(word string, emit func(int)) {

Review comment:
       As the first example of a ParDo, would it be better to go with returning outputs instead of using an emit function? I think there's a section somewhere in the programming guide for outputting multiple elements per input where we can introduce emit functions.

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -499,16 +531,34 @@ the transform itself as an argument, and the operation returns the output
 [Output PCollection] = [Input PCollection] | [Transform]
 {{< /highlight >}}
 
+{{< highlight go >}}
+[Output PCollection] := beam.ParDo(s, [Transform], [Input PCollection])
+{{< /highlight >}}
+
+{{< paragraph class="language-java language-py" >}}
 Because Beam uses a generic `apply` method for `PCollection`, you can both chain
 transforms sequentially and also apply transforms that contain other transforms
 nested within (called [composite transforms](#composite-transforms) in the Beam
 SDKs).
+{{< /paragraph >}}
+
+{{< paragraph class="language-go" >}}
+Because Go doesn't support function overloading, it's recommended to

Review comment:
       We can probably remove `"Because Go doesn't support function overloading..."` and leave just the recommendation. Users don't really need to understand the technical details of why took this approach, just what to do.

##########
File path: website/www/site/content/en/documentation/programming-guide.md
##########
@@ -716,11 +818,45 @@ static class ComputeWordLengthFn extends DoFn<String, Integer> {
 {{< code_sample "sdks/python/apache_beam/examples/snippets/snippets_test.py" model_pardo_pardo >}}
 {{< /highlight >}}
 
-{{< paragraph class="language-java" >}}
+{{< highlight go >}}
+{{< code_sample "sdks/go/examples/snippets/04transforms.go" model_pardo_pardo >}}
+{{< /highlight >}}
+
+{{< paragraph class="language-go" >}}
+Simple DoFns can also be written as functions.
+{{< /paragraph >}}
+
+{{< highlight go >}}
+func ComputeWordLengthFn(word string, emit func(int)) { ... }
+
+func init() {
+	beam.RegisterFunction(ComputeWordLengthFn)
+}
+{{< /highlight >}}
+
+<span class="language-go" >
+
+> **Note:** Wether using a structural `DoFn` type or a functional `DoFn`, they should be registered with
+> beam in an `init` block. Otherwise they may not execute on distributed runners.
+
+</span>
+
+<span class="language-java">
+
 > **Note:** If the elements in your input `PCollection` are key/value pairs, you
 > can access the key or value by using `element.getKey()` or
 > `element.getValue()`, respectively.
-{{< /paragraph >}}
+
+</span>
+
+<span class="language-go">
+
+> **Note:** If the elements in your input `PCollection` are key/value pairs, your
+> process element method must have two parameters, for each of the key and value,
+> respectively. Similarly, key/value pairs are also output as spearate

Review comment:
       ```suggestion
   > respectively. Similarly, key/value pairs are also output as separate
   ```

##########
File path: sdks/go/examples/snippets/01_03intro.go
##########
@@ -0,0 +1,93 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package snippets
+
+import (
+	"flag"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/io/textio"
+)
+
+// PipelineConstruction contains snippets for the initial sections of
+// the Beam Programming Guide, from initializing to submitting a
+// pipeline.
+func PipelineConstruction() {
+	// [START pipeline_options]
+	// If beamx or Go flags are used, flags must be parsed first,
+	// before beam.Init() is called.
+	flag.Parse()
+	// [END pipeline_options]
+
+	// [START pipelines_constructing_creating]
+	// beam.Init() is an initialization hook that must be called near
+	// the beginging of main().
+	beam.Init()
+
+	// Create the Pipeline object and root scope.
+	pipeline, scope := beam.NewPipelineWithRoot()
+	// [END pipelines_constructing_creating]
+
+	// [START pipelines_constructing_reading]
+	lines := textio.Read(scope, "gs://some/inputData.txt")
+	// [END pipelines_constructing_reading]
+
+	_ = []interface{}{pipeline, scope, lines}
+}
+
+// Create demonstrates using beam.CreateList.
+func Create() {
+	// [START model_pcollection]
+	lines := []string{
+		"To be, or not to be: that is the question: ",
+		"Whether 'tis nobler in the mind to suffer ",
+		"The slings and arrows of outrageous fortune, ",
+		"Or to take arms against a sea of troubles, ",
+	}
+
+	// Create the Pipeline object and root scope.
+	p, s := beam.NewPipelineWithRoot()

Review comment:
       It might be relevant to point out that using `s` as a variable name for the scope is the convention in the Go SDK. This is what the user will see if they read any other Go SDK examples or code. I'm not sure if the value of clarity outweighs sticking to convention, just wanted to mention that. What do you think?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org