You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/05/27 03:45:09 UTC

[GitHub] [beam] damondouglas commented on a change in pull request #11803: [BEAM-9679] Add a CoGroupByKey lesson to the Core Transforms section

damondouglas commented on a change in pull request #11803:
URL: https://github.com/apache/beam/pull/11803#discussion_r430474979



##########
File path: learning/katas/go/Core Transforms/CoGroupByKey/CoGroupByKey/pkg/task/task.go
##########
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package task
+
+import (
+	"fmt"
+	"github.com/apache/beam/sdks/go/pkg/beam"
+)
+
+func ApplyTransform(s beam.Scope, fruits beam.PCollection, countries beam.PCollection) beam.PCollection {
+	fruitsKV := beam.ParDo(s, func(e string) (string, string) {
+		return string(e[0]), e
+	}, fruits)
+
+	countriesKV := beam.ParDo(s, func(e string) (string, string) {
+		return string(e[0]), e
+	}, countries)
+
+	grouped := beam.CoGroupByKey(s, fruitsKV, countriesKV)
+	return beam.ParDo(s, func(key string, f func(*string) bool, c func(*string) bool, emit func(string)) {

Review comment:
       I agree with you and realizing for code readability purposes it makes sense.  I created [BEAM-10091](https://issues.apache.org/jira/browse/BEAM-10091) when we are all complete with the series of Go SDK katas.  I'd like to perform an overall cleanup for naming consistency and code readability.

##########
File path: learning/katas/go/Core Transforms/CoGroupByKey/CoGroupByKey/pkg/task/task.go
##########
@@ -0,0 +1,52 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package task
+
+import (
+	"fmt"
+	"github.com/apache/beam/sdks/go/pkg/beam"
+)
+
+func ApplyTransform(s beam.Scope, fruits beam.PCollection, countries beam.PCollection) beam.PCollection {
+	fruitsKV := beam.ParDo(s, func(e string) (string, string) {
+		return string(e[0]), e
+	}, fruits)
+
+	countriesKV := beam.ParDo(s, func(e string) (string, string) {
+		return string(e[0]), e
+	}, countries)
+
+	grouped := beam.CoGroupByKey(s, fruitsKV, countriesKV)
+	return beam.ParDo(s, func(key string, f func(*string) bool, c func(*string) bool, emit func(string)) {

Review comment:
       I agree with you and realizing for code readability purposes it makes sense.  I created [BEAM-10091](https://issues.apache.org/jira/browse/BEAM-10091) when we are all complete with the series of Go SDK katas.  I'd like to perform an overall cleanup for naming consistency and code readability.
   
   For now, I will adjust the current CoGroupByKey lesson task for code readability.  Thank you, Henry.

##########
File path: learning/katas/go/Core Transforms/CoGroupByKey/CoGroupByKey/task.md
##########
@@ -0,0 +1,104 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+
+# CoGroupByKey
+
+CoGroupByKey performs a relational join of two or more key/value PCollections that have the same 
+key type.
+
+**Kata:** Implement a [beam.CoGroupByKey](https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#CoGroupByKey) 
+transform that join words by the first alphabetical letter, and then produces the string representation of the 
+WordsAlphabet model.
+
+<div class="hint">
+    Refer to
+    <a href="https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#CoGroupByKey">beam.CoGroupByKey</a>
+    to solve this problem.
+</div>
+
+<div class="hint">
+  Refer to the Beam Programming Guide
+  <a href="https://beam.apache.org/documentation/programming-guide/#cogroupbykey">
+    "CoGroupByKey"</a> section for more information.
+</div>
+
+<div class="hint">
+  Think of this problem in three stages.  First, create key/value pairs of PCollections called KV
+  for fruits and countries, pairing the first character with the word.  Next, apply CoGroupByKey to the KVs
+  followed by a ParDo.
+</div>
+
+<div class="hint">
+  In the last lesson we learned how to make key/value PCollections called KV.  Now we have 
+  two to make from fruits and countries.
+  
+  To return as a KV, you can return two values from your DoFn. The first return value represents the Key, and 
+  the second return value represents the Value.  An example is shown below.
+  
+```
+func doFn(element string) (string, string) {
+    key := string(element[0])
+    value := element
+    return key, value
+}
+``` 
+</div>
+
+<div class="hint">
+  In the last lesson we learned that 
+  <a href="https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#GroupByKey">
+  beam.GroupByKey</a> takes a single KV.
+  <a href="https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#CoGroupByKey">beam.CoGroupByKey</a>
+  takes more than one KV.
+</div>
+
+<div class="hint">
+  Our final step in this problem requires a
+  <a href="https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#ParDo">beam.ParDo</a>
+  with a DoFn that's different than what we've seen in previous lessons.  In the previous step we should
+  have a PCollection acquired from CoGroupByKey.  A ParDo for that PCollection expects a DoFn that looks
+  like the following. 
+  
+  ```
+  func doFn(key string, aKV func(*string) bool, anotherKV func(*string) bool, emit func(string)){

Review comment:
       I couldn't find any and solved this through analyzing the error output.  @lostluck if/whenever you have the chance, could you let us know if this is correct?

##########
File path: learning/katas/go/Core Transforms/CoGroupByKey/CoGroupByKey/task.md
##########
@@ -0,0 +1,104 @@
+<!--
+    Licensed to the Apache Software Foundation (ASF) under one
+    or more contributor license agreements.  See the NOTICE file
+    distributed with this work for additional information
+    regarding copyright ownership.  The ASF licenses this file
+    to you under the Apache License, Version 2.0 (the
+    "License"); you may not use this file except in compliance
+    with the License.  You may obtain a copy of the License at
+
+      http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing,
+    software distributed under the License is distributed on an
+    "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+    KIND, either express or implied.  See the License for the
+    specific language governing permissions and limitations
+    under the License.
+-->
+
+# CoGroupByKey
+
+CoGroupByKey performs a relational join of two or more key/value PCollections that have the same 
+key type.
+
+**Kata:** Implement a [beam.CoGroupByKey](https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#CoGroupByKey) 
+transform that join words by the first alphabetical letter, and then produces the string representation of the 
+WordsAlphabet model.
+
+<div class="hint">
+    Refer to
+    <a href="https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#CoGroupByKey">beam.CoGroupByKey</a>
+    to solve this problem.
+</div>
+
+<div class="hint">
+  Refer to the Beam Programming Guide
+  <a href="https://beam.apache.org/documentation/programming-guide/#cogroupbykey">
+    "CoGroupByKey"</a> section for more information.
+</div>
+
+<div class="hint">
+  Think of this problem in three stages.  First, create key/value pairs of PCollections called KV
+  for fruits and countries, pairing the first character with the word.  Next, apply CoGroupByKey to the KVs
+  followed by a ParDo.
+</div>
+
+<div class="hint">
+  In the last lesson we learned how to make key/value PCollections called KV.  Now we have 
+  two to make from fruits and countries.
+  
+  To return as a KV, you can return two values from your DoFn. The first return value represents the Key, and 
+  the second return value represents the Value.  An example is shown below.
+  
+```
+func doFn(element string) (string, string) {
+    key := string(element[0])
+    value := element
+    return key, value
+}
+``` 
+</div>
+
+<div class="hint">
+  In the last lesson we learned that 
+  <a href="https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#GroupByKey">
+  beam.GroupByKey</a> takes a single KV.
+  <a href="https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#CoGroupByKey">beam.CoGroupByKey</a>
+  takes more than one KV.
+</div>
+
+<div class="hint">
+  Our final step in this problem requires a
+  <a href="https://godoc.org/github.com/apache/beam/sdks/go/pkg/beam#ParDo">beam.ParDo</a>
+  with a DoFn that's different than what we've seen in previous lessons.  In the previous step we should
+  have a PCollection acquired from CoGroupByKey.  A ParDo for that PCollection expects a DoFn that looks
+  like the following. 
+  
+  ```
+  func doFn(key string, aKV func(*string) bool, anotherKV func(*string) bool, emit func(string)){

Review comment:
       Yes I believe this is correct.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org