You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2020/08/10 07:08:45 UTC

[GitHub] [flink] shuiqiangchen commented on a change in pull request #13097: [FLINK-18864][python] Support key_by() operation for Python DataStrea…

shuiqiangchen commented on a change in pull request #13097:
URL: https://github.com/apache/flink/pull/13097#discussion_r467721304



##########
File path: flink-python/pyflink/datastream/tests/test_data_stream.py
##########
@@ -140,15 +135,68 @@ def flat_map(value):
 
         flat_mapped_stream = ds.flat_map(flat_map, type_info=Types.ROW([Types.STRING(),
                                                                         Types.INT()]))
-        collect_util = DataStreamCollectUtil()
-        collect_util.collect(flat_mapped_stream)
+        self.collect_util.collect(flat_mapped_stream)
         self.env.execute('flat_map_test')
-        results = collect_util.results()
+        results = self.collect_util.results()
         expected = ['a,0', 'bdc,2', 'deeefg,4']
         results.sort()
         expected.sort()
         self.assertEqual(expected, results)
 
+    def test_key_by(self):
+        element_collection = [('a', 0), ('b', 0), ('c', 1), ('d', 1), ('e', 2)]
+        self.env.set_parallelism(1)
+        ds = self.env.from_collection(element_collection,
+                                      type_info=Types.ROW([Types.STRING(), Types.INT()]))
+
+        class AssertKeyMapFunction(MapFunction):
+            def __init__(self):
+                self.pre = None
+
+            def map(self, value):
+                if value[0] == 'b':
+                    assert self.pre == 'a'
+                if value[0] == 'd':
+                    assert self.pre == 'c'
+                self.pre = value[0]
+                return value
+
+        mapped_stream = ds.key_by(MyKeySelector()).map(AssertKeyMapFunction())
+        self.collect_util.collect(mapped_stream)
+        self.env.execute('key_by_test')
+        results = self.collect_util.results()
+        expected = ["<Row('a', 0)>", "<Row('b', 0)>", "<Row('c', 1)>", "<Row('d', 1)>",
+                    "<Row('e', 2)>"]
+        results.sort()
+        expected.sort()
+        self.assertEqual(expected, results)
+
+    def test_key_by_map(self):

Review comment:
       They are almost the same, but the second test is to make sure we have not changed the original DataStream after key_by() operation with two way map operation.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org