You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/11/30 11:13:02 UTC

[GitHub] [beam] tszerszen opened a new pull request #13436: [BEAM-11075] Go SDK SideInput load tests

tszerszen opened a new pull request #13436:
URL: https://github.com/apache/beam/pull/13436


   This PR adds SideInput load tests for Go SDK.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | ---
   Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam
 .apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.a
 pache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam
 .apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) | ---
   XLang | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/) | ---
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website | Whitespace | Typescript
   --- | --- | --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/) <br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/be
 am_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/)
   Portable | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | --- | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532593345



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 10gb-1kb-10workers-1window-first-iterable',
+      test           : 'sideinput',
+      runner         : CommonTestProperties.Runner.FLINK,
+      pipelineOptions: [
+        job_name           : "load-tests-go-flink-batch-sideinput-1-${now}",
+        influx_namespace   : 'flink',
+        influx_measurement : 'go_batch_sideinput_1',
+        input_options      : '\'{' +
+        '"num_records": 10000000,' +
+        '"key_size": 100,' +
+        '"value_size": 900}\'',
+        parallelism        : 10,
+        endpoint           : 'localhost:8099',
+        environment_type   : 'DOCKER',
+        environment_config : "${DOCKER_CONTAINER_REGISTRY}/beam_go_sdk:latest",

Review comment:
       @kamilwu  Sorry, you're right default access_percentage is 100, therefore first test should have access_percentage 1, and second one none, which means it gets the default 100.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] lostluck commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
lostluck commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r538663625



##########
File path: sdks/go/test/load/sideinput/sideinput.go
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package main
+
+import (
+	"context"
+	"flag"
+	"reflect"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/io/synthetic"
+	"github.com/apache/beam/sdks/go/pkg/beam/log"
+	"github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+	"github.com/apache/beam/sdks/go/test/load"
+)
+
+func init() {
+	beam.RegisterDoFn(reflect.TypeOf((*doFn)(nil)))
+}
+
+var (
+	accessPercentage = flag.Int(
+		"access_percentage",
+		100,
+		"Specifies the percentage of elements in the side input to be accessed.")
+	syntheticSourceConfig = flag.String(
+		"input_options",
+		"",
+		"A JSON object that describes the configuration for synthetic source")
+)
+
+func parseSyntheticConfig() synthetic.SourceConfig {
+	if *syntheticSourceConfig == "" {
+		panic("--input_options not provided")
+	} else {
+		encoded := []byte(*syntheticSourceConfig)
+		return synthetic.DefaultSourceConfig().BuildFromJSON(encoded)
+	}
+}
+
+type doFn struct {
+	elementsToAccess int

Review comment:
       This isn't the main problem, we're debugging, but DoFn fields *must* be exported to get serialized properly.
   
   eg. ElementsToAccess instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
kamilwu commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532582517



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 10gb-1kb-10workers-1window-first-iterable',
+      test           : 'sideinput',
+      runner         : CommonTestProperties.Runner.FLINK,
+      pipelineOptions: [
+        job_name           : "load-tests-go-flink-batch-sideinput-1-${now}",
+        influx_namespace   : 'flink',
+        influx_measurement : 'go_batch_sideinput_1',
+        input_options      : '\'{' +
+        '"num_records": 10000000,' +
+        '"key_size": 100,' +
+        '"value_size": 900}\'',
+        parallelism        : 10,
+        endpoint           : 'localhost:8099',
+        environment_type   : 'DOCKER',
+        environment_config : "${DOCKER_CONTAINER_REGISTRY}/beam_go_sdk:latest",

Review comment:
       This test is exactly the same as one below. Did you miss `access_elements`? 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
kamilwu commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532587315



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 10gb-1kb-10workers-1window-first-iterable',
+      test           : 'sideinput',
+      runner         : CommonTestProperties.Runner.FLINK,
+      pipelineOptions: [
+        job_name           : "load-tests-go-flink-batch-sideinput-1-${now}",
+        influx_namespace   : 'flink',
+        influx_measurement : 'go_batch_sideinput_1',

Review comment:
       You _cannot_ use `1` here. This number is an ID that uniquely identifies the test. For side input tests:
   * `1` corresponds to the `1gb-1kb-10workers-1window-1key-percent-dict` test
   * `2` corresponds to the `1gb-1kb-10workers-1window-99key-percent-dict` test
   and so on and so forth. You can see a full set of side input tests here: https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_SideInput_Python.groovy




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532593345



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 10gb-1kb-10workers-1window-first-iterable',
+      test           : 'sideinput',
+      runner         : CommonTestProperties.Runner.FLINK,
+      pipelineOptions: [
+        job_name           : "load-tests-go-flink-batch-sideinput-1-${now}",
+        influx_namespace   : 'flink',
+        influx_measurement : 'go_batch_sideinput_1',
+        input_options      : '\'{' +
+        '"num_records": 10000000,' +
+        '"key_size": 100,' +
+        '"value_size": 900}\'',
+        parallelism        : 10,
+        endpoint           : 'localhost:8099',
+        environment_type   : 'DOCKER',
+        environment_config : "${DOCKER_CONTAINER_REGISTRY}/beam_go_sdk:latest",

Review comment:
       @kamilwu  Sorry, you're right default access_percentage is 100, therefore first test should have access_percentage 1, and second one none, which means it gets the default 100. Fixed it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on pull request #13436:
URL: https://github.com/apache/beam/pull/13436#issuecomment-735724215


   R: @kamilwu ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532590052



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 10gb-1kb-10workers-1window-first-iterable',
+      test           : 'sideinput',
+      runner         : CommonTestProperties.Runner.FLINK,
+      pipelineOptions: [
+        job_name           : "load-tests-go-flink-batch-sideinput-1-${now}",
+        influx_namespace   : 'flink',
+        influx_measurement : 'go_batch_sideinput_1',

Review comment:
       Thank you, you're right. Just fixed it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu merged pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
kamilwu merged pull request #13436:
URL: https://github.com/apache/beam/pull/13436


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
kamilwu commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532535997



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def fromTemplate = { mode, name, id, testSpecificOptions ->
+  [
+    title          : "SideInput Go Load test: ${name}",
+    test           : 'sideinput',
+    runner         : CommonTestProperties.Runner.FLINK,
+    pipelineOptions: [
+      job_name           : "load-tests-go-flink-${mode}-sideinput-${id}-${now}",
+      influx_namespace   : 'flink',
+      influx_measurement : "go_${mode}_sideinput_${id}_${now}",

Review comment:
       Why `${now}`? We would quickly end up with tons of different measurements in the database with only one element. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532564312



##########
File path: sdks/go/test/load/sideinput/sideinput.go
##########
@@ -0,0 +1,82 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package main
+
+import (
+	"context"
+	"flag"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/io/synthetic"
+	"github.com/apache/beam/sdks/go/pkg/beam/log"
+	"github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+	"github.com/apache/beam/sdks/go/test/load"
+)
+
+var (
+	accessPercentage = flag.Int(
+		"access_percentage",
+		100,
+		"Specifies the percentage of elements in the side input to be accessed.")
+	syntheticSourceConfig = flag.String(
+		"input_options",
+		"{"+

Review comment:
       I removed default value for flag syntheticSourceConfig.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on pull request #13436:
URL: https://github.com/apache/beam/pull/13436#issuecomment-735725960


   Run Spotless PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r542785895



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Go.groovy
##########
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonJobProperties as commonJobProperties
+import CommonTestProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+String now = new Date().format('MMddHHmmss', TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 400mb-1kb-10workers-1window-first-iterable',

Review comment:
       Thank I changed everything to 400mb-*.

##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Go.groovy
##########
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonJobProperties as commonJobProperties
+import CommonTestProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+String now = new Date().format('MMddHHmmss', TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 400mb-1kb-10workers-1window-first-iterable',

Review comment:
       Thank you I changed everything to 400mb-*.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532563358



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,101 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def fromTemplate = { mode, name, id, testSpecificOptions ->
+  [
+    title          : "SideInput Go Load test: ${name}",
+    test           : 'sideinput',
+    runner         : CommonTestProperties.Runner.FLINK,
+    pipelineOptions: [
+      job_name           : "load-tests-go-flink-${mode}-sideinput-${id}-${now}",
+      influx_namespace   : 'flink',
+      influx_measurement : "go_${mode}_sideinput_${id}_${now}",

Review comment:
       Thank you, in the last commit I removed  ${now} from influx_measurement.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] lostluck edited a comment on pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
lostluck edited a comment on pull request #13436:
URL: https://github.com/apache/beam/pull/13436#issuecomment-740863368


   WRT the investigation we've been doing around a flink failure for large settings: 
   In my own investigation into the problem on the google internal runner, the "big" configuration ends up being too large for protocol buffer serialization, with a single ~10GB StateResponse, causing the issue. Protos have a hard cap of 2GB in serialized size.
   
   From #beam-go slack discussion, and other research, java and python also fails with large configurations, as it's not yet implemented anywhere to page through large side inputs.
   
   `--input_options='{"num_records": 2000000, "key_size":100, "value_size":900}' --access_percentage=1`
   and
   `--input_options='{"num_records": 10000000, "key_size":100, "value_size":180}' --access_percentage=1`
   work by cutting data total down to 1/5th.
   
   This will affect all portable runners, but not any of the legacy ones, because they don't pass data around through the protos. A JIRA is being filed about it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] lostluck commented on pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
lostluck commented on pull request #13436:
URL: https://github.com/apache/beam/pull/13436#issuecomment-740863368


   WRT the investigation we've been doing around a flink failure for large settings: 
   In my own investigation into the problem on the google internal runner, the "big" configuration ends up being too large for protocol buffer serialization, with a single ~10GB StateResponse, causing the issue. Protos have a hard cap of 2GB in serialized size.
   
   From #beam-go slack discussion, python also fails with large configurations..
   
   `--input_options='{"num_records": 2000000, "key_size":100, "value_size":900}' --access_percentage=1`
   and
   `--input_options='{"num_records": 10000000, "key_size":100, "value_size":180}' --access_percentage=1`
   work by cutting data total down to 1/5th.
   
   This will affect all portable runners, but not any of the legacy ones, because they don't pass data around through the protos. A JIRA is being filed about it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on pull request #13436:
URL: https://github.com/apache/beam/pull/13436#issuecomment-735764724


   Run Spotless PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
kamilwu commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532582517



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 10gb-1kb-10workers-1window-first-iterable',
+      test           : 'sideinput',
+      runner         : CommonTestProperties.Runner.FLINK,
+      pipelineOptions: [
+        job_name           : "load-tests-go-flink-batch-sideinput-1-${now}",
+        influx_namespace   : 'flink',
+        influx_measurement : 'go_batch_sideinput_1',
+        input_options      : '\'{' +
+        '"num_records": 10000000,' +
+        '"key_size": 100,' +
+        '"value_size": 900}\'',
+        parallelism        : 10,
+        endpoint           : 'localhost:8099',
+        environment_type   : 'DOCKER',
+        environment_config : "${DOCKER_CONTAINER_REGISTRY}/beam_go_sdk:latest",

Review comment:
       This test is exactly the same as one below. Did you miss `access_elements`? 
   EDIT: `access_percentage`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
kamilwu commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532536926



##########
File path: sdks/go/test/load/sideinput/sideinput.go
##########
@@ -0,0 +1,82 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package main
+
+import (
+	"context"
+	"flag"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/io/synthetic"
+	"github.com/apache/beam/sdks/go/pkg/beam/log"
+	"github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+	"github.com/apache/beam/sdks/go/test/load"
+)
+
+var (
+	accessPercentage = flag.Int(
+		"access_percentage",
+		100,
+		"Specifies the percentage of elements in the side input to be accessed.")
+	syntheticSourceConfig = flag.String(
+		"input_options",
+		"{"+

Review comment:
       Please do not provide any default values for this parameter, as it should be required. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] lostluck edited a comment on pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
lostluck edited a comment on pull request #13436:
URL: https://github.com/apache/beam/pull/13436#issuecomment-740863368


   WRT the investigation we've been doing around a flink failure for large settings: 
   In my own investigation into the problem on the google internal runner, the "big" configuration ends up being too large for protocol buffer serialization, with a single ~10GB StateResponse, causing the issue. Protos have a hard cap of 2GB in serialized size.
   
   From #beam-go slack discussion, and other research, java and python also fails with large configurations, as it's not yet implemented anywhere to page through large side inputs.
   
   `--input_options='{"num_records": 2000000, "key_size":100, "value_size":900}' --access_percentage=1`
   and
   `--input_options='{"num_records": 10000000, "key_size":100, "value_size":90}' --access_percentage=1`
   work by cutting data total down to 1/5th.
   
   This will affect all portable runners, but not any of the legacy ones, because they don't pass data around through the protos. A JIRA is being filed about it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532564262



##########
File path: sdks/go/test/load/sideinput/sideinput.go
##########
@@ -0,0 +1,82 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package main
+
+import (
+	"context"
+	"flag"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/io/synthetic"
+	"github.com/apache/beam/sdks/go/pkg/beam/log"
+	"github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+	"github.com/apache/beam/sdks/go/test/load"
+)
+
+var (
+	accessPercentage = flag.Int(
+		"access_percentage",
+		100,
+		"Specifies the percentage of elements in the side input to be accessed.")
+	syntheticSourceConfig = flag.String(
+		"input_options",
+		"{"+
+			"\"num_records\": 300, "+
+			"\"key_size\": 5, "+
+			"\"value_size\": 15}",
+		"A JSON object that describes the configuration for synthetic source")
+)
+
+func parseSyntheticConfig() synthetic.SourceConfig {
+	if *syntheticSourceConfig == "" {
+		panic("--input_options not provided")
+	} else {
+		encoded := []byte(*syntheticSourceConfig)
+		return synthetic.DefaultSourceConfig().BuildFromJSON(encoded)
+	}
+}
+
+func main() {
+	flag.Parse()
+	beam.Init()
+	ctx := context.Background()
+	p, s := beam.NewPipelineWithRoot()
+
+	syntheticConfig := parseSyntheticConfig()
+	elementsToAccess := syntheticConfig.NumElements * *accessPercentage / 100
+
+	src := synthetic.SourceSingle(s, syntheticConfig)
+	src = beam.ParDo(s, &load.RuntimeMonitor{}, src)
+
+	src = beam.ParDo(s, func(_ []byte, values func(*[]byte, *[]byte) bool, emit func([]byte, []byte)) {
+		var key []byte
+		var value []byte
+		i := 0
+		for values(&key, &value) {
+			if i == elementsToAccess {
+				break
+			}
+			emit(key, value)
+			i++
+		}
+	}, beam.Impulse(s), beam.SideInput{Input: src})
+
+	beam.ParDo(s, &load.RuntimeMonitor{}, src)
+
+	if err := beamx.Run(ctx, p); err != nil {

Review comment:
       Ok, now it's run with beamx.RunWithMetrics.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
kamilwu commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532540690



##########
File path: sdks/go/test/load/sideinput/sideinput.go
##########
@@ -0,0 +1,82 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package main
+
+import (
+	"context"
+	"flag"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/io/synthetic"
+	"github.com/apache/beam/sdks/go/pkg/beam/log"
+	"github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+	"github.com/apache/beam/sdks/go/test/load"
+)
+
+var (
+	accessPercentage = flag.Int(
+		"access_percentage",
+		100,
+		"Specifies the percentage of elements in the side input to be accessed.")
+	syntheticSourceConfig = flag.String(
+		"input_options",
+		"{"+
+			"\"num_records\": 300, "+
+			"\"key_size\": 5, "+
+			"\"value_size\": 15}",
+		"A JSON object that describes the configuration for synthetic source")
+)
+
+func parseSyntheticConfig() synthetic.SourceConfig {
+	if *syntheticSourceConfig == "" {
+		panic("--input_options not provided")
+	} else {
+		encoded := []byte(*syntheticSourceConfig)
+		return synthetic.DefaultSourceConfig().BuildFromJSON(encoded)
+	}
+}
+
+func main() {
+	flag.Parse()
+	beam.Init()
+	ctx := context.Background()
+	p, s := beam.NewPipelineWithRoot()
+
+	syntheticConfig := parseSyntheticConfig()
+	elementsToAccess := syntheticConfig.NumElements * *accessPercentage / 100
+
+	src := synthetic.SourceSingle(s, syntheticConfig)
+	src = beam.ParDo(s, &load.RuntimeMonitor{}, src)
+
+	src = beam.ParDo(s, func(_ []byte, values func(*[]byte, *[]byte) bool, emit func([]byte, []byte)) {
+		var key []byte
+		var value []byte
+		i := 0
+		for values(&key, &value) {
+			if i == elementsToAccess {
+				break
+			}
+			emit(key, value)
+			i++
+		}
+	}, beam.Impulse(s), beam.SideInput{Input: src})
+
+	beam.ParDo(s, &load.RuntimeMonitor{}, src)
+
+	if err := beamx.Run(ctx, p); err != nil {

Review comment:
       `beamx.Run` -> `beamx.RunWithMetrics`. Please also add the code for sending results to the database. 




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kamilwu commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
kamilwu commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r542489194



##########
File path: .test-infra/metrics/grafana/dashboards/perftests_metrics/SideInput_Load_Tests.json
##########
@@ -21,7 +21,7 @@
   "links": [],
   "panels": [
     {
-      "content": "The following options are common to all tests:\n* key size: 100B\n* value size: 900B\n* number of workers: 10\n* size of the window (if fixed windows are used): 1 second\n\nAdditional common options for Dataflow:\n* experiments: use_runner_v2\n* autoscaling_algorithm: NONE\n\n\n[Jenkins job definition (Python, Dataflow)](https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_SideInput_Python.groovy)\n\n",
+      "content": "The following options are common to all tests:\n* key size: 100B\n* value size: 900B\n* number of workers: 10\n* size of the window (if fixed windows are used): 1 second\n\nAdditional common options for Dataflow:\n* experiments: use_runner_v2\n* autoscaling_algorithm: NONE\n\n\n[Jenkins job definition (Python, Dataflow)](https://github.com/apache/beam/blob/master/.test-infra/jenkins/job_LoadTests_SideInput_Python.groovy)\n\nUntil the issue [BEAM-11427](https://issues.apache.org/jira/browse/BEAM-11427) in Go SDK is resolved, sideinput iteration test have 400MB, instead of 10GB.",

Review comment:
       The comment is OK, but you should also specify additional options for Flink. And while there is a link to Python Jenkins job, you could also add links to your newly defined jobs.

##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Go.groovy
##########
@@ -0,0 +1,94 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonJobProperties as commonJobProperties
+import CommonTestProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+String now = new Date().format('MMddHHmmss', TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 400mb-1kb-10workers-1window-first-iterable',

Review comment:
       In the Flink job, the title begins with `10gb`, while here it begins with `400mb`. Please choose one of these and stick to it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] lostluck commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
lostluck commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r538663625



##########
File path: sdks/go/test/load/sideinput/sideinput.go
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package main
+
+import (
+	"context"
+	"flag"
+	"reflect"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/io/synthetic"
+	"github.com/apache/beam/sdks/go/pkg/beam/log"
+	"github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+	"github.com/apache/beam/sdks/go/test/load"
+)
+
+func init() {
+	beam.RegisterDoFn(reflect.TypeOf((*doFn)(nil)))
+}
+
+var (
+	accessPercentage = flag.Int(
+		"access_percentage",
+		100,
+		"Specifies the percentage of elements in the side input to be accessed.")
+	syntheticSourceConfig = flag.String(
+		"input_options",
+		"",
+		"A JSON object that describes the configuration for synthetic source")
+)
+
+func parseSyntheticConfig() synthetic.SourceConfig {
+	if *syntheticSourceConfig == "" {
+		panic("--input_options not provided")
+	} else {
+		encoded := []byte(*syntheticSourceConfig)
+		return synthetic.DefaultSourceConfig().BuildFromJSON(encoded)
+	}
+}
+
+type doFn struct {
+	elementsToAccess int

Review comment:
       This isn't the main problem, we're debugging, but Structural DoFn fields *must* be exported to get serialized properly.
   
   eg. ElementsToAccess instead.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r532589094



##########
File path: .test-infra/jenkins/job_LoadTests_SideInput_Flink_Go.groovy
##########
@@ -0,0 +1,99 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+import CommonTestProperties
+import CommonJobProperties as commonJobProperties
+import LoadTestsBuilder as loadTestsBuilder
+import PhraseTriggeringPostCommitBuilder
+import InfluxDBCredentialsHelper
+
+def now = new Date().format("MMddHHmmss", TimeZone.getTimeZone('UTC'))
+
+def batchScenarios = {
+  [
+    [
+      title          : 'SideInput Go Load test: 10gb-1kb-10workers-1window-first-iterable',
+      test           : 'sideinput',
+      runner         : CommonTestProperties.Runner.FLINK,
+      pipelineOptions: [
+        job_name           : "load-tests-go-flink-batch-sideinput-1-${now}",
+        influx_namespace   : 'flink',
+        influx_measurement : 'go_batch_sideinput_1',
+        input_options      : '\'{' +
+        '"num_records": 10000000,' +
+        '"key_size": 100,' +
+        '"value_size": 900}\'',
+        parallelism        : 10,
+        endpoint           : 'localhost:8099',
+        environment_type   : 'DOCKER',
+        environment_config : "${DOCKER_CONTAINER_REGISTRY}/beam_go_sdk:latest",

Review comment:
       @kamilwu if you look at `job_LoadTests_SideInput_Python.groovy` there are two iterable tests. They are exactly the same with the exception of a name, here's the fragment of code:
   ```groovy
       [
         name: '10gb-1kb-10workers-1window-first-iterable',
         testSpecificOptions: [
           input_options    : '\'{' +
           '"num_records": 10000000,' +
           '"key_size": 100,' +
           '"value_size": 900}\'',
           side_input_type  : 'iter',
           access_percentage: 1,
         ]
       ],
       [
         name: '10gb-1kb-10workers-1window-iterable',
         testSpecificOptions: [
           input_options    : '\'{' +
           '"num_records": 10000000,' +
           '"key_size": 100,' +
           '"value_size": 900}\'',
           side_input_type  : 'iter',
         ]
       ],
   ```
   And default access_percentage is 1, and it's not defined in second test case.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on a change in pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on a change in pull request #13436:
URL: https://github.com/apache/beam/pull/13436#discussion_r538876419



##########
File path: sdks/go/test/load/sideinput/sideinput.go
##########
@@ -0,0 +1,100 @@
+// Licensed to the Apache Software Foundation (ASF) under one or more
+// contributor license agreements.  See the NOTICE file distributed with
+// this work for additional information regarding copyright ownership.
+// The ASF licenses this file to You under the Apache License, Version 2.0
+// (the "License"); you may not use this file except in compliance with
+// the License.  You may obtain a copy of the License at
+//
+//    http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing, software
+// distributed under the License is distributed on an "AS IS" BASIS,
+// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+// See the License for the specific language governing permissions and
+// limitations under the License.
+
+package main
+
+import (
+	"context"
+	"flag"
+	"reflect"
+
+	"github.com/apache/beam/sdks/go/pkg/beam"
+	"github.com/apache/beam/sdks/go/pkg/beam/io/synthetic"
+	"github.com/apache/beam/sdks/go/pkg/beam/log"
+	"github.com/apache/beam/sdks/go/pkg/beam/x/beamx"
+	"github.com/apache/beam/sdks/go/test/load"
+)
+
+func init() {
+	beam.RegisterDoFn(reflect.TypeOf((*doFn)(nil)))
+}
+
+var (
+	accessPercentage = flag.Int(
+		"access_percentage",
+		100,
+		"Specifies the percentage of elements in the side input to be accessed.")
+	syntheticSourceConfig = flag.String(
+		"input_options",
+		"",
+		"A JSON object that describes the configuration for synthetic source")
+)
+
+func parseSyntheticConfig() synthetic.SourceConfig {
+	if *syntheticSourceConfig == "" {
+		panic("--input_options not provided")
+	} else {
+		encoded := []byte(*syntheticSourceConfig)
+		return synthetic.DefaultSourceConfig().BuildFromJSON(encoded)
+	}
+}
+
+type doFn struct {
+	elementsToAccess int

Review comment:
       Thank you, I changed the name to start with capital letter.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tszerszen commented on pull request #13436: [BEAM-11075] Go SDK SideInput load tests

Posted by GitBox <gi...@apache.org>.
tszerszen commented on pull request #13436:
URL: https://github.com/apache/beam/pull/13436#issuecomment-744522191


   R: @kamilwu ?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org