You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2020/04/28 17:46:19 UTC

[GitHub] [beam] mxm opened a new pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

mxm opened a new pull request #11558:
URL: https://github.com/apache/beam/pull/11558


   This contains two new benchmarks which both execute in streaming mode:
   
   (1) A benchmark which uses timers and state
   (2) A benchmark which additionally checkpoints and reports checkpointing times
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Apex | Dataflow | Flink | Gearpump | Samza | Spark
   --- | --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/)
   Java | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Apex/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Gearpump/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/)
   Python | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/) | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python2_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python35_VR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/)
   XLang | --- | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | --- | [![Build Status](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/)
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website
   --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) 
   Portable | --- | [![Build Status](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://builds.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-625156669


   Run PythonLint PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420655324



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
##########
@@ -455,6 +461,18 @@ public void open() throws Exception {
     if (!options.getDisableMetrics()) {
       flinkMetricContainer = new FlinkMetricContainer(getRuntimeContext());
       doFnRunner = new DoFnRunnerWithMetricsUpdate<>(stepName, doFnRunner, flinkMetricContainer);
+      String checkpointMetricNamespace =

Review comment:
       I thought about that. The checkpointing metrics are kept at the JobManager. They are not accessible by the operator. So in order not to change the logic of the load tests which relies on the metrics accumulated during the job, returned in the PipelineResult, I opted to go this route. 
   
   I think we may have to change the Beam service to allow querying job metrics during job runtime. At the moment, we can only retrieve them at the end of the job.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624846518


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-625329671


   Run Python2_PVR_Flink PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623474052






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621417532


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622950785


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420932855



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
##########
@@ -455,6 +461,18 @@ public void open() throws Exception {
     if (!options.getDisableMetrics()) {
       flinkMetricContainer = new FlinkMetricContainer(getRuntimeContext());
       doFnRunner = new DoFnRunnerWithMetricsUpdate<>(stepName, doFnRunner, flinkMetricContainer);
+      String checkpointMetricNamespace =

Review comment:
       Yes, I agree that we want this to be more generic. I didn't want to mention the Flink metric reporter but I think that would introduce another level of complexity. Not sure if that is the right tool for the job. I'd rather query metrics through the job server and the rest api.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622976868


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624264625


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620757652


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621135496


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621864314


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-625278998


   > lint error:
   > 
   > ```
   > 04:58:35 apache_beam/testing/load_tests/pardo_test.py:177: error: Argument 2 to "CombiningValueStateSpec" has incompatible type overloaded function; expected "Optional[Coder]"  [arg-type]
   > 04:58:36 Found 1 error in 1 file (checked 682 source files)
   > 04:58:37 error: mypy exited with status 1
   > ```
   
   Thanks for noting, I almost overlooked the error in the logs and thought this was flaky.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420653268



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkDebugPipelineOptions.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+
+/** Debug options which shouldn't normally be used. */
+public interface FlinkDebugPipelineOptions extends PipelineOptions {
+
+  @Description(
+      "If not null, reports the checkpoint duration of each ParDo stage in the provided metric namespace.")
+  String getReportCheckpointDuration();
+
+  void setReportCheckpointDuration(String metricNamespace);
+
+  @Description(
+      "Shuts down sources which have been idle for the configured time of milliseconds. Once a source has been "
+          + "shut down, chekpointing is not possible anymore. Shutting down the sources eventually leads to pipeline "
+          + "shutdown once all input has been processed.")
+  @Default.Long(0)
+  Long getShutdownSourcesAfterIdleMs();

Review comment:
       That's a good suggestion. I think we can consolidate the two. The drawback is that we would break the existing parameter but if we also change the default that wouldn't be a problem.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623577185


   Run SeedJob


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623493427


   Run SeedJob


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tweise commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
tweise commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r421498202



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/io/UnboundedSourceWrapper.java
##########
@@ -91,11 +91,8 @@
    */
   private final List<? extends UnboundedSource<OutputT, CheckpointMarkT>> splitSources;
 
-  /**
-   * Shuts down the source if the final watermark is read. Note: This prevents further checkpoints
-   * of the streaming application.
-   */
-  private final boolean shutdownOnFinalWatermark;
+  /** The idle time before we the source shuts down. */

Review comment:
       ```suggestion
     /** The idle time before the source shuts down. */
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624763021


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621277972


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622058386


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624567037


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620757409


   Run SeedJob


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623577363


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623397187


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621942631


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624726246


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tweise commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
tweise commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420914083



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkDebugPipelineOptions.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+
+/** Debug options which shouldn't normally be used. */
+public interface FlinkDebugPipelineOptions extends PipelineOptions {

Review comment:
       Agreed. I would prefer we keep all the Flink options in a single interface though. The lone option here strictly isn't a "debug option" anyways.

##########
File path: .test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy
##########
@@ -142,9 +249,22 @@ PhraseTriggeringPostCommitBuilder.postCommitJob(
   'Load Tests Python ParDo Flink Batch suite',
   this
 ) {
-  loadTest(delegate, CommonTestProperties.TriggeringContext.PR)
+  loadBatchTests(delegate, CommonTestProperties.TriggeringContext.PR)
+}
+
+PhraseTriggeringPostCommitBuilder.postCommitJob(
+    'beam_LoadTests_Python_ParDo_Flink_Streaming',

Review comment:
       Makes sense; misunderstanding on my part.

##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkPipelineOptions.java
##########
@@ -126,6 +126,16 @@
 
   void setFailOnCheckpointingErrors(Boolean failOnCheckpointingErrors);
 
+  @Description(

Review comment:
       Add link to the Flink checkpointing issue here? 

##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
##########
@@ -455,6 +461,18 @@ public void open() throws Exception {
     if (!options.getDisableMetrics()) {
       flinkMetricContainer = new FlinkMetricContainer(getRuntimeContext());
       doFnRunner = new DoFnRunnerWithMetricsUpdate<>(stepName, doFnRunner, flinkMetricContainer);
+      String checkpointMetricNamespace =

Review comment:
       It would be good to expose/utilize the Flink metrics as part of the load test suite. This could, for example, be accomplished with a custom Flink metric reporter. It can be addressed as follow-up though.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420653425



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkDebugPipelineOptions.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+
+/** Debug options which shouldn't normally be used. */
+public interface FlinkDebugPipelineOptions extends PipelineOptions {

Review comment:
       That's how pipeline options work in Beam. Take a look at what else FlinkPipelineOptions extends. 
   
   At first, I used a separate interface but having multiple interfaces and casting between them (via `.as(...)`) is not nice either and we need to be careful not to pass the wrong interface.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622025400


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-625090743


   We can actually see here that there is a fair amount of skew in the checkpoint duration (ms):
   ![image](https://user-images.githubusercontent.com/837221/81268035-b01d2800-9047-11ea-9108-51b368db4796.png)
   
   With the latest run I've also seen a big skew: 
   ![image](https://user-images.githubusercontent.com/837221/81268161-e0fd5d00-9047-11ea-81de-17ad272c0c9f.png)
   
   I'm curious to see whether things improve after merging #11590.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622093567


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624264392


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-625208279


   @tweise I've addressed your comments and have squashed the commits.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620846828


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623467868


   Run SeedJob


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622006413


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621260368


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624229343


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622401634


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624227730


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621890326


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624264392


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624557578


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm edited a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm edited a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622125124


   The benchmark and checkpoint data is now published to BigQuery:
   
   ```
   [
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525003 UTC",
       "metric": "checkpointing_19measure_time:_start.none/beam:env:docker:v1:0_max_checkpoint_duration",
       "value": "97.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525035 UTC",
       "metric": "checkpointing_41loadgenerator/reshuffle2/removerandomkeys.none/beam:env:docker:v1:0_max_checkpoint_duration",
       "value": "99.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525003 UTC",
       "metric": "checkpointing_19measure_time:_start.none/beam:env:docker:v1:0_count_checkpoint_duration",
       "value": "120.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525035 UTC",
       "metric": "checkpointing_41loadgenerator/reshuffle2/removerandomkeys.none/beam:env:docker:v1:0_min_checkpoint_duration",
       "value": "10.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525020 UTC",
       "metric": "checkpointing_40loadgenerator/reshuffle/removerandomkeys.none/beam:env:docker:v1:0_max_checkpoint_duration",
       "value": "116.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.524947 UTC",
       "metric": "checkpointing_21loadgenerator/impulse.none/beam:env:docker:v1:0_min_checkpoint_duration",
       "value": "14.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.524947 UTC",
       "metric": "checkpointing_21loadgenerator/impulse.none/beam:env:docker:v1:0_sum_checkpoint_duration",
       "value": "4402.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525035 UTC",
       "metric": "checkpointing_41loadgenerator/reshuffle2/removerandomkeys.none/beam:env:docker:v1:0_sum_checkpoint_duration",
       "value": "2214.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525035 UTC",
       "metric": "checkpointing_41loadgenerator/reshuffle2/removerandomkeys.none/beam:env:docker:v1:0_count_checkpoint_duration",
       "value": "120.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.524947 UTC",
       "metric": "checkpointing_21loadgenerator/impulse.none/beam:env:docker:v1:0_count_checkpoint_duration",
       "value": "120.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525020 UTC",
       "metric": "checkpointing_40loadgenerator/reshuffle/removerandomkeys.none/beam:env:docker:v1:0_sum_checkpoint_duration",
       "value": "2392.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525020 UTC",
       "metric": "checkpointing_40loadgenerator/reshuffle/removerandomkeys.none/beam:env:docker:v1:0_count_checkpoint_duration",
       "value": "120.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.524947 UTC",
       "metric": "checkpointing_21loadgenerator/impulse.none/beam:env:docker:v1:0_max_checkpoint_duration",
       "value": "1015.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525003 UTC",
       "metric": "checkpointing_19measure_time:_start.none/beam:env:docker:v1:0_sum_checkpoint_duration",
       "value": "2098.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525003 UTC",
       "metric": "checkpointing_19measure_time:_start.none/beam:env:docker:v1:0_min_checkpoint_duration",
       "value": "9.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525020 UTC",
       "metric": "checkpointing_40loadgenerator/reshuffle/removerandomkeys.none/beam:env:docker:v1:0_min_checkpoint_duration",
       "value": "12.0"
     }
   ]
   ```
   
   We can visualize the data using Beam's dashboard: https://apache-beam-testing.appspot.com/


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420897845



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkDebugPipelineOptions.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+
+/** Debug options which shouldn't normally be used. */
+public interface FlinkDebugPipelineOptions extends PipelineOptions {
+
+  @Description(
+      "If not null, reports the checkpoint duration of each ParDo stage in the provided metric namespace.")
+  String getReportCheckpointDuration();
+
+  void setReportCheckpointDuration(String metricNamespace);
+
+  @Description(
+      "Shuts down sources which have been idle for the configured time of milliseconds. Once a source has been "
+          + "shut down, chekpointing is not possible anymore. Shutting down the sources eventually leads to pipeline "
+          + "shutdown once all input has been processed.")
+  @Default.Long(0)
+  Long getShutdownSourcesAfterIdleMs();

Review comment:
       PTAL @tweise I've unified the flags and made it auto-set the flag when checkpointing is enabled.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420934296



##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkDebugPipelineOptions.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+
+/** Debug options which shouldn't normally be used. */
+public interface FlinkDebugPipelineOptions extends PipelineOptions {

Review comment:
       Ok, I suppose we can move the option to FlinkPipelineOptions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624229343


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] tweise commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
tweise commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420389628



##########
File path: .test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy
##########
@@ -142,9 +249,22 @@ PhraseTriggeringPostCommitBuilder.postCommitJob(
   'Load Tests Python ParDo Flink Batch suite',
   this
 ) {
-  loadTest(delegate, CommonTestProperties.TriggeringContext.PR)
+  loadBatchTests(delegate, CommonTestProperties.TriggeringContext.PR)
+}
+
+PhraseTriggeringPostCommitBuilder.postCommitJob(
+    'beam_LoadTests_Python_ParDo_Flink_Streaming',

Review comment:
       Why multiple trigger phrases?

##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkDebugPipelineOptions.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+
+/** Debug options which shouldn't normally be used. */
+public interface FlinkDebugPipelineOptions extends PipelineOptions {
+
+  @Description(
+      "If not null, reports the checkpoint duration of each ParDo stage in the provided metric namespace.")
+  String getReportCheckpointDuration();
+
+  void setReportCheckpointDuration(String metricNamespace);
+
+  @Description(
+      "Shuts down sources which have been idle for the configured time of milliseconds. Once a source has been "
+          + "shut down, chekpointing is not possible anymore. Shutting down the sources eventually leads to pipeline "
+          + "shutdown once all input has been processed.")
+  @Default.Long(0)
+  Long getShutdownSourcesAfterIdleMs();

Review comment:
       Unless I misread, this parameter is directly tied to `!shutdownSourcesOnFinalWatermark`? How about consolidating the two? Just a single parameter shutdownSourcesAfterIdleMs should suffice:
   
   0 - immediate shutdown, which should be default, unless checkpointing is enabled
   value > 0 - wait, potentially forever
   
   There was a question on the ML recently about shutdownSourcesOnFinalWatermark and if that should not be default. I think it should be (unless checkpointing was enabled), in which case we can never shutdown. So there should be almost no situation where this parameter needs to be set, except in a special case like this. 
    

##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkDebugPipelineOptions.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+
+/** Debug options which shouldn't normally be used. */
+public interface FlinkDebugPipelineOptions extends PipelineOptions {
+
+  @Description(
+      "If not null, reports the checkpoint duration of each ParDo stage in the provided metric namespace.")
+  String getReportCheckpointDuration();
+
+  void setReportCheckpointDuration(String metricNamespace);
+
+  @Description(
+      "Shuts down sources which have been idle for the configured time of milliseconds. Once a source has been "
+          + "shut down, chekpointing is not possible anymore. Shutting down the sources eventually leads to pipeline "

Review comment:
       ```suggestion
             + "shut down, checkpointing is not possible anymore. Shutting down the sources eventually leads to pipeline "
   ```

##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/FlinkDebugPipelineOptions.java
##########
@@ -0,0 +1,41 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.beam.runners.flink;
+
+import org.apache.beam.sdk.options.Default;
+import org.apache.beam.sdk.options.Description;
+import org.apache.beam.sdk.options.PipelineOptions;
+
+/** Debug options which shouldn't normally be used. */
+public interface FlinkDebugPipelineOptions extends PipelineOptions {

Review comment:
       Not sure I like the idea of having these as a separate interface that `FlinkPipelineOptions` extends (!). 

##########
File path: runners/flink/src/main/java/org/apache/beam/runners/flink/translation/wrappers/streaming/DoFnOperator.java
##########
@@ -455,6 +461,18 @@ public void open() throws Exception {
     if (!options.getDisableMetrics()) {
       flinkMetricContainer = new FlinkMetricContainer(getRuntimeContext());
       doFnRunner = new DoFnRunnerWithMetricsUpdate<>(stepName, doFnRunner, flinkMetricContainer);
+      String checkpointMetricNamespace =

Review comment:
       Wouldn't it be sufficient to rely on Flink 's task-level checkpoint metrics for this? 

##########
File path: sdks/python/apache_beam/testing/load_tests/pardo_test.py
##########
@@ -125,6 +155,70 @@ def process(self, element):
         'Measure time: End' >> beam.ParDo(MeasureTime(self.metrics_namespace)))
 
 
+class StatefulLoadGenerator(beam.PTransform):

Review comment:
       Nice to see that it is finally possible to achieve this with timers! Should `GenerateLoad` be reusable?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622480730


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620839972


   Run Website_Stage_GCS PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621187482


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620837611


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621201705


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-625144322


   The test results after merging look similar than before. 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620757409


   Run SeedJob


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621244886


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621919891


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622481501


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624853271


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621732767


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624184021


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620847201


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621997989


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623467734


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623577185






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm edited a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm edited a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-625278998


   > lint error:
   > 
   > ```
   > 04:58:35 apache_beam/testing/load_tests/pardo_test.py:177: error: Argument 2 to "CombiningValueStateSpec" has incompatible type overloaded function; expected "Optional[Coder]"  [arg-type]
   > 04:58:36 Found 1 error in 1 file (checked 682 source files)
   > 04:58:37 error: mypy exited with status 1
   > ```
   
   Thanks for noting, I almost overlooked the error in the logs and thought it was flaky.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623493427


   Run SeedJob


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621296826


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420657092



##########
File path: sdks/python/apache_beam/testing/load_tests/pardo_test.py
##########
@@ -125,6 +155,70 @@ def process(self, element):
         'Measure time: End' >> beam.ParDo(MeasureTime(self.metrics_namespace)))
 
 
+class StatefulLoadGenerator(beam.PTransform):

Review comment:
       Potentially, yes. The implementation should be changed to use SDF eventually but that is not properly supported yet for Python.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621225757


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-621901389


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-625087923


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620839972


   Run Website_Stage_GCS PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624846518


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622489303


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622401502


   I've created a new dashboard here includes data from the batch tests and the two new streaming tests: https://apache-beam-testing.appspot.com/explore?dashboard=5751884853805056
   
   ![Untitled](https://user-images.githubusercontent.com/837221/80810980-46bf9400-8bc5-11ea-99e4-bb042d23a74d.png)
   
   The streaming tests have data for the runtime and for the checkpoint duration (min/max/average).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-620847201


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622081012


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622442663


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm removed a comment on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm removed a comment on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622976868


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624054294


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622026762


   Run Python Load Tests ParDo Flink Streaming


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on a change in pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on a change in pull request #11558:
URL: https://github.com/apache/beam/pull/11558#discussion_r420653166



##########
File path: .test-infra/jenkins/job_LoadTests_ParDo_Flink_Python.groovy
##########
@@ -142,9 +249,22 @@ PhraseTriggeringPostCommitBuilder.postCommitJob(
   'Load Tests Python ParDo Flink Batch suite',
   this
 ) {
-  loadTest(delegate, CommonTestProperties.TriggeringContext.PR)
+  loadBatchTests(delegate, CommonTestProperties.TriggeringContext.PR)
+}
+
+PhraseTriggeringPostCommitBuilder.postCommitJob(
+    'beam_LoadTests_Python_ParDo_Flink_Streaming',

Review comment:
       There is one for batch and one for streaming. I think that makes sense, a lot of times we want to see the impact on either of those. 
   
   If you are referring to the args, this is the signature of the `postCommitjob`:
   
   ```
     static void postCommitJob(nameBase,
                               triggerPhrase,
                               githubUiHint,
                               scope,
                               jobDefinition = {}) {
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-622125124


   The benchmark and checkpoint data is now published to BigQuery:
   
   ```
   [
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525003 UTC",
       "metric": "checkpointing_19measure_time:_start.none/beam:env:docker:v1:0_max_checkpoint_duration",
       "value": "97.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525035 UTC",
       "metric": "checkpointing_41loadgenerator/reshuffle2/removerandomkeys.none/beam:env:docker:v1:0_max_checkpoint_duration",
       "value": "99.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525003 UTC",
       "metric": "checkpointing_19measure_time:_start.none/beam:env:docker:v1:0_count_checkpoint_duration",
       "value": "120.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525035 UTC",
       "metric": "checkpointing_41loadgenerator/reshuffle2/removerandomkeys.none/beam:env:docker:v1:0_min_checkpoint_duration",
       "value": "10.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525020 UTC",
       "metric": "checkpointing_40loadgenerator/reshuffle/removerandomkeys.none/beam:env:docker:v1:0_max_checkpoint_duration",
       "value": "116.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.524947 UTC",
       "metric": "checkpointing_21loadgenerator/impulse.none/beam:env:docker:v1:0_min_checkpoint_duration",
       "value": "14.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.524947 UTC",
       "metric": "checkpointing_21loadgenerator/impulse.none/beam:env:docker:v1:0_sum_checkpoint_duration",
       "value": "4402.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525035 UTC",
       "metric": "checkpointing_41loadgenerator/reshuffle2/removerandomkeys.none/beam:env:docker:v1:0_sum_checkpoint_duration",
       "value": "2214.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525035 UTC",
       "metric": "checkpointing_41loadgenerator/reshuffle2/removerandomkeys.none/beam:env:docker:v1:0_count_checkpoint_duration",
       "value": "120.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.524947 UTC",
       "metric": "checkpointing_21loadgenerator/impulse.none/beam:env:docker:v1:0_count_checkpoint_duration",
       "value": "120.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525020 UTC",
       "metric": "checkpointing_40loadgenerator/reshuffle/removerandomkeys.none/beam:env:docker:v1:0_sum_checkpoint_duration",
       "value": "2392.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525020 UTC",
       "metric": "checkpointing_40loadgenerator/reshuffle/removerandomkeys.none/beam:env:docker:v1:0_count_checkpoint_duration",
       "value": "120.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.524947 UTC",
       "metric": "checkpointing_21loadgenerator/impulse.none/beam:env:docker:v1:0_max_checkpoint_duration",
       "value": "1015.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525003 UTC",
       "metric": "checkpointing_19measure_time:_start.none/beam:env:docker:v1:0_sum_checkpoint_duration",
       "value": "2098.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525003 UTC",
       "metric": "checkpointing_19measure_time:_start.none/beam:env:docker:v1:0_min_checkpoint_duration",
       "value": "9.0"
     },
     {
       "test_id": "7c8aa91a0fb449218f3c1d5fe836300b",
       "timestamp": "2020-04-30 20:55:21.525020 UTC",
       "metric": "checkpointing_40loadgenerator/reshuffle/removerandomkeys.none/beam:env:docker:v1:0_min_checkpoint_duration",
       "value": "12.0"
     }
   ]```
   
   We can visualize the data using Beam's dashboard: https://apache-beam-testing.appspot.com/


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-623494085


   Run SeedJob


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] mxm commented on pull request #11558: [BEAM-8742] Add stateful and timely processing benchmarks

Posted by GitBox <gi...@apache.org>.
mxm commented on pull request #11558:
URL: https://github.com/apache/beam/pull/11558#issuecomment-624180353


   Run Seed Job


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org