You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/02/04 01:20:46 UTC

[GitHub] [beam] boyuanzz opened a new pull request #13893: [WIP] Using LoadingCache instead of Map to cache BundleProcessor

boyuanzz opened a new pull request #13893:
URL: https://github.com/apache/beam/pull/13893


   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Format the pull request title like `[BEAM-XXX] Fixes bug in ApproximateQuantiles`, where you replace `BEAM-XXX` with the appropriate JIRA issue, if applicable. This will automatically link the pull request to the issue.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   Post-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   Lang | SDK | Dataflow | Flink | Samza | Spark | Twister2
   --- | --- | --- | --- | --- | --- | ---
   Go | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Go_VR_Spark/lastCompletedBuild/) | ---
   Java | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Dataflow_Java11/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/badge/icon)](https://ci-beam
 .apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Flink_Java11/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Flink_Streaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Samza/lastCompletedBuild/) | [![Build Status](https://ci-beam.a
 pache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_PVR_Spark_Batch/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_SparkStructuredStreaming/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Twister2/lastCompletedBuild/)
   Python | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python36/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python37/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python38/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_VR_Dataflow_V2/lastCompletedBuild/)<br>[![Build Status](https://ci-beam
 .apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Py_ValCont/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_PVR_Flink_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_Python_VR_Spark/lastCompletedBuild/) | ---
   XLang | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Direct/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Dataflow/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Flink/lastCompletedBuild/) | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PostCommit_XVR_Spark/lastCompletedBuild/) | ---
   
   Pre-Commit Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   
   --- |Java | Python | Go | Website | Whitespace | Typescript
   --- | --- | --- | --- | --- | --- | ---
   Non-portable | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Java_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Python_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonLint_Cron/lastCompletedBuild/)<br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocker_Cron/lastCompletedBuild/) <br>[![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_PythonDocs_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/be
 am_PreCommit_Go_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Go_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Website_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Whitespace_Cron/lastCompletedBuild/) | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Typescript_Cron/lastCompletedBuild/)
   Portable | --- | [![Build Status](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/badge/icon)](https://ci-beam.apache.org/job/beam_PreCommit_Portable_Python_Cron/lastCompletedBuild/) | --- | --- | --- | ---
   
   See [.test-infra/jenkins/README](https://github.com/apache/beam/blob/master/.test-infra/jenkins/README.md) for trigger phrase, status and link of all Jenkins jobs.
   
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on pull request #13893:
URL: https://github.com/apache/beam/pull/13893#issuecomment-773655688






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on pull request #13893:
URL: https://github.com/apache/beam/pull/13893#issuecomment-773693047


   Run Java_Examples_Dataflow PreCommit


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r572283447



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       It seems like dofn lifecycle section is about still non-portable execution. We should add(or update it to) portable execution. I'll add some javadoc to DoFn first and we can update dofn lifecycle together.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz merged pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz merged pull request #13893:
URL: https://github.com/apache/beam/pull/13893


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r571300396



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       We can elaborate more on the error message, like saying `There is an exception when calling teardown, but this exception will not fail the execution.`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz edited a comment on pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz edited a comment on pull request #13893:
URL: https://github.com/apache/beam/pull/13893#issuecomment-773655688


   r: @kennknowles 
   cc: @robertwb @lukecwik 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r570637825



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       Note that within this change, exceptions from `shutdown` will not fail the bundle. Alternatively an error will be shown from sdk harness log.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz edited a comment on pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz edited a comment on pull request #13893:
URL: https://github.com/apache/beam/pull/13893#issuecomment-773655688


   r: @kennknowles 
   cc: @robertwb @lukecwik 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
kennknowles commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r572303946



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       But that is actually a tagent. The important thing is that users can understand how their `DoFn` will be invoked without reasoning about the portability protocols. Like "during a failure, `@Teardown` will be called if it is possible, but not always". We need a spec about what happens when any of the methods fail. I think like "Since `@TearDown` executes independent of any data processing, if `@TearDown` fails, the pipeline is permitted to continue but resources will not be collected."




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r572298177



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       > For example some runners may choose to execute some steps directly even if they support portability.
   I don't think that's the case. Would you like to share some examples in your mind?

##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       > For example some runners may choose to execute some steps directly even if they support portability.
   
   I don't think that's the case. Would you like to share some examples in your mind?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
kennknowles commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r572304888



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       The Fn API is really an implementation detail of this higher-level contract about how we call user functions.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
kennknowles commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r572296662



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       The document is for a user, who should not have to think about portable execution. The contract for which methods are called should be independent of portable execution. For example some runners may choose to execute some steps directly even if they support portability.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
kennknowles commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r571262843



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       Interesting. I did not find good documentation on the expectation about whether a bundle should fail if `TearDown` fails. The most detailed docs are the [javadoc](https://beam.apache.org/releases/javadoc/2.27.0/org/apache/beam/sdk/transforms/DoFn.Teardown.html) and they do not describe this. That seems like we need to describe the expectation.
   
   Note that I think this will call http://www.slf4j.org/apidocs/org/slf4j/Logger.html#error(java.lang.String,java.lang.Throwable) because it is the most precise overloaded method.
   
   I think that is good because it will give a stack trace.
   
   But the `{}` will not be replaced with anything.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r570637825



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       Note that within this change, exceptions from `shutdown` will not fail the bundle. Alternatively an error will be shown from sdk harness log.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
kennknowles commented on pull request #13893:
URL: https://github.com/apache/beam/pull/13893#issuecomment-774302971


   I guess GitHub thinks I **really** *really* **_really_** approve.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r572300944



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       All these steps are defined inside DoFn, right? I would image that sdk harness should be responsible for managing them. Based on the contract, the runner should talk to sdk harness via fnapi. Otherwise, that seems wrong to me.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
kennknowles commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r572299540



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       I do not mean that it occurs today. I mean that it is permitted.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] boyuanzz commented on pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
boyuanzz commented on pull request #13893:
URL: https://github.com/apache/beam/pull/13893#issuecomment-773655688


   r: @kennknowles 
   cc: @robertwb 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



[GitHub] [beam] kennknowles commented on a change in pull request #13893: [BEAM-11752] Using LoadingCache instead of Map to cache BundleProcessor

Posted by GitBox <gi...@apache.org>.
kennknowles commented on a change in pull request #13893:
URL: https://github.com/apache/beam/pull/13893#discussion_r571338016



##########
File path: sdks/java/harness/src/main/java/org/apache/beam/fn/harness/control/ProcessBundleHandler.java
##########
@@ -705,6 +718,17 @@ void reset() throws Exception {
         resetFunction.run();
       }
     }
+
+    void shutdown() {
+      for (ThrowingRunnable tearDownFunction : getTearDownFunctions()) {
+        LOG.debug("Tearing down function {}", tearDownFunction);
+        try {
+          tearDownFunction.run();
+        } catch (Exception e) {
+          LOG.error("Failed to call teardown function: {}", e);

Review comment:
       I was thinking somewhere like https://beam.apache.org/documentation/programming-guide/#dofn




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org