You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/29 11:02:29 UTC

[GitHub] [beam] mosche opened a new pull request, #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

mosche opened a new pull request, #22525:
URL: https://github.com/apache/beam/pull/22525

   Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.
   
   Make sure GroupIntoBatches / onWindowExpiration isn't used in Spark runner to prevent data loss, see #22524.
   
   
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on a diff in pull request #22525: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on code in PR #22525:
URL: https://github.com/apache/beam/pull/22525#discussion_r936296176


##########
runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryTimerInternals.java:
##########
@@ -155,10 +156,18 @@ public void setTimer(TimerData timerData) {
   @Override
   public void deleteTimer(
       StateNamespace namespace, String timerId, String timerFamilyId, TimeDomain timeDomain) {
-    throw new UnsupportedOperationException("Canceling a timer by ID is not yet supported.");
+    TimerData removedTimer = existingTimers.remove(namespace, timerId + '+' + timerFamilyId);
+    if (removedTimer != null) {
+      Preconditions.checkState(
+          removedTimer.getDomain().equals(timeDomain),
+          "%s doesn't match time domain %s of timer",
+          timeDomain,
+          removedTimer.getDomain());

Review Comment:
   I did that initially, but removed the additional lookup again again as a timer is uniquely identified by `namespace` + `timerId`. I added this as sanity check here to detect any broken state, but as far as I understand this should never be the case. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on a diff in pull request #22525: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on code in PR #22525:
URL: https://github.com/apache/beam/pull/22525#discussion_r936292439


##########
runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryTimerInternals.java:
##########
@@ -155,10 +156,18 @@ public void setTimer(TimerData timerData) {
   @Override
   public void deleteTimer(
       StateNamespace namespace, String timerId, String timerFamilyId, TimeDomain timeDomain) {
-    throw new UnsupportedOperationException("Canceling a timer by ID is not yet supported.");
+    TimerData removedTimer = existingTimers.remove(namespace, timerId + '+' + timerFamilyId);
+    if (removedTimer != null) {
+      Preconditions.checkState(
+          removedTimer.getDomain().equals(timeDomain),
+          "%s doesn't match time domain %s of timer",
+          timeDomain,
+          removedTimer.getDomain());
+      timersForDomain(timeDomain).remove(removedTimer);
+    }
   }
 
-  /** @deprecated use {@link #deleteTimer(StateNamespace, String, TimeDomain)}. */
+  /** @deprecated use {@link #deleteTimer(StateNamespace, String, String, TimeDomain)}. */

Review Comment:
   I considered doing that, though it would mean an additional lookup the get the `TimeDomain` first. That didn't seem worth it ... But no real preference from my side.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on a diff in pull request #22525: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on code in PR #22525:
URL: https://github.com/apache/beam/pull/22525#discussion_r936020620


##########
runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryTimerInternals.java:
##########
@@ -155,10 +156,18 @@ public void setTimer(TimerData timerData) {
   @Override
   public void deleteTimer(
       StateNamespace namespace, String timerId, String timerFamilyId, TimeDomain timeDomain) {
-    throw new UnsupportedOperationException("Canceling a timer by ID is not yet supported.");
+    TimerData removedTimer = existingTimers.remove(namespace, timerId + '+' + timerFamilyId);
+    if (removedTimer != null) {
+      Preconditions.checkState(
+          removedTimer.getDomain().equals(timeDomain),
+          "%s doesn't match time domain %s of timer",
+          timeDomain,
+          removedTimer.getDomain());

Review Comment:
   I don't really have context here, but I wonder if we should do this check first, and unconditionally, so that we don't end up with a partially removed timer?



##########
runners/core-java/src/main/java/org/apache/beam/runners/core/InMemoryTimerInternals.java:
##########
@@ -155,10 +156,18 @@ public void setTimer(TimerData timerData) {
   @Override
   public void deleteTimer(
       StateNamespace namespace, String timerId, String timerFamilyId, TimeDomain timeDomain) {
-    throw new UnsupportedOperationException("Canceling a timer by ID is not yet supported.");
+    TimerData removedTimer = existingTimers.remove(namespace, timerId + '+' + timerFamilyId);
+    if (removedTimer != null) {
+      Preconditions.checkState(
+          removedTimer.getDomain().equals(timeDomain),
+          "%s doesn't match time domain %s of timer",
+          timeDomain,
+          removedTimer.getDomain());
+      timersForDomain(timeDomain).remove(removedTimer);
+    }
   }
 
-  /** @deprecated use {@link #deleteTimer(StateNamespace, String, TimeDomain)}. */
+  /** @deprecated use {@link #deleteTimer(StateNamespace, String, String, TimeDomain)}. */

Review Comment:
   Maybe this should defer to the `deleteTimer(StateNamespace, String, String, TimeDomain)` implementation?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit commented on pull request #22525: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1204542950

   Thanks for the responses!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1199174950

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @kileys for label java.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1202097678

   Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1199222822

   Run Dataflow ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1202096587

   R: @TheNeuralBit 
   R: @reuvenlax 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1203546034

   > Is it expected that they're not running for [Spark](https://ci-beam.apache.org/job/beam_PostCommit_Java_ValidatesRunner_Spark_PR/479/testReport/org.apache.beam.sdk.transforms/)?
   
   Yes, I had to tag the tests with `UsesOnWindowExpiration` to make sure they are not run for Spark. I'm also making sure the Spark runner fails at translation time if `onWindowExpiration` is used. Otherwise it would silently lead to wrong results...


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] TheNeuralBit merged pull request #22525: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
TheNeuralBit merged PR #22525:
URL: https://github.com/apache/beam/pull/22525


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1199187738

   Run Spark ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1199251532

   Run Flink ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1199186636

   Run Flink ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1199222474

   Run Flink ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1199188186

   Run Dataflow ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fixes #21378: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1199222639

   Run Spark ValidatesRunner


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] mosche commented on pull request #22525: Fix deleteTimer in InMemoryTimerInternals and enable VR tests for GroupIntoBatches.

Posted by GitBox <gi...@apache.org>.
mosche commented on PR #22525:
URL: https://github.com/apache/beam/pull/22525#issuecomment-1203554631

   Thanks for the jumping on this so quickly @TheNeuralBit 🙇‍♂️ I hope you had a great time off :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org