You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/01 13:46:04 UTC

[GitHub] [beam] jrmccluskey commented on a diff in pull request #17782: [BEAM-14536] Handle 0.0 splits in offsetrange restriction

jrmccluskey commented on code in PR #17782:
URL: https://github.com/apache/beam/pull/17782#discussion_r886825913


##########
sdks/go/pkg/beam/io/rtrackers/offsetrange/offsetrange.go:
##########
@@ -208,9 +210,11 @@ func (tracker *Tracker) GetProgress() (done, remaining float64) {
 	return
 }
 
-// IsDone returns true if the most recent claimed element is past the end of the restriction.
+// IsDone returns true if the most recent claimed element is past the end of the restriction
+// or if the restriction represents no work to be done (aka the start of the restriction is
+// greater than or equal to the end).
 func (tracker *Tracker) IsDone() bool {
-	return tracker.err == nil && tracker.claimed >= tracker.rest.End
+	return tracker.err == nil && (tracker.claimed >= tracker.rest.End || tracker.rest.Start >= tracker.rest.End)

Review Comment:
   Wouldn't the "done" case with claimed be `tracker.claimed >= (tracker.rest.End - 1)` since the range is [start, end)? 



##########
sdks/go/pkg/beam/core/runtime/exec/sdf.go:
##########
@@ -678,10 +678,6 @@ func (n *ProcessSizedElementsAndRestrictions) Checkpoint() ([]*FullValue, error)
 		return nil, addContext(err)
 	}
 
-	if !n.rt.IsDone() {
-		return nil, addContext(errors.Errorf("Primary restriction %#v is not done. Check that the RTracker's TrySplit() at fraction 0.0 returns a completed primary restriction", n.rt))
-	}

Review Comment:
   I don't love removing this check since I think it has value in preventing a misconfigured RTracker from losing data. I also think that new behavior works with this if the `IsDone()` function for the offsetrange tracker is tweaked like I suggested 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org