You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by "lostluck (via GitHub)" <gi...@apache.org> on 2023/04/20 18:11:53 UTC

[GitHub] [beam] lostluck opened a new pull request, #26374: [#26373] Fix some dataraces in the Go SDK

lostluck opened a new pull request, #26374:
URL: https://github.com/apache/beam/pull/26374

   Internal Google scans found some races when running tests in parallel, and some minor races during execution.
   
   * Fixes race in hooks test.
   * Fixes race in statemgr test.
   * Fixes race against status on exec/plan.
   * Fixes race in schema coder reconciliation when constructing bundles. (RW lock)
   
   Fixes #26373.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "lostluck (via GitHub)" <gi...@apache.org>.
lostluck commented on PR #26374:
URL: https://github.com/apache/beam/pull/26374#issuecomment-1520376203

   Thanks for the review!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] miracvbasaran commented on a diff in pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "miracvbasaran (via GitHub)" <gi...@apache.org>.
miracvbasaran commented on code in PR #26374:
URL: https://github.com/apache/beam/pull/26374#discussion_r1175277489


##########
sdks/go/pkg/beam/core/runtime/exec/plan.go:
##########
@@ -100,29 +109,29 @@ func (p *Plan) SourcePTransformID() string {
 // are brought up on the first execution. If a bundle fails, the plan cannot
 // be reused for further bundles. Does not panic. Blocking.
 func (p *Plan) Execute(ctx context.Context, id string, manager DataContext) error {
-	if p.status == Initializing {
+	if p.getStatus() == Initializing {

Review Comment:
   Can `status` not change during the course of "Execute"?



##########
sdks/go/pkg/beam/core/runtime/exec/plan.go:
##########
@@ -38,7 +39,7 @@ type Plan struct {
 	bf          *bundleFinalizer
 	checkpoints []*Checkpoint
 
-	status Status
+	status Status // Uses atomic getter and setter to avoid dataraces on Splits.

Review Comment:
   Q: What is a "Split" exactly?



##########
sdks/go/pkg/beam/core/runtime/exec/plan.go:
##########
@@ -189,10 +197,11 @@ func (p *Plan) GetExpirationTime() time.Time {
 
 // Down takes the plan and associated units down. Does not panic.
 func (p *Plan) Down(ctx context.Context) error {
-	if p.status == Down {
+	// Technically racy, but only one thread calls this method on the plan.

Review Comment:
   Are we confident that this will also be the case in the future?
   
   If not, maybe we can use a mutex instead?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "lostluck (via GitHub)" <gi...@apache.org>.
lostluck commented on PR #26374:
URL: https://github.com/apache/beam/pull/26374#issuecomment-1516750051

   cc: @miracvbasaran (though feel free to review as well, to get the internal tests fixed sooner)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #26374:
URL: https://github.com/apache/beam/pull/26374#issuecomment-1516773958

   Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on a diff in pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "lostluck (via GitHub)" <gi...@apache.org>.
lostluck commented on code in PR #26374:
URL: https://github.com/apache/beam/pull/26374#discussion_r1175444534


##########
sdks/go/pkg/beam/core/runtime/exec/plan.go:
##########
@@ -100,29 +109,29 @@ func (p *Plan) SourcePTransformID() string {
 // are brought up on the first execution. If a bundle fails, the plan cannot
 // be reused for further bundles. Does not panic. Blocking.
 func (p *Plan) Execute(ctx context.Context, id string, manager DataContext) error {
-	if p.status == Initializing {
+	if p.getStatus() == Initializing {

Review Comment:
   You'll note that we change the status throughout Execute. But this is the only goroutine that should be making those changes. 
   
   The Split and Progress calls only read the status state, hence the data race on read they experienced.
   
   The goal isn't to make bundle execution state fully threadsafe, because that would be incredibly heavyweight on a per element basis. Only the parts that need to be accessed asynchronously need to be threadsafe, and status was missed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on a diff in pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "lostluck (via GitHub)" <gi...@apache.org>.
lostluck commented on code in PR #26374:
URL: https://github.com/apache/beam/pull/26374#discussion_r1175435220


##########
sdks/go/pkg/beam/core/runtime/exec/plan.go:
##########
@@ -38,7 +39,7 @@ type Plan struct {
 	bf          *bundleFinalizer
 	checkpoints []*Checkpoint
 
-	status Status
+	status Status // Uses atomic getter and setter to avoid dataraces on Splits.

Review Comment:
   Splits are how Beam (and thus Dataflow) allow for Liquid Sharding. That is, if one bundle is taking a long time, the runner can request the SDK to give up some fraction of it's remaining work so it can be processed in parallel. (This would either be via channel or sub-element splitting with SDFs, but that's a different topic.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on a diff in pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "lostluck (via GitHub)" <gi...@apache.org>.
lostluck commented on code in PR #26374:
URL: https://github.com/apache/beam/pull/26374#discussion_r1175438851


##########
sdks/go/pkg/beam/core/runtime/exec/plan.go:
##########
@@ -189,10 +197,11 @@ func (p *Plan) GetExpirationTime() time.Time {
 
 // Down takes the plan and associated units down. Does not panic.
 func (p *Plan) Down(ctx context.Context) error {
-	if p.status == Down {
+	// Technically racy, but only one thread calls this method on the plan.

Review Comment:
   Yes. Only one goroutine manages the bundle lifecycle.
   
   By design, all Plans and execution state is per processing bundle, which will only happen in a single goroutine. Recall it's not supported for any of the DoFn structures to be called from other goroutines (like the iterators and emitters). 
   
   The exception here is the Split and Progress requests, which do interact with the state in another goroutine. This care didn't happen with the Status field, hence the race.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck merged pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "lostluck (via GitHub)" <gi...@apache.org>.
lostluck merged PR #26374:
URL: https://github.com/apache/beam/pull/26374


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #26374: [#26373] Fix some dataraces in the Go SDK

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.
github-actions[bot] commented on PR #26374:
URL: https://github.com/apache/beam/pull/26374#issuecomment-1516948548

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @riteshghorse for label go.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org