You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/09 19:57:52 UTC

[GitHub] [beam] riteshghorse opened a new pull request, #21776: [Go SDK] Add more info to Worker Status API

riteshghorse opened a new pull request, #21776:
URL: https://github.com/apache/beam/pull/21776

   Adds Memory Usage information, Cache Stats, and Active Process Bundle States to worker status API added in #16957
   
   Example Output (on Dataflow):
   ```
   ===========Memory Usage============ 
   Total Alloc: 15634896 bytes Sys: 15811592 bytes Mallocs: 107161 Frees: 80900 HeapAlloc: 5142088 bytes 
   ============Active Process Bundle States============ 
   Bundle ID: process_bundle-3665700425366926206-0 Bundle State: metrics.BundleState{pid:"CountWords/wordcount.extractFn-ptransform-40", currentState:1} Bundle Execution States: ID: CountWords/stats.Count/stats.keyedCountFn-ptransform-41 Execution States: [4]metrics.ExecutionState{metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}},ID: CountWords/wordcount.extractFn-ptransform-40 Execution States: [4]metrics.ExecutionState{metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:597600000000}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:597600000000}},ID: beam.createFn-ptransform-39 Execution States: [4]metrics.ExecutionSta
 te{metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}},ID: CountWords/stats.Count/stats.SumPerKey/CombinePerKey/CoGBK+CountWords/stats.Count/stats.SumPerKey/CombinePerKey/stats.sumIntFn/Partial-ptransform-42 Execution States: [4]metrics.ExecutionState{metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:200000000}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:0}, metrics.ExecutionState{State:0, IsProcessing:false, TotalTime:200000000}}, 
   ============Cache Stats============ 
   Cache: {0 0 0 0 0} 
   ============Goroutine Dump============ 
   goroutine 14 [running]: 
   ```
   
   See the bottom of https://docs.google.com/document/d/1dMTD5_sKdzLcnoe0ZsQU5Wf9q11uliyYgFnnOZQDzuI/ for more details.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: "addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment "fixes #<ISSUE NUMBER>" instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] asf-ci commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
asf-ci commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1151561651

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1151647009

   Assigning reviewers. If you would like to opt out of this review, comment `assign to next reviewer`:
   
   R: @jrmccluskey for label go.
   
   Available commands:
   - `stop reviewer notifications` - opt out of the automated review tooling
   - `remind me after tests pass` - tag the comment author after tests pass
   - `waiting on author` - shift the attention set back to the author (any comment or push by the author will return the attention set to the reviewers)
   
   The PR bot will only process comments in the main thread (not review comments).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on a diff in pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
lostluck commented on code in PR #21776:
URL: https://github.com/apache/beam/pull/21776#discussion_r894032839


##########
sdks/go/pkg/beam/core/runtime/harness/harness.go:
##########
@@ -140,6 +118,19 @@ func Main(ctx context.Context, loggingEndpoint, controlEndpoint string, options
 		state:                &StateChannelManager{},
 		cache:                &sideCache,
 	}
+
+	// if the runner supports worker status api then expose SDK harness status
+	if statusEndpoint != "" {
+		statusHandler, err := newWorkerStatusHandler(ctx, statusEndpoint, ctrl.metStore, ctrl.cache)

Review Comment:
   The MetStore requires being locked by ctrl.mu, otherwise we run the risk of a race condition. I'd add a method to the control type that produces the string, all while locked. The we can call it on demand to avoid this issue.
   
   Note that the statecache doesn't have this issue since it manages it's own locking anyway.
   
   ----
   
   We likely want to make a much better "current bundles being processed" string from this information, instead of the full state dump. It's not particularly readable, there's not enough newlines to break things up, and similar.
   
   Good enough for a v0 pass, but we can improve this, either now or later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
lostluck commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1151667737

   Run Go Postcommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] riteshghorse commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
riteshghorse commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1152624100

   Made the changes, PTAL


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] riteshghorse commented on a diff in pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
riteshghorse commented on code in PR #21776:
URL: https://github.com/apache/beam/pull/21776#discussion_r894726809


##########
sdks/go/pkg/beam/core/runtime/harness/harness.go:
##########
@@ -140,6 +118,19 @@ func Main(ctx context.Context, loggingEndpoint, controlEndpoint string, options
 		state:                &StateChannelManager{},
 		cache:                &sideCache,
 	}
+
+	// if the runner supports worker status api then expose SDK harness status
+	if statusEndpoint != "" {
+		statusHandler, err := newWorkerStatusHandler(ctx, statusEndpoint, ctrl.metStore, ctrl.cache)

Review Comment:
   I've modified the struct to contain a func implemented by ctrl for getting active bundle process states. I thought of passing the ctrl itself to the `newStatusHandler` but that would be an overhead since we just care about 1/2 fields of all the available ones.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck merged pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
lostluck merged PR #21776:
URL: https://github.com/apache/beam/pull/21776


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] riteshghorse commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
riteshghorse commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1151581166

   Run GoPortable PostCommit 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1151594672

   Checks are failing. Will not request review until checks are succeeding. If you'd like to override that behavior, comment `assign set of reviewers`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] codecov[bot] commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1151567485

   # [Codecov](https://codecov.io/gh/apache/beam/pull/21776?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#21776](https://codecov.io/gh/apache/beam/pull/21776?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (1801d67) into [master](https://codecov.io/gh/apache/beam/commit/67533d17fd70c0c8994a3eb758b175dddfaea83b?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (67533d1) will **decrease** coverage by `0.01%`.
   > The diff coverage is `41.86%`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #21776      +/-   ##
   ==========================================
   - Coverage   74.03%   74.02%   -0.02%     
   ==========================================
     Files         698      698              
     Lines       92192    92229      +37     
   ==========================================
   + Hits        68255    68271      +16     
   - Misses      22686    22706      +20     
   - Partials     1251     1252       +1     
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | go | `50.94% <41.86%> (-0.02%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/21776?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [sdks/go/pkg/beam/core/metrics/store.go](https://codecov.io/gh/apache/beam/pull/21776/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL21ldHJpY3Mvc3RvcmUuZ28=) | `14.70% <0.00%> (-0.92%)` | :arrow_down: |
   | [sdks/go/pkg/beam/core/runtime/harness/harness.go](https://codecov.io/gh/apache/beam/pull/21776/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9oYXJuZXNzLmdv) | `10.50% <0.00%> (+0.04%)` | :arrow_up: |
   | [...beam/core/runtime/harness/statecache/statecache.go](https://codecov.io/gh/apache/beam/pull/21776/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy9zdGF0ZWNhY2hlL3N0YXRlY2FjaGUuZ28=) | `75.69% <0.00%> (-1.07%)` | :arrow_down: |
   | [.../go/pkg/beam/core/runtime/harness/worker\_status.go](https://codecov.io/gh/apache/beam/pull/21776/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9jb3JlL3J1bnRpbWUvaGFybmVzcy93b3JrZXJfc3RhdHVzLmdv) | `56.92% <66.66%> (+5.76%)` | :arrow_up: |
   | [...ks/go/pkg/beam/runners/dataflow/dataflowlib/job.go](https://codecov.io/gh/apache/beam/pull/21776/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RhdGFmbG93L2RhdGFmbG93bGliL2pvYi5nbw==) | `21.55% <0.00%> (ø)` | |
   | [...o/pkg/beam/runners/dataflow/dataflowlib/execute.go](https://codecov.io/gh/apache/beam/pull/21776/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RhdGFmbG93L2RhdGFmbG93bGliL2V4ZWN1dGUuZ28=) | `0.00% <0.00%> (ø)` | |
   | [sdks/go/pkg/beam/runners/dataflow/dataflow.go](https://codecov.io/gh/apache/beam/pull/21776/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9nby9wa2cvYmVhbS9ydW5uZXJzL2RhdGFmbG93L2RhdGFmbG93Lmdv) | `58.65% <0.00%> (+0.23%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/21776?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/21776?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [67533d1...1801d67](https://codecov.io/gh/apache/beam/pull/21776?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] asf-ci commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
asf-ci commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1151561652

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on a diff in pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
lostluck commented on code in PR #21776:
URL: https://github.com/apache/beam/pull/21776#discussion_r894022076


##########
sdks/go/pkg/beam/core/runtime/harness/worker_status.go:
##########
@@ -65,20 +70,50 @@ func (w *workerStatusHandler) start(ctx context.Context) error {
 	return nil
 }
 
+func memoryUsage() string {
+	m := runtime.MemStats{}
+	runtime.ReadMemStats(&m)
+	return fmt.Sprintf("\n Total Alloc: %v bytes \n Sys: %v bytes \n Mallocs: %v\n Frees: %v\n HeapAlloc: %v bytes", m.TotalAlloc, m.Sys, m.Mallocs, m.Frees, m.HeapAlloc)
+}
+
+func (w *workerStatusHandler) activeProcessBundleStates() string {
+	var states string
+	for bundleID, store := range w.metStore {
+		execStates := ""
+		for bundleID, state := range store.StateRegistry() {
+			execStates += fmt.Sprintf("ID: %v Execution States: %#v,", bundleID, *state)
+
+		}
+		states += fmt.Sprintf("\nBundle ID: %v\nBundle State: %#v\nBundle Execution States: %v\n", bundleID, *store.BundleState(), execStates)

Review Comment:
   We could also add the "build info" which will include a pile of useful information for debugging. It should go *after* the goroutine traces.
   
   https://pkg.go.dev/runtime/debug#ReadBuildInfo



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] lostluck commented on a diff in pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
lostluck commented on code in PR #21776:
URL: https://github.com/apache/beam/pull/21776#discussion_r893952091


##########
sdks/go/pkg/beam/core/runtime/harness/worker_status.go:
##########
@@ -65,20 +70,50 @@ func (w *workerStatusHandler) start(ctx context.Context) error {
 	return nil
 }
 
+func memoryUsage() string {
+	m := runtime.MemStats{}
+	runtime.ReadMemStats(&m)
+	return fmt.Sprintf("\n Total Alloc: %v bytes \n Sys: %v bytes \n Mallocs: %v\n Frees: %v\n HeapAlloc: %v bytes", m.TotalAlloc, m.Sys, m.Mallocs, m.Frees, m.HeapAlloc)
+}
+
+func (w *workerStatusHandler) activeProcessBundleStates() string {
+	var states string
+	for bundleID, store := range w.metStore {
+		execStates := ""
+		for bundleID, state := range store.StateRegistry() {
+			execStates += fmt.Sprintf("ID: %v Execution States: %#v,", bundleID, *state)
+
+		}
+		states += fmt.Sprintf("\nBundle ID: %v\nBundle State: %#v\nBundle Execution States: %v\n", bundleID, *store.BundleState(), execStates)
+	}
+	return states
+}
+
+func (w *workerStatusHandler) cacheStats() string {
+	return fmt.Sprintf("Cache:\n%v", w.cache.CacheMetrics())

Review Comment:
   ```suggestion
   	return fmt.Sprintf("State Cache:\n%+v", w.cache.CacheMetrics())
   ```
   
   It should be clear what cache this is taking about, and {0 0 0 0 0} isn't useful to anyway, so we use `%+v` to have the fields printed out as well, which makes the numbers useful.
   
   https://go.dev/play/p/q9db2d3a5Lv



##########
sdks/go/pkg/beam/core/runtime/harness/worker_status.go:
##########
@@ -65,20 +70,50 @@ func (w *workerStatusHandler) start(ctx context.Context) error {
 	return nil
 }
 
+func memoryUsage() string {
+	m := runtime.MemStats{}
+	runtime.ReadMemStats(&m)
+	return fmt.Sprintf("\n Total Alloc: %v bytes \n Sys: %v bytes \n Mallocs: %v\n Frees: %v\n HeapAlloc: %v bytes", m.TotalAlloc, m.Sys, m.Mallocs, m.Frees, m.HeapAlloc)
+}
+
+func (w *workerStatusHandler) activeProcessBundleStates() string {
+	var states string
+	for bundleID, store := range w.metStore {
+		execStates := ""
+		for bundleID, state := range store.StateRegistry() {
+			execStates += fmt.Sprintf("ID: %v Execution States: %#v,", bundleID, *state)
+
+		}
+		states += fmt.Sprintf("\nBundle ID: %v\nBundle State: %#v\nBundle Execution States: %v\n", bundleID, *store.BundleState(), execStates)
+	}
+	return states
+}
+
+func (w *workerStatusHandler) cacheStats() string {
+	return fmt.Sprintf("Cache:\n%v", w.cache.CacheMetrics())
+}
+
+func goroutineDump() string {
+	buf := make([]byte, 1<<16)
+	runtime.Stack(buf, true)

Review Comment:
   Instead of the raw runtime version, use the pprof version:
   
   `import "runtime/pprof"`
   ```
   profile := pprof.Lookup("goroutine")
   if profile != nil {
     // Use debug=1 to get the human readable consolidated goroutine output.
     profile.Write(b, 1)  
   }
   ```
   
   This makes the output easier to read as some repeated goroutines will instead be consolidated, and the duplication indicated with a count.
   
   (Note that a `*strings.Builder` implements `io.Writer` and can be passed right in. (the method is on the pointer)



##########
sdks/go/pkg/beam/core/runtime/harness/harness.go:
##########
@@ -37,26 +38,15 @@ import (
 	"google.golang.org/protobuf/types/known/durationpb"
 )
 
-// StatusAddress is a type of status endpoint address as an optional argument to harness.Main().
-type StatusAddress string
-
 // TODO(herohde) 2/8/2017: for now, assume we stage a full binary (not a plugin).
 
 // Main is the main entrypoint for the Go harness. It runs at "runtime" -- not
 // "pipeline-construction time" -- on each worker. It is a FnAPI client and
 // ultimately responsible for correctly executing user code.

Review Comment:
   Please add documentation around "expected" Environment variables. TBH, I don't mind the options approach to pass into main, but the  fetching form the Env vars would then happen in the init.go.



##########
sdks/go/pkg/beam/core/runtime/harness/worker_status.go:
##########
@@ -65,20 +70,50 @@ func (w *workerStatusHandler) start(ctx context.Context) error {
 	return nil
 }
 
+func memoryUsage() string {
+	m := runtime.MemStats{}
+	runtime.ReadMemStats(&m)
+	return fmt.Sprintf("\n Total Alloc: %v bytes \n Sys: %v bytes \n Mallocs: %v\n Frees: %v\n HeapAlloc: %v bytes", m.TotalAlloc, m.Sys, m.Mallocs, m.Frees, m.HeapAlloc)
+}
+
+func (w *workerStatusHandler) activeProcessBundleStates() string {
+	var states string
+	for bundleID, store := range w.metStore {
+		execStates := ""
+		for bundleID, state := range store.StateRegistry() {
+			execStates += fmt.Sprintf("ID: %v Execution States: %#v,", bundleID, *state)
+
+		}
+		states += fmt.Sprintf("\nBundle ID: %v\nBundle State: %#v\nBundle Execution States: %v\n", bundleID, *store.BundleState(), execStates)
+	}
+	return states
+}
+
+func (w *workerStatusHandler) cacheStats() string {
+	return fmt.Sprintf("Cache:\n%v", w.cache.CacheMetrics())
+}
+
+func goroutineDump() string {
+	buf := make([]byte, 1<<16)
+	runtime.Stack(buf, true)
+	return string(buf)
+}
+
 // reader reads the WorkerStatusRequest from the stream and sends a processed WorkerStatusResponse to
 // a response channel.
 func (w *workerStatusHandler) reader(ctx context.Context, stub fnpb.BeamFnWorkerStatus_WorkerStatusClient) {
 	defer w.wg.Done()
-	buf := make([]byte, 1<<16)
+
 	for w.isAlive() {
 		req, err := stub.Recv()
 		if err != nil && err != io.EOF {
 			log.Debugf(ctx, "exiting workerStatusHandler.Reader(): %v", err)
 			return
 		}
 		log.Debugf(ctx, "RECV-status: %v", req.GetId())
-		runtime.Stack(buf, true)
-		response := &fnpb.WorkerStatusResponse{Id: req.GetId(), StatusInfo: string(buf)}
+
+		statusInfo := fmt.Sprintf("\n============Memory Usage============\n%s\n============Active Process Bundle States============\n%s\n============Cache Stats============\n%s\n============Goroutine Dump============\n%s\n", memoryUsage(), w.activeProcessBundleStates(), w.cacheStats(), goroutineDump())

Review Comment:
   Use a [strings.Builder](https://pkg.go.dev/strings#Builder) instead of trying to get everything into a single print output. It will be easier to read. I'd recommend simply passing the builder to each of the helper methods, and have each of the helper methods contribute their well formed output. Then these are still easy to unit test.



##########
sdks/go/test/integration/wordcount/wordcount.go:
##########
@@ -18,15 +18,13 @@ package wordcount
 
 import (
 	"context"
-	"regexp"

Review Comment:
   We can remove this change. At best, we mostly want to remove the blank line between "strings" and "fmt".



##########
sdks/go/pkg/beam/core/runtime/harness/worker_status.go:
##########
@@ -65,20 +70,50 @@ func (w *workerStatusHandler) start(ctx context.Context) error {
 	return nil
 }
 
+func memoryUsage() string {
+	m := runtime.MemStats{}
+	runtime.ReadMemStats(&m)
+	return fmt.Sprintf("\n Total Alloc: %v bytes \n Sys: %v bytes \n Mallocs: %v\n Frees: %v\n HeapAlloc: %v bytes", m.TotalAlloc, m.Sys, m.Mallocs, m.Frees, m.HeapAlloc)

Review Comment:
   Aside from the previous string builder, we get *so much* information in MemStats, we should at least try to match what Java's putting out.
   
   https://pkg.go.dev/runtime#MemStats  -> Everything we get.
   Java's print out: https://github.com/apache/beam/blob/ffef8de04a93435e69faf3bf65efe11852cbd8dc/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/status/MemoryMonitor.java#L622 
   
   We can do the same level of readability improvement that Java is doing, though I don't think we can do "thrashing". We can show the GC target size, and `GCCPUFraction` gives us how much CPU time has been spent in GC in a way we can use as the percentage. Not sure about "thrashing" and "pushback", but this gets us close to the way there.
   
   Java's for reference:
   https://github.com/apache/beam/blob/ffef8de04a93435e69faf3bf65efe11852cbd8dc/sdks/java/harness/src/main/java/org/apache/beam/fn/harness/status/MemoryMonitor.java#L411
   
   `NextGC` is useful since it sets the target of the next GC.
   
   `Mallocs` & `Frees` aren't super useful outside of a full heap dump (we care about memory, not counts). `HeapInuse` is more useful, as is `StackInuse` and `StackSys` (We'll need both for a complete picture, as Go tries to do as much as possible on the stack, instead of the heap, unlike Java which is All Heap).



##########
sdks/go/pkg/beam/core/runtime/harness/worker_status.go:
##########
@@ -65,20 +70,50 @@ func (w *workerStatusHandler) start(ctx context.Context) error {
 	return nil
 }
 
+func memoryUsage() string {
+	m := runtime.MemStats{}
+	runtime.ReadMemStats(&m)
+	return fmt.Sprintf("\n Total Alloc: %v bytes \n Sys: %v bytes \n Mallocs: %v\n Frees: %v\n HeapAlloc: %v bytes", m.TotalAlloc, m.Sys, m.Mallocs, m.Frees, m.HeapAlloc)
+}
+
+func (w *workerStatusHandler) activeProcessBundleStates() string {
+	var states string
+	for bundleID, store := range w.metStore {
+		execStates := ""
+		for bundleID, state := range store.StateRegistry() {
+			execStates += fmt.Sprintf("ID: %v Execution States: %#v,", bundleID, *state)
+
+		}
+		states += fmt.Sprintf("\nBundle ID: %v\nBundle State: %#v\nBundle Execution States: %v\n", bundleID, *store.BundleState(), execStates)

Review Comment:
   Aside from making it use a strings.Builder, the only concern is to validate that we are being thread safe doing. We likely can improve the printout to be much clearer though.



##########
sdks/go/pkg/beam/core/runtime/harness/statecache/statecache.go:
##########
@@ -272,3 +272,8 @@ func (c *SideInputCache) evictElement(ctx context.Context) {
 		}
 	}
 }
+
+// CacheMetrics returns the cache metrics for current side input cache.
+func (c *SideInputCache) CacheMetrics() CacheMetrics {
+	return c.metrics

Review Comment:
   This method needs to be under the lock's critical section.
   
   Add
   
   ```
   	c.mu.Lock()
   	defer c.mu.Unlock()
   ```
   
   The return will provide a copy, but the copy could happen then things are mutating, causing a race condition error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] riteshghorse commented on pull request #21776: [Go SDK] Add more info to Worker Status API

Posted by GitBox <gi...@apache.org>.
riteshghorse commented on PR #21776:
URL: https://github.com/apache/beam/pull/21776#issuecomment-1155297790

   Run Go PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org