You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/12/20 15:36:31 UTC
[GitHub] [flink-kubernetes-operator] mxm opened a new pull request, #492: Start stabilization period only after job goes into RUNNING
mxm opened a new pull request, #492:
URL: https://github.com/apache/flink-kubernetes-operator/pull/492
We were tracking the stabilization period from the job start time which can greatly vary from the time the job goes into RUNNING state. This can result in no stabilization period at all leading to poor scaling decisions.
Thus, we change the logic to use the update time in combination with a RUNNING job status.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] mxm commented on pull request #492: [FLINK-30463] Start stabilization period only after job goes into RUNNING
Posted by GitBox <gi...@apache.org>.
mxm commented on PR #492:
URL: https://github.com/apache/flink-kubernetes-operator/pull/492#issuecomment-1361331765
Do you think there are any blockers merging this?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] gyfora commented on a diff in pull request #492: [FLINK-30463] Start stabilization period only after job goes into RUNNING
Posted by GitBox <gi...@apache.org>.
gyfora commented on code in PR #492:
URL: https://github.com/apache/flink-kubernetes-operator/pull/492#discussion_r1053501498
##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/ScalingExecutor.java:
##########
@@ -121,13 +122,22 @@ public boolean scaleResource(
private boolean stabilizationPeriodPassed(
AbstractFlinkResource<?, ?> resource, Configuration conf) {
- var now = clock.instant();
+ var jobStatus = resource.getStatus().getJobStatus();
+
+ if (!JobStatus.RUNNING.name().equals(jobStatus.getState())) {
Review Comment:
This might be a redundant check because the `JobAutoscaler` class already runs this check before calling any collection, evaluation etc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] mxm commented on a diff in pull request #492: [FLINK-30463] Start stabilization period only after job goes into RUNNING
Posted by GitBox <gi...@apache.org>.
mxm commented on code in PR #492:
URL: https://github.com/apache/flink-kubernetes-operator/pull/492#discussion_r1054407083
##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/autoscaler/ScalingExecutor.java:
##########
@@ -121,13 +122,22 @@ public boolean scaleResource(
private boolean stabilizationPeriodPassed(
AbstractFlinkResource<?, ?> resource, Configuration conf) {
- var now = clock.instant();
+ var jobStatus = resource.getStatus().getJobStatus();
+
+ if (!JobStatus.RUNNING.name().equals(jobStatus.getState())) {
Review Comment:
I noticed while implementing that there are various invariants enforced in different layers. This per se is not a problem if they weren't leaky, i.e. we wouldn't pass the same data structures with different invariants through the layers. This becomes apparent in unit tests where implied invariants have to be modeled. One way to solve this would be to pass a different data structure, e.g. not pass JobStatus to lower levels but only the update time / running time.
I think the check here doesn't hurt as it asserts an important invariant.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] mxm merged pull request #492: [FLINK-30463] Start stabilization period only after job goes into RUNNING
Posted by GitBox <gi...@apache.org>.
mxm merged PR #492:
URL: https://github.com/apache/flink-kubernetes-operator/pull/492
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] gyfora commented on pull request #492: [FLINK-30463] Start stabilization period only after job goes into RUNNING
Posted by GitBox <gi...@apache.org>.
gyfora commented on PR #492:
URL: https://github.com/apache/flink-kubernetes-operator/pull/492#issuecomment-1361343304
Looks good +1
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] mxm commented on pull request #492: [FLINK-30463] Start stabilization period only after job goes into RUNNING
Posted by GitBox <gi...@apache.org>.
mxm commented on PR #492:
URL: https://github.com/apache/flink-kubernetes-operator/pull/492#issuecomment-1361921179
Thanks for the review!
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org
[GitHub] [flink-kubernetes-operator] mxm commented on pull request #492: [FLINK-30463] Start stabilization period only after job goes into RUNNING
Posted by GitBox <gi...@apache.org>.
mxm commented on PR #492:
URL: https://github.com/apache/flink-kubernetes-operator/pull/492#issuecomment-1359586067
Note: I need to tag the commit with the correct JIRA on merge.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org
For queries about this service, please contact Infrastructure at:
users@infra.apache.org