You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@flink.apache.org by GitBox <gi...@apache.org> on 2022/04/14 09:32:33 UTC

[GitHub] [flink-kubernetes-operator] wangyang0918 commented on a diff in pull request #165: [FLINK-26140] Support rollback strategies

wangyang0918 commented on code in PR #165:
URL: https://github.com/apache/flink-kubernetes-operator/pull/165#discussion_r850181529


##########
docs/content/docs/custom-resource/reference.md:
##########
@@ -226,9 +226,32 @@ This page serves as a full reference for FlinkDeployment custom resource definit
 
 | Parameter | Type | Docs |
 | ----------| ---- | ---- |
-| success | boolean | True if last reconciliation step was successful. |
-| error | java.lang.String | If success == false, error information about the reconciliation failure. |
+
+### State

Review Comment:
   I think you mean ReconciliationStatus here. Right?



##########
flink-kubernetes-operator/src/main/java/org/apache/flink/kubernetes/operator/reconciler/deployment/ApplicationReconciler.java:
##########
@@ -102,36 +111,84 @@ public void reconcile(FlinkDeployment flinkApp, Context context, Configuration e
             }
             if (currentJobState == JobState.SUSPENDED && desiredJobState == JobState.RUNNING) {
                 if (upgradeMode == UpgradeMode.STATELESS) {
-                    deployFlinkJob(flinkApp, effectiveConfig, Optional.empty());
-                } else if (upgradeMode == UpgradeMode.LAST_STATE
-                        || upgradeMode == UpgradeMode.SAVEPOINT) {
-                    restoreFromLastSavepoint(flinkApp, effectiveConfig);
+                    deployFlinkJob(currentJobSpec, status, effectiveConfig, Optional.empty());
+                } else {
+                    restoreFromLastSavepoint(currentJobSpec, status, effectiveConfig);
                 }
                 stateAfterReconcile = JobState.RUNNING;
             }
-            IngressUtils.updateIngressRules(flinkApp, effectiveConfig, kubernetesClient);
+            IngressUtils.updateIngressRules(
+                    deployMeta, currentDeploySpec, effectiveConfig, kubernetesClient);
             ReconciliationUtils.updateForSpecReconciliationSuccess(flinkApp, stateAfterReconcile);
-        } else if (SavepointUtils.shouldTriggerSavepoint(flinkApp) && isJobRunning(flinkApp)) {
+        } else if (ReconciliationUtils.shouldRollBack(reconciliationStatus, effectiveConfig)) {
+            rollbackApplication(flinkApp);
+        } else if (SavepointUtils.shouldTriggerSavepoint(currentJobSpec, status)
+                && isJobRunning(status)) {
             triggerSavepoint(flinkApp, effectiveConfig);
             ReconciliationUtils.updateSavepointReconciliationSuccess(flinkApp);
+        } else {
+            LOG.info("Deployment is fully reconciled, nothing to do.");
         }
     }
 
+    private void rollbackApplication(FlinkDeployment flinkApp) throws Exception {
+        ReconciliationStatus reconciliationStatus = flinkApp.getStatus().getReconciliationStatus();
+
+        if (reconciliationStatus.getState() != ReconciliationStatus.State.ROLLING_BACK) {
+            LOG.warn("Preparing to roll back to last stable spec.");
+            if (flinkApp.getStatus().getError() == null) {
+                flinkApp.getStatus()
+                        .setError(
+                                "Deployment is not ready within the configured timeout, rolling-back.");
+            }
+            reconciliationStatus.setState(ReconciliationStatus.State.ROLLING_BACK);
+            return;
+        }
+
+        LOG.warn("Executing roll-back operation");
+
+        FlinkDeploymentSpec rollbackSpec = reconciliationStatus.deserializeLastStableSpec();
+        Configuration rollbackConfig =
+                FlinkUtils.getEffectiveConfig(flinkApp.getMetadata(), rollbackSpec, defaultConfig);
+
+        UpgradeMode upgradeMode = flinkApp.getSpec().getJob().getUpgradeMode();
+
+        suspendJob(

Review Comment:
   I thinking could we simply rollback via `flinkApp.setSpec(rollbackSpec)` and reuse the current upgrading logic in `#reconcile`?
   
   I am not sure whether it is the by-design behavior. It seems that we do not rollback the `spec`, but just redeploy the application. When the rollback finished, we will get the different `lastStableSpec` with `lastReconciledSpec`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org