You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@flink.apache.org by GitBox <gi...@apache.org> on 2022/03/22 10:11:45 UTC

[GitHub] [flink-kubernetes-operator] Aitozi commented on pull request #91: [FLINK-26554] Upgrade Operator SDK to avoid cleanup race condition

Aitozi commented on pull request #91:
URL: https://github.com/apache/flink-kubernetes-operator/pull/91#issuecomment-1074981205


   > @gyfora It seems that the e2e tests are not stable after #84.
   > 
   > ```
   > Run ls e2e-tests/test_*.sh | while read script_test;do \
   > Running e2e-tests/test_kubernetes_application_ha.sh
   > persistentvolumeclaim/flink-example-statemachine created
   > Error from server (InternalError): error when creating "e2e-tests/data/cr.yaml": Internal error occurred: failed calling webhook "vflinkdeployments.flink.apache.org": failed to call webhook: Post "https://flink-operator-webhook-service.default.svc:443/validate?timeout=10s": dial tcp 10.106.63.26:443: connect: connection refused
   > Command: kubectl apply -f e2e-tests/data/cr.yaml failed. Retrying...
   > flinkdeployment.flink.apache.org/flink-example-statemachine created
   > persistentvolumeclaim/flink-example-statemachine unchanged
   > Error from server (NotFound): deployments.apps "flink-example-statemachine" not found
   > Command: kubectl get deploy/flink-example-statemachine failed. Retrying...
   > NAME                         READY   UP-TO-DATE   AVAILABLE   AGE
   > flink-example-statemachine   0/1     1            0           1s
   > deployment.apps/flink-example-statemachine condition met
   > Waiting for jobmanager pod flink-example-statemachine-7fcf55c88b-h5r7r ready.
   > pod/flink-example-statemachine-7fcf55c88b-h5r7r condition met
   > Waiting for log "Rest endpoint listening at"...
   > Log "Rest endpoint listening at" shows up.
   > Waiting for log "Completed checkpoint [0-[9](https://github.com/apache/flink-kubernetes-operator/runs/5640468148?check_suite_focus=true#step:9:9)]+ for job"...
   > Log "Completed checkpoint [0-9]+ for job" shows up.
   > Successfully verified that flinkdep/flink-example-statemachine.status.jobManagerDeploymentStatus is in READY state.
   > Successfully verified that flinkdep/flink-example-statemachine.status.jobStatus.state is in RUNNING state.
   > Kill the flink-example-statemachine-7fcf55c88b-h5r7r
   > Defaulted container "flink-main-container" out of: flink-main-container, artifacts-fetcher (init)
   > Waiting for log "Restoring job 00000000000000000000000000000000 from Checkpoint"...
   > Log "Restoring job 00000000000000000000000000000000 from Checkpoint" shows up.
   > Waiting for log "Completed checkpoint [0-9]+ for job"...
   > Log "Completed checkpoint [0-9]+ for job" shows up.
   > Status verification for flinkdep/flink-example-statemachine.status.jobManagerDeploymentStatus failed. It is DEPLOYED_NOT_READY instead of READY.
   > Debugging failed e2e test:
   > ```
   
   > Status verification for flinkdep/flink-example-statemachine.status.jobManagerDeploymentStatus failed. It is DEPLOYED_NOT_READY instead of READY.
   
   I also encountered this same error today on my private CI, re-run the job, it passed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@flink.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org