You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@submarine.apache.org by "cdmikechen (Jira)" <ji...@apache.org> on 2023/04/01 10:12:00 UTC
[jira] [Updated] (SUBMARINE-1378) The current state of the experiment should be further refined
[ https://issues.apache.org/jira/browse/SUBMARINE-1378?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
cdmikechen updated SUBMARINE-1378:
----------------------------------
Issue Type: Improvement (was: Bug)
> The current state of the experiment should be further refined
> -------------------------------------------------------------
>
> Key: SUBMARINE-1378
> URL: https://issues.apache.org/jira/browse/SUBMARINE-1378
> Project: Apache Submarine
> Issue Type: Improvement
> Components: experiment
> Reporter: cdmikechen
> Priority: Major
>
> In some exceptions (e.g. mirror cannot be downloaded), submarine cannot listen to the actual task status and is always running now.
> For example, in the case of a image that cannot be pulled, the actual job status is as follows.
> {code}
> status:
> conditions:
> - lastProbeTime: '2023-04-01T03:50:53Z'
> reason: PodInitializing
> type: Waiting
> - lastProbeTime: '2023-04-01T03:50:39Z'
> message: >-
> rpc error: code = Unknown desc = error pulling image configuration: Get
> "https://production.cloudflare.docker.com/registry-v2/docker/registry/v2/blobs/sha256/5c/5ccab874feb97b32099f72978f97c8e7d129fbe7577464ad49b43f58f693ca90/data?verify=1680324025-7lKdJkTa1waOdofNoPtnsjwv%2FIQ%3D":
> EOF
> reason: ErrImagePull
> type: Waiting
> - lastProbeTime: '2023-04-01T03:49:58Z'
> message: >-
> Back-off pulling image
> "apache/submarine:jupyter-notebook-0.8.0-SNAPSHOT"
> reason: ImagePullBackOff
> type: Waiting
> - lastProbeTime: '2023-04-01T03:49:57Z'
> message: >-
> rpc error: code = Unknown desc = Error response from daemon: Head
> "https://registry-1.docker.io/v2/apache/submarine/manifests/jupyter-notebook-0.8.0-SNAPSHOT":
> Get
> "https://auth.docker.io/token?scope=repository%3Aapache%2Fsubmarine%3Apull&service=registry.docker.io":
> EOF
> reason: ErrImagePull
> type: Waiting
> - lastProbeTime: '2023-04-01T03:49:54Z'
> reason: PodInitializing
> type: Waiting
> containerState:
> waiting:
> reason: PodInitializing
> readyReplicas: 0
> {code}
> Therefore, we should refine the status a bit more.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@submarine.apache.org
For additional commands, e-mail: dev-help@submarine.apache.org