You are viewing a plain text version of this content. The canonical link for it is here.
Posted to yarn-issues@hadoop.apache.org by "Chandni Singh (JIRA)" <ji...@apache.org> on 2018/08/01 22:33:00 UTC

[jira] [Created] (YARN-8611) With restart policy set to ON_FAILURE, the service state doesn't reach STABLE state

Chandni Singh created YARN-8611:
-----------------------------------

             Summary: With restart policy set to ON_FAILURE, the service state doesn't reach STABLE state
                 Key: YARN-8611
                 URL: https://issues.apache.org/jira/browse/YARN-8611
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Chandni Singh


- Launched a docker based sleeper service with {{restart_policy = ON_FAILURE}}.
 - There are container failures but eventually both the component instances reach {{READY}} state
 - However the SERVICE state remains {{STARTED}}

Below is the service status json:
{code:java}
    "components": [
        {
            "artifact": {
                "id": "hadoop/centos:6",
                "type": "DOCKER"
            },
            "configuration": {
                "env": {
                    "YARN_CONTAINER_RUNTIME_DOCKER_DELAYED_REMOVAL": "true",
                    "YARN_CONTAINER_RUNTIME_DOCKER_RUN_OVERRIDE_DISABLE": "true"
                },
                "files": [],
                "properties": {
                    "docker.network": "host"
                }
            },
            "containers": [
                {
                    "bare_host": “{host1}“,
                    "component_instance_name": "ping-1",
                    "hostname": "ping-1.s.hbase.ycluster",
                    "id": "container_e02_1533070786532_0005_01_000003",
                    "ip": "172.26.111.21",
                    "launch_time": 1533159861113,
                    "state": "READY"
                },
                {
                    "bare_host": “{host2}“,
                    "component_instance_name": "ping-0",
                    "hostname": "ping-0.s.hbase.ycluster",
                    "id": "container_e02_1533070786532_0005_01_000007",
                    "ip": "172.26.111.21",
                    "launch_time": 1533160113627,
                    "state": "READY"
                }
            ],
            "dependencies": [],
            "launch_command": "sleep 90000",
            "name": "ping",
            "number_of_containers": 2,
            "quicklinks": [],
            "resource": {
                "additional": {},
                "cpus": 1,
                "memory": "256"
            },
            "restart_policy": "ON_FAILURE",
            "run_privileged_container": false,
            "state": "STABLE"
        }
    ],
    "configuration": {
        "env": {},
        "files": [],
        "properties": {}
    },
    "id": "application_1533070786532_0005",
    "kerberos_principal": {
        "keytab": "...",
        "principal_name": "..."
    },
    "lifetime": -1,
    "name": "s",
    "quicklinks": {},
    "state": "STARTED",
    "version": "1"
}{code}

The service state needs to become {{STABLE}} since all the component instances are {{READY}}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: yarn-issues-help@hadoop.apache.org