You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by kwmonroe <gi...@git.apache.org> on 2016/11/17 00:17:07 UTC

[GitHub] bigtop pull request #162: BIGTOP-2570: ensure bigtop services are started

GitHub user kwmonroe opened a pull request:

    https://github.com/apache/bigtop/pull/162

    BIGTOP-2570: ensure bigtop services are started

    We need to make sure our bigtop services are actually started before setting the `.started` state.
    
    Without this fix, our NN will set `.started` even if `host.service_restart('hadoop-hdfs-namenode')` fails.  This is bad because our slave/datanode units will attempt to do things (like relate to the NN) when they can't.
    
    This doesn't help us know why the service failed to start, but it does prevent other charms from relating to an unready application.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/juju-solutions/bigtop bug/BIGTOP-2570/tweak-service-started

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/bigtop/pull/162.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #162
    
----
commit 70f4eb2ae7b61da4014270016dbf854e76f44405
Author: Kevin W Monroe <ke...@canonical.com>
Date:   2016-11-17T00:08:27Z

    do not set NN and RM .started states unless we know the service is started

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop issue #162: BIGTOP-2570: ease hadoop charm debugging

Posted by kwmonroe <gi...@git.apache.org>.
Github user kwmonroe commented on the issue:

    https://github.com/apache/bigtop/pull/162
  
    Though this started with a systemctl service that failed to start, it has turned into a crusade to make charm actions/status/logging better so we can more easily debug problems like this in the future.
    
    In addition to handling `start_foo` better, i added some additional logging, corrected some bad status, and updated actions to better log their output.  I've updated the jira/pr title to reflect this extra scope.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop issue #162: BIGTOP-2570: ease hadoop charm debugging

Posted by johnsca <gi...@git.apache.org>.
Github user johnsca commented on the issue:

    https://github.com/apache/bigtop/pull/162
  
    LGTM as well :+1: 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #162: BIGTOP-2570: ease hadoop charm debugging

Posted by johnsca <gi...@git.apache.org>.
Github user johnsca commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/162#discussion_r88919639
  
    --- Diff: bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/reactive/resourcemanager.py ---
    @@ -131,16 +131,28 @@ def send_nn_spec(namenode):
     @when_not('apache-bigtop-resourcemanager.started')
     def start_resourcemanager(namenode):
         hookenv.status_set('maintenance', 'starting resourcemanager')
    -    # NB: service should be started by install, but this may be handy in case
    -    # we have something that removes the .started state in the future. Also
    -    # note we restart here in case we modify conf between install and now.
    -    host.service_restart('hadoop-yarn-resourcemanager')
    -    host.service_restart('hadoop-mapreduce-historyserver')
    -    for port in get_layer_opts().exposed_ports('resourcemanager'):
    -        hookenv.open_port(port)
    -    set_state('apache-bigtop-resourcemanager.started')
    -    hookenv.application_version_set(get_hadoop_version())
    -    hookenv.status_set('maintenance', 'resourcemanager started')
    +    # NB: service should be started by install, but we want to verify it is
    +    # running before we set the .started state and open ports. We always
    +    # restart here, which may seem heavy-handed. However, restart works
    +    # whether the service is currently started or stopped. It also ensures the
    +    # service is using the most current config.
    +    rm_started = host.service_restart('hadoop-yarn-resourcemanager')
    +    if rm_started:
    +        for port in get_layer_opts().exposed_ports('resourcemanager'):
    +            hookenv.open_port(port)
    +        set_state('apache-bigtop-resourcemanager.started')
    +        hookenv.status_set('maintenance', 'resourcemanager started')
    +        hookenv.application_version_set(get_hadoop_version())
    +    else:
    +        hookenv.log('YARN ResourceManager failed to start')
    +        hookenv.status_set('blocked', 'resourcemanager failed to start')
    +        remove_state('apache-bigtop-resourcemanager.started')
    --- End diff --
    
    No, this will prevent related services from trying to use the RM before it is started.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop issue #162: BIGTOP-2570: ease hadoop charm debugging

Posted by ktsakalozos <gi...@git.apache.org>.
Github user ktsakalozos commented on the issue:

    https://github.com/apache/bigtop/pull/162
  
    LGTM +1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #162: BIGTOP-2570: ease hadoop charm debugging

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/bigtop/pull/162


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

[GitHub] bigtop pull request #162: BIGTOP-2570: ease hadoop charm debugging

Posted by ktsakalozos <gi...@git.apache.org>.
Github user ktsakalozos commented on a diff in the pull request:

    https://github.com/apache/bigtop/pull/162#discussion_r88909017
  
    --- Diff: bigtop-packages/src/charm/hadoop/layer-hadoop-resourcemanager/reactive/resourcemanager.py ---
    @@ -131,16 +131,28 @@ def send_nn_spec(namenode):
     @when_not('apache-bigtop-resourcemanager.started')
     def start_resourcemanager(namenode):
         hookenv.status_set('maintenance', 'starting resourcemanager')
    -    # NB: service should be started by install, but this may be handy in case
    -    # we have something that removes the .started state in the future. Also
    -    # note we restart here in case we modify conf between install and now.
    -    host.service_restart('hadoop-yarn-resourcemanager')
    -    host.service_restart('hadoop-mapreduce-historyserver')
    -    for port in get_layer_opts().exposed_ports('resourcemanager'):
    -        hookenv.open_port(port)
    -    set_state('apache-bigtop-resourcemanager.started')
    -    hookenv.application_version_set(get_hadoop_version())
    -    hookenv.status_set('maintenance', 'resourcemanager started')
    +    # NB: service should be started by install, but we want to verify it is
    +    # running before we set the .started state and open ports. We always
    +    # restart here, which may seem heavy-handed. However, restart works
    +    # whether the service is currently started or stopped. It also ensures the
    +    # service is using the most current config.
    +    rm_started = host.service_restart('hadoop-yarn-resourcemanager')
    +    if rm_started:
    +        for port in get_layer_opts().exposed_ports('resourcemanager'):
    +            hookenv.open_port(port)
    +        set_state('apache-bigtop-resourcemanager.started')
    +        hookenv.status_set('maintenance', 'resourcemanager started')
    +        hookenv.application_version_set(get_hadoop_version())
    +    else:
    +        hookenv.log('YARN ResourceManager failed to start')
    +        hookenv.status_set('blocked', 'resourcemanager failed to start')
    +        remove_state('apache-bigtop-resourcemanager.started')
    --- End diff --
    
    I guess this is "just in case"


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---