You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@ambari.apache.org by "Jaimin D Jetly (JIRA)" <ji...@apache.org> on 2013/09/05 19:34:52 UTC

[jira] [Commented] (AMBARI-3112) Security wizard: disabling security does not return to initial condition after enabling security fails.

    [ https://issues.apache.org/jira/browse/AMBARI-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13759249#comment-13759249 ] 

Jaimin D Jetly commented on AMBARI-3112:
----------------------------------------

Datanode runs as root in secure cluster and the pid file is owned by it. whenever we stop hadoop component, we delete it's pid file explicitly. This issue is produced because datanode stop task is never sent while disabling security by server to agent (as datanode never started successfully but pid file was created by root user). Now While starting datanode in disable security wizard, hdfs user tries to write in a pid file owned by root and datanode start task fails due to permission error.

As a part of fix, we delete datanode pid file everytime before we start datanode. As a part of sanity check we do so only if the process id written in pid file is not running.
                
> Security wizard: disabling security does not return to initial condition after enabling security fails.
> -------------------------------------------------------------------------------------------------------
>
>                 Key: AMBARI-3112
>                 URL: https://issues.apache.org/jira/browse/AMBARI-3112
>             Project: Ambari
>          Issue Type: Bug
>          Components: agent
>    Affects Versions: 1.4.1
>            Reporter: Jaimin D Jetly
>            Assignee: Jaimin D Jetly
>              Labels: security
>             Fix For: 1.4.1
>
>         Attachments: AMBARI-3112.patch
>
>
> *Steps to reproduce:*
> 1. enable security WITHOUT pre-configuring kerberos on cluster and see failures on "3. Start Services";
> 2. disable security.
> In the end DataNode fails on ALL hosts without a possibility to get started.
> When you try to start DataNode manually it also ends with error:
> {code}err: /Stage[2]/Hdp-hadoop::Datanode/Hdp-hadoop::Service[datanode]/Hdp::Exec[su - hdfs -c  'export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode']/Exec[su - hdfs -c  'export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode']/returns: change from notrun to 0 failed: su - hdfs -c  'export HADOOP_LIBEXEC_DIR=/usr/lib/hadoop/libexec && /usr/lib/hadoop/sbin/hadoop-daemon.sh --config /etc/hadoop/conf start datanode' returned 1 instead of one of [0] at /var/lib/ambari-agent/puppet/modules/hdp/manifests/init.pp:479{code}

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira