You are viewing a plain text version of this content. The canonical link for it is here.
Posted to common-issues@hadoop.apache.org by "Andras Bokor (JIRA)" <ji...@apache.org> on 2017/04/21 11:57:04 UTC

[jira] [Comment Edited] (HADOOP-13238) pid handling is failing on secure datanode

    [ https://issues.apache.org/jira/browse/HADOOP-13238?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978504#comment-15978504 ] 

Andras Bokor edited comment on HADOOP-13238 at 4/21/17 11:56 AM:
-----------------------------------------------------------------

[~aw]

The root cause here is that the JSVC will delete its own pid file which was passed with {{-pidfile}} option. So after stop {{cat}} will fail.
Honestly, I feel HADOOP-12364 solves a bug in an external monitoring tool rather than in Hadoop. That is a pretty rare case (I cannot even imagine how can it happen) so I think here it is enough to check that whether the pid file exists or not. If not that means JSVC deleted the file so we do not need to do check and delete.
In addition the error message shows up twice because either {{hadoop_stop_daemon}} or {{hadoop_stop_secure_daemon}} do the same check and deletes the same pid file. The second one can be removed from the code.

After my patch the test still passes. {{hadoop_stop_daemon.bats}} and {{hadoop_stop_secure_daemon.bats}} do the same test so the first one seems unnecessary.
Also, I added a new test to prove that the pid file is deleted when everything went well.
{code}abokor$ bats hadoop_stop_secure_daemon.bats
 ✓ hadoop_stop_secure_daemon_when_pid_file_changes
 ✓ hadoop_stop_secure_daemon_deletes_pid_file

2 tests, 0 failures{code}

Output after patch:
{code}root@abokor-practice-5:/grid/0# hadoop-3.0.0-alpha2/sbin/start-dfs.sh
Starting namenodes on [abokor-practice-2.openstacklocal]
Starting datanodes
Starting secondary namenodes [abokor-practice-5]
root@abokor-practice-5:/grid/0# hadoop-3.0.0-alpha2/sbin/stop-dfs.sh
Stopping namenodes on [abokor-practice-2.openstacklocal]
Stopping datanodes
Stopping secondary namenodes [abokor-practice-5]{code}


was (Author: boky01):
[~aw]

The root cause here is that JSVC will delete the pid file which was passed to it with {{-pidfile}} option. So after stop {{cat}} will fail.
Honestly, I feel HADOOP-12364 solves a bug in an external monitoring tool rather than in Hadoop. That is a pretty rare case (I cannot even imagine how can it happen) so I think here it is enough to check that whether the pid file exists or not. If not that means JSVC deleted the file so we do not need to do check and delete.
In addition the error message shows up twice because either {{hadoop_stop_daemon.bats}} or {{hadoop_stop_secure_daemon.bats}} do the same check and deletes the same pid file. The second one can be removed from the code.

After my patch the test still passes. {{adoop_stop_daemon.bats}} and {{adoop_stop_secure_daemon.bats}} do the same test so the first seems unnecessary.
{code}abokor$ bats hadoop_stop_secure_daemon.bats
 ✓ hadoop_stop_secure_daemon

1 test, 0 failures{code}

> pid handling is failing on secure datanode
> ------------------------------------------
>
>                 Key: HADOOP-13238
>                 URL: https://issues.apache.org/jira/browse/HADOOP-13238
>             Project: Hadoop Common
>          Issue Type: Bug
>          Components: scripts, security
>            Reporter: Allen Wittenauer
>            Assignee: Andras Bokor
>
> {code}
> hdfs --daemon stop datanode
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or directory
> WARNING: pid has changed for datanode, skip deleting pid file
> cat: /home/hadoop/H/pids/hadoop-hdfs-root-datanode.pid: No such file or directory
> WARNING: daemon pid has changed for datanode, skip deleting daemon pid file
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscribe@hadoop.apache.org
For additional commands, e-mail: common-issues-help@hadoop.apache.org