You are viewing a plain text version of this content. The canonical link for it is here.
Posted to issues@ignite.apache.org by "Oleg Ignatenko (JIRA)" <ji...@apache.org> on 2018/09/13 16:48:00 UTC

[jira] [Comment Edited] (IGNITE-9585) Error message sometimes refers nonexisting log file when remote node fails to start

    [ https://issues.apache.org/jira/browse/IGNITE-9585?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16613614#comment-16613614 ] 

Oleg Ignatenko edited comment on IGNITE-9585 at 9/13/18 4:47 PM:
-----------------------------------------------------------------

(i) I considered different options to address this issue and it looks like most convenient approach would be that prior to calling ignite script we would simply create log file and fill it with a simple diagnostic message, like {{echo "Preparing to start remote node..." > logfile}} (successful launch of the node would rewrite it but that's okay). This would guarantee that log file will be there even in case if node launch script breaks and make it easier to investigate failures.

-----

(update) Just tested above approach on my machine and it appears to work even better than I expected: not only log file is created, it is also filled with details about what went wrong: {noformat}nohup: ignoring input
/home/gridgain/Desktop/test/ignite-master/modules/core/src/test/bin/start-nodes-custom.sh: 23:
 /home/gridgain/Desktop/test/ignite-master/modules/core/src/test/bin/start-nodes-custom.sh:
 /home/gridgain/Desktop/test/ignite-master/modules/core/src/test/bin/../../../../../bin/noignite.sh:
 not found{noformat}


was (Author: oignatenko):
(i) I considered different options to address this issue and it looks like most convenient approach would be that prior to calling ignite script we would simply create log file and fill it with a simple diagnostic message, like {{echo "Preparing to start remote node..." > logfile}} (successful launch of the node would rewrite it but that's okay). This would guarantee that log file will be there even in case if node launch script breaks and make it easier to investigate failures.

> Error message sometimes refers nonexisting log file when remote node fails to start
> -----------------------------------------------------------------------------------
>
>                 Key: IGNITE-9585
>                 URL: https://issues.apache.org/jira/browse/IGNITE-9585
>             Project: Ignite
>          Issue Type: Bug
>    Affects Versions: 2.6
>            Reporter: Oleg Ignatenko
>            Assignee: Oleg Ignatenko
>            Priority: Minor
>              Labels: MakeTeamcityGreenAgain
>             Fix For: 2.7
>
>
> Teamcity build logs sometimes refer to remote node log files that aren't present in build artifacts ([example|https://ci.ignite.apache.org/viewLog.html?buildTypeId=IgniteTests24Java8_StartNodes&buildId=1849937&branch_IgniteTests24Java8_StartNodes=%3Cdefault%3E]).
> I managed to reproduce this on my machine (details below) and it looks like typically the root cause of this is error message from [StartNodeCallableImpl|https://github.com/apache/ignite/blob/master/modules/ssh/src/main/java/org/apache/ignite/internal/util/nodestart/StartNodeCallableImpl.java] referring readers to file that doesn't exist (and it wasn't even created to start with).
> {code:java}
>             return new ClusterStartNodeResultImpl(spec.host(), false, "Remote node could not start. " +
>                 "See log for details: " + scriptOutputPath);
> {code}
> This is quite painful when one tries to investigate node launching failures because the misleading message causes one to waste time investigating the problem that doesn't exist (it appears as if log file was there but somehow disappeared for some mysterious reason).
> ----
> To reproduce the issue locally one can do as follows: first, modify file [start-nodes-custom.sh|https://github.com/apache/ignite/blob/master/modules/core/src/test/bin/start-nodes-custom.sh] by replacing {{ignite.sh}} with the name of script that doesn't exist, eg {{noignite.sh}}. After that, execute unit test [IgniteProjectionStartStopRestartSelfTest|https://github.com/apache/ignite/blob/master/modules/ssh/src/test/java/org/apache/ignite/internal/IgniteProjectionStartStopRestartSelfTest.java] and study its results and logs.
> You will find that {{testCustomScript}} fails - which is expected because name of the script intended to be executed has changed to one that doesn't exist. Also you will find that log file for respective node hasn't been created - which is also expected because shell command fails before creating it. But in the same time test log will refer to mentioned file as if it exists.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)