You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@ant.apache.org by cprice <cp...@its.to> on 2007/02/07 23:37:29 UTC

Ant sshexec random failuresrun a fai

Hi there;

I run a deploy system using ant with sshexec and scp. My deploy consists of
sshexec calls to shut down running applications on remote servers, copy of
files via scp to remote servers, and then subsequent sshexec calls to start
up applications on remote servers. 

My problem is that my ant sshexec calls fail randomly with "Remote command
failed with exit status -1". I'll have 20 successful deploys in a row, a
failure, 3 more successes, 2 failures, etc... Effectively what seems to be
happening is that the failing sshexec tasks do not attempt to make a
connection to the remote system. I've esteablished this by watching strace
and lsof output of the running ant process and noting the abscence of a tcp
socket connection to the remote system. Also, the instant I get the remote
command failed error I can connect via ssh by hand (CLI) to the remote
system with no issues. I have also run scripts that continually run remote
commands against all remote systems while running the ant deploy.

My deploy system is rhel4, OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003.

Target systems are a mixture of rhel3, OpenSSH_3.6.1p2, SSH protocols
1.5/2.0, OpenSSL 0x0090701f
and windows 2003 server running cygwin, OpenSSH_4.3p2, OpenSSL 0.9.8b 04 May
2006 

I was originally seeing this behaviour under ant-1.6.5, so I upgraded to
ant-1.7.0 with no changes. I also upgraded my ant-1.7.0 to jsch-0.1.31.jar
based on some anecdotal evidence I saw in the jsch changelog.

My ant install was running under jdk 1.4.2_11, which we upgraded to jdk
1.6.0  with no change.

ANY help you can provide will be greatly appreciated.

Cheers,
Chris



-- 
View this message in context: http://www.nabble.com/Ant-sshexec-random-failuresrun-a-fai-tf3190085.html#a8855696
Sent from the Ant - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Ant sshexec random failures

Posted by pdmckenzie <pe...@netfocusconsulting.com>.
Replying for cprice - the patch below definitely seems to have made the
problem much less noticable.  Previous to applying this patch, we had a
shell script that ssh'ed to a remote box continuously overnight and failed
around 7 - 8 times.  Now it doesn't fail at all.
Thanks!




Atsuhiko Yamanaka-2 wrote:
> 
> Hi,
> 
> 2007/6/1, ken77 <kg...@cmtek.com>:
>> I am having the same exactly problem with the sshexec, the same script
>> sometimes works, sometimes it gives me "Remote command failed with exit
>> status -1"  , I was wondering if you have found the reason for this and a
>> possible solution ?
> 
> May I ask you to try the attached patch?  It is a patch for SVN HEAD.
> 
> diff -Naur
> ant/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> ant.new/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> --- ant/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> 2007-06-01 10:29:07.000000000 +0000
> +++
> ant.new/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> 2007-06-01 15:54:56.000000000 +0000
> @@ -186,7 +186,7 @@
>              thread =
>                  new Thread() {
>                      public void run() {
> -                        while (!channel.isEOF()) {
> +                        while (!channel.isClosed()) {
>                              if (thread == null) {
>                                  return;
>                              }
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
> 

-- 
View this message in context: http://www.nabble.com/Ant-sshexec-random-failures-tf3190085.html#a11347029
Sent from the Ant - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Ant sshexec random failures

Posted by Andrew Goktepe <an...@gmail.com>.
I posted bug 43092 for this issue:

http://issues.apache.org/bugzilla/show_bug.cgi?id=43092

-Andrew

On 7/10/07, Kevin Jackson <fo...@gmail.com> wrote:
>
> Hi,
>
> Has anyone posted a bug on BZ for this?
>
> If we have an open bug report and a patch & test for it, we can
> perform the fix and get it into the trunk of ant for 1.7.1
>
> Thanks,
> Kev
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>
>

Re: Ant sshexec random failures

Posted by Kevin Jackson <fo...@gmail.com>.
Hi,

Has anyone posted a bug on BZ for this?

If we have an open bug report and a patch & test for it, we can
perform the fix and get it into the trunk of ant for 1.7.1

Thanks,
Kev

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Ant sshexec random failures

Posted by Andrew Goktepe <an...@gmail.com>.
Regarding my earlier post, the upgrade from JSch 0.1.31 to 0.1.33 did not
make any difference. It sounds like that Ant patch will help though.

-Andrew

On 7/10/07, ruel loehr <rl...@pointserve.com> wrote:
>
>
> I am facing this problem also.  Has a bug been opened up for this?  I'm
> unable to find it in bugzilla....
>
>
>
> Atsuhiko Yamanaka-2 wrote:
> >
> > Hi,
> >
> > 2007/6/1, ken77 <kg...@cmtek.com>:
> >> I am having the same exactly problem with the sshexec, the same script
> >> sometimes works, sometimes it gives me "Remote command failed with exit
> >> status -1"  , I was wondering if you have found the reason for this and
> a
> >> possible solution ?
> >
> > May I ask you to try the attached patch?  It is a patch for SVN HEAD.
> >
> > diff -Naur
> > ant/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> > ant.new/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> > --- ant/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> > 2007-06-01 10:29:07.000000000 +0000
> > +++
> > ant.new/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> > 2007-06-01 15:54:56.000000000 +0000
> > @@ -186,7 +186,7 @@
> >              thread =
> >                  new Thread() {
> >                      public void run() {
> > -                        while (!channel.isEOF()) {
> > +                        while (!channel.isClosed()) {
> >                              if (thread == null) {
> >                                  return;
> >                              }
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> > For additional commands, e-mail: user-help@ant.apache.org
> >
>
> --
> View this message in context:
> http://www.nabble.com/Ant-sshexec-random-failures-tf3190085.html#a11521137
> Sent from the Ant - Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>
>

Re: Ant sshexec random failures

Posted by ruel loehr <rl...@pointserve.com>.
I am facing this problem also.  Has a bug been opened up for this?  I'm
unable to find it in bugzilla....



Atsuhiko Yamanaka-2 wrote:
> 
> Hi,
> 
> 2007/6/1, ken77 <kg...@cmtek.com>:
>> I am having the same exactly problem with the sshexec, the same script
>> sometimes works, sometimes it gives me "Remote command failed with exit
>> status -1"  , I was wondering if you have found the reason for this and a
>> possible solution ?
> 
> May I ask you to try the attached patch?  It is a patch for SVN HEAD.
> 
> diff -Naur
> ant/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> ant.new/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> --- ant/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> 2007-06-01 10:29:07.000000000 +0000
> +++
> ant.new/src/main/org/apache/tools/ant/taskdefs/optional/ssh/SSHExec.java
> 2007-06-01 15:54:56.000000000 +0000
> @@ -186,7 +186,7 @@
>              thread =
>                  new Thread() {
>                      public void run() {
> -                        while (!channel.isEOF()) {
> +                        while (!channel.isClosed()) {
>                              if (thread == null) {
>                                  return;
>                              }
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
> 

-- 
View this message in context: http://www.nabble.com/Ant-sshexec-random-failures-tf3190085.html#a11521137
Sent from the Ant - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


Re: Ant sshexec random failures

Posted by Atsuhiko Yamanaka <at...@gmail.com>.
Hi,

2007/6/1, ken77 <kg...@cmtek.com>:
> I am having the same exactly problem with the sshexec, the same script
> sometimes works, sometimes it gives me "Remote command failed with exit
> status -1"  , I was wondering if you have found the reason for this and a
> possible solution ?

May I ask you to try the attached patch?  It is a patch for SVN HEAD.

Re: Ant sshexec random failures

Posted by Andrew Goktepe <an...@gmail.com>.
Same problem for me too. We have been experiencing this intermittently for
the past few months. It started happening for me again yesterday, and I am
sure that the remote script is exiting with 0 status. This seems like a JSch
bug to me. I have been running JSch 0.1.31 but I just upgraded to the new
v0.1.33. I'll update this thread with the results.

-Andrew

On 5/31/07, ken77 <kg...@cmtek.com> wrote:
>
>
> Hi Chris...
>
> I am having the same exactly problem with the sshexec, the same script
> sometimes works, sometimes it gives me "Remote command failed with exit
> status -1"  , I was wondering if you have found the reason for this and a
> possible solution ?
>
> this is driving me crazy since I can't find the exactly (valid) reason ,
> so
> i can talk with my manager about it..
>
> thanks in advance
>
> ken
>
>
>
> cprice wrote:
> >
> > Hi there;
> >
> > I run a deploy system using ant with sshexec and scp. My deploy consists
> > of sshexec calls to shut down running applications on remote servers,
> copy
> > of files via scp to remote servers, and then subsequent sshexec calls to
> > start up applications on remote servers.
> >
> > My problem is that my ant sshexec calls fail randomly with "Remote
> command
> > failed with exit status -1". I'll have 20 successful deploys in a row, a
> > failure, 3 more successes, 2 failures, etc... Effectively what seems to
> be
> > happening is that the failing sshexec tasks do not attempt to make a
> > connection to the remote system. I've esteablished this by watching
> strace
> > and lsof output of the running ant process and noting the abscence of a
> > tcp socket connection to the remote system. Also, the instant I get the
> > remote command failed error I can connect via ssh by hand (CLI) to the
> > remote system with no issues. I have also run scripts that continually
> run
> > remote commands against all remote systems while running the ant deploy.
> >
> > My deploy system is rhel4, OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003.
> >
> > Target systems are a mixture of rhel3, OpenSSH_3.6.1p2, SSH protocols
> > 1.5/2.0, OpenSSL 0x0090701f
> > and windows 2003 server running cygwin, OpenSSH_4.3p2, OpenSSL 0.9.8b 04
> > May 2006
> >
> > I was originally seeing this behaviour under ant-1.6.5, so I upgraded to
> > ant-1.7.0 with no changes. I also upgraded my ant-1.7.0 to
> jsch-0.1.31.jar
> > based on some anecdotal evidence I saw in the jsch changelog.
> >
> > My ant install was running under jdk 1.4.2_11, which we upgraded to jdk
> > 1.6.0  with no change.
> >
> > ANY help you can provide will be greatly appreciated.
> >
> > Cheers,
> > Chris
> >
> >
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Ant-sshexec-random-failures-tf3190085.html#a10895879
> Sent from the Ant - Users mailing list archive at Nabble.com.
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
>
>

Re: Ant sshexec random failures

Posted by ken77 <kg...@cmtek.com>.
Hi Chris...

I am having the same exactly problem with the sshexec, the same script
sometimes works, sometimes it gives me "Remote command failed with exit
status -1"  , I was wondering if you have found the reason for this and a
possible solution ?

this is driving me crazy since I can't find the exactly (valid) reason , so
i can talk with my manager about it..

thanks in advance

ken 



cprice wrote:
> 
> Hi there;
> 
> I run a deploy system using ant with sshexec and scp. My deploy consists
> of sshexec calls to shut down running applications on remote servers, copy
> of files via scp to remote servers, and then subsequent sshexec calls to
> start up applications on remote servers. 
> 
> My problem is that my ant sshexec calls fail randomly with "Remote command
> failed with exit status -1". I'll have 20 successful deploys in a row, a
> failure, 3 more successes, 2 failures, etc... Effectively what seems to be
> happening is that the failing sshexec tasks do not attempt to make a
> connection to the remote system. I've esteablished this by watching strace
> and lsof output of the running ant process and noting the abscence of a
> tcp socket connection to the remote system. Also, the instant I get the
> remote command failed error I can connect via ssh by hand (CLI) to the
> remote system with no issues. I have also run scripts that continually run
> remote commands against all remote systems while running the ant deploy.
> 
> My deploy system is rhel4, OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003.
> 
> Target systems are a mixture of rhel3, OpenSSH_3.6.1p2, SSH protocols
> 1.5/2.0, OpenSSL 0x0090701f
> and windows 2003 server running cygwin, OpenSSH_4.3p2, OpenSSL 0.9.8b 04
> May 2006 
> 
> I was originally seeing this behaviour under ant-1.6.5, so I upgraded to
> ant-1.7.0 with no changes. I also upgraded my ant-1.7.0 to jsch-0.1.31.jar
> based on some anecdotal evidence I saw in the jsch changelog.
> 
> My ant install was running under jdk 1.4.2_11, which we upgraded to jdk
> 1.6.0  with no change.
> 
> ANY help you can provide will be greatly appreciated.
> 
> Cheers,
> Chris
> 
> 
> 
> 

-- 
View this message in context: http://www.nabble.com/Ant-sshexec-random-failures-tf3190085.html#a10895879
Sent from the Ant - Users mailing list archive at Nabble.com.


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org


RE: Ant sshexec random failuresrun a fai

Posted by "Anderson, Rob (Global Trade)" <Ro...@nike.com>.
> -----Original Message-----
> From: cprice [mailto:cprice@its.to] 
> Sent: Wednesday, February 07, 2007 2:37 PM
> To: user@ant.apache.org
> Subject: Ant sshexec random failuresrun a fai
> 
> 
> Hi there;
> 
> I run a deploy system using ant with sshexec and scp. My 
> deploy consists of
> sshexec calls to shut down running applications on remote 
> servers, copy of
> files via scp to remote servers, and then subsequent sshexec 
> calls to start
> up applications on remote servers. 
> 
> My problem is that my ant sshexec calls fail randomly with 
> "Remote command
> failed with exit status -1". I'll have 20 successful deploys 
> in a row, a
> failure, 3 more successes, 2 failures, etc... Effectively 
> what seems to be
> happening is that the failing sshexec tasks do not attempt to make a
> connection to the remote system. I've esteablished this by 
> watching strace
> and lsof output of the running ant process and noting the 
> abscence of a tcp
> socket connection to the remote system. Also, the instant I 
> get the remote
> command failed error I can connect via ssh by hand (CLI) to the remote
> system with no issues. I have also run scripts that 
> continually run remote
> commands against all remote systems while running the ant deploy.
> 
> My deploy system is rhel4, OpenSSH_3.9p1, OpenSSL 0.9.7a Feb 19 2003.
> 
> Target systems are a mixture of rhel3, OpenSSH_3.6.1p2, SSH protocols
> 1.5/2.0, OpenSSL 0x0090701f
> and windows 2003 server running cygwin, OpenSSH_4.3p2, 
> OpenSSL 0.9.8b 04 May
> 2006 
> 
> I was originally seeing this behaviour under ant-1.6.5, so I 
> upgraded to
> ant-1.7.0 with no changes. I also upgraded my ant-1.7.0 to 
> jsch-0.1.31.jar
> based on some anecdotal evidence I saw in the jsch changelog.
> 
> My ant install was running under jdk 1.4.2_11, which we 
> upgraded to jdk
> 1.6.0  with no change.
> 
> ANY help you can provide will be greatly appreciated.
> 
> Cheers,
> Chris
> 
> 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Ant-sshexec-random-failuresrun-a-fai-tf3
> 190085.html#a8855696
> Sent from the Ant - Users mailing list archive at Nabble.com.
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
> For additional commands, e-mail: user-help@ant.apache.org
> 
> 
> 


Chris, Some more info would be helpful. Please post the relavant portion
of your build.xml and the verbose output from Ant. Be sure to remove any
sensitive data. You say there are two target systems... Does it only
fail when attempting to sshexec to a particular one?

-Rob Anderson


---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscribe@ant.apache.org
For additional commands, e-mail: user-help@ant.apache.org