You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Niklas Gustavsson <ni...@protocol7.com> on 2011/01/17 16:20:04 UTC

PreCommit-HDFS-Build

Hi

The following build keeps getting locked up in Hudson and requires
frequent killing. Could someone have a look at it or should we disable
it for now?

https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/

/niklas

Re: PreCommit-HDFS-Build

Posted by Giridharan Kesavan <gk...@yahoo-inc.com>.
I saw another stuck build today for hdfs and build is again stuck on the same junit test TestLargeDirectoryDelete

Just by killing this test I could see the build job going on..

And I see this as a pattern in the previous build failure as well and I doubt its not the nfs mount..

-Giri

On Jan 17, 2011, at 9:16 AM, Nigel Daley wrote:

Hudson does a terrible job of killing underlying processes when a build is aborted due to someone killing it from UI or it hitting a timeout.  For these hadoop builds, it usually means that 3 or 4 processes are left lying around that can and do interfere with subsequent jobs.  It's not clear to me why they are hanging, but I suspect NFS issues on these hadoop slaves.  We're going to disable NFS on a couple of them later this week and see if that helps.

I try to monitor for this situation regularly and properly kill builds that seem hung.  Since these are on the hadoop slaves, it doesn't impact other project builds.

Cheers,
Nige


On Jan 17, 2011, at 7:20 AM, Niklas Gustavsson wrote:

Hi

The following build keeps getting locked up in Hudson and requires
frequent killing. Could someone have a look at it or should we disable
it for now?

https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/

/niklas



Re: PreCommit-HDFS-Build

Posted by Nigel Daley <nd...@mac.com>.
Hudson does a terrible job of killing underlying processes when a build is aborted due to someone killing it from UI or it hitting a timeout.  For these hadoop builds, it usually means that 3 or 4 processes are left lying around that can and do interfere with subsequent jobs.  It's not clear to me why they are hanging, but I suspect NFS issues on these hadoop slaves.  We're going to disable NFS on a couple of them later this week and see if that helps.  

I try to monitor for this situation regularly and properly kill builds that seem hung.  Since these are on the hadoop slaves, it doesn't impact other project builds.

Cheers,
Nige


On Jan 17, 2011, at 7:20 AM, Niklas Gustavsson wrote:

> Hi
> 
> The following build keeps getting locked up in Hudson and requires
> frequent killing. Could someone have a look at it or should we disable
> it for now?
> 
> https://hudson.apache.org/hudson/job/PreCommit-HDFS-Build/
> 
> /niklas