You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Jukka Zitting <ju...@gmail.com> on 2010/02/26 16:12:15 UTC

[hudson] Killed subversion-1.6.x-solaris build 88

Hi,

This build [1] was stuck since yesterday.

PS. The subversion-1.6.x-solaris job is also running on the Lucene
zone. Is this intentional?

[1] http://hudson.zones.apache.org/hudson/job/subversion-1.6.x-solaris/88/

BR,

Jukka Zitting

RE: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by Uwe Schindler <uw...@thetaphi.de>.
Do we have any other hudson slave with solaris?

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Jukka Zitting [mailto:jukka.zitting@gmail.com]
> Sent: Friday, February 26, 2010 4:12 PM
> To: builds@apache.org
> Subject: [hudson] Killed subversion-1.6.x-solaris build 88
> 
> Hi,
> 
> This build [1] was stuck since yesterday.
> 
> PS. The subversion-1.6.x-solaris job is also running on the Lucene
> zone. Is this intentional?
> 
> [1] http://hudson.zones.apache.org/hudson/job/subversion-1.6.x-
> solaris/88/
> 
> BR,
> 
> Jukka Zitting


Re: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Sun, Feb 28, 2010 at 11:18 AM, Gav... <ga...@16degrees.com.au> wrote:
>> Perhaps we should
>> reserve also hudson.zones only for tied jobs and push all untied
>> builds to the Ubuntu servers (plus perhaps an experimental on-demand
>> EC2 slave for which we may get some funding for next budget year).
>
> Isn't that defeating the purpose of projects using Hudson.zones and
> Lucene.zones is because they are testing on Solaris, or am I
> misunderstanding something

Having tied jobs on hudson.zones would still be OK for builds that
need a Solaris environment, but I'd change the configuration so that
*untied* builds would no longer by default be running on hudson.zones
(currently they run on hudson.zones, minerva or vesta, depending on
where an available executor is found).

BR,

Jukka Zitting

RE: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Norman,

thanks for helping. Now lucene.zones is useable again:
  7:22pm  up 52 day(s), 16:57,  1 user,  load average: 1.13, 2.02, 1.86

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Norman Maurer [mailto:norman.maurer@googlemail.com]
> Sent: Monday, March 01, 2010 10:27 AM
> To: Uwe Schindler
> Cc: builds@apache.org; Apache Infrastructure
> Subject: Re: [hudson] Killed subversion-1.6.x-solaris build 88
> 
> Hi Uwe,
> 
> the lucene and hudson zone are not the same. They are even not on the
> same physical server. Anyway the high load was caused by the lenya
> zone which is located on teh same server as the lucene zone.
> 
> Hope it will work out better now..
> 
> 
> Have fun,
> Norman
> 
> 
> 2010/3/1 Uwe Schindler <uw...@thetaphi.de>:
> >> Note that (to answer Uwes question also) I created an Infra issue
> back
> >> in
> >> December
> >> for a new Solaris zone for Hudson and Buildbot.
> >>
> >> http://issues.apache.org/jira/browse/INFRA-2360
> >>
> >> I hope to get the new Hudson zone up and running in the next day or
> >> two,
> >> which should
> >> alleviate pressure from the other two.
> >
> > I am not sure, if this would solve the load problems on lucene.zones:
> >
> > load averages: 50.76, 50.55, 47.90
>   08:21:02
> > 32 processes:  29 sleeping, 1 running, 2 on cpu
> > CPU states:  0.0% idle, 50.2% user, 49.8% kernel,  0.0% iowait,  0.0%
> swap
> >
> > From my solaris knowledge, the given load in the output of "top" is
> not from this zone alone, it is the load of the whole physical machine
> (and 1 running java compilation process cannot create a load of 50).
> Nothing more is running at the moment on that zone.
> >
> > So my question is more, what is also running on the *physical*
> machine, that eats up all the cpu resources?
> >
> > Uwe
> >
> >


Re: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by Norman Maurer <no...@googlemail.com>.
Hi Uwe,

the lucene and hudson zone are not the same. They are even not on the
same physical server. Anyway the high load was caused by the lenya
zone which is located on teh same server as the lucene zone.

Hope it will work out better now..


Have fun,
Norman


2010/3/1 Uwe Schindler <uw...@thetaphi.de>:
>> Note that (to answer Uwes question also) I created an Infra issue back
>> in
>> December
>> for a new Solaris zone for Hudson and Buildbot.
>>
>> http://issues.apache.org/jira/browse/INFRA-2360
>>
>> I hope to get the new Hudson zone up and running in the next day or
>> two,
>> which should
>> alleviate pressure from the other two.
>
> I am not sure, if this would solve the load problems on lucene.zones:
>
> load averages: 50.76, 50.55, 47.90                                     08:21:02
> 32 processes:  29 sleeping, 1 running, 2 on cpu
> CPU states:  0.0% idle, 50.2% user, 49.8% kernel,  0.0% iowait,  0.0% swap
>
> From my solaris knowledge, the given load in the output of "top" is not from this zone alone, it is the load of the whole physical machine (and 1 running java compilation process cannot create a load of 50). Nothing more is running at the moment on that zone.
>
> So my question is more, what is also running on the *physical* machine, that eats up all the cpu resources?
>
> Uwe
>
>

RE: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by Uwe Schindler <uw...@thetaphi.de>.
> Note that (to answer Uwes question also) I created an Infra issue back
> in
> December
> for a new Solaris zone for Hudson and Buildbot.
> 
> http://issues.apache.org/jira/browse/INFRA-2360
> 
> I hope to get the new Hudson zone up and running in the next day or
> two,
> which should
> alleviate pressure from the other two.

I am not sure, if this would solve the load problems on lucene.zones:

load averages: 50.76, 50.55, 47.90                                     08:21:02
32 processes:  29 sleeping, 1 running, 2 on cpu
CPU states:  0.0% idle, 50.2% user, 49.8% kernel,  0.0% iowait,  0.0% swap

>From my solaris knowledge, the given load in the output of "top" is not from this zone alone, it is the load of the whole physical machine (and 1 running java compilation process cannot create a load of 50). Nothing more is running at the moment on that zone.

So my question is more, what is also running on the *physical* machine, that eats up all the cpu resources?

Uwe


RE: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by "Gav..." <ga...@16degrees.com.au>.

> -----Original Message-----
> From: Jukka Zitting [mailto:jukka.zitting@gmail.com]
> Sent: Sunday, 28 February 2010 7:43 PM
> To: builds@apache.org
> Subject: Re: [hudson] Killed subversion-1.6.x-solaris build 88
> 
> Hi,
> 
> On Sat, Feb 27, 2010 at 1:15 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> > I killed the slave process from the zone's shell and restarted using
> the GUI, but no progress.
> 
> I had to repeat the same thing a few times yesterday evening, but now
> the slave seems to be back in business.
> 
> > The machine with all zones has now a load of 46 (!).
> 
> I wonder if hudson.zones is on the same server.

Not that I can tell. Looking at the docs, Hudson is on Hyperion along with
James, perl, Servicemix, quetz and confluence-test.

confluence-test is not in use and due to be deleted.


> Perhaps we should
> reserve also hudson.zones only for tied jobs and push all untied
> builds to the Ubuntu servers (plus perhaps an experimental on-demand
> EC2 slave for which we may get some funding for next budget year).

Isn't that defeating the purpose of projects using Hudson.zones and
Lucene.zones
is because they are testing on Solaris, or am I misunderstanding something>

Note that (to answer Uwes question also) I created an Infra issue back in
December
for a new Solaris zone for Hudson and Buildbot.

http://issues.apache.org/jira/browse/INFRA-2360

I hope to get the new Hudson zone up and running in the next day or two,
which should
alleviate pressure from the other two.

Gav...

> 
> BR,
> 
> Jukka Zitting



RE: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Jukka,

> > The machine with all zones has now a load of 46 (!).
> 
> I wonder if hudson.zones is on the same server. Perhaps we should
> reserve also hudson.zones only for tied jobs and push all untied
> builds to the Ubuntu servers (plus perhaps an experimental on-demand
> EC2 slave for which we may get some funding for next budget year).

As far as I know this is the same physical Solaris machine (hudson.zones and lucene.zones). But: The load of this machine seems to be increasing all the time. Before Xmas we had constantly a load of 16, one month ago about 30 and now 46. The Lucene/Solr Builds getting slower and slower each day. But one positive thing was that this slowdown caused some concurrency bugs in Lucene to appear and we were able to fix them (thread starvation).

There seem to be a lot new zones or some of the zones go crazy. As a zone use I have no chance to find out what jobs are running in other zones, I can only see the load. Maybe infra should look after the machine and give some hints what is using this server now so heavy?

Uwe


Re: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by Jukka Zitting <ju...@gmail.com>.
Hi,

On Sat, Feb 27, 2010 at 1:15 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> I killed the slave process from the zone's shell and restarted using the GUI, but no progress.

I had to repeat the same thing a few times yesterday evening, but now
the slave seems to be back in business.

> The machine with all zones has now a load of 46 (!).

I wonder if hudson.zones is on the same server. Perhaps we should
reserve also hudson.zones only for tied jobs and push all untied
builds to the Ubuntu servers (plus perhaps an experimental on-demand
EC2 slave for which we may get some funding for next budget year).

BR,

Jukka Zitting

RE: [hudson] Killed subversion-1.6.x-solaris build 88

Posted by Uwe Schindler <uw...@thetaphi.de>.
The whole lucene-zones slave does not seem to respond anymore. The build after subversion was nutch, also stuck - it did not respond to a cancel request. I killed the slave process from the zone's shell and restarted using the GUI, but no progress.

The machine with all zones has now a load of 46 (!).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Jukka Zitting [mailto:jukka.zitting@gmail.com]
> Sent: Friday, February 26, 2010 4:12 PM
> To: builds@apache.org
> Subject: [hudson] Killed subversion-1.6.x-solaris build 88
> 
> Hi,
> 
> This build [1] was stuck since yesterday.
> 
> PS. The subversion-1.6.x-solaris job is also running on the Lucene
> zone. Is this intentional?
> 
> [1] http://hudson.zones.apache.org/hudson/job/subversion-1.6.x-
> solaris/88/
> 
> BR,
> 
> Jukka Zitting