You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bigtop.apache.org by Julien Eid <je...@gmail.com> on 2014/07/08 22:04:10 UTC

CentOS build jenkins slave broken

http://bigtop01.cloudera.org:8080/job/Bigtop-trunk-Hadoop/label=centos6/667/console


The JRE keeps crashing for builds and if you go to
http://bigtop01.cloudera.org:8080/job/Bigtop-trunk-Hadoop/label=centos6/ws/
the Jenkins slave is having lots of trouble.

Can we see if a Jenkins slave restart fixes these issues and then kick off
a build?

Re: CentOS build jenkins slave broken

Posted by Mark Grover <ma...@apache.org>.
I am able to ssh now.

Not sure if it was the reboot or Cos' magical hands but life is much better
now:-)


On Tue, Jul 8, 2014 at 9:30 PM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> On Tue, Jul 8, 2014 at 4:59 PM, Mark Grover <ma...@apache.org> wrote:
> > Thanks Roman.
> >
> > FWIW, I still can't ssh or ping
> ec2-54-221-150-123.compute-1.amazonaws.com,
> > all the other instances are just fine.
>
> Ping never worked, but ssh works just fine for me. You key
> mgrover@mgrover-MBP.local
> seems to be in place. Not quite sure what else could be happening.
>
> > But, even barring that, all the few CentOS6 builds I checked were failing
> > with the same error:
>
> /tmp got clogged again. I think we just need to add rm -rf /var/tmp/*
> /tmp/* to
> our top level nitghly job.
>
> Thanks,
> Roman.
>

Re: CentOS build jenkins slave broken

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
On Tue, Jul 8, 2014 at 4:59 PM, Mark Grover <ma...@apache.org> wrote:
> Thanks Roman.
>
> FWIW, I still can't ssh or ping ec2-54-221-150-123.compute-1.amazonaws.com,
> all the other instances are just fine.

Ping never worked, but ssh works just fine for me. You key
mgrover@mgrover-MBP.local
seems to be in place. Not quite sure what else could be happening.

> But, even barring that, all the few CentOS6 builds I checked were failing
> with the same error:

/tmp got clogged again. I think we just need to add rm -rf /var/tmp/* /tmp/* to
our top level nitghly job.

Thanks,
Roman.

Re: CentOS build jenkins slave broken

Posted by Mark Grover <ma...@apache.org>.
Thanks Roman.

FWIW, I still can't ssh or ping ec2-54-221-150-123.compute-1.amazonaws.com,
all the other instances are just fine.
ping ec2-54-221-150-123.compute-1.amazonaws.com
PING ec2-54-221-150-123.compute-1.amazonaws.com (54.221.150.123): 56 data
bytes
Request timeout for icmp_seq 0
Request timeout for icmp_seq 1
Request timeout for icmp_seq 2

But, even barring that, all the few CentOS6 builds I checked were failing
with the same error:

*19:40:24*  [INFO] --- maven-source-plugin:2.2.1:jar-no-fork
(attach-sources) @ hbase-common ---*20:51:30*  #*20:51:30*  # A fatal
error has been detected by the Java Runtime Environment:*20:51:30*
#*20:51:30*  #  Internal Error (safepoint.cpp:309), pid=19146,
tid=140178966472448*20:51:30*  #  guarantee(PageArmed == 0) failed:
invariant*20:51:30*  #*20:51:30*  # JRE version: 6.0_45-b06*20:51:30*
# Java VM: Java HotSpot(TM) 64-Bit Server VM (20.45-b01 mixed mode
linux-amd64 compressed oops)*20:51:30*  # An error report file with
more information is saved as:*20:51:30*  #
/mnt/jenkins/workspace/Bigtop-trunk-HBase/label/centos6/build/hbase/rpm/BUILD/hbase-0.98.2/hs_err_pid19146.log*20:51:30*
 FATAL: Unable to delete script file
/tmp/hudson1935741715718466093.sh*20:51:30*  hudson.util.IOException2
<http://stacktrace.jenkins-ci.org/search?query=hudson.util.IOException2>:
remote file operation failed: /tmp/hudson1935741715718466093.sh at
hudson.remoting.Channel@20abe8b7:centos6-01*20:51:30*  	at
hudson.FilePath.act(FilePath.java:784)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.FilePath.act&entity=method>*20:51:30*
 	at hudson.FilePath.act(FilePath.java:770)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.FilePath.act&entity=method>*20:51:30*
 	at hudson.FilePath.delete(FilePath.java:1075)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.FilePath.delete&entity=method>*20:51:30*
 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:92)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.tasks.CommandInterpreter.perform&entity=method>*20:51:30*
 	at hudson.tasks.CommandInterpreter.perform(CommandInterpreter.java:58)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.tasks.CommandInterpreter.perform&entity=method>*20:51:30*
 	at hudson.tasks.BuildStepMonitor$1.perform(BuildStepMonitor.java:19)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.tasks.BuildStepMonitor$1.perform&entity=method>*20:51:30*
 	at hudson.model.AbstractBuild$AbstractRunner.perform(AbstractBuild.java:703)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.model.AbstractBuild$AbstractRunner.perform&entity=method>*20:51:30*
 	at hudson.model.Build$RunnerImpl.build(Build.java:178)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.model.Build$RunnerImpl.build&entity=method>*20:51:30*
 	at hudson.model.Build$RunnerImpl.doRun(Build.java:139)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.model.Build$RunnerImpl.doRun&entity=method>*20:51:30*
 	at hudson.model.AbstractBuild$AbstractRunner.run(AbstractBuild.java:473)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.model.AbstractBuild$AbstractRunner.run&entity=method>*20:51:30*
 	at hudson.model.Run.run(Run.java:1410)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.model.Run.run&entity=method>*20:51:30*
 	at hudson.matrix.MatrixRun.run(MatrixRun.java:146)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.matrix.MatrixRun.run&entity=method>*20:51:30*
 	at hudson.model.ResourceController.execute(ResourceController.java:88)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.model.ResourceController.execute&entity=method>*20:51:30*
 	at hudson.model.Executor.run(Executor.java:238)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.model.Executor.run&entity=method>*20:51:30*
 Caused by: hudson.remoting.ChannelClosedException
<http://stacktrace.jenkins-ci.org/search?query=hudson.remoting.ChannelClosedException>:
channel is already closed*20:51:30*  	at
hudson.remoting.Channel.send(Channel.java:499)
<http://stacktrace.jenkins-ci.org/search/?query=hudson.remoting.Channel.send&entity=method>



Thoughts?


On Tue, Jul 8, 2014 at 4:01 PM, Roman Shaposhnik <ro...@shaposhnik.org>
wrote:

> I can login just fine to the slave and it seems to be operational now (the
> fedora one seems to be borked, though :-().
>
> Having access to AWS web interface is a sore point -- none of us
> have it. All we've got is ACCESS KEY and SECRET KEY. Anything
> that works with that -- works, but that doesn't include AWS web interface.
>
> Folks at Cloudera (Andrew, Aida) probably know more.
>
> Thanks,
> Roman.
>
> On Tue, Jul 8, 2014 at 1:15 PM, Mark Grover <ma...@apache.org> wrote:
> > Hey Julien,
> > I tried SSH'ing (as root and ec2-user) and pinging to the box (
> > ec2-54-221-150-123.compute-1.amazonaws.com) but both seemed to fail.
> >
> > Something is listening on port 22 there since telnet passed.
> >
> > I don't have access to the webconsole to be able to do anything fancier
> but
> > I am open to suggestions? Can anyone with more privs help out here or
> > assign some AWS web interface creds to me?
> >
> > Thanks!
> > Mark
> >
> >
> > On Tue, Jul 8, 2014 at 1:04 PM, Julien Eid <je...@gmail.com> wrote:
> >
> >>
> >>
> http://bigtop01.cloudera.org:8080/job/Bigtop-trunk-Hadoop/label=centos6/667/console
> >>
> >>
> >> The JRE keeps crashing for builds and if you go to
> >>
> http://bigtop01.cloudera.org:8080/job/Bigtop-trunk-Hadoop/label=centos6/ws/
> >> the Jenkins slave is having lots of trouble.
> >>
> >> Can we see if a Jenkins slave restart fixes these issues and then kick
> off
> >> a build?
> >>
>

Re: CentOS build jenkins slave broken

Posted by Roman Shaposhnik <ro...@shaposhnik.org>.
I can login just fine to the slave and it seems to be operational now (the
fedora one seems to be borked, though :-().

Having access to AWS web interface is a sore point -- none of us
have it. All we've got is ACCESS KEY and SECRET KEY. Anything
that works with that -- works, but that doesn't include AWS web interface.

Folks at Cloudera (Andrew, Aida) probably know more.

Thanks,
Roman.

On Tue, Jul 8, 2014 at 1:15 PM, Mark Grover <ma...@apache.org> wrote:
> Hey Julien,
> I tried SSH'ing (as root and ec2-user) and pinging to the box (
> ec2-54-221-150-123.compute-1.amazonaws.com) but both seemed to fail.
>
> Something is listening on port 22 there since telnet passed.
>
> I don't have access to the webconsole to be able to do anything fancier but
> I am open to suggestions? Can anyone with more privs help out here or
> assign some AWS web interface creds to me?
>
> Thanks!
> Mark
>
>
> On Tue, Jul 8, 2014 at 1:04 PM, Julien Eid <je...@gmail.com> wrote:
>
>>
>> http://bigtop01.cloudera.org:8080/job/Bigtop-trunk-Hadoop/label=centos6/667/console
>>
>>
>> The JRE keeps crashing for builds and if you go to
>> http://bigtop01.cloudera.org:8080/job/Bigtop-trunk-Hadoop/label=centos6/ws/
>> the Jenkins slave is having lots of trouble.
>>
>> Can we see if a Jenkins slave restart fixes these issues and then kick off
>> a build?
>>

Re: CentOS build jenkins slave broken

Posted by Mark Grover <ma...@apache.org>.
Hey Julien,
I tried SSH'ing (as root and ec2-user) and pinging to the box (
ec2-54-221-150-123.compute-1.amazonaws.com) but both seemed to fail.

Something is listening on port 22 there since telnet passed.

I don't have access to the webconsole to be able to do anything fancier but
I am open to suggestions? Can anyone with more privs help out here or
assign some AWS web interface creds to me?

Thanks!
Mark


On Tue, Jul 8, 2014 at 1:04 PM, Julien Eid <je...@gmail.com> wrote:

>
> http://bigtop01.cloudera.org:8080/job/Bigtop-trunk-Hadoop/label=centos6/667/console
>
>
> The JRE keeps crashing for builds and if you go to
> http://bigtop01.cloudera.org:8080/job/Bigtop-trunk-Hadoop/label=centos6/ws/
> the Jenkins slave is having lots of trouble.
>
> Can we see if a Jenkins slave restart fixes these issues and then kick off
> a build?
>