You are viewing a plain text version of this content. The canonical link for it is here.
Posted to builds@apache.org by Martin Desruisseaux <ma...@geomatys.fr> on 2013/01/22 01:21:25 UTC

Systematic JVM crash on Jenkins

Hello all

The JDK7 branch of the SIS project is suffering from a systematic JVM 
crash at building time on Jenkins. A look at the report file suggests 
that the crash occurs in the call to the 
java.util.zip.ZipFile.getEntry(String) method. We can workaround the 
problem by disabling a custom Maven plugin. But a fix would of course be 
preferred. Should I post a bug report on http://bugreport.sun.com, or is 
there any other action that could be taken?

The Jenkins job suffering from this crash is 
https://builds.apache.org/job/sis-jdk7/. Below is the crash report:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0xfee952d6, pid=4889, tid=20
#
# JRE version: 7.0-b147
# Java VM: Java HotSpot(TM) Server VM (21.0-b17 mixed mode solaris-x86 )
# Problematic frame:
# C  [libc.so.1+0x252d6]  memcpy+0x166
#
# Core dump written. Default location: /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core or core.4889
#
# An error report file with more information is saved as:
# /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/hs_err_pid4889.log
#
# If you would like to submit a bug report, please visit:
#http://bugreport.sun.com/bugreport/crash.jsp
# The crash happened outside the Java Virtual Machine in native code.
# See problematic frame for where to report the bug.
#


Thanks,

     Martin


Re: Systematic JVM crash on Jenkins

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Le 21/01/13 20:30, Adam Estrada a écrit :
> You are developing on Mac, right? I was recently hit with a JDK7 update on my Mac which might address this issue for Lion OSX? What are the odds that the Solaris machine got the same update? I'd say post the bug to Sun and see what happens and work the time being, disable the plugin that is causing the crash. How much will be missing with this plugin disabled?

Yes I'm on Mac, but this crash doesn't occur on my machine. I don't know 
how it behaves on other machine. I can easily disable the plugin, which 
is just packaging the JAR files in a single directory (in order to allow 
the class-path entry in META-INF/MANIFEST.MF to work). But it may be 
worth to wait for reply from either builds@apache.org or from Sun, in 
case someone want to reproduce the work...

     Martin


Re: Systematic JVM crash on Jenkins

Posted by Adam Estrada <es...@gmail.com>.
Hey Martin,

You are developing on Mac, right? I was recently hit with a JDK7 update on my Mac which might address this issue for Lion OSX? What are the odds that the Solaris machine got the same update? I'd say post the bug to Sun and see what happens and work the time being, disable the plugin that is causing the crash. How much will be missing with this plugin disabled?

Adam

On Jan 21, 2013, at 7:21 PM, Martin Desruisseaux wrote:

> Hello all
> 
> The JDK7 branch of the SIS project is suffering from a systematic JVM crash at building time on Jenkins. A look at the report file suggests that the crash occurs in the call to the java.util.zip.ZipFile.getEntry(String) method. We can workaround the problem by disabling a custom Maven plugin. But a fix would of course be preferred. Should I post a bug report on http://bugreport.sun.com, or is there any other action that could be taken?
> 
> The Jenkins job suffering from this crash is https://builds.apache.org/job/sis-jdk7/. Below is the crash report:
> 
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0xfee952d6, pid=4889, tid=20
> #
> # JRE version: 7.0-b147
> # Java VM: Java HotSpot(TM) Server VM (21.0-b17 mixed mode solaris-x86 )
> # Problematic frame:
> # C  [libc.so.1+0x252d6]  memcpy+0x166
> #
> # Core dump written. Default location: /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core or core.4889
> #
> # An error report file with more information is saved as:
> # /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/hs_err_pid4889.log
> #
> # If you would like to submit a bug report, please visit:
> #http://bugreport.sun.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> 
> 
> Thanks,
> 
>    Martin
> 


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Daniel Shahaf <d....@daniel.shahaf.name>.
builds@ was the correct destination (rather than infra@)

Adam Estrada wrote on Mon, Feb 04, 2013 at 09:45:40 -0500:
> Thanks for keeping on this, Martin. I will copy infra@ to see if they have
> any insight in to this.

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
No problem buddy!

To register, simply send an email to builds-subscribe@apache.org and
follow the instructions from there.

Still waiting to see what builds@ people decided though.

Thanks,
Chris

On 2/4/13 8:50 AM, "Martin Desruisseaux" <ma...@geomatys.fr>
wrote:

>Hello Chris
>
>Le 04/02/13 17:21, Mattmann, Chris A (388J) a écrit :
>> There were some replies too ‹ that centered around  upgrading the JVM
>>on that machine to a newer version and seeing if that fixes the problem
>>as you suggest.
>
>Then this is my fault, I didn't saw those replies. It may be because I
>failed to find where to register to the build@apache.org mailing list...
>I will search for the link. Thanks for letting me know,
>
>     Martin
>
>


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
No problem buddy!

To register, simply send an email to builds-subscribe@apache.org and
follow the instructions from there.

Still waiting to see what builds@ people decided though.

Thanks,
Chris

On 2/4/13 8:50 AM, "Martin Desruisseaux" <ma...@geomatys.fr>
wrote:

>Hello Chris
>
>Le 04/02/13 17:21, Mattmann, Chris A (388J) a écrit :
>> There were some replies too ‹ that centered around  upgrading the JVM
>>on that machine to a newer version and seeing if that fixes the problem
>>as you suggest.
>
>Then this is my fault, I didn't saw those replies. It may be because I
>failed to find where to register to the build@apache.org mailing list...
>I will search for the link. Thanks for letting me know,
>
>     Martin
>
>


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello Chris

Le 04/02/13 17:21, Mattmann, Chris A (388J) a écrit :
> There were some replies too — that centered around  upgrading the JVM on that machine to a newer version and seeing if that fixes the problem as you suggest.

Then this is my fault, I didn't saw those replies. It may be because I 
failed to find where to register to the build@apache.org mailing list... 
I will search for the link. Thanks for letting me know,

     Martin



Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello Chris

Le 04/02/13 17:21, Mattmann, Chris A (388J) a écrit :
> There were some replies too — that centered around  upgrading the JVM on that machine to a newer version and seeing if that fixes the problem as you suggest.

Then this is my fault, I didn't saw those replies. It may be because I 
failed to find where to register to the build@apache.org mailing list... 
I will search for the link. Thanks for letting me know,

     Martin



Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks for reporting this — I saw the email go through to builds@ Martin.

There were some replies too — that centered around  upgrading the JVM on that machine to a newer version and seeing if that fixes the problem as you suggest.

I dropped infra@ from the list since infra@ will just redirect us to builds@. builds@ peeps — any help here?

Cheers,
Chris

From: Adam Estrada <es...@gmail.com>>
Date: Monday, February 4, 2013 6:45 AM
To: "dev@sis.apache.org<ma...@sis.apache.org>" <de...@sis.apache.org>>, "infra@apache.org<ma...@apache.org>" <in...@apache.org>>
Subject: Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Thanks for keeping on this, Martin. I will copy infra@ to see if they have any insight in to this. Also, @mattmann may be able to lend a hand here too.

Adam


On Mon, Feb 4, 2013 at 9:23 AM, Martin Desruisseaux <ma...@geomatys.fr>> wrote:
Hello build@apache.org<ma...@apache.org> (copy to Apache SIS)

I don't know if I'm writing to the appropriate email address (this is my third email to build@apache.org<ma...@apache.org>)... If I'm writing to the wrong email address, would it be possible to redirect me to a more appropriate contact point please? We have a problem with either the Apache build infrastructure or the Oracle Java Virtual Machine. This is not a problem in the Apache SIS project, because pure Java code should never crash the JVM. While we could workaround the problem, JVM crashes are serious issues and I would feel safer if this issue could be addressed.

The JVM on the Jenkins server crashes (I mean crashes in native libraries, not a Java exception) in almost every build of the JDK7 branch of Apache SIS. We have no native code and we don't use explicitly any native libraries. According to the log attached to this email, the crash occurs in the C/C++ implementation of java.util.zip.ZipFile.getEntry(String). I would like to send a bug report to http://bugreport.sun.com/bugreport/crash.jsp. But before doing so, I noticed that the JDK 7 used by Jenkins is 18 months old (built on June 27, 2011). Is there any way we can test the build on a more recent JDK 7 version on the Apache Jenkins system before to fill a bug report?

    Thanks and regards,

        Martin



Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks for reporting this — I saw the email go through to builds@ Martin.

There were some replies too — that centered around  upgrading the JVM on that machine to a newer version and seeing if that fixes the problem as you suggest.

I dropped infra@ from the list since infra@ will just redirect us to builds@. builds@ peeps — any help here?

Cheers,
Chris

From: Adam Estrada <es...@gmail.com>>
Date: Monday, February 4, 2013 6:45 AM
To: "dev@sis.apache.org<ma...@sis.apache.org>" <de...@sis.apache.org>>, "infra@apache.org<ma...@apache.org>" <in...@apache.org>>
Subject: Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Thanks for keeping on this, Martin. I will copy infra@ to see if they have any insight in to this. Also, @mattmann may be able to lend a hand here too.

Adam


On Mon, Feb 4, 2013 at 9:23 AM, Martin Desruisseaux <ma...@geomatys.fr>> wrote:
Hello build@apache.org<ma...@apache.org> (copy to Apache SIS)

I don't know if I'm writing to the appropriate email address (this is my third email to build@apache.org<ma...@apache.org>)... If I'm writing to the wrong email address, would it be possible to redirect me to a more appropriate contact point please? We have a problem with either the Apache build infrastructure or the Oracle Java Virtual Machine. This is not a problem in the Apache SIS project, because pure Java code should never crash the JVM. While we could workaround the problem, JVM crashes are serious issues and I would feel safer if this issue could be addressed.

The JVM on the Jenkins server crashes (I mean crashes in native libraries, not a Java exception) in almost every build of the JDK7 branch of Apache SIS. We have no native code and we don't use explicitly any native libraries. According to the log attached to this email, the crash occurs in the C/C++ implementation of java.util.zip.ZipFile.getEntry(String). I would like to send a bug report to http://bugreport.sun.com/bugreport/crash.jsp. But before doing so, I noticed that the JDK 7 used by Jenkins is 18 months old (built on June 27, 2011). Is there any way we can test the build on a more recent JDK 7 version on the Apache Jenkins system before to fill a bug report?

    Thanks and regards,

        Martin



Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Adam Estrada <es...@gmail.com>.
Thanks for keeping on this, Martin. I will copy infra@ to see if they have
any insight in to this. Also, @mattmann may be able to lend a hand here
too.

Adam


On Mon, Feb 4, 2013 at 9:23 AM, Martin Desruisseaux <
martin.desruisseaux@geomatys.fr> wrote:

> Hello build@apache.org (copy to Apache SIS)
>
> I don't know if I'm writing to the appropriate email address (this is my
> third email to build@apache.org)... If I'm writing to the wrong email
> address, would it be possible to redirect me to a more appropriate contact
> point please? We have a problem with either the Apache build infrastructure
> or the Oracle Java Virtual Machine. This is not a problem in the Apache SIS
> project, because pure Java code should never crash the JVM. While we could
> workaround the problem, JVM crashes are serious issues and I would feel
> safer if this issue could be addressed.
>
> The JVM on the Jenkins server crashes (I mean crashes in native libraries,
> not a Java exception) in almost every build of the JDK7 branch of Apache
> SIS. We have no native code and we don't use explicitly any native
> libraries. According to the log attached to this email, the crash occurs in
> the C/C++ implementation of java.util.zip.ZipFile.**getEntry(String). I
> would like to send a bug report to http://bugreport.sun.com/**
> bugreport/crash.jsp <http://bugreport.sun.com/bugreport/crash.jsp>. But
> before doing so, I noticed that the JDK 7 used by Jenkins is 18 months old
> (built on June 27, 2011). Is there any way we can test the build on a more
> recent JDK 7 version on the Apache Jenkins system before to fill a bug
> report?
>
>     Thanks and regards,
>
>         Martin
>
>

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by sebb <se...@gmail.com>.
On 6 February 2013 01:49, Jesse Glick <ty...@gmail.com> wrote:
> On 02/05/2013 07:30 PM, sebb wrote:
>>
>> a pure Java application should not be able to cause a JVM crash
>
>
> Depends entirely on what you mean by “pure Java application”. When you have
> filesystem and Process access, you can do whatever the user shell account
> can do, which certainly includes triggering native errors.

> At any rate I personally agree that using java.util.zip in this way should
> not cause JVM crashes—but it does, and the JDK team has explicitly decided
> that this is “not a bug” and closed discussion.

Astonishing decision.
I hope people complain until they relent.
Seems very unwise to allow that behaviour - it undermines trust in the JVM.

> The workaround, or depending
> on your perspective the fix, is to ensure that you do not clobber an open
> ZIP file (which would also prevent a lock error on Windows).

I trust there aren't more such shortcuts in the JVM.

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Le 06/02/13 02:49, Jesse Glick a écrit :
> On 02/05/2013 07:30 PM, sebb wrote:
>> a pure Java application should not be able to cause a JVM crash
> Depends entirely on what you mean by “pure Java application”. When you 
> have filesystem and Process access, you can do whatever the user shell 
> account can do, which certainly includes triggering native errors.

But at least in the Process case, the crash would occur outside the JVM, 
isn't it?

> At any rate I personally agree that using java.util.zip in this way 
> should not cause JVM crashes—but it does, and the JDK team has 
> explicitly decided that this is “not a bug” and closed discussion. The 
> workaround, or depending on your perspective the fix, is to ensure 
> that you do not clobber an open ZIP file (which would also prevent a 
> lock error on Windows).

In our case, we are not handling the ZIP files ourself. The ZIP file is 
opened by the Java ClassLoader, and the JAR file were created by the 
standard Maven plugin. We are not doing parallel compilation. 
Consequently, it is not sure that the crash is caused by clobbering an 
open ZIP file. Furthermore, the approach that we wanted to apply has 
been used for more than 5 years on Windows, Solaris, Linux Gentoo, Linux 
Ubuntu, MacOS, and other Hudson/Jenkins servers without any JVM crash.

Anyway, if the JVM were "allowed" to crash when clobbering an open ZIP 
file, then any user or external application touching a ZIP file could 
crash a running JVM...

     Martin


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks for tracking this down, guys.

Cheers,
Chris

On 2/28/13 1:48 AM, "Martin Desruisseaux"
<ma...@geomatys.fr> wrote:

>Hello all
>
>Just for the record: as predicted by Jesse Glick, upgrading to latest
>JDK7 on Solaris didn't solved the JVM crash. So I tied the build of the
>JDK7 branch to the Ubuntu slaves as suggested by Gavin McDonald. After
>Apache SIS release, we will use the released Maven "sis-build-helper"
>plugin instead of the snapshot version (as suggested by Jesse) and try
>again to let Jenkins build on any slave.
>
>Thanks all for your help
>
>     Martin
>


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by "Mattmann, Chris A (388J)" <ch...@jpl.nasa.gov>.
Thanks for tracking this down, guys.

Cheers,
Chris

On 2/28/13 1:48 AM, "Martin Desruisseaux"
<ma...@geomatys.fr> wrote:

>Hello all
>
>Just for the record: as predicted by Jesse Glick, upgrading to latest
>JDK7 on Solaris didn't solved the JVM crash. So I tied the build of the
>JDK7 branch to the Ubuntu slaves as suggested by Gavin McDonald. After
>Apache SIS release, we will use the released Maven "sis-build-helper"
>plugin instead of the snapshot version (as suggested by Jesse) and try
>again to let Jenkins build on any slave.
>
>Thanks all for your help
>
>     Martin
>


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello all

Just for the record: as predicted by Jesse Glick, upgrading to latest 
JDK7 on Solaris didn't solved the JVM crash. So I tied the build of the 
JDK7 branch to the Ubuntu slaves as suggested by Gavin McDonald. After 
Apache SIS release, we will use the released Maven "sis-build-helper" 
plugin instead of the snapshot version (as suggested by Jesse) and try 
again to let Jenkins build on any slave.

Thanks all for your help

     Martin


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello all

Just for the record: as predicted by Jesse Glick, upgrading to latest 
JDK7 on Solaris didn't solved the JVM crash. So I tied the build of the 
JDK7 branch to the Ubuntu slaves as suggested by Gavin McDonald. After 
Apache SIS release, we will use the released Maven "sis-build-helper" 
plugin instead of the snapshot version (as suggested by Jesse) and try 
again to let Jenkins build on any slave.

Thanks all for your help

     Martin


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello Gavin

Le 22/02/13 06:40, Gavin McDonald a écrit :
> Ok, upgraded. Now solaris1 is as follows:
>
> bash-3.00# /home/hudson/tools/java/latest1.7/bin/java -version
> java version "1.7.0_15"
> Java(TM) SE Runtime Environment (build 1.7.0_15-b03)
> Java HotSpot(TM) Server VM (build 23.7-b01, mixed mode)

Thank you very much for the upgrade! We will try it in a few hours. I 
will post for the record after one week if the problem appears solved. 
Otherwise we will try another slave a you suggested.

Thanks also for the clarification about the various Jenkins slave, and 
the different JDK versions on those machines.

     Martin


RE: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Gavin McDonald <ga...@16degrees.com.au>.

> -----Original Message-----
> From: Gavin McDonald [mailto:gavin@16degrees.com.au]
> Sent: Friday, 22 February 2013 1:45 PM
> To: builds@apache.org
> Subject: RE: Apache build infrastructure or Oracle JVM problem: crash in
> native JDK code
> 
> I think we need to clarify some things here:
> 
> 1.   'Jenkins Server' is generic, saying that the 'Jenkins Server' needs java 7
> updating means nothing.
> We have many Jenkins Slaves , some Linux, some Solaris., some windows,
> freebsd, etc ..
> Each of them has individual java versions installed. There is the system
> default java and there are the many other different versions we install into
> /home/jenkins/tools/java/*
> 
> 2.  The 'Jenkins Server' on which your build has been running mostly is the
> 'Solaris1' slave machine.
> 
> This 'Solaris1' slave does have an out-dated java 7 and I'll update it shortly.

Ok, upgraded. Now solaris1 is as follows:

bash-3.00# /home/hudson/tools/java/latest1.7/bin/java -version
java version "1.7.0_15"
Java(TM) SE Runtime Environment (build 1.7.0_15-b03)
Java HotSpot(TM) Server VM (build 23.7-b01, mixed mode)

HTH

Gav...

> 
> currently:
> 
> bash-3.00$ /export/home/hudson/tools/java/latest1.7/bin/java -version
> java version "1.7.0"
> Java(TM) SE Runtime Environment (build 1.7.0-b147) Java HotSpot(TM)
> Server VM (build 21.0-b17, mixed mode)
> 
> 3. The many other 'jenkins slave' machines which you could have chosen to
> run  your builds on instead all have much more upto date versions.
> 
> gmcdonald@juno:~$ /home/hudson/tools/java/latest1.7/bin/java -version
> java version "1.7.0_04"
> Java(TM) SE Runtime Environment (build 1.7.0_04-b20) Java HotSpot(TM) 64-
> Bit Server VM (build 23.0-b21, mixed mode)
> 
> etc..
> 
> Conclusion: Only one of our Jenkins slaves had a really old Java 7 and you
> could simply choose any other slave that has more modern versions already
> installed. (You can force this by tying to the 'ubuntu4' label for example)
> 
> I'll let you know when 'solaris1' is updated if you prefer to continue with that.
> 
> HTH
> 
> Gav...
> 
> 
> > -----Original Message-----
> > From: Martin Desruisseaux [mailto:martin.desruisseaux@geomatys.fr]
> > Sent: Thursday, 21 February 2013 3:53 AM
> > To: builds@apache.org
> > Subject: Re: Apache build infrastructure or Oracle JVM problem: crash
> > in native JDK code
> >
> > Le 20/02/13 18:12, Jesse Glick a écrit :
> > > I would merely claim that loading classes from a snapshot JAR is
> > > _prone_ to triggering crashes when other factors come into play
> > > which might be difficult to predict.
> > I agree. We will move to a non-snapshot version when we can. The only
> > issue is that we need to make a release before we can do that (unless
> > we choose to keep permanently one of the snapshots).
> >
> > >> This snapshot JAR file is used as a plugin for other modules well
> > >> after the build has been completed.
> > > Well after the build of that plugin module has been completed I
> > > suppose you mean. (According to your build log, the JAR is created
> > > earlier in the reactor build and then loaded at the moment of the
> > > crash.)
> > Yes you are right, thank for clarifying.
> >
> > > At any rate, for many reasons it would be preferable to run on the
> > > latest available version of Java (7u15 as of this writing). I do not
> > > personally have admin permissions to help you there.
> > Anyway, many thanks for your efforts,
> >
> >      Martin
> 



RE: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Gavin McDonald <ga...@16degrees.com.au>.
I think we need to clarify some things here:

1.   'Jenkins Server' is generic, saying that the 'Jenkins Server' needs java 7 updating means nothing.
We have many Jenkins Slaves , some Linux, some Solaris., some windows, freebsd, etc .. 
Each of them has individual java versions installed. There is the system default java and there are 
the many other different versions we install into /home/jenkins/tools/java/*

2.  The 'Jenkins Server' on which your build has been running mostly is the 'Solaris1' slave machine.

This 'Solaris1' slave does have an out-dated java 7 and I'll update it shortly.

currently:

bash-3.00$ /export/home/hudson/tools/java/latest1.7/bin/java -version
java version "1.7.0"
Java(TM) SE Runtime Environment (build 1.7.0-b147)
Java HotSpot(TM) Server VM (build 21.0-b17, mixed mode)

3. The many other 'jenkins slave' machines which you could have chosen to run  your builds on instead all
have much more upto date versions.

gmcdonald@juno:~$ /home/hudson/tools/java/latest1.7/bin/java -version
java version "1.7.0_04"
Java(TM) SE Runtime Environment (build 1.7.0_04-b20)
Java HotSpot(TM) 64-Bit Server VM (build 23.0-b21, mixed mode)

etc..

Conclusion: Only one of our Jenkins slaves had a really old Java 7 and you could simply choose any other slave 
that has more modern versions already installed. (You can force this by tying to the 'ubuntu4' label for example)

I'll let you know when 'solaris1' is updated if you prefer to continue with that.

HTH

Gav...


> -----Original Message-----
> From: Martin Desruisseaux [mailto:martin.desruisseaux@geomatys.fr]
> Sent: Thursday, 21 February 2013 3:53 AM
> To: builds@apache.org
> Subject: Re: Apache build infrastructure or Oracle JVM problem: crash in
> native JDK code
> 
> Le 20/02/13 18:12, Jesse Glick a écrit :
> > I would merely claim that loading classes from a snapshot JAR is
> > _prone_ to triggering crashes when other factors come into play which
> > might be difficult to predict.
> I agree. We will move to a non-snapshot version when we can. The only issue
> is that we need to make a release before we can do that (unless we choose
> to keep permanently one of the snapshots).
> 
> >> This snapshot JAR file is used as a plugin for other modules well
> >> after the build has been completed.
> > Well after the build of that plugin module has been completed I
> > suppose you mean. (According to your build log, the JAR is created
> > earlier in the reactor build and then loaded at the moment of the
> > crash.)
> Yes you are right, thank for clarifying.
> 
> > At any rate, for many reasons it would be preferable to run on the
> > latest available version of Java (7u15 as of this writing). I do not
> > personally have admin permissions to help you there.
> Anyway, many thanks for your efforts,
> 
>      Martin



Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Le 20/02/13 18:12, Jesse Glick a écrit :
> I would merely claim that loading classes from a snapshot JAR is 
> _prone_ to triggering crashes when other factors come into play which 
> might be difficult to predict.
I agree. We will move to a non-snapshot version when we can. The only 
issue is that we need to make a release before we can do that (unless we 
choose to keep permanently one of the snapshots).

>> This snapshot JAR file is used as a plugin for other modules well 
>> after the build has been completed.
> Well after the build of that plugin module has been completed I 
> suppose you mean. (According to your build log, the JAR is created 
> earlier in the reactor build and then loaded at the moment of the crash.)
Yes you are right, thank for clarifying.

> At any rate, for many reasons it would be preferable to run on the 
> latest available version of Java (7u15 as of this writing). I do not 
> personally have admin permissions to help you there.
Anyway, many thanks for your efforts,

     Martin


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Jesse Glick <ty...@gmail.com>.
On 02/20/2013 11:46 AM, Martin Desruisseaux wrote:
> clobbering an open JAR file is an hypothesis that may explain the JVM crash, but this is unverified.

Indeed it is only a hypothesis, though one based on repeated encounters with similar crashes over the course of years of development that were generally traced back to 
this cause.

> nothing is touching this JAR file in parallel

Right, there is no known parallel build. There are other less likely explanations, such as asynchronous writes happening in response to garbage collection, etc. A real 
investigation would require tracking OS writes or something like that.

I would merely claim that loading classes from a snapshot JAR is _prone_ to triggering crashes when other factors come into play which might be difficult to predict. And 
it is certainly not out of the question that different JDK versions behave in subtly different ways that are enough to make your crash happen or not; this does not mean 
that the bug is only present in 1.7.0.

> This snapshot JAR file is used as a plugin for other modules well after the build has been completed.

Well after the build of that plugin module has been completed I suppose you mean. (According to your build log, the JAR is created earlier in the reactor build and then 
loaded at the moment of the crash.)

At any rate, for many reasons it would be preferable to run on the latest available version of Java (7u15 as of this writing). I do not personally have admin permissions 
to help you there.

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello Jesse

Le 20/02/13 16:20, Jesse Glick a écrit :
> On 02/06/2013 01:39 PM, Jesse Glick wrote:
>> I could file a JIRA ticket for it
> https://jira.codehaus.org/browse/MNG-5437

Thanks a lot for writing this JIRA task. However it may be worth to 
emphases that clobbering an open JAR file is an hypothesis that may 
explain the JVM crash, but this is unverified. There is some arguments 
that may me think that the problem may be somewhere else:

  * In our case, nothing is touching this JAR file in parallel (there is
    only one Jenkin job building this JAR, and we are not doing parallel
    builds).
  * This snapshot JAR file is used as a plugin for other modules well
    after the build has been completed.
  * The build using JDK6 on the same Jenkin server never crashed.
  * The build using JDK7 on an other (non-Apache) Jenkin server never
    crashed.
  * The build using JDK7 on the Apache Jenkin server never crashed when
    the plugin was executed under the Maven "generate-resources" phase.
    Executing a plugin in the same JAR under the "package" phase cause
    the crash, despite the "package" phase being after the
    "generate-resources" phase.


Maybe the cause of the JVM crash is something cobbling the open JAR 
file, but I think that this is an uncertain hypothesis. I would like 
pretty much to try with an upgraded JDK7 on the Jenkins server first... 
Is there anything we can do for helping for a JDK7 upgrade?

     Thanks,

         Martin


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Jesse Glick <ty...@gmail.com>.
On 02/06/2013 01:39 PM, Jesse Glick wrote:
> I could file a JIRA ticket for it

https://jira.codehaus.org/browse/MNG-5437

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Le 06/02/13 19:39, Jesse Glick a écrit :
> On 02/06/2013 06:12 AM, Martin Desruisseaux wrote:
>> the approach that we wanted to apply has been used for more than 5 
>> years on Windows, Solaris, Linux Gentoo, Linux Ubuntu, MacOS, and 
>> other Hudson/Jenkins servers without any JVM crash
> Perhaps you were just lucky, or perhaps there is something in the 
> current Jenkins configuration that makes the crash more likely—for 
> example, if the job were marked as permitting multiple concurrent builds.
>
> I will reiterate that while upgrading the Java 7 build is a good idea, 
> it will probably not affect this bug.

Well, we also have a JDK6 branch (namely "sis-trunk") built on the same 
Jenkins server with the same configuration, and the JDK6 JVM never 
crash. Admittedly the code - including the MOJO at cause - is slightly 
different because of JDK7 features not supported in JDK6, but 
nevertheless I'm somewhat optimist about the chances that upgrading the 
JDK7 would help stabilizing the build.

In the main time, I tried to upgrade Maven from 3.0.3 to 3.0.4 and it 
seems to have helped; the 5 last build have been stable.

     Martin


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Ted Dunning <te...@gmail.com>.
Thanks for this.


On Wed, Feb 6, 2013 at 10:39 AM, Jesse Glick <ty...@gmail.com> wrote:

> To Ted Dunning’s request for the bug number—searching on bugs.sun.comseems to be broken, and as I no longer work for Oracle I cannot use the
> internal search tool. Fortunately I managed to dig it up using other means:
> it is #4425695 [1].
>

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Jesse Glick <ty...@gmail.com>.
On 02/06/2013 06:12 AM, Martin Desruisseaux wrote:
> the approach that we wanted to apply has been used for more than 5 years on Windows, Solaris, Linux Gentoo, Linux Ubuntu, MacOS, and other Hudson/Jenkins servers without any JVM crash

Perhaps you were just lucky, or perhaps there is something in the current Jenkins configuration that makes the crash more likely—for example, if the job were marked as 
permitting multiple concurrent builds.

I will reiterate that while upgrading the Java 7 build is a good idea, it will probably not affect this bug.

> if the JVM were "allowed" to crash when clobbering an open ZIP file, then any user or external application touching a ZIP file could crash a running JVM...

Yes, that is exactly what happens. The problem does not affect typical Java applications since ZIP files are normally just used for class loading and a normal application 
has a static code base that would not be overwritten while it is running; but it is a scourge of developer tools such as build systems and IDEs, as well as applications 
with module systems that permit dynamic reload.

To Ted Dunning’s request for the bug number—searching on bugs.sun.com seems to be broken, and as I no longer work for Oracle I cannot use the internal search tool. 
Fortunately I managed to dig it up using other means: it is #4425695 [1].

Passing -Dsun.zip.disableMemoryMapping=true to the JVM accessing the ZIP file (the Maven process in this case) may avoid the hard crash, as in NetBeans bug #190481 [2], 
though the ZIP would probably still be unreadable so you would just get other class loading errors.

While I would still recommend that the SIS project not run a SNAPSHOT plugin (e.g. from the reactor), it may be possible for Maven core to avoid the crash (and almost 
always avoid corruption & Windows lock errors too): rather than creating a class realm from ~/.m2/repository/**/*-SNAPSHOT.jar directly, which is inherently unsafe in a 
multiuser operating system, copy that JAR to /tmp (hoping no one tries to overwrite it during this brief interval), mark deleteOnExit, and open that temp file for the 
class loader instead. If there are any Maven committers listening who want to go with this idea (olamy?), great, otherwise I could file a JIRA ticket for it and maybe try 
a patch.


[1] http://bugs.sun.com/view_bug.do?bug_id=4425695
[2] http://netbeans.org/bugzilla/show_bug.cgi?id=190481

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello all, and thanks for the reply.

Le 06/02/13 02:49, Jesse Glick a écrit :
> On 02/05/2013 07:30 PM, sebb wrote:
>> a pure Java application should not be able to cause a JVM crash
> Depends entirely on what you mean by “pure Java application”. When you 
> have filesystem and Process access, you can do whatever the user shell 
> account can do, which certainly includes triggering native errors.

But at least in the Process case, the crash would occur outside the JVM, 
isn't it?


> At any rate I personally agree that using java.util.zip in this way 
> should not cause JVM crashes—but it does, and the JDK team has 
> explicitly decided that this is “not a bug” and closed discussion. The 
> workaround, or depending on your perspective the fix, is to ensure 
> that you do not clobber an open ZIP file (which would also prevent a 
> lock error on Windows).

In our case, we are not handling the ZIP files ourself. The ZIP file is 
opened by the Java ClassLoader, and the JAR file were created by the 
standard Maven plugin. We are not doing parallel compilation. 
Consequently, it is not sure that the crash is caused by clobbering an 
open ZIP file. Furthermore, the approach that we wanted to apply has 
been used for more than 5 years on Windows, Solaris, Linux Gentoo, Linux 
Ubuntu, MacOS, and other Hudson/Jenkins servers without any JVM crash.

Anyway, if the JVM were "allowed" to crash when clobbering an open ZIP 
file, then any user or external application touching a ZIP file could 
crash a running JVM...

     Martin


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Le 06/02/13 06:44, Ted Dunning a écrit :
> This is still a very early version of java 7.  It would be good to update
> it.

That would be my preferred approach, if possible...

     Thanks,

         Martin


Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Ted Dunning <te...@gmail.com>.
Do you have a reference to the bug?

This is still a very early version of java 7.  It would be good to update
it.

On Tue, Feb 5, 2013 at 5:49 PM, Jesse Glick <ty...@gmail.com> wrote:

> On 02/05/2013 07:30 PM, sebb wrote:
>
>> a pure Java application should not be able to cause a JVM crash
>>
>
> Depends entirely on what you mean by “pure Java application”. When you
> have filesystem and Process access, you can do whatever the user shell
> account can do, which certainly includes triggering native errors.
>
> At any rate I personally agree that using java.util.zip in this way should
> not cause JVM crashes—but it does, and the JDK team has explicitly decided
> that this is “not a bug” and closed discussion. The workaround, or
> depending on your perspective the fix, is to ensure that you do not clobber
> an open ZIP file (which would also prevent a lock error on Windows).
>

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Ted Dunning <te...@gmail.com>.
Do you have a reference to the bug?

This is still a very early version of java 7.  It would be good to update
it.

On Tue, Feb 5, 2013 at 5:49 PM, Jesse Glick <ty...@gmail.com> wrote:

> On 02/05/2013 07:30 PM, sebb wrote:
>
>> a pure Java application should not be able to cause a JVM crash
>>
>
> Depends entirely on what you mean by “pure Java application”. When you
> have filesystem and Process access, you can do whatever the user shell
> account can do, which certainly includes triggering native errors.
>
> At any rate I personally agree that using java.util.zip in this way should
> not cause JVM crashes—but it does, and the JDK team has explicitly decided
> that this is “not a bug” and closed discussion. The workaround, or
> depending on your perspective the fix, is to ensure that you do not clobber
> an open ZIP file (which would also prevent a lock error on Windows).
>

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by sebb <se...@gmail.com>.
On 6 February 2013 01:49, Jesse Glick <ty...@gmail.com> wrote:
> On 02/05/2013 07:30 PM, sebb wrote:
>>
>> a pure Java application should not be able to cause a JVM crash
>
>
> Depends entirely on what you mean by “pure Java application”. When you have
> filesystem and Process access, you can do whatever the user shell account
> can do, which certainly includes triggering native errors.

> At any rate I personally agree that using java.util.zip in this way should
> not cause JVM crashes—but it does, and the JDK team has explicitly decided
> that this is “not a bug” and closed discussion.

Astonishing decision.
I hope people complain until they relent.
Seems very unwise to allow that behaviour - it undermines trust in the JVM.

> The workaround, or depending
> on your perspective the fix, is to ensure that you do not clobber an open
> ZIP file (which would also prevent a lock error on Windows).

I trust there aren't more such shortcuts in the JVM.

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Jesse Glick <ty...@gmail.com>.
On 02/05/2013 07:30 PM, sebb wrote:
> a pure Java application should not be able to cause a JVM crash

Depends entirely on what you mean by “pure Java application”. When you have filesystem and Process access, you can do whatever the user shell account can do, which 
certainly includes triggering native errors.

At any rate I personally agree that using java.util.zip in this way should not cause JVM crashes—but it does, and the JDK team has explicitly decided that this is “not a 
bug” and closed discussion. The workaround, or depending on your perspective the fix, is to ensure that you do not clobber an open ZIP file (which would also prevent a 
lock error on Windows).

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Jesse Glick <ty...@gmail.com>.
On 02/05/2013 07:30 PM, sebb wrote:
> a pure Java application should not be able to cause a JVM crash

Depends entirely on what you mean by “pure Java application”. When you have filesystem and Process access, you can do whatever the user shell account can do, which 
certainly includes triggering native errors.

At any rate I personally agree that using java.util.zip in this way should not cause JVM crashes—but it does, and the JDK team has explicitly decided that this is “not a 
bug” and closed discussion. The workaround, or depending on your perspective the fix, is to ensure that you do not clobber an open ZIP file (which would also prevent a 
lock error on Windows).

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by sebb <se...@gmail.com>.
On 5 February 2013 17:59, Jesse Glick <ty...@gmail.com> wrote:
> On 02/04/2013 09:23 AM, Martin Desruisseaux wrote:
>>
>> the crash occurs in the C/C++ implementation of
>> java.util.zip.ZipFile.getEntry(String)
>
>
> This is almost always a symptom of user error rather than a JDK bug; or at
> least any bug report to Oracle along these lines will be rejected. Something
> in your build is trying to modify a JAR file which is still held open by
> some other Java code, and the JRE’s libzip.so (*) is not written to handle
> changes “beneath its feet”—it just crashes rather than reporting an error to
> the code calling getEntry. (**)

Sorry, but a pure Java application should not be able to cause a JVM
crash. In this case,

#  SIGSEGV (0xb) at pc=0xfee952ce, pid=2229, tid=20

A pure Java app can of course cause exceptions.
But the JVM should not crash when running pure Java - if it does, the
JVM is faulty.

However JNI apps can of course cause crashes such as SIGSEV.

> You need to be careful to close any JAR files before they are potentially
> overwritten; JarFile.close in a finally-block is the usual idiom, or for
> indirect usage from URLClassLoader you must call its close method (added in
> Java 7).
>
> From the stack trace, it seems the caller is the Jenkins Maven runner (for
> native Maven jobs). The JAR it is loading would be that of a mojo. So my
> suspicion would be that your build is somehow running a mojo from some JAR
> in the workspace (or local repository) which has been modified between the
> time the JAR was opened and the time the crash occurs (the end of the mojo).
>
> Specifically your console [1] suggests that sis-utility is running a
> just-built (SNAPSHOT) version of sis-build-helper:collect-jars, which could
> be dangerous especially if there are concurrent builds using the same
> workspace. Consider publishing release versions of mojo projects rather than
> running snapshots, i.e. version them independently of the main code.
>
>
> (*) On Windows the problem cannot arise because of mandatory file locks: the
> attempt to change the JAR would fail with an IOException.
>
> (**) I have advocated that libzip be used only for the bootstrap class
> loader, with a pure Java (perhaps NIO-based) implementation of ZipFile for
> user code, which would improve robustness and diagnosis if not actually
> correct the error. Last I checked the JRE team declined to put in this
> engineering effort, though a proposal made through OpenJDK might succeed if
> someone wanted to do the work.
>
> [1] https://builds.apache.org/job/sis-jdk7/lastFailedBuild/console

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by sebb <se...@gmail.com>.
On 5 February 2013 17:59, Jesse Glick <ty...@gmail.com> wrote:
> On 02/04/2013 09:23 AM, Martin Desruisseaux wrote:
>>
>> the crash occurs in the C/C++ implementation of
>> java.util.zip.ZipFile.getEntry(String)
>
>
> This is almost always a symptom of user error rather than a JDK bug; or at
> least any bug report to Oracle along these lines will be rejected. Something
> in your build is trying to modify a JAR file which is still held open by
> some other Java code, and the JRE’s libzip.so (*) is not written to handle
> changes “beneath its feet”—it just crashes rather than reporting an error to
> the code calling getEntry. (**)

Sorry, but a pure Java application should not be able to cause a JVM
crash. In this case,

#  SIGSEGV (0xb) at pc=0xfee952ce, pid=2229, tid=20

A pure Java app can of course cause exceptions.
But the JVM should not crash when running pure Java - if it does, the
JVM is faulty.

However JNI apps can of course cause crashes such as SIGSEV.

> You need to be careful to close any JAR files before they are potentially
> overwritten; JarFile.close in a finally-block is the usual idiom, or for
> indirect usage from URLClassLoader you must call its close method (added in
> Java 7).
>
> From the stack trace, it seems the caller is the Jenkins Maven runner (for
> native Maven jobs). The JAR it is loading would be that of a mojo. So my
> suspicion would be that your build is somehow running a mojo from some JAR
> in the workspace (or local repository) which has been modified between the
> time the JAR was opened and the time the crash occurs (the end of the mojo).
>
> Specifically your console [1] suggests that sis-utility is running a
> just-built (SNAPSHOT) version of sis-build-helper:collect-jars, which could
> be dangerous especially if there are concurrent builds using the same
> workspace. Consider publishing release versions of mojo projects rather than
> running snapshots, i.e. version them independently of the main code.
>
>
> (*) On Windows the problem cannot arise because of mandatory file locks: the
> attempt to change the JAR would fail with an IOException.
>
> (**) I have advocated that libzip be used only for the bootstrap class
> loader, with a pure Java (perhaps NIO-based) implementation of ZipFile for
> user code, which would improve robustness and diagnosis if not actually
> correct the error. Last I checked the JRE team declined to put in this
> engineering effort, though a proposal made through OpenJDK might succeed if
> someone wanted to do the work.
>
> [1] https://builds.apache.org/job/sis-jdk7/lastFailedBuild/console

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Jesse Glick <ty...@gmail.com>.
On 02/04/2013 09:23 AM, Martin Desruisseaux wrote:
> the crash occurs in the C/C++ implementation of java.util.zip.ZipFile.getEntry(String)

This is almost always a symptom of user error rather than a JDK bug; or at least any bug report to Oracle along these lines will be rejected. Something in your build is 
trying to modify a JAR file which is still held open by some other Java code, and the JRE’s libzip.so (*) is not written to handle changes “beneath its feet”—it just 
crashes rather than reporting an error to the code calling getEntry. (**)

You need to be careful to close any JAR files before they are potentially overwritten; JarFile.close in a finally-block is the usual idiom, or for indirect usage from 
URLClassLoader you must call its close method (added in Java 7).

 From the stack trace, it seems the caller is the Jenkins Maven runner (for native Maven jobs). The JAR it is loading would be that of a mojo. So my suspicion would be 
that your build is somehow running a mojo from some JAR in the workspace (or local repository) which has been modified between the time the JAR was opened and the time 
the crash occurs (the end of the mojo).

Specifically your console [1] suggests that sis-utility is running a just-built (SNAPSHOT) version of sis-build-helper:collect-jars, which could be dangerous especially 
if there are concurrent builds using the same workspace. Consider publishing release versions of mojo projects rather than running snapshots, i.e. version them 
independently of the main code.


(*) On Windows the problem cannot arise because of mandatory file locks: the attempt to change the JAR would fail with an IOException.

(**) I have advocated that libzip be used only for the bootstrap class loader, with a pure Java (perhaps NIO-based) implementation of ZipFile for user code, which would 
improve robustness and diagnosis if not actually correct the error. Last I checked the JRE team declined to put in this engineering effort, though a proposal made through 
OpenJDK might succeed if someone wanted to do the work.

[1] https://builds.apache.org/job/sis-jdk7/lastFailedBuild/console

Re: Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Jesse Glick <ty...@gmail.com>.
On 02/04/2013 09:23 AM, Martin Desruisseaux wrote:
> the crash occurs in the C/C++ implementation of java.util.zip.ZipFile.getEntry(String)

This is almost always a symptom of user error rather than a JDK bug; or at least any bug report to Oracle along these lines will be rejected. Something in your build is 
trying to modify a JAR file which is still held open by some other Java code, and the JRE’s libzip.so (*) is not written to handle changes “beneath its feet”—it just 
crashes rather than reporting an error to the code calling getEntry. (**)

You need to be careful to close any JAR files before they are potentially overwritten; JarFile.close in a finally-block is the usual idiom, or for indirect usage from 
URLClassLoader you must call its close method (added in Java 7).

 From the stack trace, it seems the caller is the Jenkins Maven runner (for native Maven jobs). The JAR it is loading would be that of a mojo. So my suspicion would be 
that your build is somehow running a mojo from some JAR in the workspace (or local repository) which has been modified between the time the JAR was opened and the time 
the crash occurs (the end of the mojo).

Specifically your console [1] suggests that sis-utility is running a just-built (SNAPSHOT) version of sis-build-helper:collect-jars, which could be dangerous especially 
if there are concurrent builds using the same workspace. Consider publishing release versions of mojo projects rather than running snapshots, i.e. version them 
independently of the main code.


(*) On Windows the problem cannot arise because of mandatory file locks: the attempt to change the JAR would fail with an IOException.

(**) I have advocated that libzip be used only for the bootstrap class loader, with a pure Java (perhaps NIO-based) implementation of ZipFile for user code, which would 
improve robustness and diagnosis if not actually correct the error. Last I checked the JRE team declined to put in this engineering effort, though a proposal made through 
OpenJDK might succeed if someone wanted to do the work.

[1] https://builds.apache.org/job/sis-jdk7/lastFailedBuild/console

Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello build@apache.org (copy to Apache SIS)

I don't know if I'm writing to the appropriate email address (this is my 
third email to build@apache.org)... If I'm writing to the wrong email 
address, would it be possible to redirect me to a more appropriate 
contact point please? We have a problem with either the Apache build 
infrastructure or the Oracle Java Virtual Machine. This is not a problem 
in the Apache SIS project, because pure Java code should never crash the 
JVM. While we could workaround the problem, JVM crashes are serious 
issues and I would feel safer if this issue could be addressed.

The JVM on the Jenkins server crashes (I mean crashes in native 
libraries, not a Java exception) in almost every build of the JDK7 
branch of Apache SIS. We have no native code and we don't use explicitly 
any native libraries. According to the log attached to this email, the 
crash occurs in the C/C++ implementation of 
java.util.zip.ZipFile.getEntry(String). I would like to send a bug 
report to http://bugreport.sun.com/bugreport/crash.jsp. But before doing 
so, I noticed that the JDK 7 used by Jenkins is 18 months old (built on 
June 27, 2011). Is there any way we can test the build on a more recent 
JDK 7 version on the Apache Jenkins system before to fill a bug report?

     Thanks and regards,

         Martin


RE: Systematic JVM crash on Jenkins

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi again,

As described in my previous mail, I would open an JIRA issue to request update of JDK 7 on the Solaris node! This bug can be caused by the wrong loop compilation of the Hotspot compiler (e.g. produces invlid pointer, then used by the bundled native zlib library or similar). If this issue does not fix with latest 7 update, I would open bug report. Can you reproduce this also on another machine?

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Tuesday, January 22, 2013 6:27 AM
> To: builds@apache.org
> Subject: RE: Systematic JVM crash on Jenkins
> 
> Hi,
> 
> Java 7.0-b147 is the original Java 7 GA release without any updates. You might
> remember  the bug, the Apache Lucene developers found in this version [1],
> causing e.g. SIGSEGV on startup of Apache Solr or corrupting indexes. I would
> strongly recommend to *not* use this version and update to 7u1 [2] or later!
> Of course without more details I cannot see if this is also the same bug, but in
> general the GA release is unuseable in production systems.
> 
> Uwe
> 
> [1] http://blog.thetaphi.de/2011/07/real-story-behind-java-7-ga-bugs.html
> [2] http://blog.thetaphi.de/2011/10/java-7-update-1-released-does-it-
> fix.html
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Martin Desruisseaux [mailto:martin.desruisseaux@geomatys.fr]
> > Sent: Tuesday, January 22, 2013 1:21 AM
> > To: builds@apache.org
> > Cc: Apache SIS
> > Subject: Systematic JVM crash on Jenkins
> >
> > Hello all
> >
> > The JDK7 branch of the SIS project is suffering from a systematic JVM
> > crash at building time on Jenkins. A look at the report file suggests
> > that the crash occurs in the call to the
> > java.util.zip.ZipFile.getEntry(String) method. We can workaround the
> > problem by disabling a custom Maven plugin. But a fix would of course
> > be preferred. Should I post a bug report on http://bugreport.sun.com,
> > or is there any other action that could be taken?
> >
> > The Jenkins job suffering from this crash is
> > https://builds.apache.org/job/sis-jdk7/. Below is the crash report:
> >
> > #
> > # A fatal error has been detected by the Java Runtime Environment:
> > #
> > #  SIGSEGV (0xb) at pc=0xfee952d6, pid=4889, tid=20 # # JRE version:
> > 7.0-b147 # Java VM: Java HotSpot(TM) Server VM (21.0-b17 mixed mode
> > solaris-x86 ) # Problematic frame:
> > # C  [libc.so.1+0x252d6]  memcpy+0x166 # # Core dump written. Default
> > location:
> > /zonestorage/hudson_solaris/home/hudson/hudson-
> slave/workspace/sis-
> > jdk7/core or core.4889
> > #
> > # An error report file with more information is saved as:
> > # /zonestorage/hudson_solaris/home/hudson/hudson-
> > slave/workspace/sis-jdk7/hs_err_pid4889.log
> > #
> > # If you would like to submit a bug report, please visit:
> > #http://bugreport.sun.com/bugreport/crash.jsp
> > # The crash happened outside the Java Virtual Machine in native code.
> > # See problematic frame for where to report the bug.
> > #
> >
> >
> > Thanks,
> >
> >      Martin



RE: Systematic JVM crash on Jenkins

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

Java 7.0-b147 is the original Java 7 GA release without any updates. You might remember  the bug, the Apache Lucene developers found in this version [1], causing e.g. SIGSEGV on startup of Apache Solr or corrupting indexes. I would strongly recommend to *not* use this version and update to 7u1 [2] or later! Of course without more details I cannot see if this is also the same bug, but in general the GA release is unuseable in production systems.

Uwe

[1] http://blog.thetaphi.de/2011/07/real-story-behind-java-7-ga-bugs.html
[2] http://blog.thetaphi.de/2011/10/java-7-update-1-released-does-it-fix.html

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Martin Desruisseaux [mailto:martin.desruisseaux@geomatys.fr]
> Sent: Tuesday, January 22, 2013 1:21 AM
> To: builds@apache.org
> Cc: Apache SIS
> Subject: Systematic JVM crash on Jenkins
> 
> Hello all
> 
> The JDK7 branch of the SIS project is suffering from a systematic JVM
> crash at building time on Jenkins. A look at the report file suggests
> that the crash occurs in the call to the
> java.util.zip.ZipFile.getEntry(String) method. We can workaround the
> problem by disabling a custom Maven plugin. But a fix would of course be
> preferred. Should I post a bug report on http://bugreport.sun.com, or is
> there any other action that could be taken?
> 
> The Jenkins job suffering from this crash is
> https://builds.apache.org/job/sis-jdk7/. Below is the crash report:
> 
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0xfee952d6, pid=4889, tid=20
> #
> # JRE version: 7.0-b147
> # Java VM: Java HotSpot(TM) Server VM (21.0-b17 mixed mode solaris-x86 )
> # Problematic frame:
> # C  [libc.so.1+0x252d6]  memcpy+0x166
> #
> # Core dump written. Default location:
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-
> jdk7/core or core.4889
> #
> # An error report file with more information is saved as:
> # /zonestorage/hudson_solaris/home/hudson/hudson-
> slave/workspace/sis-jdk7/hs_err_pid4889.log
> #
> # If you would like to submit a bug report, please visit:
> #http://bugreport.sun.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #
> 
> 
> Thanks,
> 
>      Martin



Fwd: Re: Systematic JVM crash on Jenkins

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello all

I'm forwarding this email for information purpose. I'm still trying to 
resolve the JVM crash on Jenkins. If I can not do any progress by next 
week, I will disable the plugin which is causing the crash.

     Martin


Re: Systematic JVM crash on Jenkins

Posted by Olivier Lamy <ol...@apache.org>.
please load a jira entry if you want an update of 1.7 in solaris (I
will have a look tomorrow to upgrade that)

2013/1/30 Ted Dunning <te...@gmail.com>:
> This is an ancient JRE.  The current series is 1.7.0_11 while the JRE in
> the stack trace mentioned here is from the original release.  Many serious
> problems were noted in early Java 7 builds.
>
> I seriously doubt that making a bug report against this version of the JDK
> will be useful.
>
> On the other hand, updating or making a more recent version available could
> well resolve the issue.
>
> On Wed, Jan 30, 2013 at 12:47 PM, Olivier Lamy <ol...@apache.org> wrote:
>
>> when the core file here https://builds.apache.org/userContent/isis/
>> will be 554M transfer will be complete.
>>
>>
>> 2013/1/30 Martin Desruisseaux <ma...@geomatys.fr>:
>> > Hello build master
>> >
>> > We are still suffering from systematic JVM crashes when building Apache
>> SIS
>> > on Jenkins. I don't think that this is the fault of our application
>> since we
>> > have no native code, only pure Java. The crash seems to occur in
>> > java.util.zip.ZipFile code. According to the crash report, core dumps
>> should
>> > exist in the following location:
>> >
>> >
>> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core
>> > or core.2143
>> >
>> > Would it be possible to send me those files please, so I can write a bug
>> > report to http://bugreport.sun.com/bugreport/crash.jsp ? Or unless
>> there is
>> > another procedure that I should follow?
>> >
>> >     Thanks for your help,
>> >
>> >         Martin
>> >
>> >
>> >
>> > Le 22/01/13 01:21, Martin Desruisseaux a écrit :
>> >>
>> >> Hello all
>> >>
>> >> The JDK7 branch of the SIS project is suffering from a systematic JVM
>> >> crash at building time on Jenkins. A look at the report file suggests
>> that
>> >> the crash occurs in the call to the
>> java.util.zip.ZipFile.getEntry(String)
>> >> method. We can workaround the problem by disabling a custom Maven
>> plugin.
>> >> But a fix would of course be preferred. Should I post a bug report on
>> >> http://bugreport.sun.com, or is there any other action that could be
>> taken?
>> >>
>> >> The Jenkins job suffering from this crash is
>> >> https://builds.apache.org/job/sis-jdk7/. Below is the crash report:
>> >>
>> >> #
>> >> # A fatal error has been detected by the Java Runtime Environment:
>> >> #
>> >> #  SIGSEGV (0xb) at pc=0xfee952d6, pid=4889, tid=20
>> >> #
>> >> # JRE version: 7.0-b147
>> >> # Java VM: Java HotSpot(TM) Server VM (21.0-b17 mixed mode solaris-x86 )
>> >> # Problematic frame:
>> >> # C  [libc.so.1+0x252d6]  memcpy+0x166
>> >> #
>> >> # Core dump written. Default location:
>> >>
>> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core
>> >> or core.4889
>> >> #
>> >> # An error report file with more information is saved as:
>> >> #
>> >>
>> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/hs_err_pid4889.log
>> >> #
>> >> # If you would like to submit a bug report, please visit:
>> >> #http://bugreport.sun.com/bugreport/crash.jsp
>> >> # The crash happened outside the Java Virtual Machine in native code.
>> >> # See problematic frame for where to report the bug.
>> >> #
>> >
>> >
>>
>>
>>
>> --
>> Olivier Lamy
>> Talend: http://coders.talend.com
>> http://twitter.com/olamy | http://linkedin.com/in/olamy
>>



-- 
Olivier Lamy
Talend: http://coders.talend.com
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: Systematic JVM crash on Jenkins

Posted by Ted Dunning <te...@gmail.com>.
This is an ancient JRE.  The current series is 1.7.0_11 while the JRE in
the stack trace mentioned here is from the original release.  Many serious
problems were noted in early Java 7 builds.

I seriously doubt that making a bug report against this version of the JDK
will be useful.

On the other hand, updating or making a more recent version available could
well resolve the issue.

On Wed, Jan 30, 2013 at 12:47 PM, Olivier Lamy <ol...@apache.org> wrote:

> when the core file here https://builds.apache.org/userContent/isis/
> will be 554M transfer will be complete.
>
>
> 2013/1/30 Martin Desruisseaux <ma...@geomatys.fr>:
> > Hello build master
> >
> > We are still suffering from systematic JVM crashes when building Apache
> SIS
> > on Jenkins. I don't think that this is the fault of our application
> since we
> > have no native code, only pure Java. The crash seems to occur in
> > java.util.zip.ZipFile code. According to the crash report, core dumps
> should
> > exist in the following location:
> >
> >
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core
> > or core.2143
> >
> > Would it be possible to send me those files please, so I can write a bug
> > report to http://bugreport.sun.com/bugreport/crash.jsp ? Or unless
> there is
> > another procedure that I should follow?
> >
> >     Thanks for your help,
> >
> >         Martin
> >
> >
> >
> > Le 22/01/13 01:21, Martin Desruisseaux a écrit :
> >>
> >> Hello all
> >>
> >> The JDK7 branch of the SIS project is suffering from a systematic JVM
> >> crash at building time on Jenkins. A look at the report file suggests
> that
> >> the crash occurs in the call to the
> java.util.zip.ZipFile.getEntry(String)
> >> method. We can workaround the problem by disabling a custom Maven
> plugin.
> >> But a fix would of course be preferred. Should I post a bug report on
> >> http://bugreport.sun.com, or is there any other action that could be
> taken?
> >>
> >> The Jenkins job suffering from this crash is
> >> https://builds.apache.org/job/sis-jdk7/. Below is the crash report:
> >>
> >> #
> >> # A fatal error has been detected by the Java Runtime Environment:
> >> #
> >> #  SIGSEGV (0xb) at pc=0xfee952d6, pid=4889, tid=20
> >> #
> >> # JRE version: 7.0-b147
> >> # Java VM: Java HotSpot(TM) Server VM (21.0-b17 mixed mode solaris-x86 )
> >> # Problematic frame:
> >> # C  [libc.so.1+0x252d6]  memcpy+0x166
> >> #
> >> # Core dump written. Default location:
> >>
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core
> >> or core.4889
> >> #
> >> # An error report file with more information is saved as:
> >> #
> >>
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/hs_err_pid4889.log
> >> #
> >> # If you would like to submit a bug report, please visit:
> >> #http://bugreport.sun.com/bugreport/crash.jsp
> >> # The crash happened outside the Java Virtual Machine in native code.
> >> # See problematic frame for where to report the bug.
> >> #
> >
> >
>
>
>
> --
> Olivier Lamy
> Talend: http://coders.talend.com
> http://twitter.com/olamy | http://linkedin.com/in/olamy
>

Re: Systematic JVM crash on Jenkins

Posted by Olivier Lamy <ol...@apache.org>.
when the core file here https://builds.apache.org/userContent/isis/
will be 554M transfer will be complete.


2013/1/30 Martin Desruisseaux <ma...@geomatys.fr>:
> Hello build master
>
> We are still suffering from systematic JVM crashes when building Apache SIS
> on Jenkins. I don't think that this is the fault of our application since we
> have no native code, only pure Java. The crash seems to occur in
> java.util.zip.ZipFile code. According to the crash report, core dumps should
> exist in the following location:
>
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core
> or core.2143
>
> Would it be possible to send me those files please, so I can write a bug
> report to http://bugreport.sun.com/bugreport/crash.jsp ? Or unless there is
> another procedure that I should follow?
>
>     Thanks for your help,
>
>         Martin
>
>
>
> Le 22/01/13 01:21, Martin Desruisseaux a écrit :
>>
>> Hello all
>>
>> The JDK7 branch of the SIS project is suffering from a systematic JVM
>> crash at building time on Jenkins. A look at the report file suggests that
>> the crash occurs in the call to the java.util.zip.ZipFile.getEntry(String)
>> method. We can workaround the problem by disabling a custom Maven plugin.
>> But a fix would of course be preferred. Should I post a bug report on
>> http://bugreport.sun.com, or is there any other action that could be taken?
>>
>> The Jenkins job suffering from this crash is
>> https://builds.apache.org/job/sis-jdk7/. Below is the crash report:
>>
>> #
>> # A fatal error has been detected by the Java Runtime Environment:
>> #
>> #  SIGSEGV (0xb) at pc=0xfee952d6, pid=4889, tid=20
>> #
>> # JRE version: 7.0-b147
>> # Java VM: Java HotSpot(TM) Server VM (21.0-b17 mixed mode solaris-x86 )
>> # Problematic frame:
>> # C  [libc.so.1+0x252d6]  memcpy+0x166
>> #
>> # Core dump written. Default location:
>> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core
>> or core.4889
>> #
>> # An error report file with more information is saved as:
>> #
>> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/hs_err_pid4889.log
>> #
>> # If you would like to submit a bug report, please visit:
>> #http://bugreport.sun.com/bugreport/crash.jsp
>> # The crash happened outside the Java Virtual Machine in native code.
>> # See problematic frame for where to report the bug.
>> #
>
>



-- 
Olivier Lamy
Talend: http://coders.talend.com
http://twitter.com/olamy | http://linkedin.com/in/olamy

Re: Systematic JVM crash on Jenkins

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello build master

We are still suffering from systematic JVM crashes when building Apache 
SIS on Jenkins. I don't think that this is the fault of our application 
since we have no native code, only pure Java. The crash seems to occur 
in java.util.zip.ZipFile code. According to the crash report, core dumps 
should exist in the following location:

/zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core 
or core.2143

Would it be possible to send me those files please, so I can write a bug 
report to http://bugreport.sun.com/bugreport/crash.jsp ? Or unless there 
is another procedure that I should follow?

     Thanks for your help,

         Martin



Le 22/01/13 01:21, Martin Desruisseaux a écrit :
> Hello all
>
> The JDK7 branch of the SIS project is suffering from a systematic JVM 
> crash at building time on Jenkins. A look at the report file suggests 
> that the crash occurs in the call to the 
> java.util.zip.ZipFile.getEntry(String) method. We can workaround the 
> problem by disabling a custom Maven plugin. But a fix would of course 
> be preferred. Should I post a bug report on http://bugreport.sun.com, 
> or is there any other action that could be taken?
>
> The Jenkins job suffering from this crash is 
> https://builds.apache.org/job/sis-jdk7/. Below is the crash report:
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0xfee952d6, pid=4889, tid=20
> #
> # JRE version: 7.0-b147
> # Java VM: Java HotSpot(TM) Server VM (21.0-b17 mixed mode solaris-x86 )
> # Problematic frame:
> # C  [libc.so.1+0x252d6]  memcpy+0x166
> #
> # Core dump written. Default location: 
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/core 
> or core.4889
> #
> # An error report file with more information is saved as:
> # 
> /zonestorage/hudson_solaris/home/hudson/hudson-slave/workspace/sis-jdk7/hs_err_pid4889.log
> #
> # If you would like to submit a bug report, please visit:
> #http://bugreport.sun.com/bugreport/crash.jsp
> # The crash happened outside the Java Virtual Machine in native code.
> # See problematic frame for where to report the bug.
> #


Apache build infrastructure or Oracle JVM problem: crash in native JDK code

Posted by Martin Desruisseaux <ma...@geomatys.fr>.
Hello build@apache.org (copy to Apache SIS)

I don't know if I'm writing to the appropriate email address (this is my 
third email to build@apache.org)... If I'm writing to the wrong email 
address, would it be possible to redirect me to a more appropriate 
contact point please? We have a problem with either the Apache build 
infrastructure or the Oracle Java Virtual Machine. This is not a problem 
in the Apache SIS project, because pure Java code should never crash the 
JVM. While we could workaround the problem, JVM crashes are serious 
issues and I would feel safer if this issue could be addressed.

The JVM on the Jenkins server crashes (I mean crashes in native 
libraries, not a Java exception) in almost every build of the JDK7 
branch of Apache SIS. We have no native code and we don't use explicitly 
any native libraries. According to the log attached to this email, the 
crash occurs in the C/C++ implementation of 
java.util.zip.ZipFile.getEntry(String). I would like to send a bug 
report to http://bugreport.sun.com/bugreport/crash.jsp. But before doing 
so, I noticed that the JDK 7 used by Jenkins is 18 months old (built on 
June 27, 2011). Is there any way we can test the build on a more recent 
JDK 7 version on the Apache Jenkins system before to fill a bug report?

     Thanks and regards,

         Martin