You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Uwe Schindler <uw...@thetaphi.de> on 2015/07/24 09:23:37 UTC

Solr 5.x tests sometimes fail with PermGen error

Hi,

(this is unrelated to my permgen improvements yesterday about the Ant build). This mail is about the test runners. I had to kill builds on MacOSX quite often because the test runner went into a permgen error. The problem: Killing the jenkins job was not enough, because the test runners not even reponsed to sigterm. You had to kill them (kill -9) it, otherwise it never dies. There is nothing the test runner can do, because it died completely.

The reason is that on JDK 1.7 there is still permgen used and the test seem to load too many classes or whatever, no idea:
http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2492/
(search for permgen n the logs).

Please fix this before release, this happened quite often the last 2 weeks!
Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Solr 5.x tests sometimes fail with PermGen error

Posted by Uwe Schindler <uw...@thetaphi.de>.
FYI: Windows does not fail, although it uses 2 JVMs to run tests, because it excludes HDFS/Hadoop.

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Friday, July 24, 2015 4:47 PM
> To: dev@lucene.apache.org
> Subject: RE: Solr 5.x tests sometimes fail with PermGen error
> 
> I reopened https://issues.apache.org/jira/browse/SOLR-5022.
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Friday, July 24, 2015 4:14 PM
> > To: dev@lucene.apache.org
> > Subject: RE: Solr 5.x tests sometimes fail with PermGen error
> >
> > Next failure:
> > http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2493/console
> >
> > Had to kill manually. We should really fix that! No build with JDK 1.7
> > succeeds anymore! So something must have changed that uses tons of
> permgen!
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> > > -----Original Message-----
> > > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > > Sent: Friday, July 24, 2015 9:35 AM
> > > To: dev@lucene.apache.org
> > > Subject: RE: Solr 5.x tests sometimes fail with PermGen error
> > >
> > > Hi,
> > >
> > > > Permgens are like JVM hanging completely -- there's not much the
> > > > test runner can do (because everything is typically and
> > > > effectively dead on the Java side of things).
> > > >
> > > > It needs to be solved in the code; there's very likely multiple
> > > > classloaders being loaded and something prevents them from being
> > > released.
> > >
> > > SolrResourceLoader... Maybe one of the test creates too many cores
> > > in parallel. As said, this only happened recently, and you can
> > > reproduce it. The reason why it mainly happens with MacOSX is the
> > > fact that this one runs on Policeman Jenkins with -Dtests.jvms=2, so
> > > each JVM has to run longer and each one fills permgen with more classes.
> > >
> > > So to debug, ideally run the tests with -Dtests.jvms=1, then it
> > > permgen- oom's  almost certainly!
> > >
> > > > Dawid
> > > >
> > > > On Fri, Jul 24, 2015 at 9:23 AM, Uwe Schindler <uw...@thetaphi.de>
> > wrote:
> > > > > Hi,
> > > > >
> > > > > (this is unrelated to my permgen improvements yesterday about
> > > > > the Ant
> > > > build). This mail is about the test runners. I had to kill builds
> > > > on MacOSX quite often because the test runner went into a permgen
> error.
> > > The problem:
> > > > Killing the jenkins job was not enough, because the test runners
> > > > not even reponsed to sigterm. You had to kill them (kill -9) it,
> > > > otherwise it never
> > > dies.
> > > > There is nothing the test runner can do, because it died completely.
> > > > >
> > > > > The reason is that on JDK 1.7 there is still permgen used and
> > > > > the test seem
> > > > to load too many classes or whatever, no idea:
> > > > > http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2492/
> > > > > (search for permgen n the logs).
> > > > >
> > > > > Please fix this before release, this happened quite often the
> > > > > last
> > > > > 2
> > > weeks!
> > > > > Uwe
> > > > >
> > > > > -----
> > > > > Uwe Schindler
> > > > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > > > eMail: uwe@thetaphi.de
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ----------------------------------------------------------------
> > > > > --
> > > > > --- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > > > For additional commands, e-mail: dev-help@lucene.apache.org
> > > > >
> > > >
> > > > ------------------------------------------------------------------
> > > > --
> > > > - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > > additional commands, e-mail: dev-help@lucene.apache.org
> > >
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Solr 5.x tests sometimes fail with PermGen error

Posted by Uwe Schindler <uw...@thetaphi.de>.
I reopened https://issues.apache.org/jira/browse/SOLR-5022.

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Friday, July 24, 2015 4:14 PM
> To: dev@lucene.apache.org
> Subject: RE: Solr 5.x tests sometimes fail with PermGen error
> 
> Next failure:
> http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2493/console
> 
> Had to kill manually. We should really fix that! No build with JDK 1.7 succeeds
> anymore! So something must have changed that uses tons of permgen!
> 
> Uwe
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Friday, July 24, 2015 9:35 AM
> > To: dev@lucene.apache.org
> > Subject: RE: Solr 5.x tests sometimes fail with PermGen error
> >
> > Hi,
> >
> > > Permgens are like JVM hanging completely -- there's not much the
> > > test runner can do (because everything is typically and effectively
> > > dead on the Java side of things).
> > >
> > > It needs to be solved in the code; there's very likely multiple
> > > classloaders being loaded and something prevents them from being
> > released.
> >
> > SolrResourceLoader... Maybe one of the test creates too many cores in
> > parallel. As said, this only happened recently, and you can reproduce
> > it. The reason why it mainly happens with MacOSX is the fact that this
> > one runs on Policeman Jenkins with -Dtests.jvms=2, so each JVM has to
> > run longer and each one fills permgen with more classes.
> >
> > So to debug, ideally run the tests with -Dtests.jvms=1, then it
> > permgen- oom's  almost certainly!
> >
> > > Dawid
> > >
> > > On Fri, Jul 24, 2015 at 9:23 AM, Uwe Schindler <uw...@thetaphi.de>
> wrote:
> > > > Hi,
> > > >
> > > > (this is unrelated to my permgen improvements yesterday about the
> > > > Ant
> > > build). This mail is about the test runners. I had to kill builds on
> > > MacOSX quite often because the test runner went into a permgen error.
> > The problem:
> > > Killing the jenkins job was not enough, because the test runners not
> > > even reponsed to sigterm. You had to kill them (kill -9) it,
> > > otherwise it never
> > dies.
> > > There is nothing the test runner can do, because it died completely.
> > > >
> > > > The reason is that on JDK 1.7 there is still permgen used and the
> > > > test seem
> > > to load too many classes or whatever, no idea:
> > > > http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2492/
> > > > (search for permgen n the logs).
> > > >
> > > > Please fix this before release, this happened quite often the last
> > > > 2
> > weeks!
> > > > Uwe
> > > >
> > > > -----
> > > > Uwe Schindler
> > > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > > eMail: uwe@thetaphi.de
> > > >
> > > >
> > > >
> > > >
> > > > ------------------------------------------------------------------
> > > > --- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > > additional commands, e-mail: dev-help@lucene.apache.org
> > > >
> > >
> > > --------------------------------------------------------------------
> > > - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Solr 5.x tests sometimes fail with PermGen error

Posted by Uwe Schindler <uw...@thetaphi.de>.
Next failure:
http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2493/console

Had to kill manually. We should really fix that! No build with JDK 1.7 succeeds anymore! So something must have changed that uses tons of permgen!

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Friday, July 24, 2015 9:35 AM
> To: dev@lucene.apache.org
> Subject: RE: Solr 5.x tests sometimes fail with PermGen error
> 
> Hi,
> 
> > Permgens are like JVM hanging completely -- there's not much the test
> > runner can do (because everything is typically and effectively dead on
> > the Java side of things).
> >
> > It needs to be solved in the code; there's very likely multiple
> > classloaders being loaded and something prevents them from being
> released.
> 
> SolrResourceLoader... Maybe one of the test creates too many cores in
> parallel. As said, this only happened recently, and you can reproduce it. The
> reason why it mainly happens with MacOSX is the fact that this one runs on
> Policeman Jenkins with -Dtests.jvms=2, so each JVM has to run longer and
> each one fills permgen with more classes.
> 
> So to debug, ideally run the tests with -Dtests.jvms=1, then it permgen-
> oom's  almost certainly!
> 
> > Dawid
> >
> > On Fri, Jul 24, 2015 at 9:23 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> > > Hi,
> > >
> > > (this is unrelated to my permgen improvements yesterday about the
> > > Ant
> > build). This mail is about the test runners. I had to kill builds on
> > MacOSX quite often because the test runner went into a permgen error.
> The problem:
> > Killing the jenkins job was not enough, because the test runners not
> > even reponsed to sigterm. You had to kill them (kill -9) it, otherwise it never
> dies.
> > There is nothing the test runner can do, because it died completely.
> > >
> > > The reason is that on JDK 1.7 there is still permgen used and the
> > > test seem
> > to load too many classes or whatever, no idea:
> > > http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2492/
> > > (search for permgen n the logs).
> > >
> > > Please fix this before release, this happened quite often the last 2
> weeks!
> > > Uwe
> > >
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> > commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Solr 5.x tests sometimes fail with PermGen error

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

> Permgens are like JVM hanging completely -- there's not much the test
> runner can do (because everything is typically and effectively dead on the
> Java side of things).
> 
> It needs to be solved in the code; there's very likely multiple classloaders
> being loaded and something prevents them from being released.

SolrResourceLoader... Maybe one of the test creates too many cores in parallel. As said, this only happened recently, and you can reproduce it. The reason why it mainly happens with MacOSX is the fact that this one runs on Policeman Jenkins with -Dtests.jvms=2, so each JVM has to run longer and each one fills permgen with more classes.

So to debug, ideally run the tests with -Dtests.jvms=1, then it permgen-oom's  almost certainly!

> Dawid
> 
> On Fri, Jul 24, 2015 at 9:23 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> > Hi,
> >
> > (this is unrelated to my permgen improvements yesterday about the Ant
> build). This mail is about the test runners. I had to kill builds on MacOSX quite
> often because the test runner went into a permgen error. The problem:
> Killing the jenkins job was not enough, because the test runners not even
> reponsed to sigterm. You had to kill them (kill -9) it, otherwise it never dies.
> There is nothing the test runner can do, because it died completely.
> >
> > The reason is that on JDK 1.7 there is still permgen used and the test seem
> to load too many classes or whatever, no idea:
> > http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2492/
> > (search for permgen n the logs).
> >
> > Please fix this before release, this happened quite often the last 2 weeks!
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Solr 5.x tests sometimes fail with PermGen error

Posted by Dawid Weiss <da...@gmail.com>.
Permgens are like JVM hanging completely -- there's not much the test
runner can do (because everything is typically and effectively dead on
the Java side of things).

It needs to be solved in the code; there's very likely multiple
classloaders being loaded and something prevents them from being
released.

Dawid

On Fri, Jul 24, 2015 at 9:23 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi,
>
> (this is unrelated to my permgen improvements yesterday about the Ant build). This mail is about the test runners. I had to kill builds on MacOSX quite often because the test runner went into a permgen error. The problem: Killing the jenkins job was not enough, because the test runners not even reponsed to sigterm. You had to kill them (kill -9) it, otherwise it never dies. There is nothing the test runner can do, because it died completely.
>
> The reason is that on JDK 1.7 there is still permgen used and the test seem to load too many classes or whatever, no idea:
> http://jenkins.thetaphi.de/job/Lucene-Solr-5.x-MacOSX/2492/
> (search for permgen n the logs).
>
> Please fix this before release, this happened quite often the last 2 weeks!
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org