You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@lucene.apache.org by Uwe Schindler <uw...@thetaphi.de> on 2013/07/05 15:58:42 UTC

Several builds hanging pecause of permgen

Several Jenkins builds now hang because of permgen. The runner JVM is dead (can only be killed by -9), last example:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de




---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Several builds hanging pecause of permgen

Posted by Shai Erera <se...@gmail.com>.
FWIW, we're testing indexing on hadoop too, using LocalJobRunner, no
MiniDFCluster (since we're not after testing HDFS, only our MapReduce
logic) and it works on Windows and Linux.

Not everything is testable that way though, for instance there were bugs
that were only exposed on a true HDFS (but not GPFS), but since we develop
both on Windows and Linux, we chose not to try force MiniDFSCluster to run
on Windows. I tried w/ the community, but it seemed they didn't care much
about it.

I don't know what Solr tests, and why it needs MiniDFSCluster, just
pointing out that you can test MR jobs on Windows too.

Shai


On Tue, Jul 9, 2013 at 10:27 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Hi Mark,
>
> > > I am sure, it's horrible slowdoop!
> >
> > It doesn't bother me that you say slowdoop or say that you said that
> hadoop
> > was made by first year students on twitter - but I will point out it
> kind of
> > makes you look like a dick - a community of people work on that project
> that
> > is very similar to the one that works on Lucene/Solr. And I know you, so
> it
> > doesn't really affect my view of you, but it's kind of unbecoming for a
> PMC
> > chair to besmirch another Apache project by referring to it with a
> derogatory
> > nickname.... FWIW - no real skin off my nose - besmirch away if that's
> what
> > you enjoy.
>
> My problem is not with Hadoop, my problem is that this is part of main
> Solr module and is not in a contrib module (where it should live). It was
> committed without discussing about all the integration problems, like
> test-only JAR files in different versions like we currently have. I have
> nothing against Hadoop, but I feel like we should not use code that has a
> bad design and is not platform independent (I am talking about the test
> dependencies only!). Isn't it possible to test our code without using this
> MiniDFSCluster? This component also seems to be the cause of the permgen
> errors: We need to tune it or replace it by something that works correctly
> - also on Windows!
>
> I agree the "slow" nickname is bad - the parts of hadoop code I looked at
> should be updated, including the Cygwin requirement (MiniDFSCluster, UNIX
> Shell code). But the response time to new issues seems to be low (see the
> Locale problems where you submitted a 2-line patch), so I don't want to
> submit patches or issues at the moment.
>
> Uwe
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Mark,

> > I am sure, it's horrible slowdoop!
> 
> It doesn't bother me that you say slowdoop or say that you said that hadoop
> was made by first year students on twitter - but I will point out it kind of
> makes you look like a dick - a community of people work on that project that
> is very similar to the one that works on Lucene/Solr. And I know you, so it
> doesn't really affect my view of you, but it's kind of unbecoming for a PMC
> chair to besmirch another Apache project by referring to it with a derogatory
> nickname.... FWIW - no real skin off my nose - besmirch away if that's what
> you enjoy.

My problem is not with Hadoop, my problem is that this is part of main Solr module and is not in a contrib module (where it should live). It was committed without discussing about all the integration problems, like test-only JAR files in different versions like we currently have. I have nothing against Hadoop, but I feel like we should not use code that has a bad design and is not platform independent (I am talking about the test dependencies only!). Isn't it possible to test our code without using this MiniDFSCluster? This component also seems to be the cause of the permgen errors: We need to tune it or replace it by something that works correctly - also on Windows!

I agree the "slow" nickname is bad - the parts of hadoop code I looked at should be updated, including the Cygwin requirement (MiniDFSCluster, UNIX Shell code). But the response time to new issues seems to be low (see the Locale problems where you submitted a 2-line patch), so I don't want to submit patches or issues at the moment.

Uwe


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Several builds hanging pecause of permgen

Posted by Mark Miller <ma...@gmail.com>.
On Jul 6, 2013, at 4:16 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

> I am sure, it's horrible slowdoop!

It doesn't bother me that you say slowdoop or say that you said that hadoop was made by first year students on twitter - but I will point out it kind of makes you look like a dick - a community of people work on that project that is very similar to the one that works on Lucene/Solr. And I know you, so it doesn't really affect my view of you, but it's kind of unbecoming for a PMC chair to besmirch another Apache project by referring to it with a derogatory nickname.... FWIW - no real skin off my nose - besmirch away if that's what you enjoy.

- Mark
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
I know how to reproduce easier:

It’s not the number of multiplier, it’s the number of parallel JVMs: If you run with many CPUs, the test runner runs tests in multiple JVMs, so every JVMs runs less tests. The Jenkins server on MacOSX runs with one JVM only (because the virtual box has only 2 virtual CPUs). So all tests have to share the permgen.

 

Something seems to eat all permgen, not by loading classes, but by interning strings. And that’s the issue here.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Uwe Schindler [mailto:uwe@thetaphi.de] 
Sent: Tuesday, July 09, 2013 11:48 AM
To: dev@lucene.apache.org
Subject: RE: Several builds hanging pecause of permgen

 

You need to add tests.multiplier=3 to hit the bug faster, otherwise it depends on test seed.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Alan Woodward [mailto:alan@flax.co.uk] 
Sent: Tuesday, July 09, 2013 11:17 AM
To: dev@lucene.apache.org
Subject: Re: Several builds hanging pecause of permgen

 

SOLR-4948 changed the test's default CoreContainer from using a hardcoded JulWatcher to whatever would be picked up in the standard config, which will be log4j if that's what's configured for tests.  Could that be the cause?  When did the permgen crashes start?

 

FWIW, I'm not seeing any permgen errors locally (MacOS, Java 7).

 

Alan Woodward
www.flax.co.uk

 

On 9 Jul 2013, at 08:57, Uwe Schindler wrote:

 

I digged more:

In most heap dumps, the failing thread looks like that, so it looks like Log4J is the bad guy - I looked at the code, it does tricky stuff to get stack traces but I found no String.intern() anywhere.

My question: When did we change tests to use log4j? Was this before/after the hadoop change? Maybe it's not related to hadoop, and is more a bug in Log4J. I find similar stack traces all around the internet always linked to Log4J.

"qtp1522342989-5342" prio=5 tid=5342 RUNNABLE
          at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:25)
          at java.lang.Throwable.getStackTraceElement(Native Method)
          at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
          at java.lang.Throwable.getStackTrace(Throwable.java:582)
          at sun.reflect.GeneratedMethodAccessor2.invoke(<unknown string>)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.log4j.spi.LocationInfo.<init>(LocationInfo.java:139)
            Local Variable: org.apache.log4j.spi.LocationInfo#52
            Local Variable: java.lang.String#28067
            Local Variable: java.lang.Throwable#1
          at org.apache.log4j.spi.LoggingEvent.getLocationInformation(LoggingEvent.java:253)
          at org.apache.solr.util.SolrLogLayout._format(SolrLogLayout.java:122)
            Local Variable: java.lang.StringBuilder#32
            Local Variable: java.lang.String#26042
            Local Variable: org.apache.solr.util.SolrLogLayout#1
          at org.apache.solr.util.SolrLogLayout.format(SolrLogLayout.java:110)
          at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
            Local Variable: org.apache.log4j.ConsoleAppender#1
          at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
          at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
          at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
            Local Variable: org.apache.log4j.helpers.AppenderAttachableImpl#1
          at org.apache.log4j.Category.callAppenders(Category.java:206)
            Local Variable: org.apache.log4j.spi.LoggingEvent#52
            Local Variable: org.apache.log4j.Logger#52
          at org.apache.log4j.Category.forcedLog(Category.java:391)
          at org.apache.log4j.Category.log(Category.java:856)
          at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:305)
          at org.apache.solr.core.SolrCore.execute(SolrCore.java:1909)
          at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
            Local Variable: org.apache.solr.handler.StandardRequestHandler#4
            Local Variable: org.apache.solr.core.SolrCore#3
            Local Variable: java.lang.String#25968
            Local Variable: org.apache.solr.response.SolrQueryResponse#3
            Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#15
            Local Variable: org.apache.solr.servlet.cache.Method#3
            Local Variable: org.apache.solr.servlet.SolrRequestParsers$1#3
            Local Variable: org.apache.solr.servlet.SolrDispatchFilter#2
            Local Variable: org.apache.solr.core.CoreContainer#3
            Local Variable: org.apache.solr.servlet.SolrRequestParsers#3
            Local Variable: org.apache.solr.core.SolrConfig#3
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
          at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
            Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#13
          at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
          at org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
            Local Variable: org.eclipse.jetty.server.handler.GzipHandler#2
          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
            Local Variable: org.eclipse.jetty.servlet.ServletHolder#2
            Local Variable: org.eclipse.jetty.servlet.ServletHandler#2
          at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
            Local Variable: org.eclipse.jetty.server.session.SessionHandler#2
          at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
            Local Variable: java.lang.String#26053
            Local Variable: org.eclipse.jetty.servlet.ServletContextHandler#2
            Local Variable: org.eclipse.jetty.server.Request#4
            Local Variable: javax.servlet.DispatcherType#1
            Local Variable: org.eclipse.jetty.server.Response#4
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
          at org.eclipse.jetty.server.Server.handle(Server.java:368)
          at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
            Local Variable: org.eclipse.jetty.server.Server#2
            Local Variable: java.lang.String#25761
          at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
          at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
            Local Variable: org.eclipse.jetty.io.View#5
            Local Variable: org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler#4
          at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
          at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
            Local Variable: org.eclipse.jetty.http.HttpParser#4
          at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
            Local Variable: org.eclipse.jetty.server.AsyncHttpConnection#4
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)
            Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint#4
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
            Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint$1#4
          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
          at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
          at java.lang.Thread.run(Thread.java:662)

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de




-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Tuesday, July 09, 2013 9:32 AM

To: dev@lucene.apache.org

Subject: RE: Several builds hanging pecause of permgen

 

I inspected the heap dumps already, 2 things are interesting:

 

- The heap dump file is quite small (only approx. 50 MB), so the permgen

issues start when not even -Xmx is used completely.

- There are lots of (250) reflection class loaders, but no leaks. All classloaders

exist one time, just the reflection-based sun.relect.DelegatingClassLoader

with the stubs for speeding up reflection shows up, but only with one

class/loader. Which is perfectly fine.

- ~2500 classes loaded, which is not much.

 

How do I get the number of interned Strings from jvisualvm?

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf

Of Dawid Weiss

Sent: Tuesday, July 09, 2013 9:13 AM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

What's the core of those permgen leaks, guys? Is it the number of

classes (via different classloaders)? Number of interned strings?

Maybe something is going on that we should reflect on before we raise

the default permgen limit? Our typical permgen headaches were due to

repeated classloader leaks -- is Hadoop using something like that?

 

Dawid

 

On Tue, Jul 9, 2013 at 8:25 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

Hi Mark,

 

The problem with raising permgen is:

- It's Hotspot specific only, so does not work with other JVMs

- Its no longer available in Java 8

 

I would really prefer to maybe tune the tests and maybe not create

so

many nodes in the cloud tests. It looks like the bug happens more

often with higher test multiplier (-Dtests.multiplier=3), so maybe we

can really tune that.

If we want to raise permgen, we have to do it in a similar way like

we do enable the heap dumps - with lots of <condition/> tasks in ANT...

:(

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Mark Miller [mailto:markrmiller@gmail.com]

Sent: Tuesday, July 09, 2013 5:20 AM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

Looks like we currently don't set the max perm gen for tests, so you

get the default - I think we want to change that regardless - we

don't want it to vary IMO - it should work like Xmx.

 

I think we should just set it to 128 mb, and these tests should have

plenty of room to run.

 

- Mark

 

On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com>

wrote:

 

 

On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

 

Next one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console

 

I will vote all releases of 4.4 with -1 until this is fixed!

 

You can't veto a release, so it's kind of a hollow threat ;)

 

It hangs on my local computer, too! Tests pass only ½ of the time,

the

remaining time it hangs with permgen errors.

 

Depending on the host OS and java version, I have had to raise the

max

perm gem a bit higher for heavy SolrCloud tests that also start up

hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply

a test resource issue - those tests are very heavy, which is why most

are set to run nightly - I've seen the code run heavily without any

perm gen issues though - and on some of my machines, I don't have to

raise the perm gen at all.

 

- Mark

 

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On

Behalf

Of Dawid Weiss

Sent: Monday, July 08, 2013 2:16 PM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

 

Not much I can do from my side about permgen errors. There is

really no

way to deal with these from within Java (the same process) -- you

cannot effectively handle anything because your own classes may not

load at all.

 

Dawid

 

On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de>

wrote:

Another one, this time on OSX:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Robert Muir [mailto:rcmuir@gmail.com]

Sent: Sunday, July 07, 2013 11:15 PM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

When there were leaks from static classes, we added a checker to

LuceneTestCase that looks for RAM > N and fails with debugging

information.

 

I wonder if some similar check is possible for this case (to make

it easier

than going thru heapdumps, and to find issues before crash-time)...

 

On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de>

wrote:

Another one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-

Linux/6385/testRepo

rt/

 

 

junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT

est/

 

Had to be killed with kill -9

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Saturday, July 06, 2013 10:16 PM

To: dev@lucene.apache.org

Subject: RE: Several builds hanging pecause of permgen

 

Another one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console

 

I was only able to kill the JVM with kill -9 I am sure, it's horrible

slowdoop!

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Friday, July 05, 2013 3:59 PM

To: dev@lucene.apache.org

Subject: Several builds hanging pecause of permgen

 

Several Jenkins builds now hang because of permgen. The runner

JVM

is dead (can only be killed by -9), last example:

 

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-

Linux/6360/console

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

 

 

-------------------------------------------------------------------

-- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

--------------------------------------------------------------------

- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

 

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional

commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

 


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
You need to add tests.multiplier=3 to hit the bug faster, otherwise it depends on test seed.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Alan Woodward [mailto:alan@flax.co.uk] 
Sent: Tuesday, July 09, 2013 11:17 AM
To: dev@lucene.apache.org
Subject: Re: Several builds hanging pecause of permgen

 

SOLR-4948 changed the test's default CoreContainer from using a hardcoded JulWatcher to whatever would be picked up in the standard config, which will be log4j if that's what's configured for tests.  Could that be the cause?  When did the permgen crashes start?

 

FWIW, I'm not seeing any permgen errors locally (MacOS, Java 7).

 

Alan Woodward
www.flax.co.uk

 

On 9 Jul 2013, at 08:57, Uwe Schindler wrote:





I digged more:

In most heap dumps, the failing thread looks like that, so it looks like Log4J is the bad guy - I looked at the code, it does tricky stuff to get stack traces but I found no String.intern() anywhere.

My question: When did we change tests to use log4j? Was this before/after the hadoop change? Maybe it's not related to hadoop, and is more a bug in Log4J. I find similar stack traces all around the internet always linked to Log4J.

"qtp1522342989-5342" prio=5 tid=5342 RUNNABLE
          at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:25)
          at java.lang.Throwable.getStackTraceElement(Native Method)
          at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
          at java.lang.Throwable.getStackTrace(Throwable.java:582)
          at sun.reflect.GeneratedMethodAccessor2.invoke(<unknown string>)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.log4j.spi.LocationInfo.<init>(LocationInfo.java:139)
            Local Variable: org.apache.log4j.spi.LocationInfo#52
            Local Variable: java.lang.String#28067
            Local Variable: java.lang.Throwable#1
          at org.apache.log4j.spi.LoggingEvent.getLocationInformation(LoggingEvent.java:253)
          at org.apache.solr.util.SolrLogLayout._format(SolrLogLayout.java:122)
            Local Variable: java.lang.StringBuilder#32
            Local Variable: java.lang.String#26042
            Local Variable: org.apache.solr.util.SolrLogLayout#1
          at org.apache.solr.util.SolrLogLayout.format(SolrLogLayout.java:110)
          at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
            Local Variable: org.apache.log4j.ConsoleAppender#1
          at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
          at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
          at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
            Local Variable: org.apache.log4j.helpers.AppenderAttachableImpl#1
          at org.apache.log4j.Category.callAppenders(Category.java:206)
            Local Variable: org.apache.log4j.spi.LoggingEvent#52
            Local Variable: org.apache.log4j.Logger#52
          at org.apache.log4j.Category.forcedLog(Category.java:391)
          at org.apache.log4j.Category.log(Category.java:856)
          at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:305)
          at org.apache.solr.core.SolrCore.execute(SolrCore.java:1909)
          at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
            Local Variable: org.apache.solr.handler.StandardRequestHandler#4
            Local Variable: org.apache.solr.core.SolrCore#3
            Local Variable: java.lang.String#25968
            Local Variable: org.apache.solr.response.SolrQueryResponse#3
            Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#15
            Local Variable: org.apache.solr.servlet.cache.Method#3
            Local Variable: org.apache.solr.servlet.SolrRequestParsers$1#3
            Local Variable: org.apache.solr.servlet.SolrDispatchFilter#2
            Local Variable: org.apache.solr.core.CoreContainer#3
            Local Variable: org.apache.solr.servlet.SolrRequestParsers#3
            Local Variable: org.apache.solr.core.SolrConfig#3
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
          at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
            Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#13
          at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
          at org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
            Local Variable: org.eclipse.jetty.server.handler.GzipHandler#2
          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
            Local Variable: org.eclipse.jetty.servlet.ServletHolder#2
            Local Variable: org.eclipse.jetty.servlet.ServletHandler#2
          at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
            Local Variable: org.eclipse.jetty.server.session.SessionHandler#2
          at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
            Local Variable: java.lang.String#26053
            Local Variable: org.eclipse.jetty.servlet.ServletContextHandler#2
            Local Variable: org.eclipse.jetty.server.Request#4
            Local Variable: javax.servlet.DispatcherType#1
            Local Variable: org.eclipse.jetty.server.Response#4
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
          at org.eclipse.jetty.server.Server.handle(Server.java:368)
          at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
            Local Variable: org.eclipse.jetty.server.Server#2
            Local Variable: java.lang.String#25761
          at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
          at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
            Local Variable: org.eclipse.jetty.io.View#5
            Local Variable: org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler#4
          at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
          at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
            Local Variable: org.eclipse.jetty.http.HttpParser#4
          at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
            Local Variable: org.eclipse.jetty.server.AsyncHttpConnection#4
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)
            Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint#4
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
            Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint$1#4
          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
          at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
          at java.lang.Thread.run(Thread.java:662)

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de





-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Tuesday, July 09, 2013 9:32 AM

To: dev@lucene.apache.org

Subject: RE: Several builds hanging pecause of permgen

 

I inspected the heap dumps already, 2 things are interesting:

 

- The heap dump file is quite small (only approx. 50 MB), so the permgen

issues start when not even -Xmx is used completely.

- There are lots of (250) reflection class loaders, but no leaks. All classloaders

exist one time, just the reflection-based sun.relect.DelegatingClassLoader

with the stubs for speeding up reflection shows up, but only with one

class/loader. Which is perfectly fine.

- ~2500 classes loaded, which is not much.

 

How do I get the number of interned Strings from jvisualvm?

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf

Of Dawid Weiss

Sent: Tuesday, July 09, 2013 9:13 AM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

What's the core of those permgen leaks, guys? Is it the number of

classes (via different classloaders)? Number of interned strings?

Maybe something is going on that we should reflect on before we raise

the default permgen limit? Our typical permgen headaches were due to

repeated classloader leaks -- is Hadoop using something like that?

 

Dawid

 

On Tue, Jul 9, 2013 at 8:25 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

Hi Mark,

 

The problem with raising permgen is:

- It's Hotspot specific only, so does not work with other JVMs

- Its no longer available in Java 8

 

I would really prefer to maybe tune the tests and maybe not create

so

many nodes in the cloud tests. It looks like the bug happens more

often with higher test multiplier (-Dtests.multiplier=3), so maybe we

can really tune that.

If we want to raise permgen, we have to do it in a similar way like

we do enable the heap dumps - with lots of <condition/> tasks in ANT...

:(

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Mark Miller [mailto:markrmiller@gmail.com]

Sent: Tuesday, July 09, 2013 5:20 AM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

Looks like we currently don't set the max perm gen for tests, so you

get the default - I think we want to change that regardless - we

don't want it to vary IMO - it should work like Xmx.

 

I think we should just set it to 128 mb, and these tests should have

plenty of room to run.

 

- Mark

 

On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com>

wrote:

 

 

On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

 

Next one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console

 

I will vote all releases of 4.4 with -1 until this is fixed!

 

You can't veto a release, so it's kind of a hollow threat ;)

 

It hangs on my local computer, too! Tests pass only ½ of the time,

the

remaining time it hangs with permgen errors.

 

Depending on the host OS and java version, I have had to raise the

max

perm gem a bit higher for heavy SolrCloud tests that also start up

hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply

a test resource issue - those tests are very heavy, which is why most

are set to run nightly - I've seen the code run heavily without any

perm gen issues though - and on some of my machines, I don't have to

raise the perm gen at all.

 

- Mark

 

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On

Behalf

Of Dawid Weiss

Sent: Monday, July 08, 2013 2:16 PM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

 

Not much I can do from my side about permgen errors. There is

really no

way to deal with these from within Java (the same process) -- you

cannot effectively handle anything because your own classes may not

load at all.

 

Dawid

 

On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de>

wrote:

Another one, this time on OSX:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Robert Muir [mailto:rcmuir@gmail.com]

Sent: Sunday, July 07, 2013 11:15 PM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

When there were leaks from static classes, we added a checker to

LuceneTestCase that looks for RAM > N and fails with debugging

information.

 

I wonder if some similar check is possible for this case (to make

it easier

than going thru heapdumps, and to find issues before crash-time)...

 

On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de>

wrote:

Another one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-

Linux/6385/testRepo

rt/

 

 

junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT

est/

 

Had to be killed with kill -9

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Saturday, July 06, 2013 10:16 PM

To: dev@lucene.apache.org

Subject: RE: Several builds hanging pecause of permgen

 

Another one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console

 

I was only able to kill the JVM with kill -9 I am sure, it's horrible

slowdoop!

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Friday, July 05, 2013 3:59 PM

To: dev@lucene.apache.org

Subject: Several builds hanging pecause of permgen

 

Several Jenkins builds now hang because of permgen. The runner

JVM

is dead (can only be killed by -9), last example:

 

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-

Linux/6360/console

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

 

 

-------------------------------------------------------------------

-- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

--------------------------------------------------------------------

- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

 

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional

commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

 


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
SOLR-4948 was committed after hadoop and to trunk first. But the permgen bugs started to appear on 4.x, too (before you committed).

 

I think that somehow Hadoop is causing this.

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Alan Woodward [mailto:alan@flax.co.uk] 
Sent: Tuesday, July 09, 2013 11:17 AM
To: dev@lucene.apache.org
Subject: Re: Several builds hanging pecause of permgen

 

SOLR-4948 changed the test's default CoreContainer from using a hardcoded JulWatcher to whatever would be picked up in the standard config, which will be log4j if that's what's configured for tests.  Could that be the cause?  When did the permgen crashes start?

 

FWIW, I'm not seeing any permgen errors locally (MacOS, Java 7).

 

Alan Woodward
www.flax.co.uk

 

On 9 Jul 2013, at 08:57, Uwe Schindler wrote:





I digged more:

In most heap dumps, the failing thread looks like that, so it looks like Log4J is the bad guy - I looked at the code, it does tricky stuff to get stack traces but I found no String.intern() anywhere.

My question: When did we change tests to use log4j? Was this before/after the hadoop change? Maybe it's not related to hadoop, and is more a bug in Log4J. I find similar stack traces all around the internet always linked to Log4J.

"qtp1522342989-5342" prio=5 tid=5342 RUNNABLE
          at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:25)
          at java.lang.Throwable.getStackTraceElement(Native Method)
          at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
          at java.lang.Throwable.getStackTrace(Throwable.java:582)
          at sun.reflect.GeneratedMethodAccessor2.invoke(<unknown string>)
          at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
          at java.lang.reflect.Method.invoke(Method.java:597)
          at org.apache.log4j.spi.LocationInfo.<init>(LocationInfo.java:139)
            Local Variable: org.apache.log4j.spi.LocationInfo#52
            Local Variable: java.lang.String#28067
            Local Variable: java.lang.Throwable#1
          at org.apache.log4j.spi.LoggingEvent.getLocationInformation(LoggingEvent.java:253)
          at org.apache.solr.util.SolrLogLayout._format(SolrLogLayout.java:122)
            Local Variable: java.lang.StringBuilder#32
            Local Variable: java.lang.String#26042
            Local Variable: org.apache.solr.util.SolrLogLayout#1
          at org.apache.solr.util.SolrLogLayout.format(SolrLogLayout.java:110)
          at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
            Local Variable: org.apache.log4j.ConsoleAppender#1
          at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
          at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
          at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
            Local Variable: org.apache.log4j.helpers.AppenderAttachableImpl#1
          at org.apache.log4j.Category.callAppenders(Category.java:206)
            Local Variable: org.apache.log4j.spi.LoggingEvent#52
            Local Variable: org.apache.log4j.Logger#52
          at org.apache.log4j.Category.forcedLog(Category.java:391)
          at org.apache.log4j.Category.log(Category.java:856)
          at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:305)
          at org.apache.solr.core.SolrCore.execute(SolrCore.java:1909)
          at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
            Local Variable: org.apache.solr.handler.StandardRequestHandler#4
            Local Variable: org.apache.solr.core.SolrCore#3
            Local Variable: java.lang.String#25968
            Local Variable: org.apache.solr.response.SolrQueryResponse#3
            Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#15
            Local Variable: org.apache.solr.servlet.cache.Method#3
            Local Variable: org.apache.solr.servlet.SolrRequestParsers$1#3
            Local Variable: org.apache.solr.servlet.SolrDispatchFilter#2
            Local Variable: org.apache.solr.core.CoreContainer#3
            Local Variable: org.apache.solr.servlet.SolrRequestParsers#3
            Local Variable: org.apache.solr.core.SolrConfig#3
          at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
          at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
          at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
          at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
            Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#13
          at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
          at org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
            Local Variable: org.eclipse.jetty.server.handler.GzipHandler#2
          at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
          at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
            Local Variable: org.eclipse.jetty.servlet.ServletHolder#2
            Local Variable: org.eclipse.jetty.servlet.ServletHandler#2
          at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
            Local Variable: org.eclipse.jetty.server.session.SessionHandler#2
          at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
            Local Variable: java.lang.String#26053
            Local Variable: org.eclipse.jetty.servlet.ServletContextHandler#2
            Local Variable: org.eclipse.jetty.server.Request#4
            Local Variable: javax.servlet.DispatcherType#1
            Local Variable: org.eclipse.jetty.server.Response#4
          at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
          at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
          at org.eclipse.jetty.server.Server.handle(Server.java:368)
          at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
            Local Variable: org.eclipse.jetty.server.Server#2
            Local Variable: java.lang.String#25761
          at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
          at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
            Local Variable: org.eclipse.jetty.io.View#5
            Local Variable: org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler#4
          at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
          at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
            Local Variable: org.eclipse.jetty.http.HttpParser#4
          at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
            Local Variable: org.eclipse.jetty.server.AsyncHttpConnection#4
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)
            Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint#4
          at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
            Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint$1#4
          at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
          at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
          at java.lang.Thread.run(Thread.java:662)

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de





-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Tuesday, July 09, 2013 9:32 AM

To: dev@lucene.apache.org

Subject: RE: Several builds hanging pecause of permgen

 

I inspected the heap dumps already, 2 things are interesting:

 

- The heap dump file is quite small (only approx. 50 MB), so the permgen

issues start when not even -Xmx is used completely.

- There are lots of (250) reflection class loaders, but no leaks. All classloaders

exist one time, just the reflection-based sun.relect.DelegatingClassLoader

with the stubs for speeding up reflection shows up, but only with one

class/loader. Which is perfectly fine.

- ~2500 classes loaded, which is not much.

 

How do I get the number of interned Strings from jvisualvm?

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf

Of Dawid Weiss

Sent: Tuesday, July 09, 2013 9:13 AM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

What's the core of those permgen leaks, guys? Is it the number of

classes (via different classloaders)? Number of interned strings?

Maybe something is going on that we should reflect on before we raise

the default permgen limit? Our typical permgen headaches were due to

repeated classloader leaks -- is Hadoop using something like that?

 

Dawid

 

On Tue, Jul 9, 2013 at 8:25 AM, Uwe Schindler <uw...@thetaphi.de> wrote:

Hi Mark,

 

The problem with raising permgen is:

- It's Hotspot specific only, so does not work with other JVMs

- Its no longer available in Java 8

 

I would really prefer to maybe tune the tests and maybe not create

so

many nodes in the cloud tests. It looks like the bug happens more

often with higher test multiplier (-Dtests.multiplier=3), so maybe we

can really tune that.

If we want to raise permgen, we have to do it in a similar way like

we do enable the heap dumps - with lots of <condition/> tasks in ANT...

:(

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Mark Miller [mailto:markrmiller@gmail.com]

Sent: Tuesday, July 09, 2013 5:20 AM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

Looks like we currently don't set the max perm gen for tests, so you

get the default - I think we want to change that regardless - we

don't want it to vary IMO - it should work like Xmx.

 

I think we should just set it to 128 mb, and these tests should have

plenty of room to run.

 

- Mark

 

On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com>

wrote:

 

 

On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

 

Next one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console

 

I will vote all releases of 4.4 with -1 until this is fixed!

 

You can't veto a release, so it's kind of a hollow threat ;)

 

It hangs on my local computer, too! Tests pass only ½ of the time,

the

remaining time it hangs with permgen errors.

 

Depending on the host OS and java version, I have had to raise the

max

perm gem a bit higher for heavy SolrCloud tests that also start up

hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply

a test resource issue - those tests are very heavy, which is why most

are set to run nightly - I've seen the code run heavily without any

perm gen issues though - and on some of my machines, I don't have to

raise the perm gen at all.

 

- Mark

 

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On

Behalf

Of Dawid Weiss

Sent: Monday, July 08, 2013 2:16 PM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

 

Not much I can do from my side about permgen errors. There is

really no

way to deal with these from within Java (the same process) -- you

cannot effectively handle anything because your own classes may not

load at all.

 

Dawid

 

On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de>

wrote:

Another one, this time on OSX:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Robert Muir [mailto:rcmuir@gmail.com]

Sent: Sunday, July 07, 2013 11:15 PM

To: dev@lucene.apache.org

Subject: Re: Several builds hanging pecause of permgen

 

When there were leaks from static classes, we added a checker to

LuceneTestCase that looks for RAM > N and fails with debugging

information.

 

I wonder if some similar check is possible for this case (to make

it easier

than going thru heapdumps, and to find issues before crash-time)...

 

On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de>

wrote:

Another one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-

Linux/6385/testRepo

rt/

 

 

junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT

est/

 

Had to be killed with kill -9

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Saturday, July 06, 2013 10:16 PM

To: dev@lucene.apache.org

Subject: RE: Several builds hanging pecause of permgen

 

Another one:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console

 

I was only able to kill the JVM with kill -9 I am sure, it's horrible

slowdoop!

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

-----Original Message-----

From: Uwe Schindler [mailto:uwe@thetaphi.de]

Sent: Friday, July 05, 2013 3:59 PM

To: dev@lucene.apache.org

Subject: Several builds hanging pecause of permgen

 

Several Jenkins builds now hang because of permgen. The runner

JVM

is dead (can only be killed by -9), last example:

 

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-

Linux/6360/console

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

 

 

 

-------------------------------------------------------------------

-- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

--------------------------------------------------------------------

- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional commands, e-mail: dev-help@lucene.apache.org

 

 

 

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For

additional

commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org

 

 

---------------------------------------------------------------------

To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org

For additional commands, e-mail: dev-help@lucene.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

 


Re: Several builds hanging pecause of permgen

Posted by Alan Woodward <al...@flax.co.uk>.
SOLR-4948 changed the test's default CoreContainer from using a hardcoded JulWatcher to whatever would be picked up in the standard config, which will be log4j if that's what's configured for tests.  Could that be the cause?  When did the permgen crashes start?

FWIW, I'm not seeing any permgen errors locally (MacOS, Java 7).

Alan Woodward
www.flax.co.uk


On 9 Jul 2013, at 08:57, Uwe Schindler wrote:

> I digged more:
> 
> In most heap dumps, the failing thread looks like that, so it looks like Log4J is the bad guy - I looked at the code, it does tricky stuff to get stack traces but I found no String.intern() anywhere.
> 
> My question: When did we change tests to use log4j? Was this before/after the hadoop change? Maybe it's not related to hadoop, and is more a bug in Log4J. I find similar stack traces all around the internet always linked to Log4J.
> 
> "qtp1522342989-5342" prio=5 tid=5342 RUNNABLE
> 	at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:25)
> 	at java.lang.Throwable.getStackTraceElement(Native Method)
> 	at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
> 	at java.lang.Throwable.getStackTrace(Throwable.java:582)
> 	at sun.reflect.GeneratedMethodAccessor2.invoke(<unknown string>)
> 	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> 	at java.lang.reflect.Method.invoke(Method.java:597)
> 	at org.apache.log4j.spi.LocationInfo.<init>(LocationInfo.java:139)
> 	   Local Variable: org.apache.log4j.spi.LocationInfo#52
> 	   Local Variable: java.lang.String#28067
> 	   Local Variable: java.lang.Throwable#1
> 	at org.apache.log4j.spi.LoggingEvent.getLocationInformation(LoggingEvent.java:253)
> 	at org.apache.solr.util.SolrLogLayout._format(SolrLogLayout.java:122)
> 	   Local Variable: java.lang.StringBuilder#32
> 	   Local Variable: java.lang.String#26042
> 	   Local Variable: org.apache.solr.util.SolrLogLayout#1
> 	at org.apache.solr.util.SolrLogLayout.format(SolrLogLayout.java:110)
> 	at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
> 	   Local Variable: org.apache.log4j.ConsoleAppender#1
> 	at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
> 	at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
> 	at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
> 	   Local Variable: org.apache.log4j.helpers.AppenderAttachableImpl#1
> 	at org.apache.log4j.Category.callAppenders(Category.java:206)
> 	   Local Variable: org.apache.log4j.spi.LoggingEvent#52
> 	   Local Variable: org.apache.log4j.Logger#52
> 	at org.apache.log4j.Category.forcedLog(Category.java:391)
> 	at org.apache.log4j.Category.log(Category.java:856)
> 	at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:305)
> 	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1909)
> 	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
> 	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
> 	   Local Variable: org.apache.solr.handler.StandardRequestHandler#4
> 	   Local Variable: org.apache.solr.core.SolrCore#3
> 	   Local Variable: java.lang.String#25968
> 	   Local Variable: org.apache.solr.response.SolrQueryResponse#3
> 	   Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#15
> 	   Local Variable: org.apache.solr.servlet.cache.Method#3
> 	   Local Variable: org.apache.solr.servlet.SolrRequestParsers$1#3
> 	   Local Variable: org.apache.solr.servlet.SolrDispatchFilter#2
> 	   Local Variable: org.apache.solr.core.CoreContainer#3
> 	   Local Variable: org.apache.solr.servlet.SolrRequestParsers#3
> 	   Local Variable: org.apache.solr.core.SolrConfig#3
> 	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
> 	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> 	at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
> 	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
> 	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
> 	   Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#13
> 	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
> 	at org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
> 	   Local Variable: org.eclipse.jetty.server.handler.GzipHandler#2
> 	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
> 	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
> 	   Local Variable: org.eclipse.jetty.servlet.ServletHolder#2
> 	   Local Variable: org.eclipse.jetty.servlet.ServletHandler#2
> 	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
> 	   Local Variable: org.eclipse.jetty.server.session.SessionHandler#2
> 	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
> 	   Local Variable: java.lang.String#26053
> 	   Local Variable: org.eclipse.jetty.servlet.ServletContextHandler#2
> 	   Local Variable: org.eclipse.jetty.server.Request#4
> 	   Local Variable: javax.servlet.DispatcherType#1
> 	   Local Variable: org.eclipse.jetty.server.Response#4
> 	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
> 	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
> 	at org.eclipse.jetty.server.Server.handle(Server.java:368)
> 	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
> 	   Local Variable: org.eclipse.jetty.server.Server#2
> 	   Local Variable: java.lang.String#25761
> 	at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
> 	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
> 	   Local Variable: org.eclipse.jetty.io.View#5
> 	   Local Variable: org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler#4
> 	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
> 	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
> 	   Local Variable: org.eclipse.jetty.http.HttpParser#4
> 	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
> 	   Local Variable: org.eclipse.jetty.server.AsyncHttpConnection#4
> 	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)
> 	   Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint#4
> 	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
> 	   Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint$1#4
> 	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
> 	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
> 	at java.lang.Thread.run(Thread.java:662)
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
>> -----Original Message-----
>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>> Sent: Tuesday, July 09, 2013 9:32 AM
>> To: dev@lucene.apache.org
>> Subject: RE: Several builds hanging pecause of permgen
>> 
>> I inspected the heap dumps already, 2 things are interesting:
>> 
>> - The heap dump file is quite small (only approx. 50 MB), so the permgen
>> issues start when not even -Xmx is used completely.
>> - There are lots of (250) reflection class loaders, but no leaks. All classloaders
>> exist one time, just the reflection-based sun.relect.DelegatingClassLoader
>> with the stubs for speeding up reflection shows up, but only with one
>> class/loader. Which is perfectly fine.
>> - ~2500 classes loaded, which is not much.
>> 
>> How do I get the number of interned Strings from jvisualvm?
>> 
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>> 
>> 
>>> -----Original Message-----
>>> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf
>>> Of Dawid Weiss
>>> Sent: Tuesday, July 09, 2013 9:13 AM
>>> To: dev@lucene.apache.org
>>> Subject: Re: Several builds hanging pecause of permgen
>>> 
>>> What's the core of those permgen leaks, guys? Is it the number of
>>> classes (via different classloaders)? Number of interned strings?
>>> Maybe something is going on that we should reflect on before we raise
>>> the default permgen limit? Our typical permgen headaches were due to
>>> repeated classloader leaks -- is Hadoop using something like that?
>>> 
>>> Dawid
>>> 
>>> On Tue, Jul 9, 2013 at 8:25 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
>>>> Hi Mark,
>>>> 
>>>> The problem with raising permgen is:
>>>> - It's Hotspot specific only, so does not work with other JVMs
>>>> - Its no longer available in Java 8
>>>> 
>>>> I would really prefer to maybe tune the tests and maybe not create
>>>> so
>>> many nodes in the cloud tests. It looks like the bug happens more
>>> often with higher test multiplier (-Dtests.multiplier=3), so maybe we
>>> can really tune that.
>>>> If we want to raise permgen, we have to do it in a similar way like
>>>> we do enable the heap dumps - with lots of <condition/> tasks in ANT...
>>>> :(
>>>> 
>>>> Uwe
>>>> 
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>> eMail: uwe@thetaphi.de
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: Mark Miller [mailto:markrmiller@gmail.com]
>>>>> Sent: Tuesday, July 09, 2013 5:20 AM
>>>>> To: dev@lucene.apache.org
>>>>> Subject: Re: Several builds hanging pecause of permgen
>>>>> 
>>>>> Looks like we currently don't set the max perm gen for tests, so you
>>>>> get the default - I think we want to change that regardless - we
>>>>> don't want it to vary IMO - it should work like Xmx.
>>>>> 
>>>>> I think we should just set it to 128 mb, and these tests should have
>>>>> plenty of room to run.
>>>>> 
>>>>> - Mark
>>>>> 
>>>>> On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com>
>> wrote:
>>>>> 
>>>>>> 
>>>>>> On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
>>>>>> 
>>>>>>> Next one:
>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console
>>>>>>> 
>>>>>>> I will vote all releases of 4.4 with -1 until this is fixed!
>>>>>> 
>>>>>> You can't veto a release, so it's kind of a hollow threat ;)
>>>>>> 
>>>>>>> It hangs on my local computer, too! Tests pass only ½ of the time,
>>>>>>> the
>>>>> remaining time it hangs with permgen errors.
>>>>>> 
>>>>>> Depending on the host OS and java version, I have had to raise the
>>>>>> max
>>>>> perm gem a bit higher for heavy SolrCloud tests that also start up
>>>>> hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply
>>>>> a test resource issue - those tests are very heavy, which is why most
>>>>> are set to run nightly - I've seen the code run heavily without any
>>>>> perm gen issues though - and on some of my machines, I don't have to
>>> raise the perm gen at all.
>>>>>> 
>>>>>> - Mark
>>>>>> 
>>>>>>> 
>>>>>>> Uwe
>>>>>>> 
>>>>>>> -----
>>>>>>> Uwe Schindler
>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>> eMail: uwe@thetaphi.de
>>>>>>> 
>>>>>>> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On
>>>>> Behalf
>>>>>>> Of Dawid Weiss
>>>>>>> Sent: Monday, July 08, 2013 2:16 PM
>>>>>>> To: dev@lucene.apache.org
>>>>>>> Subject: Re: Several builds hanging pecause of permgen
>>>>>>> 
>>>>>>> 
>>>>>>> Not much I can do from my side about permgen errors. There is
>>>>>>> really no
>>>>> way to deal with these from within Java (the same process) -- you
>>>>> cannot effectively handle anything because your own classes may not
>>> load at all.
>>>>>>> 
>>>>>>> Dawid
>>>>>>> 
>>>>>>> On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de>
>>>>> wrote:
>>>>>>> Another one, this time on OSX:
>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/
>>>>>>> 
>>>>>>> -----
>>>>>>> Uwe Schindler
>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>> eMail: uwe@thetaphi.de
>>>>>>> 
>>>>>>> From: Robert Muir [mailto:rcmuir@gmail.com]
>>>>>>> Sent: Sunday, July 07, 2013 11:15 PM
>>>>>>> To: dev@lucene.apache.org
>>>>>>> Subject: Re: Several builds hanging pecause of permgen
>>>>>>> 
>>>>>>> When there were leaks from static classes, we added a checker to
>>>>> LuceneTestCase that looks for RAM > N and fails with debugging
>>> information.
>>>>>>> 
>>>>>>> I wonder if some similar check is possible for this case (to make
>>>>>>> it easier
>>>>> than going thru heapdumps, and to find issues before crash-time)...
>>>>>>> 
>>>>>>> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de>
>>> wrote:
>>>>>>> Another one:
>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-
>> Linux/6385/testRepo
>>>>>>> rt/
>>>>>>> 
>>>>> 
>>> junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT
>>>>>>> est/
>>>>>>> 
>>>>>>> Had to be killed with kill -9
>>>>>>> 
>>>>>>> -----
>>>>>>> Uwe Schindler
>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>> eMail: uwe@thetaphi.de
>>>>>>> 
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>>>>>>>> Sent: Saturday, July 06, 2013 10:16 PM
>>>>>>>> To: dev@lucene.apache.org
>>>>>>>> Subject: RE: Several builds hanging pecause of permgen
>>>>>>>> 
>>>>>>>> Another one:
>>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
>>>>>>>> 
>>>>>>>> I was only able to kill the JVM with kill -9 I am sure, it's horrible
>>>>> slowdoop!
>>>>>>>> 
>>>>>>>> -----
>>>>>>>> Uwe Schindler
>>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>>> eMail: uwe@thetaphi.de
>>>>>>>> 
>>>>>>>> 
>>>>>>>>> -----Original Message-----
>>>>>>>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>>>>>>>>> Sent: Friday, July 05, 2013 3:59 PM
>>>>>>>>> To: dev@lucene.apache.org
>>>>>>>>> Subject: Several builds hanging pecause of permgen
>>>>>>>>> 
>>>>>>>>> Several Jenkins builds now hang because of permgen. The runner
>>> JVM
>>>>>>>>> is dead (can only be killed by -9), last example:
>>>>>>>>> 
>>>>>>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-
>> Linux/6360/console
>>>>>>>>> 
>>>>>>>>> -----
>>>>>>>>> Uwe Schindler
>>>>>>>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>>>>>>>>> eMail: uwe@thetaphi.de
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> -------------------------------------------------------------------
>>>>>>>>> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For
>>>>>>>>> additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>>> 
>>>>>>>> 
>>>>>>>> --------------------------------------------------------------------
>>>>>>>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>>>>>>>> additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>> 
>>>>>>> 
>>>>>>> ---------------------------------------------------------------------
>>>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>>>>>>> additional commands, e-mail: dev-help@lucene.apache.org
>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>>> 
>>>>> ---------------------------------------------------------------------
>>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>>> additional
>>>>> commands, e-mail: dev-help@lucene.apache.org
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>>> For additional commands, e-mail: dev-help@lucene.apache.org
>>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>>> For additional commands, e-mail: dev-help@lucene.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
I digged more:

In most heap dumps, the failing thread looks like that, so it looks like Log4J is the bad guy - I looked at the code, it does tricky stuff to get stack traces but I found no String.intern() anywhere.

My question: When did we change tests to use log4j? Was this before/after the hadoop change? Maybe it's not related to hadoop, and is more a bug in Log4J. I find similar stack traces all around the internet always linked to Log4J.
  
"qtp1522342989-5342" prio=5 tid=5342 RUNNABLE
	at java.lang.OutOfMemoryError.<init>(OutOfMemoryError.java:25)
	at java.lang.Throwable.getStackTraceElement(Native Method)
	at java.lang.Throwable.getOurStackTrace(Throwable.java:591)
	at java.lang.Throwable.getStackTrace(Throwable.java:582)
	at sun.reflect.GeneratedMethodAccessor2.invoke(<unknown string>)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
	at java.lang.reflect.Method.invoke(Method.java:597)
	at org.apache.log4j.spi.LocationInfo.<init>(LocationInfo.java:139)
	   Local Variable: org.apache.log4j.spi.LocationInfo#52
	   Local Variable: java.lang.String#28067
	   Local Variable: java.lang.Throwable#1
	at org.apache.log4j.spi.LoggingEvent.getLocationInformation(LoggingEvent.java:253)
	at org.apache.solr.util.SolrLogLayout._format(SolrLogLayout.java:122)
	   Local Variable: java.lang.StringBuilder#32
	   Local Variable: java.lang.String#26042
	   Local Variable: org.apache.solr.util.SolrLogLayout#1
	at org.apache.solr.util.SolrLogLayout.format(SolrLogLayout.java:110)
	at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:310)
	   Local Variable: org.apache.log4j.ConsoleAppender#1
	at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
	at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
	at org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
	   Local Variable: org.apache.log4j.helpers.AppenderAttachableImpl#1
	at org.apache.log4j.Category.callAppenders(Category.java:206)
	   Local Variable: org.apache.log4j.spi.LoggingEvent#52
	   Local Variable: org.apache.log4j.Logger#52
	at org.apache.log4j.Category.forcedLog(Category.java:391)
	at org.apache.log4j.Category.log(Category.java:856)
	at org.slf4j.impl.Log4jLoggerAdapter.info(Log4jLoggerAdapter.java:305)
	at org.apache.solr.core.SolrCore.execute(SolrCore.java:1909)
	at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:659)
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:362)
	   Local Variable: org.apache.solr.handler.StandardRequestHandler#4
	   Local Variable: org.apache.solr.core.SolrCore#3
	   Local Variable: java.lang.String#25968
	   Local Variable: org.apache.solr.response.SolrQueryResponse#3
	   Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#15
	   Local Variable: org.apache.solr.servlet.cache.Method#3
	   Local Variable: org.apache.solr.servlet.SolrRequestParsers$1#3
	   Local Variable: org.apache.solr.servlet.SolrDispatchFilter#2
	   Local Variable: org.apache.solr.core.CoreContainer#3
	   Local Variable: org.apache.solr.servlet.SolrRequestParsers#3
	   Local Variable: org.apache.solr.core.SolrConfig#3
	at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:158)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
	at org.apache.solr.client.solrj.embedded.JettySolrRunner$DebugFilter.doFilter(JettySolrRunner.java:136)
	at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1419)
	at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:455)
	   Local Variable: org.eclipse.jetty.servlet.ServletHandler$CachedChain#13
	at org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:229)
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
	at org.eclipse.jetty.server.handler.GzipHandler.handle(GzipHandler.java:301)
	   Local Variable: org.eclipse.jetty.server.handler.GzipHandler#2
	at org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1077)
	at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:384)
	   Local Variable: org.eclipse.jetty.servlet.ServletHolder#2
	   Local Variable: org.eclipse.jetty.servlet.ServletHandler#2
	at org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:193)
	   Local Variable: org.eclipse.jetty.server.session.SessionHandler#2
	at org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1009)
	   Local Variable: java.lang.String#26053
	   Local Variable: org.eclipse.jetty.servlet.ServletContextHandler#2
	   Local Variable: org.eclipse.jetty.server.Request#4
	   Local Variable: javax.servlet.DispatcherType#1
	   Local Variable: org.eclipse.jetty.server.Response#4
	at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:135)
	at org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:116)
	at org.eclipse.jetty.server.Server.handle(Server.java:368)
	at org.eclipse.jetty.server.AbstractHttpConnection.handleRequest(AbstractHttpConnection.java:489)
	   Local Variable: org.eclipse.jetty.server.Server#2
	   Local Variable: java.lang.String#25761
	at org.eclipse.jetty.server.AbstractHttpConnection.content(AbstractHttpConnection.java:953)
	at org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler.content(AbstractHttpConnection.java:1014)
	   Local Variable: org.eclipse.jetty.io.View#5
	   Local Variable: org.eclipse.jetty.server.AbstractHttpConnection$RequestHandler#4
	at org.eclipse.jetty.http.HttpParser.parseNext(HttpParser.java:861)
	at org.eclipse.jetty.http.HttpParser.parseAvailable(HttpParser.java:240)
	   Local Variable: org.eclipse.jetty.http.HttpParser#4
	at org.eclipse.jetty.server.AsyncHttpConnection.handle(AsyncHttpConnection.java:82)
	   Local Variable: org.eclipse.jetty.server.AsyncHttpConnection#4
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint.handle(SelectChannelEndPoint.java:628)
	   Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint#4
	at org.eclipse.jetty.io.nio.SelectChannelEndPoint$1.run(SelectChannelEndPoint.java:52)
	   Local Variable: org.eclipse.jetty.io.nio.SelectChannelEndPoint$1#4
	at org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:608)
	at org.eclipse.jetty.util.thread.QueuedThreadPool$3.run(QueuedThreadPool.java:543)
	at java.lang.Thread.run(Thread.java:662)

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Tuesday, July 09, 2013 9:32 AM
> To: dev@lucene.apache.org
> Subject: RE: Several builds hanging pecause of permgen
> 
> I inspected the heap dumps already, 2 things are interesting:
> 
> - The heap dump file is quite small (only approx. 50 MB), so the permgen
> issues start when not even -Xmx is used completely.
> - There are lots of (250) reflection class loaders, but no leaks. All classloaders
> exist one time, just the reflection-based sun.relect.DelegatingClassLoader
> with the stubs for speeding up reflection shows up, but only with one
> class/loader. Which is perfectly fine.
> - ~2500 classes loaded, which is not much.
> 
> How do I get the number of interned Strings from jvisualvm?
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf
> > Of Dawid Weiss
> > Sent: Tuesday, July 09, 2013 9:13 AM
> > To: dev@lucene.apache.org
> > Subject: Re: Several builds hanging pecause of permgen
> >
> > What's the core of those permgen leaks, guys? Is it the number of
> > classes (via different classloaders)? Number of interned strings?
> > Maybe something is going on that we should reflect on before we raise
> > the default permgen limit? Our typical permgen headaches were due to
> > repeated classloader leaks -- is Hadoop using something like that?
> >
> > Dawid
> >
> > On Tue, Jul 9, 2013 at 8:25 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> > > Hi Mark,
> > >
> > > The problem with raising permgen is:
> > > - It's Hotspot specific only, so does not work with other JVMs
> > > - Its no longer available in Java 8
> > >
> > > I would really prefer to maybe tune the tests and maybe not create
> > > so
> > many nodes in the cloud tests. It looks like the bug happens more
> > often with higher test multiplier (-Dtests.multiplier=3), so maybe we
> > can really tune that.
> > > If we want to raise permgen, we have to do it in a similar way like
> > > we do enable the heap dumps - with lots of <condition/> tasks in ANT...
> > > :(
> > >
> > > Uwe
> > >
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > >
> > >
> > >> -----Original Message-----
> > >> From: Mark Miller [mailto:markrmiller@gmail.com]
> > >> Sent: Tuesday, July 09, 2013 5:20 AM
> > >> To: dev@lucene.apache.org
> > >> Subject: Re: Several builds hanging pecause of permgen
> > >>
> > >> Looks like we currently don't set the max perm gen for tests, so you
> > >> get the default - I think we want to change that regardless - we
> > >> don't want it to vary IMO - it should work like Xmx.
> > >>
> > >> I think we should just set it to 128 mb, and these tests should have
> > >> plenty of room to run.
> > >>
> > >> - Mark
> > >>
> > >> On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com>
> wrote:
> > >>
> > >> >
> > >> > On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> > >> >
> > >> >> Next one:
> > >> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console
> > >> >>
> > >> >> I will vote all releases of 4.4 with -1 until this is fixed!
> > >> >
> > >> > You can't veto a release, so it's kind of a hollow threat ;)
> > >> >
> > >> >> It hangs on my local computer, too! Tests pass only ½ of the time,
> > >> >> the
> > >> remaining time it hangs with permgen errors.
> > >> >
> > >> > Depending on the host OS and java version, I have had to raise the
> > >> > max
> > >> perm gem a bit higher for heavy SolrCloud tests that also start up
> > >> hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply
> > >> a test resource issue - those tests are very heavy, which is why most
> > >> are set to run nightly - I've seen the code run heavily without any
> > >> perm gen issues though - and on some of my machines, I don't have to
> > raise the perm gen at all.
> > >> >
> > >> > - Mark
> > >> >
> > >> >>
> > >> >> Uwe
> > >> >>
> > >> >> -----
> > >> >> Uwe Schindler
> > >> >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> >> eMail: uwe@thetaphi.de
> > >> >>
> > >> >> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On
> > >> Behalf
> > >> >> Of Dawid Weiss
> > >> >> Sent: Monday, July 08, 2013 2:16 PM
> > >> >> To: dev@lucene.apache.org
> > >> >> Subject: Re: Several builds hanging pecause of permgen
> > >> >>
> > >> >>
> > >> >> Not much I can do from my side about permgen errors. There is
> > >> >> really no
> > >> way to deal with these from within Java (the same process) -- you
> > >> cannot effectively handle anything because your own classes may not
> > load at all.
> > >> >>
> > >> >> Dawid
> > >> >>
> > >> >> On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de>
> > >> wrote:
> > >> >> Another one, this time on OSX:
> > >> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/
> > >> >>
> > >> >> -----
> > >> >> Uwe Schindler
> > >> >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> >> eMail: uwe@thetaphi.de
> > >> >>
> > >> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> > >> >> Sent: Sunday, July 07, 2013 11:15 PM
> > >> >> To: dev@lucene.apache.org
> > >> >> Subject: Re: Several builds hanging pecause of permgen
> > >> >>
> > >> >> When there were leaks from static classes, we added a checker to
> > >> LuceneTestCase that looks for RAM > N and fails with debugging
> > information.
> > >> >>
> > >> >> I wonder if some similar check is possible for this case (to make
> > >> >> it easier
> > >> than going thru heapdumps, and to find issues before crash-time)...
> > >> >>
> > >> >> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de>
> > wrote:
> > >> >> Another one:
> > >> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-
> Linux/6385/testRepo
> > >> >> rt/
> > >> >>
> > >>
> > junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT
> > >> >> est/
> > >> >>
> > >> >> Had to be killed with kill -9
> > >> >>
> > >> >> -----
> > >> >> Uwe Schindler
> > >> >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> >> eMail: uwe@thetaphi.de
> > >> >>
> > >> >>
> > >> >>> -----Original Message-----
> > >> >>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > >> >>> Sent: Saturday, July 06, 2013 10:16 PM
> > >> >>> To: dev@lucene.apache.org
> > >> >>> Subject: RE: Several builds hanging pecause of permgen
> > >> >>>
> > >> >>> Another one:
> > >> >>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
> > >> >>>
> > >> >>> I was only able to kill the JVM with kill -9 I am sure, it's horrible
> > >> slowdoop!
> > >> >>>
> > >> >>> -----
> > >> >>> Uwe Schindler
> > >> >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> >>> eMail: uwe@thetaphi.de
> > >> >>>
> > >> >>>
> > >> >>>> -----Original Message-----
> > >> >>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > >> >>>> Sent: Friday, July 05, 2013 3:59 PM
> > >> >>>> To: dev@lucene.apache.org
> > >> >>>> Subject: Several builds hanging pecause of permgen
> > >> >>>>
> > >> >>>> Several Jenkins builds now hang because of permgen. The runner
> > JVM
> > >> >>>> is dead (can only be killed by -9), last example:
> > >> >>>>
> > >> >>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-
> Linux/6360/console
> > >> >>>>
> > >> >>>> -----
> > >> >>>> Uwe Schindler
> > >> >>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> > >> >>>> eMail: uwe@thetaphi.de
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>>
> > >> >>>> -------------------------------------------------------------------
> > >> >>>> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For
> > >> >>>> additional commands, e-mail: dev-help@lucene.apache.org
> > >> >>>
> > >> >>>
> > >> >>> --------------------------------------------------------------------
> > >> >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > >> >>> additional commands, e-mail: dev-help@lucene.apache.org
> > >> >>
> > >> >>
> > >> >> ---------------------------------------------------------------------
> > >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > >> >> additional commands, e-mail: dev-help@lucene.apache.org
> > >> >>
> > >> >>
> > >> >
> > >>
> > >>
> > >> ---------------------------------------------------------------------
> > >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional
> > >> commands, e-mail: dev-help@lucene.apache.org
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > > For additional commands, e-mail: dev-help@lucene.apache.org
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi,

Yeah, and permgen is not included in the dump :( http://stackoverflow.com/questions/4080010/how-to-dump-permgen
So you can only look at a live JVM, which is hard because you don't know when it dies. Once it died, it's too late as jmap in most cases cannot connect anymore (and only kill -9 helps to kill).

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf
> Of Dawid Weiss
> Sent: Tuesday, July 09, 2013 9:52 AM
> To: dev@lucene.apache.org
> Subject: Re: Several builds hanging pecause of permgen
> 
> > How do I get the number of interned Strings from jvisualvm?
> 
> Don't know about jvisualvm but:
> 
> "The jmap -permgen command prints statistics for the objects in the
> permanent generation, including information about internalized String
> instances. See   2.7.4 Getting Information on the Permanent
> Generation."
> 
> http://www.oracle.com/technetwork/java/javase/tooldescr-
> 136044.html#gblmm
> 
> There's also a magic switch to hotspot that dumps those strings:
> 
>   product(bool, PrintStringTableStatistics, false,                          \
>           "print statistics about the StringTable and SymbolTable")         \
>                                                                             \
>   notproduct(bool, PrintSymbolTableSizeHistogram, false,                    \
>           "print histogram of the symbol table")                            \
> 
> 
> D.
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Several builds hanging pecause of permgen

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
> How do I get the number of interned Strings from jvisualvm?

Don't know about jvisualvm but:

"The jmap -permgen command prints statistics for the objects in the
permanent generation, including information about internalized String
instances. See   2.7.4 Getting Information on the Permanent
Generation."

http://www.oracle.com/technetwork/java/javase/tooldescr-136044.html#gblmm

There's also a magic switch to hotspot that dumps those strings:

  product(bool, PrintStringTableStatistics, false,                          \
          "print statistics about the StringTable and SymbolTable")         \
                                                                            \
  notproduct(bool, PrintSymbolTableSizeHistogram, false,                    \
          "print histogram of the symbol table")                            \


D.

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
I inspected the heap dumps already, 2 things are interesting:

- The heap dump file is quite small (only approx. 50 MB), so the permgen issues start when not even -Xmx is used completely.
- There are lots of (250) reflection class loaders, but no leaks. All classloaders exist one time, just the reflection-based sun.relect.DelegatingClassLoader with the stubs for speeding up reflection shows up, but only with one class/loader. Which is perfectly fine.
- ~2500 classes loaded, which is not much.

How do I get the number of interned Strings from jvisualvm?

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf
> Of Dawid Weiss
> Sent: Tuesday, July 09, 2013 9:13 AM
> To: dev@lucene.apache.org
> Subject: Re: Several builds hanging pecause of permgen
> 
> What's the core of those permgen leaks, guys? Is it the number of classes
> (via different classloaders)? Number of interned strings?
> Maybe something is going on that we should reflect on before we raise the
> default permgen limit? Our typical permgen headaches were due to
> repeated classloader leaks -- is Hadoop using something like that?
> 
> Dawid
> 
> On Tue, Jul 9, 2013 at 8:25 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> > Hi Mark,
> >
> > The problem with raising permgen is:
> > - It's Hotspot specific only, so does not work with other JVMs
> > - Its no longer available in Java 8
> >
> > I would really prefer to maybe tune the tests and maybe not create so
> many nodes in the cloud tests. It looks like the bug happens more often with
> higher test multiplier (-Dtests.multiplier=3), so maybe we can really tune
> that.
> > If we want to raise permgen, we have to do it in a similar way like we
> > do enable the heap dumps - with lots of <condition/> tasks in ANT...
> > :(
> >
> > Uwe
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >> -----Original Message-----
> >> From: Mark Miller [mailto:markrmiller@gmail.com]
> >> Sent: Tuesday, July 09, 2013 5:20 AM
> >> To: dev@lucene.apache.org
> >> Subject: Re: Several builds hanging pecause of permgen
> >>
> >> Looks like we currently don't set the max perm gen for tests, so you
> >> get the default - I think we want to change that regardless - we
> >> don't want it to vary IMO - it should work like Xmx.
> >>
> >> I think we should just set it to 128 mb, and these tests should have
> >> plenty of room to run.
> >>
> >> - Mark
> >>
> >> On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com> wrote:
> >>
> >> >
> >> > On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> >> >
> >> >> Next one:
> >> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console
> >> >>
> >> >> I will vote all releases of 4.4 with -1 until this is fixed!
> >> >
> >> > You can't veto a release, so it's kind of a hollow threat ;)
> >> >
> >> >> It hangs on my local computer, too! Tests pass only ½ of the time,
> >> >> the
> >> remaining time it hangs with permgen errors.
> >> >
> >> > Depending on the host OS and java version, I have had to raise the
> >> > max
> >> perm gem a bit higher for heavy SolrCloud tests that also start up
> >> hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply
> >> a test resource issue - those tests are very heavy, which is why most
> >> are set to run nightly - I've seen the code run heavily without any
> >> perm gen issues though - and on some of my machines, I don't have to
> raise the perm gen at all.
> >> >
> >> > - Mark
> >> >
> >> >>
> >> >> Uwe
> >> >>
> >> >> -----
> >> >> Uwe Schindler
> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >> >> eMail: uwe@thetaphi.de
> >> >>
> >> >> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On
> >> Behalf
> >> >> Of Dawid Weiss
> >> >> Sent: Monday, July 08, 2013 2:16 PM
> >> >> To: dev@lucene.apache.org
> >> >> Subject: Re: Several builds hanging pecause of permgen
> >> >>
> >> >>
> >> >> Not much I can do from my side about permgen errors. There is
> >> >> really no
> >> way to deal with these from within Java (the same process) -- you
> >> cannot effectively handle anything because your own classes may not
> load at all.
> >> >>
> >> >> Dawid
> >> >>
> >> >> On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de>
> >> wrote:
> >> >> Another one, this time on OSX:
> >> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/
> >> >>
> >> >> -----
> >> >> Uwe Schindler
> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >> >> eMail: uwe@thetaphi.de
> >> >>
> >> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> >> >> Sent: Sunday, July 07, 2013 11:15 PM
> >> >> To: dev@lucene.apache.org
> >> >> Subject: Re: Several builds hanging pecause of permgen
> >> >>
> >> >> When there were leaks from static classes, we added a checker to
> >> LuceneTestCase that looks for RAM > N and fails with debugging
> information.
> >> >>
> >> >> I wonder if some similar check is possible for this case (to make
> >> >> it easier
> >> than going thru heapdumps, and to find issues before crash-time)...
> >> >>
> >> >> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de>
> wrote:
> >> >> Another one:
> >> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testRepo
> >> >> rt/
> >> >>
> >>
> junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT
> >> >> est/
> >> >>
> >> >> Had to be killed with kill -9
> >> >>
> >> >> -----
> >> >> Uwe Schindler
> >> >> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >> >> eMail: uwe@thetaphi.de
> >> >>
> >> >>
> >> >>> -----Original Message-----
> >> >>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> >> >>> Sent: Saturday, July 06, 2013 10:16 PM
> >> >>> To: dev@lucene.apache.org
> >> >>> Subject: RE: Several builds hanging pecause of permgen
> >> >>>
> >> >>> Another one:
> >> >>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
> >> >>>
> >> >>> I was only able to kill the JVM with kill -9 I am sure, it's horrible
> >> slowdoop!
> >> >>>
> >> >>> -----
> >> >>> Uwe Schindler
> >> >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >> >>> eMail: uwe@thetaphi.de
> >> >>>
> >> >>>
> >> >>>> -----Original Message-----
> >> >>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> >> >>>> Sent: Friday, July 05, 2013 3:59 PM
> >> >>>> To: dev@lucene.apache.org
> >> >>>> Subject: Several builds hanging pecause of permgen
> >> >>>>
> >> >>>> Several Jenkins builds now hang because of permgen. The runner
> JVM
> >> >>>> is dead (can only be killed by -9), last example:
> >> >>>>
> >> >>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> >> >>>>
> >> >>>> -----
> >> >>>> Uwe Schindler
> >> >>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >> >>>> eMail: uwe@thetaphi.de
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>>
> >> >>>> -------------------------------------------------------------------
> >> >>>> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> >>>> additional commands, e-mail: dev-help@lucene.apache.org
> >> >>>
> >> >>>
> >> >>> --------------------------------------------------------------------
> >> >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> >>> additional commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >> >>
> >> >> ---------------------------------------------------------------------
> >> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> >> additional commands, e-mail: dev-help@lucene.apache.org
> >> >>
> >> >>
> >> >
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> additional
> >> commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> > For additional commands, e-mail: dev-help@lucene.apache.org
> >
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Several builds hanging pecause of permgen

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
What's the core of those permgen leaks, guys? Is it the number of
classes (via different classloaders)? Number of interned strings?
Maybe something is going on that we should reflect on before we raise
the default permgen limit? Our typical permgen headaches were due to
repeated classloader leaks -- is Hadoop using something like that?

Dawid

On Tue, Jul 9, 2013 at 8:25 AM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Hi Mark,
>
> The problem with raising permgen is:
> - It's Hotspot specific only, so does not work with other JVMs
> - Its no longer available in Java 8
>
> I would really prefer to maybe tune the tests and maybe not create so many nodes in the cloud tests. It looks like the bug happens more often with higher test multiplier (-Dtests.multiplier=3), so maybe we can really tune that.
> If we want to raise permgen, we have to do it in a similar way like we do enable the heap dumps - with lots of <condition/> tasks in ANT... :(
>
> Uwe
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
>> -----Original Message-----
>> From: Mark Miller [mailto:markrmiller@gmail.com]
>> Sent: Tuesday, July 09, 2013 5:20 AM
>> To: dev@lucene.apache.org
>> Subject: Re: Several builds hanging pecause of permgen
>>
>> Looks like we currently don't set the max perm gen for tests, so you get the
>> default - I think we want to change that regardless - we don't want it to vary
>> IMO - it should work like Xmx.
>>
>> I think we should just set it to 128 mb, and these tests should have plenty of
>> room to run.
>>
>> - Mark
>>
>> On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com> wrote:
>>
>> >
>> > On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> >
>> >> Next one:
>> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console
>> >>
>> >> I will vote all releases of 4.4 with -1 until this is fixed!
>> >
>> > You can't veto a release, so it's kind of a hollow threat ;)
>> >
>> >> It hangs on my local computer, too! Tests pass only ½ of the time, the
>> remaining time it hangs with permgen errors.
>> >
>> > Depending on the host OS and java version, I have had to raise the max
>> perm gem a bit higher for heavy SolrCloud tests that also start up hdfs. I think
>> I remember raising it from 96 MB to 128 MB. It's simply a test resource issue -
>> those tests are very heavy, which is why most are set to run nightly - I've
>> seen the code run heavily without any perm gen issues though - and on
>> some of my machines, I don't have to raise the perm gen at all.
>> >
>> > - Mark
>> >
>> >>
>> >> Uwe
>> >>
>> >> -----
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: uwe@thetaphi.de
>> >>
>> >> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On
>> Behalf
>> >> Of Dawid Weiss
>> >> Sent: Monday, July 08, 2013 2:16 PM
>> >> To: dev@lucene.apache.org
>> >> Subject: Re: Several builds hanging pecause of permgen
>> >>
>> >>
>> >> Not much I can do from my side about permgen errors. There is really no
>> way to deal with these from within Java (the same process) -- you cannot
>> effectively handle anything because your own classes may not load at all.
>> >>
>> >> Dawid
>> >>
>> >> On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de>
>> wrote:
>> >> Another one, this time on OSX:
>> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/
>> >>
>> >> -----
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: uwe@thetaphi.de
>> >>
>> >> From: Robert Muir [mailto:rcmuir@gmail.com]
>> >> Sent: Sunday, July 07, 2013 11:15 PM
>> >> To: dev@lucene.apache.org
>> >> Subject: Re: Several builds hanging pecause of permgen
>> >>
>> >> When there were leaks from static classes, we added a checker to
>> LuceneTestCase that looks for RAM > N and fails with debugging information.
>> >>
>> >> I wonder if some similar check is possible for this case (to make it easier
>> than going thru heapdumps, and to find issues before crash-time)...
>> >>
>> >> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> >> Another one:
>> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/
>> >>
>> junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT
>> >> est/
>> >>
>> >> Had to be killed with kill -9
>> >>
>> >> -----
>> >> Uwe Schindler
>> >> H.-H.-Meier-Allee 63, D-28213 Bremen
>> >> http://www.thetaphi.de
>> >> eMail: uwe@thetaphi.de
>> >>
>> >>
>> >>> -----Original Message-----
>> >>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>> >>> Sent: Saturday, July 06, 2013 10:16 PM
>> >>> To: dev@lucene.apache.org
>> >>> Subject: RE: Several builds hanging pecause of permgen
>> >>>
>> >>> Another one:
>> >>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
>> >>>
>> >>> I was only able to kill the JVM with kill -9 I am sure, it's horrible
>> slowdoop!
>> >>>
>> >>> -----
>> >>> Uwe Schindler
>> >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>> >>> eMail: uwe@thetaphi.de
>> >>>
>> >>>
>> >>>> -----Original Message-----
>> >>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>> >>>> Sent: Friday, July 05, 2013 3:59 PM
>> >>>> To: dev@lucene.apache.org
>> >>>> Subject: Several builds hanging pecause of permgen
>> >>>>
>> >>>> Several Jenkins builds now hang because of permgen. The runner JVM
>> >>>> is dead (can only be killed by -9), last example:
>> >>>>
>> >>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
>> >>>>
>> >>>> -----
>> >>>> Uwe Schindler
>> >>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
>> >>>> eMail: uwe@thetaphi.de
>> >>>>
>> >>>>
>> >>>>
>> >>>>
>> >>>> -------------------------------------------------------------------
>> >>>> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >>>> additional commands, e-mail: dev-help@lucene.apache.org
>> >>>
>> >>>
>> >>> --------------------------------------------------------------------
>> >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >>> additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >>
>> >> ---------------------------------------------------------------------
>> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>> >> additional commands, e-mail: dev-help@lucene.apache.org
>> >>
>> >>
>> >
>>
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>> commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
Hi Mark,

The problem with raising permgen is:
- It's Hotspot specific only, so does not work with other JVMs
- Its no longer available in Java 8

I would really prefer to maybe tune the tests and maybe not create so many nodes in the cloud tests. It looks like the bug happens more often with higher test multiplier (-Dtests.multiplier=3), so maybe we can really tune that.
If we want to raise permgen, we have to do it in a similar way like we do enable the heap dumps - with lots of <condition/> tasks in ANT... :(

Uwe

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Mark Miller [mailto:markrmiller@gmail.com]
> Sent: Tuesday, July 09, 2013 5:20 AM
> To: dev@lucene.apache.org
> Subject: Re: Several builds hanging pecause of permgen
> 
> Looks like we currently don't set the max perm gen for tests, so you get the
> default - I think we want to change that regardless - we don't want it to vary
> IMO - it should work like Xmx.
> 
> I think we should just set it to 128 mb, and these tests should have plenty of
> room to run.
> 
> - Mark
> 
> On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com> wrote:
> 
> >
> > On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> >
> >> Next one:
> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console
> >>
> >> I will vote all releases of 4.4 with -1 until this is fixed!
> >
> > You can't veto a release, so it's kind of a hollow threat ;)
> >
> >> It hangs on my local computer, too! Tests pass only ½ of the time, the
> remaining time it hangs with permgen errors.
> >
> > Depending on the host OS and java version, I have had to raise the max
> perm gem a bit higher for heavy SolrCloud tests that also start up hdfs. I think
> I remember raising it from 96 MB to 128 MB. It's simply a test resource issue -
> those tests are very heavy, which is why most are set to run nightly - I've
> seen the code run heavily without any perm gen issues though - and on
> some of my machines, I don't have to raise the perm gen at all.
> >
> > - Mark
> >
> >>
> >> Uwe
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: uwe@thetaphi.de
> >>
> >> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On
> Behalf
> >> Of Dawid Weiss
> >> Sent: Monday, July 08, 2013 2:16 PM
> >> To: dev@lucene.apache.org
> >> Subject: Re: Several builds hanging pecause of permgen
> >>
> >>
> >> Not much I can do from my side about permgen errors. There is really no
> way to deal with these from within Java (the same process) -- you cannot
> effectively handle anything because your own classes may not load at all.
> >>
> >> Dawid
> >>
> >> On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de>
> wrote:
> >> Another one, this time on OSX:
> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: uwe@thetaphi.de
> >>
> >> From: Robert Muir [mailto:rcmuir@gmail.com]
> >> Sent: Sunday, July 07, 2013 11:15 PM
> >> To: dev@lucene.apache.org
> >> Subject: Re: Several builds hanging pecause of permgen
> >>
> >> When there were leaks from static classes, we added a checker to
> LuceneTestCase that looks for RAM > N and fails with debugging information.
> >>
> >> I wonder if some similar check is possible for this case (to make it easier
> than going thru heapdumps, and to find issues before crash-time)...
> >>
> >> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> >> Another one:
> >> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/
> >>
> junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsT
> >> est/
> >>
> >> Had to be killed with kill -9
> >>
> >> -----
> >> Uwe Schindler
> >> H.-H.-Meier-Allee 63, D-28213 Bremen
> >> http://www.thetaphi.de
> >> eMail: uwe@thetaphi.de
> >>
> >>
> >>> -----Original Message-----
> >>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> >>> Sent: Saturday, July 06, 2013 10:16 PM
> >>> To: dev@lucene.apache.org
> >>> Subject: RE: Several builds hanging pecause of permgen
> >>>
> >>> Another one:
> >>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
> >>>
> >>> I was only able to kill the JVM with kill -9 I am sure, it's horrible
> slowdoop!
> >>>
> >>> -----
> >>> Uwe Schindler
> >>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >>> eMail: uwe@thetaphi.de
> >>>
> >>>
> >>>> -----Original Message-----
> >>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> >>>> Sent: Friday, July 05, 2013 3:59 PM
> >>>> To: dev@lucene.apache.org
> >>>> Subject: Several builds hanging pecause of permgen
> >>>>
> >>>> Several Jenkins builds now hang because of permgen. The runner JVM
> >>>> is dead (can only be killed by -9), last example:
> >>>>
> >>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> >>>>
> >>>> -----
> >>>> Uwe Schindler
> >>>> H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de
> >>>> eMail: uwe@thetaphi.de
> >>>>
> >>>>
> >>>>
> >>>>
> >>>> -------------------------------------------------------------------
> >>>> -- To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >>>> additional commands, e-mail: dev-help@lucene.apache.org
> >>>
> >>>
> >>> --------------------------------------------------------------------
> >>> - To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >>> additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >>
> >> ---------------------------------------------------------------------
> >> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> >> additional commands, e-mail: dev-help@lucene.apache.org
> >>
> >>
> >
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Several builds hanging pecause of permgen

Posted by Mark Miller <ma...@gmail.com>.
Looks like we currently don't set the max perm gen for tests, so you get the default - I think we want to change that regardless - we don't want it to vary IMO - it should work like Xmx.

I think we should just set it to 128 mb, and these tests should have plenty of room to run.

- Mark

On Jul 8, 2013, at 11:06 PM, Mark Miller <ma...@gmail.com> wrote:

> 
> On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> 
>> Next one: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console
>> 
>> I will vote all releases of 4.4 with -1 until this is fixed!
> 
> You can't veto a release, so it's kind of a hollow threat ;)
> 
>> It hangs on my local computer, too! Tests pass only ½ of the time, the remaining time it hangs with permgen errors.
> 
> Depending on the host OS and java version, I have had to raise the max perm gem a bit higher for heavy SolrCloud tests that also start up hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply a test resource issue - those tests are very heavy, which is why most are set to run nightly - I've seen the code run heavily without any perm gen issues though - and on some of my machines, I don't have to raise the perm gen at all.
> 
> - Mark
> 
>> 
>> Uwe
>> 
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>> 
>> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf Of Dawid Weiss
>> Sent: Monday, July 08, 2013 2:16 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Several builds hanging pecause of permgen
>> 
>> 
>> Not much I can do from my side about permgen errors. There is really no way to deal with these from within Java (the same process) -- you cannot effectively handle anything because your own classes may not load at all. 
>> 
>> Dawid
>> 
>> On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> Another one, this time on OSX:
>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/
>> 
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>> 
>> From: Robert Muir [mailto:rcmuir@gmail.com] 
>> Sent: Sunday, July 07, 2013 11:15 PM
>> To: dev@lucene.apache.org
>> Subject: Re: Several builds hanging pecause of permgen
>> 
>> When there were leaks from static classes, we added a checker to LuceneTestCase that looks for RAM > N and fails with debugging information.
>> 
>> I wonder if some similar check is possible for this case (to make it easier than going thru heapdumps, and to find issues before crash-time)...
>> 
>> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
>> Another one: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsTest/
>> 
>> Had to be killed with kill -9
>> 
>> -----
>> Uwe Schindler
>> H.-H.-Meier-Allee 63, D-28213 Bremen
>> http://www.thetaphi.de
>> eMail: uwe@thetaphi.de
>> 
>> 
>>> -----Original Message-----
>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>>> Sent: Saturday, July 06, 2013 10:16 PM
>>> To: dev@lucene.apache.org
>>> Subject: RE: Several builds hanging pecause of permgen
>>> 
>>> Another one:
>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
>>> 
>>> I was only able to kill the JVM with kill -9 I am sure, it's horrible slowdoop!
>>> 
>>> -----
>>> Uwe Schindler
>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>> http://www.thetaphi.de
>>> eMail: uwe@thetaphi.de
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Uwe Schindler [mailto:uwe@thetaphi.de]
>>>> Sent: Friday, July 05, 2013 3:59 PM
>>>> To: dev@lucene.apache.org
>>>> Subject: Several builds hanging pecause of permgen
>>>> 
>>>> Several Jenkins builds now hang because of permgen. The runner JVM is
>>>> dead (can only be killed by -9), last example:
>>>> 
>>>> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
>>>> 
>>>> -----
>>>> Uwe Schindler
>>>> H.-H.-Meier-Allee 63, D-28213 Bremen
>>>> http://www.thetaphi.de
>>>> eMail: uwe@thetaphi.de
>>>> 
>>>> 
>>>> 
>>>> 
>>>> ---------------------------------------------------------------------
>>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
>>>> additional commands, e-mail: dev-help@lucene.apache.org
>>> 
>>> 
>>> ---------------------------------------------------------------------
>>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
>>> commands, e-mail: dev-help@lucene.apache.org
>> 
>> 
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
>> For additional commands, e-mail: dev-help@lucene.apache.org
>> 
>> 
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


Re: Several builds hanging pecause of permgen

Posted by Mark Miller <ma...@gmail.com>.
On Jul 8, 2013, at 2:26 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Next one: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console
>  
> I will vote all releases of 4.4 with -1 until this is fixed!

You can't veto a release, so it's kind of a hollow threat ;)

> It hangs on my local computer, too! Tests pass only ½ of the time, the remaining time it hangs with permgen errors.

Depending on the host OS and java version, I have had to raise the max perm gem a bit higher for heavy SolrCloud tests that also start up hdfs. I think I remember raising it from 96 MB to 128 MB. It's simply a test resource issue - those tests are very heavy, which is why most are set to run nightly - I've seen the code run heavily without any perm gen issues though - and on some of my machines, I don't have to raise the perm gen at all.

- Mark

>  
> Uwe
>  
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>  
> From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf Of Dawid Weiss
> Sent: Monday, July 08, 2013 2:16 PM
> To: dev@lucene.apache.org
> Subject: Re: Several builds hanging pecause of permgen
>  
>  
> Not much I can do from my side about permgen errors. There is really no way to deal with these from within Java (the same process) -- you cannot effectively handle anything because your own classes may not load at all. 
>  
> Dawid
> 
> On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Another one, this time on OSX:
> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/
>  
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>  
> From: Robert Muir [mailto:rcmuir@gmail.com] 
> Sent: Sunday, July 07, 2013 11:15 PM
> To: dev@lucene.apache.org
> Subject: Re: Several builds hanging pecause of permgen
>  
> When there were leaks from static classes, we added a checker to LuceneTestCase that looks for RAM > N and fails with debugging information.
> 
> I wonder if some similar check is possible for this case (to make it easier than going thru heapdumps, and to find issues before crash-time)...
> 
> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de> wrote:
> Another one: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsTest/
> 
> Had to be killed with kill -9
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Saturday, July 06, 2013 10:16 PM
> > To: dev@lucene.apache.org
> > Subject: RE: Several builds hanging pecause of permgen
> >
> > Another one:
> > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
> >
> > I was only able to kill the JVM with kill -9 I am sure, it's horrible slowdoop!
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> > > -----Original Message-----
> > > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > > Sent: Friday, July 05, 2013 3:59 PM
> > > To: dev@lucene.apache.org
> > > Subject: Several builds hanging pecause of permgen
> > >
> > > Several Jenkins builds now hang because of permgen. The runner JVM is
> > > dead (can only be killed by -9), last example:
> > >
> > > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> > >
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen
> > > http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> > commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
> 
>  


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
Next one: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6395/console

 

I will vote all releases of 4.4 with -1 until this is fixed! It hangs on my local computer, too! Tests pass only ½ of the time, the remaining time it hangs with permgen errors.

 

Uwe

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: dawid.weiss@gmail.com [mailto:dawid.weiss@gmail.com] On Behalf Of Dawid Weiss
Sent: Monday, July 08, 2013 2:16 PM
To: dev@lucene.apache.org
Subject: Re: Several builds hanging pecause of permgen

 

 

Not much I can do from my side about permgen errors. There is really no way to deal with these from within Java (the same process) -- you cannot effectively handle anything because your own classes may not load at all. 

 

Dawid

On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

Another one, this time on OSX:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Robert Muir [mailto:rcmuir@gmail.com] 
Sent: Sunday, July 07, 2013 11:15 PM
To: dev@lucene.apache.org
Subject: Re: Several builds hanging pecause of permgen

 

When there were leaks from static classes, we added a checker to LuceneTestCase that looks for RAM > N and fails with debugging information.

I wonder if some similar check is possible for this case (to make it easier than going thru heapdumps, and to find issues before crash-time)...

On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

Another one: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsTest/

Had to be killed with kill -9


-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]

> Sent: Saturday, July 06, 2013 10:16 PM
> To: dev@lucene.apache.org
> Subject: RE: Several builds hanging pecause of permgen
>
> Another one:
> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
>
> I was only able to kill the JVM with kill -9 I am sure, it's horrible slowdoop!
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Friday, July 05, 2013 3:59 PM
> > To: dev@lucene.apache.org
> > Subject: Several builds hanging pecause of permgen
> >
> > Several Jenkins builds now hang because of permgen. The runner JVM is
> > dead (can only be killed by -9), last example:
> >
> > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

 

 


Re: Several builds hanging pecause of permgen

Posted by Dawid Weiss <da...@cs.put.poznan.pl>.
Not much I can do from my side about permgen errors. There is really no way
to deal with these from within Java (the same process) -- you cannot
effectively handle anything because your own classes may not load at all.

Dawid

On Mon, Jul 8, 2013 at 1:35 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Another one, this time on OSX:****
>
> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/****
>
> ** **
>
> -----****
>
> Uwe Schindler****
>
> H.-H.-Meier-Allee 63, D-28213 Bremen****
>
> http://www.thetaphi.de****
>
> eMail: uwe@thetaphi.de****
>
> ** **
>
> *From:* Robert Muir [mailto:rcmuir@gmail.com]
> *Sent:* Sunday, July 07, 2013 11:15 PM
> *To:* dev@lucene.apache.org
> *Subject:* Re: Several builds hanging pecause of permgen****
>
> ** **
>
> When there were leaks from static classes, we added a checker to
> LuceneTestCase that looks for RAM > N and fails with debugging information.
>
> I wonder if some similar check is possible for this case (to make it
> easier than going thru heapdumps, and to find issues before crash-time)...
> ****
>
> On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de> wrote:****
>
> Another one:
> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsTest/
>
> Had to be killed with kill -9****
>
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]****
>
> > Sent: Saturday, July 06, 2013 10:16 PM
> > To: dev@lucene.apache.org
> > Subject: RE: Several builds hanging pecause of permgen
> >
> > Another one:
> > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
> >
> > I was only able to kill the JVM with kill -9 I am sure, it's horrible
> slowdoop!
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> > > -----Original Message-----
> > > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > > Sent: Friday, July 05, 2013 3:59 PM
> > > To: dev@lucene.apache.org
> > > Subject: Several builds hanging pecause of permgen
> > >
> > > Several Jenkins builds now hang because of permgen. The runner JVM is
> > > dead (can only be killed by -9), last example:
> > >
> > > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> > >
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen
> > > http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> > commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org****
>
> ** **
>

RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
Another one, this time on OSX:

http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-MacOSX/617/

 

-----

Uwe Schindler

H.-H.-Meier-Allee 63, D-28213 Bremen

 <http://www.thetaphi.de/> http://www.thetaphi.de

eMail: uwe@thetaphi.de

 

From: Robert Muir [mailto:rcmuir@gmail.com] 
Sent: Sunday, July 07, 2013 11:15 PM
To: dev@lucene.apache.org
Subject: Re: Several builds hanging pecause of permgen

 

When there were leaks from static classes, we added a checker to LuceneTestCase that looks for RAM > N and fails with debugging information.

I wonder if some similar check is possible for this case (to make it easier than going thru heapdumps, and to find issues before crash-time)...

On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

Another one: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsTest/

Had to be killed with kill -9


-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]

> Sent: Saturday, July 06, 2013 10:16 PM
> To: dev@lucene.apache.org
> Subject: RE: Several builds hanging pecause of permgen
>
> Another one:
> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
>
> I was only able to kill the JVM with kill -9 I am sure, it's horrible slowdoop!
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Friday, July 05, 2013 3:59 PM
> > To: dev@lucene.apache.org
> > Subject: Several builds hanging pecause of permgen
> >
> > Several Jenkins builds now hang because of permgen. The runner JVM is
> > dead (can only be killed by -9), last example:
> >
> > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

 


Re: Several builds hanging pecause of permgen

Posted by Robert Muir <rc...@gmail.com>.
When there were leaks from static classes, we added a checker to
LuceneTestCase that looks for RAM > N and fails with debugging information.

I wonder if some similar check is possible for this case (to make it easier
than going thru heapdumps, and to find issues before crash-time)...

On Sun, Jul 7, 2013 at 4:10 PM, Uwe Schindler <uw...@thetaphi.de> wrote:

> Another one:
> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsTest/
>
> Had to be killed with kill -9
>
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
>
>
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Saturday, July 06, 2013 10:16 PM
> > To: dev@lucene.apache.org
> > Subject: RE: Several builds hanging pecause of permgen
> >
> > Another one:
> > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
> >
> > I was only able to kill the JVM with kill -9 I am sure, it's horrible
> slowdoop!
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> > > -----Original Message-----
> > > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > > Sent: Friday, July 05, 2013 3:59 PM
> > > To: dev@lucene.apache.org
> > > Subject: Several builds hanging pecause of permgen
> > >
> > > Several Jenkins builds now hang because of permgen. The runner JVM is
> > > dead (can only be killed by -9), last example:
> > >
> > > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> > >
> > > -----
> > > Uwe Schindler
> > > H.-H.-Meier-Allee 63, D-28213 Bremen
> > > http://www.thetaphi.de
> > > eMail: uwe@thetaphi.de
> > >
> > >
> > >
> > >
> > > ---------------------------------------------------------------------
> > > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > > additional commands, e-mail: dev-help@lucene.apache.org
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> > commands, e-mail: dev-help@lucene.apache.org
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
> For additional commands, e-mail: dev-help@lucene.apache.org
>
>

RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
Another one: http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6385/testReport/junit/junit.framework/TestSuite/org_apache_solr_request_SimpleFacetsTest/

Had to be killed with kill -9

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Saturday, July 06, 2013 10:16 PM
> To: dev@lucene.apache.org
> Subject: RE: Several builds hanging pecause of permgen
> 
> Another one:
> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console
> 
> I was only able to kill the JVM with kill -9 I am sure, it's horrible slowdoop!
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> > -----Original Message-----
> > From: Uwe Schindler [mailto:uwe@thetaphi.de]
> > Sent: Friday, July 05, 2013 3:59 PM
> > To: dev@lucene.apache.org
> > Subject: Several builds hanging pecause of permgen
> >
> > Several Jenkins builds now hang because of permgen. The runner JVM is
> > dead (can only be killed by -9), last example:
> >
> > http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> >
> > -----
> > Uwe Schindler
> > H.-H.-Meier-Allee 63, D-28213 Bremen
> > http://www.thetaphi.de
> > eMail: uwe@thetaphi.de
> >
> >
> >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For
> > additional commands, e-mail: dev-help@lucene.apache.org
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


RE: Several builds hanging pecause of permgen

Posted by Uwe Schindler <uw...@thetaphi.de>.
Another one:
http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6375/console

I was only able to kill the JVM with kill -9
I am sure, it's horrible slowdoop!

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: uwe@thetaphi.de


> -----Original Message-----
> From: Uwe Schindler [mailto:uwe@thetaphi.de]
> Sent: Friday, July 05, 2013 3:59 PM
> To: dev@lucene.apache.org
> Subject: Several builds hanging pecause of permgen
> 
> Several Jenkins builds now hang because of permgen. The runner JVM is
> dead (can only be killed by -9), last example:
> 
> http://jenkins.thetaphi.de/job/Lucene-Solr-4.x-Linux/6360/console
> 
> -----
> Uwe Schindler
> H.-H.-Meier-Allee 63, D-28213 Bremen
> http://www.thetaphi.de
> eMail: uwe@thetaphi.de
> 
> 
> 
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org For additional
> commands, e-mail: dev-help@lucene.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org