You are viewing a plain text version of this content. The canonical link for it is here.
Posted to user@geronimo.apache.org by Geronimo User <ge...@gmail.com> on 2007/07/31 16:30:52 UTC

Asking again: No registered context in security getCurrentContext() after apps run for two days.

We are using geronimo-1.1.1 and have a reproducible problem.

When we start Geronimo with our apps deployed in it, and leave it to
run for two days, we cannot then access geronimo for some requests.

When we submit a simple request, e.g., http://localhost:8080/console
we get the error shown at the end of this message.

We've looked in Geronimo's JIRA system for bugs in these classes and
packages, searched Google, looked at the source code in Geronimo's
1.1.1 SVN repository, tried to turn on logging in various packages in
Geronimo.

We have reduced our session timeout settings to a very small number to
try to reproduce the issue in case it depended on sessions timing out;
but that didn't cause the problem.

We've looked for other settings that establish timeouts on other
things, and didn't find them.

What should we do now in order to find out why this is happening?
Are there any troubleshooting properties we can turn on?

Thanks.

2007-07-30 09:31:33,415 ERROR [CoyoteAdapter] An exception or error
occurred in the container during the request processing
java.lang.AssertionError: No registered context
       at org.apache.geronimo.security.ContextManager.getCurrentContext(ContextManager.java:165)
       at org.apache.geronimo.tomcat.realm.TomcatGeronimoRealm.hasResourcePermission(TomcatGeronimoRealm.java:194)
       at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:506)
       at org.apache.geronimo.tomcat.GeronimoStandardContext$SystemMethodValve.invoke(GeronimoStandardContext.java:342)
       at org.apache.geronimo.tomcat.valve.GeronimoBeforeAfterValve.invoke(GeronimoBeforeAfterValve.java:31)
       at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:126)
       at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:105)
       at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:107)
       at org.apache.catalina.valves.AccessLogValve.invoke(AccessLogValve.java:541)
       at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:148)
       at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:869)
       at org.apache.coyote.http11.Http11BaseProtocol$Http11ConnectionHandler.processConnection(Http11BaseProtocol.java:667)
       at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket(PoolTcpEndpoint.java:527)
       at org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt(LeaderFollowerWorkerThread.java:80)
       at org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:684)
       at java.lang.Thread.run(Thread.java:595)

Re: Asking again: No registered context in security getCurrentContext() after apps run for two days.

Posted by Geronimo User <ge...@gmail.com>.
On 7/31/07, Kevan Miller <ke...@gmail.com> wrote:
>
> So, all is well for the first two days?

Yes, no issues seen in the geronimo.log.

> Then some requests fail (e.g.
> the console), but others work? No other indications of problems in
> the server log?

I'm working on the hypothesis that requests that would force a j_securitycheck
workflow are the ones that fail. That is just a hypothesis, though.

I can get other requests to work by submitting, via a browser, requests
that I can see have been previously submitted, by looking at the catalina logs.

> I'd guess you're either running out of some resource (threads/
> connectors, etc) or you have a memory leak. Have you checked your
> memory utilization? Assuming Java 5, you could run with the following
> JAVA_OPTS="-XX:+PrintGCDetails -XX:
> +HeapDumpOnOutOfMemoryError" (assuming you're using startup/geronimo
> sh/bat scripts). Alternatively, bump up your heap and permgen
> settings and see if the time-to-failure increases -- JAVA_OPTS="-
> Xmx256m -XX:MaxPermSize=128m", for example...
>
> --kevan

Here are our JAVA_OPTS, which were in effect when we reproduced the problem.
We didn't see any memory-related issues.

JAVA_OPTS=-enableassertions -Xms1024M -Xmx1024M
-Xrunjdwp:transport=dt_socket,server=y,suspend=y,address=8787
-verbose:gc -XX:+UseConcMarkSweepGC -XX:+CMSPermGenSweepingEnabl
ed -XX:+CMSClassUnloadingEnabled -XX:MaxPermSize=128m

I am starting Geronimo from the command line in .../geronimo-1.1.1/bin, using:

"C:\Program Files\informiam\geronimo-1.1.1\bin\startup.bat"

Thanks.

Re: Asking again: No registered context in security getCurrentContext() after apps run for two days.

Posted by Kevan Miller <ke...@gmail.com>.
On Jul 31, 2007, at 10:30 AM, Geronimo User wrote:

> We are using geronimo-1.1.1 and have a reproducible problem.
>
> When we start Geronimo with our apps deployed in it, and leave it to
> run for two days, we cannot then access geronimo for some requests.
>
> When we submit a simple request, e.g., http://localhost:8080/console
> we get the error shown at the end of this message.
>
> We've looked in Geronimo's JIRA system for bugs in these classes and
> packages, searched Google, looked at the source code in Geronimo's
> 1.1.1 SVN repository, tried to turn on logging in various packages in
> Geronimo.
>
> We have reduced our session timeout settings to a very small number to
> try to reproduce the issue in case it depended on sessions timing out;
> but that didn't cause the problem.
>
> We've looked for other settings that establish timeouts on other
> things, and didn't find them.
>
> What should we do now in order to find out why this is happening?
> Are there any troubleshooting properties we can turn on?

Heh. Nice email address...

So, all is well for the first two days? Then some requests fail (e.g.  
the console), but others work? No other indications of problems in  
the server log?

I'd guess you're either running out of some resource (threads/ 
connectors, etc) or you have a memory leak. Have you checked your  
memory utilization? Assuming Java 5, you could run with the following  
JAVA_OPTS="-XX:+PrintGCDetails -XX: 
+HeapDumpOnOutOfMemoryError" (assuming you're using startup/geronimo  
sh/bat scripts). Alternatively, bump up your heap and permgen  
settings and see if the time-to-failure increases -- JAVA_OPTS="- 
Xmx256m -XX:MaxPermSize=128m", for example...

--kevan

>
> Thanks.
>
> 2007-07-30 09:31:33,415 ERROR [CoyoteAdapter] An exception or error
> occurred in the container during the request processing
> java.lang.AssertionError: No registered context
>        at  
> org.apache.geronimo.security.ContextManager.getCurrentContext 
> (ContextManager.java:165)
>        at  
> org.apache.geronimo.tomcat.realm.TomcatGeronimoRealm.hasResourcePermis 
> sion(TomcatGeronimoRealm.java:194)
>        at org.apache.catalina.authenticator.AuthenticatorBase.invoke 
> (AuthenticatorBase.java:506)
>        at org.apache.geronimo.tomcat.GeronimoStandardContext 
> $SystemMethodValve.invoke(GeronimoStandardContext.java:342)
>        at  
> org.apache.geronimo.tomcat.valve.GeronimoBeforeAfterValve.invoke 
> (GeronimoBeforeAfterValve.java:31)
>        at org.apache.catalina.core.StandardHostValve.invoke 
> (StandardHostValve.java:126)
>        at org.apache.catalina.valves.ErrorReportValve.invoke 
> (ErrorReportValve.java:105)
>        at org.apache.catalina.core.StandardEngineValve.invoke 
> (StandardEngineValve.java:107)
>        at org.apache.catalina.valves.AccessLogValve.invoke 
> (AccessLogValve.java:541)
>        at org.apache.catalina.connector.CoyoteAdapter.service 
> (CoyoteAdapter.java:148)
>        at org.apache.coyote.http11.Http11Processor.process 
> (Http11Processor.java:869)
>        at org.apache.coyote.http11.Http11BaseProtocol 
> $Http11ConnectionHandler.processConnection(Http11BaseProtocol.java: 
> 667)
>        at org.apache.tomcat.util.net.PoolTcpEndpoint.processSocket 
> (PoolTcpEndpoint.java:527)
>        at  
> org.apache.tomcat.util.net.LeaderFollowerWorkerThread.runIt 
> (LeaderFollowerWorkerThread.java:80)
>        at org.apache.tomcat.util.threads.ThreadPool 
> $ControlRunnable.run(ThreadPool.java:684)
>        at java.lang.Thread.run(Thread.java:595)


RE: Asking again: No registered context in security getCurrentContext() after apps run for two days.

Posted by Scott Stanchfield <Sc...@hillcrestlabs.com>.
Cool. Sounds even better. Prolly need to separate (in ExpirationMonitor)

    if (session.isDone() || age > maxLoginDurationMillis) {
        list.add(session); // to unregister
        session.setDone(true);
        it.remove();
    }

into

    if (session.isDone()) {
        list.add(session); // to unregister
        session.setDone(true);
        it.remove();
    } else if (age > maxLoginDurationMillis) {
        // replace with anon session
        // (haven't dug in to find how to create it)
    }

- Scott


> -----Original Message-----
> From: David Jencks [mailto:david_jencks@yahoo.com] 
> Sent: Wednesday, October 10, 2007 8:56 PM
> To: user@geronimo.apache.org
> Subject: Re: Asking again: No registered context in security 
> getCurrentContext() after apps run for two days.
> 
> I wonder if it would be appropriate to replace the expired 
> context with something representing the "no one" identity?  
> then the ACE might occur naturally when "no one" can't do 
> something but stuff that  
> is actually allowed for an unauthenticated user would still work...   
> so I think this would modify the timeout method rather than 
> the code you are looking at.
> 
> haven't looked into it in detail yet.
> 
> thanks
> david jencks

Re: Asking again: No registered context in security getCurrentContext() after apps run for two days.

Posted by David Jencks <da...@yahoo.com>.
I wonder if it would be appropriate to replace the expired context  
with something representing the "no one" identity?  then the ACE  
might occur naturally when "no one" can't do something but stuff that  
is actually allowed for an unauthenticated user would still work...   
so I think this would modify the timeout method rather than the code  
you are looking at.

haven't looked into it in detail yet.

thanks
david jencks

On Oct 10, 2007, at 4:08 PM, Scott Stanchfield wrote:

>
> Sorry -- forgot the code...
>
> Replace
>
>   assert context != null : "No registered context";
>
> with
>
>   if (context == null)
>     throw new AccessControlException("No registered context (may have
> expired)");
>
>
> Scott Stanchfield wrote:
>>
>> Suggested Fix
>>
>> Change getCurrentContext() to throw and AccessControlException when
>> context is null to prevent skipping of "after" processing in
>> BeforeAfterValve. All callers of getCurrentContext() already catch
>> AccessControlException and treat it as "not authorized".
>>
>
> -- 
> View this message in context: http://www.nabble.com/Asking-again%3A- 
> No-registered-context-in-security-getCurrentContext%28%29-after- 
> apps-run-for-two-days.-tf4193578s134.html#a13146591
> Sent from the Apache Geronimo - Users mailing list archive at  
> Nabble.com.
>


Re: Asking again: No registered context in security getCurrentContext() after apps run for two days.

Posted by Scott Stanchfield <sc...@hcrest.com>.
Sorry -- forgot the code...

Replace

  assert context != null : "No registered context";

with

  if (context == null) 
    throw new AccessControlException("No registered context (may have
expired)");


Scott Stanchfield wrote:
> 
> Suggested Fix
> 
> Change getCurrentContext() to throw and AccessControlException when
> context is null to prevent skipping of "after" processing in
> BeforeAfterValve. All callers of getCurrentContext() already catch
> AccessControlException and treat it as "not authorized".
> 

-- 
View this message in context: http://www.nabble.com/Asking-again%3A-No-registered-context-in-security-getCurrentContext%28%29-after-apps-run-for-two-days.-tf4193578s134.html#a13146591
Sent from the Apache Geronimo - Users mailing list archive at Nabble.com.


Re: Asking again: No registered context in security getCurrentContext() after apps run for two days.

Posted by Scott Stanchfield <sc...@hcrest.com>.
We had the same issue, and didn't see much about it online (except this note
and bug GERONIMO-2100)

I debugged through the Geronimo code for a while today, and I think I've
found the solution (which I also think is the proper solution to
GERONIMO-2100 as well...)

The problem

[In this description, getCurrentContext() refers to 
org.apache.geronimo.security.ContextManager.getCurrentContext() ]

The JaasLoginService.ExpirationMonitor kills contexts based on a maximum
login time (the default is one day). To see if this initiating the problem,
I changed the timeout to two minutes. The problem appeared, though not on
every access.

I debugged and watched what was happening thread-wise. It seemed that once a
thread went bad, all requests handled in that thread die with the
NullPointerException (or if assertions are on, the AssertionError.)

The exception is thrown from getCurrentContext() because it is being asked
for a context that has been unregistered by ExpirationMonitor.

There are some BeforeAfter processing that take place around calls to
getCurrentContext(). In particular,
org.apache.geronimo.tomcat.interceptor.PolicyContextBeforeAfter calls
ContextManager's setCallers() and popCallers() methods.

When getCurrentContext() throws a NullPointerException, the "after"
processing is skipped; this fails to call popCallers(). This is very evil
indeed, as the current thread is left in a bad state, thinking it already
has a subject, which has recently been invalidated by the ExpirationMonitor.

Any further HTTP requests assigned to that thread are now treated as though
they have a Subject, which in turn causes a NullPointerException from this
method.


Suggested Fix

Change getCurrentContext() to throw and AccessControlException when context
is null to prevent skipping of "after" processing in BeforeAfterValve. All
callers of getCurrentContext() already catch AccessControlException and
treat it as "not authorized".

I haven't tried this yet (I'm settting up the sources for a build), but I've
got warm fuzzies about it...

Thoughts?
-- Scott


Geronimo User wrote:
> 
> [...]
> When we start Geronimo with our apps deployed in it, and leave it to
> run for two days, we cannot then access geronimo for some requests.
> 
> [...]
> 
> 2007-07-30 09:31:33,415 ERROR [CoyoteAdapter] An exception or error
> occurred in the container during the request processing
> java.lang.AssertionError: No registered context
>        at
> org.apache.geronimo.security.ContextManager.getCurrentContext(ContextManager.java:165)
> [...]
> 

-- 
View this message in context: http://www.nabble.com/Asking-again%3A-No-registered-context-in-security-getCurrentContext%28%29-after-apps-run-for-two-days.-tf4193578s134.html#a13146261
Sent from the Apache Geronimo - Users mailing list archive at Nabble.com.