You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@tomcat.apache.org by Joel Klein <jf...@wolfram.com> on 2011/05/05 21:35:16 UTC

Preventing or recovering from SocketException: Interrupted system call

I'm using Tomcat 5.5 to host a webapp that we deploy on Windows, Linux, 
and OS X.  The webapp includes a Java Native Library.

Right now we're seeing a scenario only on Mac OS X where Tomcat shuts 
down with this message:

SEVERE: StandardServer.await: accept:
java.net.SocketException: Interrupted system call
	at java.net.PlainSocketImpl.socketAccept(Native Method)
	at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:402)
	at java.net.ServerSocket.implAccept(ServerSocket.java:450)
	at java.net.ServerSocket.accept(ServerSocket.java:421)
	at org.apache.catalina.core.StandardServer.await(StandardServer.java:379)

The user scenario that reproduces this is one where we're pretty sure 
system signals are being fired, which could account for why the system 
call is interrupted.

This has been seen with Java 1.5 on OS X 10.5.8 and Java 1.6.0_24 on OS 
X 10.6.7.

My question to the list is whether it will work to modify 
StandardServer.java to ignore the exception and continue on.  My 
understanding is the interrupted system call isn't actually fatal.

Are there other ideas for either preventing or recovering from this 
situation?

-- 
Joel Klein
Kernel Developer, Wolfram Research, Inc.

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org


Re: Preventing or recovering from SocketException: Interrupted system call

Posted by Martin Kuen <ma...@gmail.com>.
Hi Joel,

I have done some ipc programming on linux  a while ago . . I hardly ever
looked at the jvm's source code and I don't have too much experience when it
comes to c/c++ cross platform development. However, as I didn't see any
responses so far, here are my two cents:

On Thu, May 5, 2011 at 9:35 PM, Joel Klein <jf...@wolfram.com> wrote:

>
> I'm using Tomcat 5.5 to host a webapp that we deploy on Windows, Linux, and
> OS X.  The webapp includes a Java Native Library.


> Right now we're seeing a scenario only on Mac OS X where Tomcat shuts down
> with this message:
>
> SEVERE: StandardServer.await: accept:
> java.net.SocketException: Interrupted system call
>        at java.net.PlainSocketImpl.socketAccept(Native Method)
>        at java.net.PlainSocketImpl.accept(PlainSocketImpl.java:402)
>        at java.net.ServerSocket.implAccept(ServerSocket.java:450)
>        at java.net.ServerSocket.accept(ServerSocket.java:421)
>        at
> org.apache.catalina.core.StandardServer.await(StandardServer.java:379)
>
> The user scenario that reproduces this is one where we're pretty sure
> system signals are being fired, which could account for why the system call
> is interrupted.
>

The reason why you see this, is not the signal being raised (although that's
the cause), but it's about how the process installed the signal handler for
it (see SA_RESTART flag).  Seemingly (Interpreting your
stacktrace) "PlainSocketImpl.socketAccept" ends up calling a native "I/O
primitive", which is interrupted.

If a handler is installed using the SA_RESTART flag the operation is
silently restarted. Otherwise it's up to the programmer to make this
decision (restart or abandon ship?).

 If the jvm programmers installed all their handlers with sarestart, they're
is no need for them to care about an interrupted primitive, but what does
your library do?

A program can install exactly one handler for one signal. If a handler A was
installed for a signal and something in your program later installs handler
B for the same signal, the handler A is discarded. No chaining/composition
of handlers provided by the c library itself.

It's possible that your library
a) overwrote a jvm signal handler
b) installed a signal handler, the wrong way (from a jvm point of view)


>
> This has been seen with Java 1.5 on OS X 10.5.8 and Java 1.6.0_24 on OS X
> 10.6.7.
>

Keep in mind that there may be "#define" directives in the libraries code,
altering the source depending on the build environment.


>
> My question to the list is whether it will work to modify
> StandardServer.java to ignore the exception and continue on.


No, at least not in the long run. FileInputStream.read(int) may show the
same behaviour. (I guess this method will end up in such a primitive as well
. . ). Reading a byte from a file is just faster than waiting for incoming
connections. If e.g. the author listens for tcp client connections in
blocking mode, he/she found it easiest to return from this call by raising
and handling a signal without SA_RESTART.


> My understanding is the interrupted system call isn't actually fatal.


> Are there other ideas for either preventing or recovering from this
> situation?
>


a) Search for "java signal-chaining". This way your jni lib should be able
to install sig handlers without affecting the jvm's signal handlers. I never
tried this . . .

b) You could enable additional error checks regarding jni . Something like
 -XXcheckjni. If I recall correctly this only has effects regarding
parameter validation (sorry, not sure)

c) It's possible that the os X build has a piece of buggy code (or needed
special treatment), which is never compiled for other platforms

d) You may want to think about writing a second application, which basically
wraps the functionality provided by your native code and exposes it e.g.
using RMI. This way you could at least degrade gracefully if "application
wrapping JNI functionality" crashes. You may want to use jsvc to wrap this
app so the downtime is minimized (or use a cronjob for that). Yes, this is
ugly, but it's Friday :)



Best Regards,

Martin