You are viewing a plain text version of this content. The canonical link for it is here.

Posted to dev@tomcat.apache.org by Mladen Turk <mt...@apache.org> on 2013/10/01 06:53:36 UTC

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

On 09/30/2013 08:50 PM, Christopher Schultz wrote:
> Mladen,
>
> Unless there is significant objection,

Yes there is.
The transition from  Java to JNI costs 10 times then
then a simple 'if (socket == OL)' in java

> I'd like to add such NULL checks
> to limit JVM crashes in cases where the Java code isn't absolutely
> perfect (with sincere apologies to all who manage the socket-management
> code in Tomcat).
>

-1. You are just hiding the reason for crash.


Regards
-- 
^TM

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Rainer,

On 10/3/13 7:01 PM, Rainer Jung wrote:
> On 04.10.2013 00:37, Christopher Schultz wrote:
>> Rainer,
>>
>> On 10/3/13 5:40 PM, Rainer Jung wrote:
>>> On 03.10.2013 21:52, Mark Thomas wrote:
>>>> On 03/10/2013 17:34, Christopher Schultz wrote:
>>>>> On 10/3/13 11:42 AM, Mark Thomas wrote:
>>>
>>>> I was thinking maybe an APR function that gave a textual error message
>>>> for a given error code. Currently, the APR Java code reports just the
>>>> error number. It would be nice to include a meaningful text message.
>>>
>>> The function is apr_strerror() and already wired via
>>>
>>> public static native org.apache.tomcat.jni.Error.strerror(int statcode)
>>
>> There's also void tcn_ThrowAPRException(JNIEnv *e, apr_status_t err)
>> which throws an exception (actually, java.lang.Error) with the
>> appropriate error message.
> 
> I think it throws org.apache.tomcat.jni.Error which extends Exception.

Ah, yes. I had traced the definition of TCN_ERROR_CLASS to here:

  #define TCN_ERROR_CLASS TCN_CLASS_PATH "Error"

...and foolishly ignored the TCN_CLASS_PATH constant.

>> I think it would be useful to have a TomcatNativeException instead of
>> throwing java.lang.Error or maybe even be smart about converting certain
>> error codes (e.g. APR_EOF) to Java exception types (e.g.
>> java.io.EOFException).
>>
>> On the other hand, Mladen wants tcnative to be a thin wrapper around
>> APR, so maybe he doesn't want exceptions to be thrown directly from
>> native. I think if you're gonna use JNI, you may as well make it as nice
>> as possible.
> 
> There are already several places whre tcnative calls
> tcn_ThrowAPRException() or other tcn_Throw...Exception().

Yup. I was just suggesting that maybe we could improve upon that.

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Rainer Jung <ra...@kippdata.de>.

On 04.10.2013 00:37, Christopher Schultz wrote:
> Rainer,
> 
> On 10/3/13 5:40 PM, Rainer Jung wrote:
>> On 03.10.2013 21:52, Mark Thomas wrote:
>>> On 03/10/2013 17:34, Christopher Schultz wrote:
>>>> On 10/3/13 11:42 AM, Mark Thomas wrote:
>>
>>> I was thinking maybe an APR function that gave a textual error message
>>> for a given error code. Currently, the APR Java code reports just the
>>> error number. It would be nice to include a meaningful text message.
>>
>> The function is apr_strerror() and already wired via
>>
>> public static native org.apache.tomcat.jni.Error.strerror(int statcode)
> 
> There's also void tcn_ThrowAPRException(JNIEnv *e, apr_status_t err)
> which throws an exception (actually, java.lang.Error) with the
> appropriate error message.

I think it throws org.apache.tomcat.jni.Error which extends Exception.

> I think it would be useful to have a TomcatNativeException instead of
> throwing java.lang.Error or maybe even be smart about converting certain
> error codes (e.g. APR_EOF) to Java exception types (e.g.
> java.io.EOFException).
> 
> On the other hand, Mladen wants tcnative to be a thin wrapper around
> APR, so maybe he doesn't want exceptions to be thrown directly from
> native. I think if you're gonna use JNI, you may as well make it as nice
> as possible.

There are already several places whre tcnative calls
tcn_ThrowAPRException() or other tcn_Throw...Exception().

Regards,

Rainer


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Rainer,

On 10/3/13 5:40 PM, Rainer Jung wrote:
> On 03.10.2013 21:52, Mark Thomas wrote:
>> On 03/10/2013 17:34, Christopher Schultz wrote:
>>> On 10/3/13 11:42 AM, Mark Thomas wrote:
> 
>> I was thinking maybe an APR function that gave a textual error message
>> for a given error code. Currently, the APR Java code reports just the
>> error number. It would be nice to include a meaningful text message.
> 
> The function is apr_strerror() and already wired via
> 
> public static native org.apache.tomcat.jni.Error.strerror(int statcode)

There's also void tcn_ThrowAPRException(JNIEnv *e, apr_status_t err)
which throws an exception (actually, java.lang.Error) with the
appropriate error message.

I think it would be useful to have a TomcatNativeException instead of
throwing java.lang.Error or maybe even be smart about converting certain
error codes (e.g. APR_EOF) to Java exception types (e.g.
java.io.EOFException).

On the other hand, Mladen wants tcnative to be a thin wrapper around
APR, so maybe he doesn't want exceptions to be thrown directly from
native. I think if you're gonna use JNI, you may as well make it as nice
as possible.

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Rainer Jung <ra...@kippdata.de>.

On 03.10.2013 21:52, Mark Thomas wrote:
> On 03/10/2013 17:34, Christopher Schultz wrote:
>> On 10/3/13 11:42 AM, Mark Thomas wrote:

> I was thinking maybe an APR function that gave a textual error message
> for a given error code. Currently, the APR Java code reports just the
> error number. It would be nice to include a meaningful text message.

The function is apr_strerror() and already wired via

public static native org.apache.tomcat.jni.Error.strerror(int statcode)

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Mark,

On 10/3/13 6:29 PM, Christopher Schultz wrote:
> Okay, that's something I can do: a JNI wrapper around apr_strerror
> should be trivial.

Already exists:

    public static native String org.apache.tomcat.jni.Error.strerror(int
statcode);

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Mark,

On 10/3/13 3:52 PM, Mark Thomas wrote:
> On 03/10/2013 17:34, Christopher Schultz wrote:
>> On 10/3/13 11:42 AM, Mark Thomas wrote:
> 
>>> On the other hand, a JVM crash is a very strong motivation to fix
>>> an issue.
> 
>> For *you*, or for the user?
> 
> For me. I haven't looked at the historical bugs. The APR code has
> changed so much for WebSocket I'd be inclined to close all the
> historical issues as WORKSFORME (assuming they didn't come with a
> reproducible test case) and focus on the current code.

I kind of agree.... many of those reports are ooooold.

>> Certainly it is for the user, but given the number of unfixed crash
>> reports in Bugzilla, it doesn't seem like tcnative-crash=quick-fix
>> from the team. I've been trying to investigate them whenever
>> possible but it's hard for me to get more information and, frankly,
>> I don't know anything about APR socket management itself so my time
>> is only of limited use.
> 
> The AprEndpoint can be hard to get your head around starting from
> scratch. It has taken me a long time to feel comfortable making
> changes to that code. More comments in the AprEndpoint code should
> help. I've been trying to add them as I make changes.
> 
>>> - documenting some of the constraints around using SSL would
>>> have saved me some time when getting SSL and WebSocket working
> 
>> Can you be more specific without just writing the documentation 
>> yourself? I'm hoping to help, but I'm not sure what SSL constraints
>> you are alluding to.
> 
> If you get a partial read/write you have to repeat the call with
> exactly the same parameters (i.e. the same objects) as you made the
> first call.

Interesting. So partial-write means that you should just idempotently
re-try?

>>> -730053
> 
>> This one isn't valid, as far as I can tell (the error string is 
>> "Unrecognized resolver error"). 70053 is "Error string not
>> specified yet" so I'm not sure what that one is.
> 
> 720000 is the offset for OS native error codes.
> 10053 is client abort on Windows.

Aah, Windows. No wonder I couldn't find the error code on Linux ;)

>>> I can look them up to figure what they mean but it would save
>>> some time if the error report included a text version.
> 
>> tcnative doesn't have an error log, so where could those strings
>> go? Or were you thinking of having a bridge from Java code into
>> apr_strerr?
> 
> I was thinking maybe an APR function that gave a textual error message
> for a given error code. Currently, the APR Java code reports just the
> error number. It would be nice to include a meaningful text message.

Okay, that's something I can do: a JNI wrapper around apr_strerror
should be trivial.

>> What about a program like perror that understands APR error codes?
>> I've written a simple one that could be helpful, but you probably
>> did that yourself already.
> 
> Nope. I use Google :)

LOL. Really.

I've got a quick and dirty C program that compiles cleanly on Mac,
Linux. I suppose it would compile on Windows, but I don't have a win32
compiler.

>> Anywhere more information can be added, I'm happy to help.
> 
>>>>> - Refactoring the connectors so all socket access goes
>>>>> through the SocketWrapper so there is a much smaller set of
>>>>> code to validate.
>>>>
>>>> I'm guessing you are tackling this task slowly over time.
>>>
>>> I am moving slowly in this direction. My ultimate aim is to have
>>> the connector type specific code only in the Endpoint and the 
>>> SocketWrapper. No idea if that is possible. It is a longer term
>>> goal for Tomcat 9+ at this point.
>>>
>>> At the very least whenever I add functionality to the connectors
>>> (e.g. non-blocking IO) I do enough refactoring that I only have
>>> to add the new stuff once.
> 
>> Sounds good. Having unified code with only certain aspects
>> separated into BIO, NIO, and APR will certainly help folks like me
>> understand the "true" conceptual relationship between all of these
>> components and make it easier to actually help work on that part of
>> the code.
> 
> Exactly. Simpler maintenance is one of my goals when reducing the
> duplication in the connector code. It is also something that can often
> be done in small and/or simple steps. There is always scope for
> someone to start contributing in that area (with the caveat that they
> need to be careful not to bite off more than they can chew - it is
> also easy to get into a right mess).

:)

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Mark Thomas <ma...@apache.org>.

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/10/2013 17:34, Christopher Schultz wrote:
> On 10/3/13 11:42 AM, Mark Thomas wrote:

>> On the other hand, a JVM crash is a very strong motivation to fix
>> an issue.
> 
> For *you*, or for the user?

For me. I haven't looked at the historical bugs. The APR code has
changed so much for WebSocket I'd be inclined to close all the
historical issues as WORKSFORME (assuming they didn't come with a
reproducible test case) and focus on the current code.

> Certainly it is for the user, but given the number of unfixed crash
> reports in Bugzilla, it doesn't seem like tcnative-crash=quick-fix
> from the team. I've been trying to investigate them whenever
> possible but it's hard for me to get more information and, frankly,
> I don't know anything about APR socket management itself so my time
> is only of limited use.

The AprEndpoint can be hard to get your head around starting from
scratch. It has taken me a long time to feel comfortable making
changes to that code. More comments in the AprEndpoint code should
help. I've been trying to add them as I make changes.

>> - documenting some of the constraints around using SSL would
>> have saved me some time when getting SSL and WebSocket working
> 
> Can you be more specific without just writing the documentation 
> yourself? I'm hoping to help, but I'm not sure what SSL constraints
> you are alluding to.

If you get a partial read/write you have to repeat the call with
exactly the same parameters (i.e. the same objects) as you made the
first call.

>> -730053
> 
> This one isn't valid, as far as I can tell (the error string is 
> "Unrecognized resolver error"). 70053 is "Error string not
> specified yet" so I'm not sure what that one is.

720000 is the offset for OS native error codes.
10053 is client abort on Windows.

>> I can look them up to figure what they mean but it would save
>> some time if the error report included a text version.
> 
> tcnative doesn't have an error log, so where could those strings
> go? Or were you thinking of having a bridge from Java code into
> apr_strerr?

I was thinking maybe an APR function that gave a textual error message
for a given error code. Currently, the APR Java code reports just the
error number. It would be nice to include a meaningful text message.

> What about a program like perror that understands APR error codes?
> I've written a simple one that could be helpful, but you probably
> did that yourself already.

Nope. I use Google :)

> Anywhere more information can be added, I'm happy to help.
> 
>>>> - Refactoring the connectors so all socket access goes
>>>> through the SocketWrapper so there is a much smaller set of
>>>> code to validate.
>>> 
>>> I'm guessing you are tackling this task slowly over time.
>> 
>> I am moving slowly in this direction. My ultimate aim is to have
>> the connector type specific code only in the Endpoint and the 
>> SocketWrapper. No idea if that is possible. It is a longer term
>> goal for Tomcat 9+ at this point.
>> 
>> At the very least whenever I add functionality to the connectors
>> (e.g. non-blocking IO) I do enough refactoring that I only have
>> to add the new stuff once.
> 
> Sounds good. Having unified code with only certain aspects
> separated into BIO, NIO, and APR will certainly help folks like me
> understand the "true" conceptual relationship between all of these
> components and make it easier to actually help work on that part of
> the code.

Exactly. Simpler maintenance is one of my goals when reducing the
duplication in the connector code. It is also something that can often
be done in small and/or simple steps. There is always scope for
someone to start contributing in that area (with the caveat that they
need to be careful not to bite off more than they can chew - it is
also easy to get into a right mess).

Mark

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (MingW32)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJSTcrlAAoJEBDAHFovYFnnlfcP/A0O/1uiL0wBZ4ZrNZo9OQkR
134nCL7lt8RmpojWEH5vbySn8+94Qm1ycgwKXxX5a0RccSIPsqg2YZaH9BXLXqA8
hWaT4Yr38owGeY4ODXnVjsP4NGClDUOWjaxsBzg7AU5k7krwnNnGiaoBzPfT1CS1
juBr2OjcqiyJ2nC7eMO5Nfjmkv9Ru5B6Ksd28hMoRqBxuORXIUal2DEjxSdC1f6D
iHkvn3a/zAdWGu+qpL2yR7Kqb3LT6yYDDCGiwZM3tsDqg+lRwu2yXmRbvo7zBEfL
8lG+WUHuvkC8+gqJeWDFg7ECUWSh6ZuX2AN0zudJtrvpFqcac+CHBR2ksapq0QUB
s3qq82OZ0qF5ibJNCO9w1JO2XHxY3HGo6wUKZ4J9d0A1NMuiU7/Dnyn6ob0npOBV
qOwRv0qrS0gaxOoLWWOvThgJnFMHBnhEra9CC9fkWF8huZo4zlupDVa6Zxr1FgyQ
zwm8PFKHUPIh/MMzKGtn5jp9M1JW/0XGUbPHEe2sasTKmmlp3WizlVoborajRSfG
P4vqPAYAZRl+r7kkr3ffMzQ/5sUkouPeD11qO4hRxUgntf2PeGSyONTsQRRNGEjV
oT7uHinDWi+0HmEEASTvzMhJ84A7m5p9vmkGC323oPL/fgitUSSTWPRryuHmEt+a
QMBvTmVxM8uaCzE6qfCW
=D89L
-----END PGP SIGNATURE-----

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Mark,

On 10/3/13 11:42 AM, Mark Thomas wrote:
> On 03/10/2013 16:12, Christopher Schultz wrote:
>> On 10/3/13 9:18 AM, Mark Thomas wrote:
> 
>>> Having been responsible for introducing and fixing a number of
>>> these issues I disagree. It is usually fairly easy to tell what
>>> the problem is from the crash report (double close on a socket,
>>> invalid socket in the poller, etc). What is far more difficult is
>>> figuring out how things got into the known invalid state in the
>>> first place. No amount of debug information at the point of the
>>> crash is going to help with that.
>>
>> Agreed. Given that you have been researching these issues (and
>> fixing them -- thanks!) you have a unique perspective: even though
>> it's easy for you to determine the cause of a crash when you can
>> reproduce it, how about getting a crash report with little other
>> context? Would it help you to have more information or is the crash
>> report sufficient?
> 
> If I can't reproduce it then the fall-back is manual code review to
> try and figure out a code path (perhaps with some hints from the OP's
> report) that would result in the observed issue. As long as there is
> enough info in the crash report (there typically is) to figure out
> what is null / invalid then all is good. I haven't yet felt the need
> for further context.

Okay.

>>> A hard to reproduce bug in APR that triggers a crash is no more
>>> or less difficult to work with than a hard to reproduce bug in
>>> NIO that triggers an unexpected socket close.
>>
>> In those two cases -- NIO and APR -- is the NIO case any less 
>> catastrophic?
> 
> Sometimes yes, sometimes no.
>
>> I've been arguing that we should stop the JVM from crashing, but
>> locking-up the NIO connector and rendering the server inert is
>> roughly the same outcome. Is there actually any utility in stopping
>> the JVM from coming down? I guess you could get a thread dump from
>> an otherwise hung Tomcat, but probably not much else.
> 
> With a lock-up of infinite loop you can find out where you are (as
> with APR) but the how you got there is what you really want.
> 
> For an error that would otherwise result in a socket closure then a
> JVM crash is bad.

So there is at least some utility in preventing JVM crashes. The
question is whether or not avoiding a crash can meaningfully recover
that particular socket, poller, etc. If not, crashing is probably not
much better than the alternative.

> On the other hand, a JVM crash is a very strong motivation to fix an
> issue.

For *you*, or for the user? Certainly it is for the user, but given the
number of unfixed crash reports in Bugzilla, it doesn't seem like
tcnative-crash=quick-fix from the team. I've been trying to investigate
them whenever possible but it's hard for me to get more information and,
frankly, I don't know anything about APR socket management itself so my
time is only of limited use.

>>> I can think of several things that would be more useful: - Better
>>> Javadoc for the native methods. I can think of a number of times
>>> where better docs would have saved me a fair amount of time 
>>> debugging unexpected behaviour.
>>
>> Do you mean more documentation about how the method works, or even
>> just a simple description of what happens *at all*?
> 
> Some examples might help:
> - documenting the return codes for the non-blocking reads would have
> highlighted the problem that required the 1.1.28 release

Ok.

> - documenting the return codes for removing sockets from the poller
> would have highlighted the problem that is going to require the 1.1.29
> release

Ok.

> - documenting some of the constraints around using SSL would have
> saved me some time when getting SSL and WebSocket working

Can you be more specific without just writing the documentation
yourself? I'm hoping to help, but I'm not sure what SSL constraints you
are alluding to.

> As I have worked more with APR/native I have discovered various things
> that it would have been a great help to have in the docs. I've tried
> to remember to document them when I have found them but I've probably
> missed some.
> 
>>> - Something to turn an APR error response code into meaningful
>>> test.
>>
>> Can you explain this in a little more detail? For example, an APR
>> error code might be "bad socket", but as you say, the circumstances
>> of the bug are more important than the error code. How could such a
>> code be turned into a test case?
> 
> In the last few days I have come across the following error codes:
> -70014

I'm sure you know already that this is "End of file found"

> -730053

This one isn't valid, as far as I can tell (the error string is
"Unrecognized resolver error"). 70053 is "Error string not specified
yet" so I'm not sure what that one is.

The defined error codes for APR only go up to 70025 and even 70025 says
"Error string not specified".

> I can look them up to figure what they mean but it would save some
> time if the error report included a text version.

tcnative doesn't have an error log, so where could those strings go? Or
were you thinking of having a bridge from Java code into apr_strerr?

What about a program like perror that understands APR error codes? I've
written a simple one that could be helpful, but you probably did that
yourself already.

Anywhere more information can be added, I'm happy to help.

>>> - Refactoring the connectors so all socket access goes through
>>> the SocketWrapper so there is a much smaller set of code to
>>> validate.
>>
>> I'm guessing you are tackling this task slowly over time.
> 
> I am moving slowly in this direction. My ultimate aim is to have the
> connector type specific code only in the Endpoint and the
> SocketWrapper. No idea if that is possible. It is a longer term goal
> for Tomcat 9+ at this point.
> 
> At the very least whenever I add functionality to the connectors (e.g.
> non-blocking IO) I do enough refactoring that I only have to add the
> new stuff once.

Sounds good. Having unified code with only certain aspects separated
into BIO, NIO, and APR will certainly help folks like me understand the
"true" conceptual relationship between all of these components and make
it easier to actually help work on that part of the code.

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Mark Thomas <ma...@apache.org>.

On 03/10/2013 16:12, Christopher Schultz wrote:
> On 10/3/13 9:18 AM, Mark Thomas wrote:

>> Having been responsible for introducing and fixing a number of
>> these issues I disagree. It is usually fairly easy to tell what
>> the problem is from the crash report (double close on a socket,
>> invalid socket in the poller, etc). What is far more difficult is
>> figuring out how things got into the known invalid state in the
>> first place. No amount of debug information at the point of the
>> crash is going to help with that.
> 
> Agreed. Given that you have been researching these issues (and
> fixing them -- thanks!) you have a unique perspective: even though
> it's easy for you to determine the cause of a crash when you can
> reproduce it, how about getting a crash report with little other
> context? Would it help you to have more information or is the crash
> report sufficient?

If I can't reproduce it then the fall-back is manual code review to
try and figure out a code path (perhaps with some hints from the OP's
report) that would result in the observed issue. As long as there is
enough info in the crash report (there typically is) to figure out
what is null / invalid then all is good. I haven't yet felt the need
for further context.

>> A hard to reproduce bug in APR that triggers a crash is no more
>> or less difficult to work with than a hard to reproduce bug in
>> NIO that triggers an unexpected socket close.
> 
> In those two cases -- NIO and APR -- is the NIO case any less 
> catastrophic?

Sometimes yes, sometimes no.

> I've been arguing that we should stop the JVM from crashing, but
> locking-up the NIO connector and rendering the server inert is
> roughly the same outcome. Is there actually any utility in stopping
> the JVM from coming down? I guess you could get a thread dump from
> an otherwise hung Tomcat, but probably not much else.

With a lock-up of infinite loop you can find out where you are (as
with APR) but the how you got there is what you really want.

For an error that would otherwise result in a socket closure then a
JVM crash is bad. On the other hand, a JVM crash is a very strong
motivation to fix an issue.

>> You may have noticed that I have slowly been adding debug code to
>> the low level connector code, primarily in the Endpoints and the
>> Protocol implementations. All of this debug code has proven
>> useful in tracking down the type of bug that triggers a crash
>> with APR.
>> 
>> Additional validity checks in the native code provide for a more 
>> graceful failure mode but offer little other benefit as the
>> useful information is more focused on how the current state was
>> achieved rather than what the current state is.
> 
> Agreed. Remember, I was just trying to stop crashes. We have a load
> of crash reports in various versions of tcnative and if the
> problem actually isn't in tcnative, it would be nice to get those
> reported against the components that actually have bugs.

Most crashes in APR/native are going to be bugs in AprEndpoint or less
likely in the APR processor and protocol.

>> I'm -0 on adding additional checks to the native code.
> 
> Noted.
> 
>> I can think of several things that would be more useful: - Better
>> Javadoc for the native methods. I can think of a number of times
>> where better docs would have saved me a fair amount of time 
>> debugging unexpected behaviour.
> 
> Do you mean more documentation about how the method works, or even
> just a simple description of what happens *at all*?

Some examples might help:
- documenting the return codes for the non-blocking reads would have
highlighted the problem that required the 1.1.28 release
- documenting the return codes for removing sockets from the poller
would have highlighted the problem that is going to require the 1.1.29
release
- documenting some of the constraints around using SSL would have
saved me some time when getting SSL and WebSocket working

As I have worked more with APR/native I have discovered various things
that it would have been a great help to have in the docs. I've tried
to remember to document them when I have found them but I've probably
missed some.

>> - Something to turn an APR error response code into meaningful
>> test.
> 
> Can you explain this in a little more detail? For example, an APR
> error code might be "bad socket", but as you say, the circumstances
> of the bug are more important than the error code. How could such a
> code be turned into a test case?

In the last few days I have come across the following error codes:
-70014
-730053

I can look them up to figure what they mean but it would save some
time if the error report included a text version.

>> - Refactoring the connectors so all socket access goes through
>> the SocketWrapper so there is a much smaller set of code to
>> validate.
> 
> I'm guessing you are tackling this task slowly over time.

I am moving slowly in this direction. My ultimate aim is to have the
connector type specific code only in the Endpoint and the
SocketWrapper. No idea if that is possible. It is a longer term goal
for Tomcat 9+ at this point.

At the very least whenever I add functionality to the connectors (e.g.
non-blocking IO) I do enough refactoring that I only have to add the
new stuff once.

Mark


---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Mark,

On 10/3/13 9:18 AM, Mark Thomas wrote:
> On 03/10/2013 13:51, Christopher Schultz wrote:
>> Sebb,
>>
>> On 10/3/13 8:06 AM, sebb wrote:
>>> The problem is that bugs that reveal themselves as JVM crashes
>>> are much harder to debug.
>>
>> +1
>>
>> This is exactly the point I was arguing. When we get a JVM crash
>> report, the stack dump could be completely different depending upon
>> which architecture, kernel, compiler, optimization flags, etc. that
>> were used when compiling and/or running the library. Converting
>> SIGSEGV into an exception gives us a lot of freedom to publish tons
>> of useful information when reporting errors to the user.
>>
>> I'd rather get a report from a user that says something like "here
>> is my stack trace and error message, complete with name of variable
>> that was NULL and line of source code" rather than "here's my JVM
>> crash report, sorry I didn't get a core file, I'm having trouble
>> reproducing the error" (which is essentially what all
>> currently-open JVM-crash error reports look like in BZ for
>> tcnative).
> 
> Having been responsible for introducing and fixing a number of these
> issues I disagree. It is usually fairly easy to tell what the problem
> is from the crash report (double close on a socket, invalid socket in
> the poller, etc). What is far more difficult is figuring out how
> things got into the known invalid state in the first place. No amount
> of debug information at the point of the crash is going to help with that.

Agreed. Given that you have been researching these issues (and fixing
them -- thanks!) you have a unique perspective: even though it's easy
for you to determine the cause of a crash when you can reproduce it, how
about getting a crash report with little other context? Would it help
you to have more information or is the crash report sufficient?

> A hard to reproduce bug in APR that triggers a crash is no more or
> less difficult to work with than a hard to reproduce bug in NIO that
> triggers an unexpected socket close.

In those two cases -- NIO and APR -- is the NIO case any less
catastrophic? I've been arguing that we should stop the JVM from
crashing, but locking-up the NIO connector and rendering the server
inert is roughly the same outcome. Is there actually any utility in
stopping the JVM from coming down? I guess you could get a thread dump
from an otherwise hung Tomcat, but probably not much else.

> You may have noticed that I have slowly been adding debug code to the
> low level connector code, primarily in the Endpoints and the Protocol
> implementations. All of this debug code has proven useful in tracking
> down the type of bug that triggers a crash with APR.
> 
> Additional validity checks in the native code provide for a more
> graceful failure mode but offer little other benefit as the useful
> information is more focused on how the current state was achieved
> rather than what the current state is.

Agreed. Remember, I was just trying to stop crashes. We have a load of
crash reports in various versions of tcnative and if the problem
actually isn't in tcnative, it would be nice to get those reported
against the components that actually have bugs.

> I'm -0 on adding additional checks to the native code.

Noted.

> I can think of several things that would be more useful:
> - Better Javadoc for the native methods. I can think of a number of
> times where better docs would have saved me a fair amount of time
> debugging unexpected behaviour.

Do you mean more documentation about how the method works, or even just
a simple description of what happens *at all*?

> - Something to turn an APR error response code into meaningful test.

Can you explain this in a little more detail? For example, an APR error
code might be "bad socket", but as you say, the circumstances of the bug
are more important than the error code. How could such a code be turned
into a test case?

> - Refactoring the connectors so all socket access goes through the
>   SocketWrapper so there is a much smaller set of code to validate.

I'm guessing you are tackling this task slowly over time.

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Mark Thomas <ma...@apache.org>.

On 03/10/2013 13:51, Christopher Schultz wrote:
> Sebb,
> 
> On 10/3/13 8:06 AM, sebb wrote:
>> The problem is that bugs that reveal themselves as JVM crashes
>> are much harder to debug.
> 
> +1
> 
> This is exactly the point I was arguing. When we get a JVM crash
> report, the stack dump could be completely different depending upon
> which architecture, kernel, compiler, optimization flags, etc. that
> were used when compiling and/or running the library. Converting
> SIGSEGV into an exception gives us a lot of freedom to publish tons
> of useful information when reporting errors to the user.
> 
> I'd rather get a report from a user that says something like "here
> is my stack trace and error message, complete with name of variable
> that was NULL and line of source code" rather than "here's my JVM
> crash report, sorry I didn't get a core file, I'm having trouble
> reproducing the error" (which is essentially what all
> currently-open JVM-crash error reports look like in BZ for
> tcnative).

Having been responsible for introducing and fixing a number of these
issues I disagree. It is usually fairly easy to tell what the problem
is from the crash report (double close on a socket, invalid socket in
the poller, etc). What is far more difficult is figuring out how
things got into the known invalid state in the first place. No amount
of debug information at the point of the crash is going to help with that.

A hard to reproduce bug in APR that triggers a crash is no more or
less difficult to work with than a hard to reproduce bug in NIO that
triggers an unexpected socket close.

You may have noticed that I have slowly been adding debug code to the
low level connector code, primarily in the Endpoints and the Protocol
implementations. All of this debug code has proven useful in tracking
down the type of bug that triggers a crash with APR.

Additional validity checks in the native code provide for a more
graceful failure mode but offer little other benefit as the useful
information is more focussed on how the current state was achieved
rather than what the current state is.

In all of the recent APR issues I have been working on, by far the
most useful tool has been the reproducible test case. Unfortunately, I
know of no way to make those easier to generate.

>>> We can add compile time '#if defined(MAINTAINER_MODE) ...
>>> #endif' checks for easier debugging at development,
>> 
>> Any crashes that occur in a released version of TC are likely to
>> be fairly rare, otherwise they would be detected in testing. So
>> the MAINTAINER_MODE is not likely to be much use after the
>> initial shakedown period. Unless the debugging checks are
>> expensive, why not leave them in?
> 
> Obviously given my previous comments, I agree with this. If the 
> MAINTAINER had an exhaustive set of unit tests, there would be no
> error reports, right? ;) I argue that MAINTAINER_MODE is exactly
> the opposite of what you want: you want real users to have this
> information precisely because they are *not* programmers (at least
> not C programmers).

I'm -0 on adding additional checks to the native code.

I can think of several things that would be more useful:
- Better Javadoc for the native methods. I can think of a number of
times where better docs would have saved me a fair amount of time
debugging unexpected behaviour.
- Something to turn an APR error response code into meaningful test.
- Refactoring the connectors so all socket access goes through the
SocketWrapper so there is a much smaller set of code to validate.

Mark

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Sebb,

On 10/3/13 8:06 AM, sebb wrote:
> The problem is that bugs that reveal themselves as JVM crashes are
> much harder to debug.

+1

This is exactly the point I was arguing. When we get a JVM crash report,
the stack dump could be completely different depending upon which
architecture, kernel, compiler, optimization flags, etc. that were used
when compiling and/or running the library. Converting SIGSEGV into an
exception gives us a lot of freedom to publish tons of useful
information when reporting errors to the user.

I'd rather get a report from a user that says something like "here is my
stack trace and error message, complete with name of variable that was
NULL and line of source code" rather than "here's my JVM crash report,
sorry I didn't get a core file, I'm having trouble reproducing the
error" (which is essentially what all currently-open JVM-crash error
reports look like in BZ for tcnative).

>> We can add compile time '#if defined(MAINTAINER_MODE) ... #endif' checks
>> for easier debugging at development,
> 
> Any crashes that occur in a released version of TC are likely to be
> fairly rare, otherwise they would be detected in testing.
> So the MAINTAINER_MODE is not likely to be much use after the initial
> shakedown period.
> Unless the debugging checks are expensive, why not leave them in?

Obviously given my previous comments, I agree with this. If the
MAINTAINER had an exhaustive set of unit tests, there would be no error
reports, right? ;) I argue that MAINTAINER_MODE is exactly the opposite
of what you want: you want real users to have this information precisely
because they are *not* programmers (at least not C programmers).

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by sebb <se...@gmail.com>.

On 2 October 2013 06:18, Mladen Turk <mt...@apache.org> wrote:
> On 10/01/2013 07:32 PM, sebb wrote:
>>
>>
>> If a Java application succeeds in crashing the JVM, then IMO the JVM
>> has a bug. I believe that all native code should strive to behave the
>> same way.
>>
>
> This is conceptual difference.

Partly.

> Most of those checks are done again inside Java.
> However inside JVM the Java API hides its native methods and
> ensures params are validated.
> Our API is Servlet spec and our VM is Tomcat.

AFAIK tcnative does not hide its native methods.
If it did, I agree it should be possible to guarantee correct
parameters, regardless of application code behaviour.

> All the invalid data should be checked in java part which can be
> invalid as part of normal operation. Our native code already checks
> for some invalid data which can be invalid in such situations.
> OTOH invalid data passed to native caused by bug is just that, a bug.
> So fix the bug and you won't need the check.

The problem is that bugs that reveal themselves as JVM crashes are
much harder to debug.

> We can add compile time '#if defined(MAINTAINER_MODE) ... #endif' checks
> for easier debugging at development,

Any crashes that occur in a released version of TC are likely to be
fairly rare, otherwise they would be detected in testing.
So the MAINTAINER_MODE is not likely to be much use after the initial
shakedown period.
Unless the debugging checks are expensive, why not leave them in?

> but all the checks inside native method
> can be equally well coded before the actual JNI call and since our API is
> servlet
> and no use code can pass beyond that.
>

Tomcat code that calls tcnative directly needs to validate its
parameters; the more places that TC code has direct access to
tcnative, the more likely that some checks will be overlooked. And
more effort will be required to retro-fit any extra checks that are
found necessary.

Also AIUI tcnative code may be used separately from Tomcat.

>
>
> Regards
> --
> ^TM
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Mladen Turk <mt...@apache.org>.

On 10/01/2013 07:32 PM, sebb wrote:
>
> If a Java application succeeds in crashing the JVM, then IMO the JVM
> has a bug. I believe that all native code should strive to behave the
> same way.
>

This is conceptual difference.

Most of those checks are done again inside Java.
However inside JVM the Java API hides its native methods and
ensures params are validated. Our API is Servlet spec and our VM is Tomcat.

All the invalid data should be checked in java part which can be
invalid as part of normal operation. Our native code already checks
for some invalid data which can be invalid in such situations.
OTOH invalid data passed to native caused by bug is just that, a bug.
So fix the bug and you won't need the check.

We can add compile time '#if defined(MAINTAINER_MODE) ... #endif' checks
for easier debugging at development, but all the checks inside native method
can be equally well coded before the actual JNI call and since our API is servlet
and no use code can pass beyond that.

Regards
-- 
^TM

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by sebb <se...@gmail.com>.

On 1 October 2013 15:47, Christopher Schultz
<ch...@christopherschultz.net> wrote:
> Mladen,
>
> On 10/1/13 10:39 AM, Mladen Turk wrote:
>> On 10/01/2013 04:15 PM, Christopher Schultz wrote:
>>> Mladen and Rainer,
>>>>>
>>>>> -1. You are just hiding the reason for crash.
>>>
>>> I disagree: the reason for the crash can still be reported. It just
>>> won't be reported with a JVM crash: instead, there will be an exception.
>>
>> I disagree on your disagreement :)
>
> ;)
>
> Just to be sure: I'm not arguing against fixing the Java code, or
> providing checks in the Java code to avoid SIGSEGVs. I'm just saying
> that native code should be as safe as reasonably possible. NULL-checks
> are cheap, too, so I don't think there's any reason to avoid them in
> native code.
>
>> This can be easily done in Java with much cheaper computing cost.
>
> Agreed. Feel free to avoid native calls if you know the data is bad. But
> we can't force people to use specific versions of Tomcat and tcnative
> together (though obviously, newer versions of Tomcat can refuse to load
> older versions of tcnative). Someone pointed out in Bugzilla (though it
> mostly fell on deaf ears) that tcnative isn't used exclusively for
> Tomcat: some folks are using it as a convenient library to access APR
> and other native capabilities in a Java wrapper.
>
>> If you suspect the data fed to native call can be faulty, well check it
>> and throw java exception instead calling native, and distinguishing
>> between valid
>> and error return values from native and then still have a code which will
>> pass/throw.
>
> I was suggesting throwing an exception directly from native code, not
> trying to return a status code and then throwing from Java-land.
>
>> I know it requires a different thinking then average library,
>> but it's not an average library. It's wrapper for APR and APR expects you
>> provide valid data.
>>
>> Even checking in native won't solve crashes. You can fed a long (pointer)
>> which is in zombie state (closed but not null) and you'll have a memory
>> location which will pass null check but can still crash or even corrupt
>> other memory location if it was reused by another alloc.
>
> Sure, but a NULL-check (whether in Java or C) isn't going to catch that
> either way, and nothing we've been discussing here will have any bearing
> on that situation: it will still fail, it will likely still crash or
> behave strangely and ...
>
>> Much harder to track down.
>
> ... it will be hard to track down. Adding NULL-checks will not
> complicate any of that. It will not mask any of that. It will make it no
> more difficult to track-down or debug. It will only prevent the JVM
> going down when a pointer ends up being NULL, which is the only reason
> I'm suggesting it.
>
> Given the two -1s and lack of any +1s on this suggestion, I won't be
> committing anything, but I still think it's a worthwhile endeavor
> because it's so easy to do and protects us from such disastrous results.

+1 (non-binding!)

I agree with Mladen that relying on tcnative to do all the validation
is wrong; the Java code should be written so as to only pass valid
parameters. But mistakes happen, and not all applications using
tcnative are written by Tomcat developers who should be familiar with
the calling conventions.

The JVM throws an exception if a Java application uses an invalid
array index or tries to read past the end of an iterator. That does
not mean that Java applications should use these exceptions as part of
standard coding; of course the application should try to use valid
parameters. The point is that such exceptions make debugging much
easier. If Java crashed whenever an invalid parameter was provided, it
would not be very popular with developers.

If a Java application succeeds in crashing the JVM, then IMO the JVM
has a bug. I believe that all native code should strive to behave the
same way.

> -chris
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Mladen,

On 10/1/13 10:39 AM, Mladen Turk wrote:
> On 10/01/2013 04:15 PM, Christopher Schultz wrote:
>> Mladen and Rainer,
>>>>
>>>> -1. You are just hiding the reason for crash.
>>
>> I disagree: the reason for the crash can still be reported. It just
>> won't be reported with a JVM crash: instead, there will be an exception.
> 
> I disagree on your disagreement :)

;)

Just to be sure: I'm not arguing against fixing the Java code, or
providing checks in the Java code to avoid SIGSEGVs. I'm just saying
that native code should be as safe as reasonably possible. NULL-checks
are cheap, too, so I don't think there's any reason to avoid them in
native code.

> This can be easily done in Java with much cheaper computing cost.

Agreed. Feel free to avoid native calls if you know the data is bad. But
we can't force people to use specific versions of Tomcat and tcnative
together (though obviously, newer versions of Tomcat can refuse to load
older versions of tcnative). Someone pointed out in Bugzilla (though it
mostly fell on deaf ears) that tcnative isn't used exclusively for
Tomcat: some folks are using it as a convenient library to access APR
and other native capabilities in a Java wrapper.

> If you suspect the data fed to native call can be faulty, well check it
> and throw java exception instead calling native, and distinguishing
> between valid
> and error return values from native and then still have a code which will
> pass/throw.

I was suggesting throwing an exception directly from native code, not
trying to return a status code and then throwing from Java-land.

> I know it requires a different thinking then average library,
> but it's not an average library. It's wrapper for APR and APR expects you
> provide valid data.
> 
> Even checking in native won't solve crashes. You can fed a long (pointer)
> which is in zombie state (closed but not null) and you'll have a memory
> location which will pass null check but can still crash or even corrupt
> other memory location if it was reused by another alloc.

Sure, but a NULL-check (whether in Java or C) isn't going to catch that
either way, and nothing we've been discussing here will have any bearing
on that situation: it will still fail, it will likely still crash or
behave strangely and ...

> Much harder to track down.

... it will be hard to track down. Adding NULL-checks will not
complicate any of that. It will not mask any of that. It will make it no
more difficult to track-down or debug. It will only prevent the JVM
going down when a pointer ends up being NULL, which is the only reason
I'm suggesting it.

Given the two -1s and lack of any +1s on this suggestion, I won't be
committing anything, but I still think it's a worthwhile endeavor
because it's so easy to do and protects us from such disastrous results.

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Mladen Turk <mt...@apache.org>.

On 10/01/2013 04:15 PM, Christopher Schultz wrote:
> Mladen and Rainer,
>>>
>>> -1. You are just hiding the reason for crash.
>
> I disagree: the reason for the crash can still be reported. It just
> won't be reported with a JVM crash: instead, there will be an exception.

I disagree on your disagreement :)

This can be easily done in Java with much cheaper computing cost.
If you suspect the data fed to native call can be faulty, well check it
and throw java exception instead calling native, and distinguishing between valid
and error return values from native and then still have a code which will
pass/throw. I know it requires a different thinking then average library,
but it's not an average library. It's wrapper for APR and APR expects you
provide valid data.

Even checking in native won't solve crashes. You can fed a long (pointer)
which is in zombie state (closed but not null) and you'll have a memory
location which will pass null check but can still crash or even corrupt
other memory location if it was reused by another alloc.
Much harder to track down.

Regards
-- 
^TM

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Christopher Schultz <ch...@christopherschultz.net>.

Mladen and Rainer,

On 10/1/13 2:59 AM, Rainer Jung wrote:
> On 01.10.2013 06:53, Mladen Turk wrote:
>> On 09/30/2013 08:50 PM, Christopher Schultz wrote:
>>> Mladen,
>>>
>>> Unless there is significant objection,
>>
>> Yes there is.
>> The transition from  Java to JNI costs 10 times then
>> then a simple 'if (socket == OL)' in java

Okay, so we can add those in Java as well. I still think that tcnative
should be coded defensively.

>>> I'd like to add such NULL checks
>>> to limit JVM crashes in cases where the Java code isn't absolutely
>>> perfect (with sincere apologies to all who manage the socket-management
>>> code in Tomcat).
>>>
>>
>> -1. You are just hiding the reason for crash.

I disagree: the reason for the crash can still be reported. It just
won't be reported with a JVM crash: instead, there will be an exception.
We can even rig it to report the line number in the native method if you
want. Wouldn't that be better than bringing-down the whole JVM? I'd
rather have a complete lock-up of the JVM with a handful of exceptions
explaining the problem than having to examine a core file to figure out
what happened. I'm sure that's true of most of our users, too.

> I hate these crashes too, but I'm agreeing with Mladen here, especially
> if the native call does not return an error on such unexpected input
> data and the error is not handled in Java land (logging, maybe more).

I wasn't suggesting that the Java component take no action, here: I was
just suggesting that a exception (even if fatal) is better than a JVM
crash. I don't see why both components can't check their own stuff.

> In cases where the crashes indicate that there was a data corruption
> before, like a synchronization issue, we need to get aware of that.
> Otherwise we might serve invalid results.
> 
> It would be nice though, if there were a simple validity check in Java
> land, so that we could e.g. log an error if we encounter invalid data
> and handle the error accordingly.

We can add those, too. I'm trying to understand which data are intended
to be native pointers so we can do that kind of thing. I'm happy to add
zero-checks in Java as well as NULL checks in the native code. I just
didn't want to go around changing things in Java-land if Mark is doing
the same: his changes will likely be more meaningful than anything I do.

-chris

Re: Recent tcnative null-dereference with 8.0.0-RC3 and 7.0.45 [tcnative-1.dll+0x7e23]

Posted by Rainer Jung <ra...@kippdata.de>.

On 01.10.2013 06:53, Mladen Turk wrote:
> On 09/30/2013 08:50 PM, Christopher Schultz wrote:
>> Mladen,
>>
>> Unless there is significant objection,
> 
> Yes there is.
> The transition from  Java to JNI costs 10 times then
> then a simple 'if (socket == OL)' in java
> 
>> I'd like to add such NULL checks
>> to limit JVM crashes in cases where the Java code isn't absolutely
>> perfect (with sincere apologies to all who manage the socket-management
>> code in Tomcat).
>>
> 
> -1. You are just hiding the reason for crash.

I hate these crashes too, but I'm agreeing with Mladen here, especially
if the native call does not return an error on such unexpected input
data and the error is not handled in Java land (logging, maybe more).

In cases where the crashes indicate that there was a data corruption
before, like a synchronization issue, we need to get aware of that.
Otherwise we might serve invalid results.

It would be nice though, if there were a simple validity check in Java
land, so that we could e.g. log an error if we encounter invalid data
and handle the error accordingly.

Regards,

Rainer

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org