You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@tomcat.apache.org by GitBox <gi...@apache.org> on 2020/02/18 14:15:56 UTC

[GitHub] [tomcat] mads1980 opened a new pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

mads1980 opened a new pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246
 
 
   We started getting random SIGSEGV JVM crashes on our servers after upgrading to OpenJDK 13.0.2
   
   The crashes are always in tcnative code, on the JVM Finalizer daemon thread, at OpenSSLEngine.shutdown(). We're on Tomcat 9.0.31, tcnative 1.2.23, OpenSSL 1.1.1d and APR 1.7.0 using NIO connectors. This combination works fine with Oracle JDK 8u202 as well as OpenJDK 8u242, but crashes began immediately after upgrading to OpenJDK 13.0.2
   
   Since the failure is related to tcnative code and finalization, we guess the new JVM is probably invoking the finalizers sooner, so this is exposing a bug in OpenSSLEngine which was already present, but was not being triggered on previous JVMs because of different timing.
   
   The crash occurs when SSL.freeSSL() is invoked, so this probably means either the pointer is zero, or a double SSL.freeSSL() is invoked. This can happen if there are uncaught exceptions within the OpenSSLEngine constructor, as finalization can execute concurrently with object construction.
   
   We adapted the solution found in OpenSSLContext.destroy() which evidently was causing similar issues. The current volatile "destroyed" checks found in OpenSSLEngine seem to be insufficient, so we replaced it with an AtomicBoolean. Also, we check whether "networkBIO" and
   "ssl" are zero in shutdown() to avoid attempting to free a null pointer.
   
   We found a discussing regarding these issues in the following thread:
   http://mail-archives.apache.org/mod_mbox/tomcat-dev/201901.mbox/%3C44ce40f1-36a1-df9b-b648-8635d9d84eb1@kippdata.de%3E
   
   It seems the improvements made it to OpenSSLContext but not OpenSSLEngine. This pull request should fix remaining issues.
   
   See attached sample HotSpot error dumps.
   
   [hs_err_pid28766.log](https://github.com/apache/tomcat/files/4219652/hs_err_pid28766.log)
   [hs_err_pid32426.log](https://github.com/apache/tomcat/files/4219653/hs_err_pid32426.log)
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] rmaucher commented on a change in pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
rmaucher commented on a change in pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#discussion_r381181675
 
 

 ##########
 File path: java/org/apache/tomcat/util/net/openssl/OpenSSLEngine.java
 ##########
 @@ -219,10 +218,14 @@ public String getNegotiatedProtocol() {
      * Destroys this engine.
      */
     public synchronized void shutdown() {
-        if (!destroyed) {
-            destroyed = true;
-            SSL.freeBIO(networkBIO);
-            SSL.freeSSL(ssl);
+        // Guard against multiple shutdown() calls triggered by construction exception and finalize() later
+        if (destroyed.compareAndSet(false, true)) {
+            if (networkBIO != 0) {
+                SSL.freeBIO(networkBIO);
+            }
+            if (ssl != 0) {
+                SSL.freeSSL(ssl);
+            }
 
 Review comment:
   That's the best part of the change since it could take care of mysterious errors.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


Re: [GitHub] [tomcat] mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by Rémy Maucherat <re...@apache.org>.
On Wed, Feb 19, 2020 at 4:26 PM GitBox <gi...@apache.org> wrote:

> mads1980 commented on issue #246: OpenSSLEngine improvements to guard
> against multiple shutdown() calls triggered by construction exception and
> finalize() later
> URL: https://github.com/apache/tomcat/pull/246#issuecomment-588286325
>
>
>    > Well, personally I don't approve until there's an explanation.
>    >
>    > * I still don't see what the atomic boolean brings
>
>    OpenSSLContext has a very similar protection (this is were I got the
> idea for this solution), but it uses AtomicInteger instead of AtomicBoolean
> (not sure why, since AtomicBoolean is more semantically "readable", and
> both implementations internally use a volatile int for storage, so memory
> usage is the same).
>
>    >  all relevant code is already synchronized and the check on the
> boolean flag should be identical.
>
>    The fact that shutdown() is synchronized does not prevent it from
> executing concurrently with a failed constructor. There is no
> synchronization within the constructor, while shutdown() is being
> concurrently execute as a result from finalize() being invoked by the JVM
> GC concurrently with the failed constructor.
>
>    > I suppose the actual "fix" is the check if (networkBIO != 0).
>
>    Actually the HotStop error log shows the stack trace on freeSSL() but I
> guess that it does not hurt protecting both networkBIO and ssl from null
> pointer frees.
>
>    > * "This can happen if there are uncaught exceptions within the
> OpenSSLEngine constructor, as finalization can execute concurrently with
> object construction": ok, so what is the uncaught exception ?
>
>    This is not being captured by the HotSpot error logs, since a SIGSEGV
> crash occurs before any useful logging. However, it must be some kind of
> exception arising from the JNI calls within the OpenSSLEngine constructor.
>

GitHub has decided I cannot comment, so posting it here.

I still cannot reproduce this with the GC config (which is missing
-XX:+UnlockExperimentalVMOptions). As with
https://bz.apache.org/bugzilla/show_bug.cgi?id=63802 I will not add weird
fixes for issues that don't exist.

So I don't understand what difference the AtomicBoolean makes, what
"matters" to me here is the != 0 check in shutdown. The AtomicInteger in
OpenSSLContext is not to be used for comparison and comes almost straight
from some Netty code (it was an AtomicIntegerFieldUpdater which got changed
for some reason). I don't quite see why it's needed either over a simple
boolean since it's all synced, but there it's really not used at all so it
stays.

Rémy


>
> ----------------------------------------------------------------
> This is an automated message from the Apache Git Service.
> To respond to the message, please log on to GitHub and use the
> URL above to go to the specific comment.
>
> For queries about this service, please contact Infrastructure at:
> users@infra.apache.org
>
>
> With regards,
> Apache Git Services
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>
>

[GitHub] [tomcat] mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#issuecomment-588286325
 
 
   > Well, personally I don't approve until there's an explanation.
   > 
   > * I still don't see what the atomic boolean brings
   
   OpenSSLContext has a very similar protection (this is were I got the idea for this solution), but it uses AtomicInteger instead of AtomicBoolean (not sure why, since AtomicBoolean is more semantically "readable", and both implementations internally use a volatile int for storage, so memory usage is the same).
   
   >  all relevant code is already synchronized and the check on the boolean flag should be identical.
   
   The fact that shutdown() is synchronized does not prevent it from executing concurrently with a failed constructor. There is no synchronization within the constructor, while shutdown() is being concurrently execute as a result from finalize() being invoked by the JVM GC concurrently with the failed constructor.
   
   > I suppose the actual "fix" is the check if (networkBIO != 0).
   
   Actually the HotStop error log shows the stack trace on freeSSL() but I guess that it does not hurt protecting both networkBIO and ssl from null pointer frees.
   
   > * "This can happen if there are uncaught exceptions within the OpenSSLEngine constructor, as finalization can execute concurrently with object construction": ok, so what is the uncaught exception ?
   
   This is not being captured by the HotSpot error logs, since a SIGSEGV crash occurs before any useful logging. However, it must be some kind of exception arising from the JNI calls within the OpenSSLEngine constructor.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#issuecomment-587507009
 
 
   Hi Remy, thank you for your feedback. Object.finalize() has been 
   deprecated since Java 9 because of its weird subtleties. Still, even if 
   the JVM behaviour is broken, shouldn't we still fix this? The solution 
   seems simple enough, especially if this would avoid JVM crashes.
   *
   Manuel Dominguez Sarmiento*
   
   On 18/02/2020 11:31, Rémy Maucherat wrote:
   >
   > It's quite normal as the problem was evident and legitimate with 
   > OpenSSLContext, while with OpenSSLEngine I consider the JVM is broken. 
   > I'd like to remind that the engine is referenced directly from the 
   > channel, and then the JVM seems to happily discard everything in a 
   > random order. So: whatever ...
   >
   > —
   > You are receiving this because you authored the thread.
   > Reply to this email directly, view it on GitHub 
   > <https://github.com/apache/tomcat/pull/246?email_source=notifications&email_token=ABDXTFDYUKPSC62S7QRWL43RDPWNBA5CNFSM4KXFMPI2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEMCFW4A#issuecomment-587488112>, 
   > or unsubscribe 
   > <https://github.com/notifications/unsubscribe-auth/ABDXTFB3FH4YLXDFOW2KRN3RDPWNBANCNFSM4KXFMPIQ>.
   >
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#issuecomment-588285958
 
 
   > Personally I can concurrently shutdown Tomcat on JDK 14 while running ab.
   
   Since finalization() and garbage collection are involved, it is expected that different JDKs with different GC configurations will produce different results. We are using G1 GC with the following options:
   
   -XX:+UseG1GC -XX:MaxGCPauseMillis=200 -XX:GCTimeRatio=4 -XX:G1NewSizePercent=30 -XX:G1MaxNewSizePercent=60 -XX:InitiatingHeapOccupancyPercent=0 -XX:G1ReservePercent=10 -XX:G1HeapRegionSize=32M
   
   Even so, this same G1 configuration produced no errors on the following JVMs:
   - Oracle JDK 8u202
   - AdoptOpenJDK 8u242 (Upstream build contributed by Redhat)
   
   After upgrading to OpenJDK 13.0.2 (Oracle build) we started seeing random crashes from some of our servers. Not on startup, not on shutdown, but randomnly during otherwise normal operation.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] rmaucher commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
rmaucher commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#issuecomment-588121408
 
 
   Well, personally I don't approve until there's an explanation.
   - I still don't see what the atomic boolean brings, all relevant code is already synchronized and the check on the boolean flag should be identical. I suppose the actual "fix" is the check if (networkBIO != 0).
   - "This can happen if there are uncaught exceptions within the OpenSSLEngine constructor, as finalization can execute concurrently with object construction": ok, so what is the uncaught exception ?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] mads1980 commented on a change in pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
mads1980 commented on a change in pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#discussion_r381351602
 
 

 ##########
 File path: java/org/apache/tomcat/util/net/openssl/OpenSSLEngine.java
 ##########
 @@ -141,7 +142,7 @@
     private boolean handshakeFinished;
     private int currentHandshake;
     private boolean receivedShutdown;
-    private volatile boolean destroyed;
+    private final AtomicBoolean destroyed = new AtomicBoolean(false);
 
 Review comment:
   OpenSSLContext has a very similar protection (this is were I got the idea for this solution), but it uses AtomicInteger instead of AtomicBoolean (not sure why, since AtomicBoolean is more semantically "readable", and both implementations internally use a volatile int for storage, so memory usage is the same).
   
   Making both "networkBIO" and "ssl" final would not hurt, but it would likely not resolve the issue either. The problem arises if an exception is thrown within the constructor, and the JVM concurrently invokes finalize() on the same condition. Even if the variables are final, under a failed constructor scenario, they would not be initialized (so both would be zero while the concurrent finalize() executes)

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] markt-asf commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
markt-asf commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#issuecomment-592474362
 
 
   Rémy has committed a variation of this.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] rmaucher commented on a change in pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
rmaucher commented on a change in pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#discussion_r381180570
 
 

 ##########
 File path: java/org/apache/tomcat/util/net/openssl/OpenSSLEngine.java
 ##########
 @@ -197,10 +198,8 @@
             throw new IllegalArgumentException(sm.getString("engine.noSSLContext"));
         }
         session = new OpenSSLSession();
-        destroyed = true;
         ssl = SSL.newSSL(sslCtx, !clientMode);
         networkBIO = SSL.makeNetworkBIO(ssl);
-        destroyed = false;
 
 Review comment:
   It's good to remove that, it is a hack. But I would tighten things up by making ssl and networkBIO final. Then keep the if (networkBIO != 0) and if (ssl != 0) checks in destroy but drop the atomic boolean change.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] rmaucher commented on a change in pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
rmaucher commented on a change in pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#discussion_r381180960
 
 

 ##########
 File path: java/org/apache/tomcat/util/net/openssl/OpenSSLEngine.java
 ##########
 @@ -141,7 +142,7 @@
     private boolean handshakeFinished;
     private int currentHandshake;
     private boolean receivedShutdown;
-    private volatile boolean destroyed;
+    private final AtomicBoolean destroyed = new AtomicBoolean(false);
 
 Review comment:
   Try without the AtomicBoolean but ssl and networkBIO should be final.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
mads1980 commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#issuecomment-592486250
 
 
   This is great, thanks for your attention & support.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] rmaucher commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
rmaucher commented on issue #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246#issuecomment-587488112
 
 
   It's quite normal as the problem was evident and legitimate with OpenSSLContext, while with OpenSSLEngine I consider the JVM is broken. I'd like to remind that the engine is referenced directly from the channel, and then the JVM seems to happily discard everything in a random order. So: whatever ...

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org


[GitHub] [tomcat] markt-asf closed pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later

Posted by GitBox <gi...@apache.org>.
markt-asf closed pull request #246: OpenSSLEngine improvements to guard against multiple shutdown() calls triggered by construction exception and finalize() later
URL: https://github.com/apache/tomcat/pull/246
 
 
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org