You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "JoshRosen (via GitHub)" <gi...@apache.org> on 2023/09/21 23:11:20 UTC

[GitHub] [spark] JoshRosen commented on a diff in pull request #42685: [WIP][SPARK-44937][CORE] Add SSL/TLS support for RPC and Shuffle communications

JoshRosen commented on code in PR #42685:
URL: https://github.com/apache/spark/pull/42685#discussion_r1333678652


##########
common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java:
##########
@@ -325,7 +325,10 @@ public TransportResponseHandler getHandler() {
 
   @Override
   public void close() {
-    // close is a local operation and should finish with milliseconds; timeout just to be safe
+    // Mark the connection as timed out, so we do not return a connection that's being closed
+    // from the TransportClientFactory if closing takes some time (e.g. with SSL)
+    this.timedOut = true;

Review Comment:
   This change is necessary for SSL because SSL connections can take a long time to close, but I think it's also okay to have it apply for regular unencrypted connections (as you've done here): if we have a client that is in the process of closing then marking it as timed out during that close helps to avoid a race condition in which a pooled client might be handed to a requester only to immediately close. 👍 



##########
common/network-common/src/main/java/org/apache/spark/network/ssl/ReloadingX509TrustManager.java:
##########
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.ssl;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.TrustManagerFactory;
+import javax.net.ssl.X509TrustManager;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.security.GeneralSecurityException;
+import java.security.KeyStore;
+import java.security.cert.CertificateException;
+import java.security.cert.X509Certificate;
+import java.util.concurrent.atomic.AtomicReference;
+
+/**
+ * A {@link TrustManager} implementation that reloads its configuration when
+ * the truststore file on disk changes.
+ * This implementation is based entirely on the
+ * org.apache.hadoop.security.ssl.ReloadingX509TrustManager class in the Apache Hadoop Encrypted
+ * Shuffle implementation.
+ *
+ * @see <a href="https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Hadoop MapReduce Next Generation - Encrypted Shuffle</a>
+ */
+public final class ReloadingX509TrustManager

Review Comment:
   Bundling this as source avoids the need to add `hadoop-common` dependencies to `common/network-common`. Currently it looks like nothing under `common/` depends on Hadoop stuff. Based on https://github.com/c9n/hadoop/blob/master/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ssl/ReloadingX509TrustManager.java it looks like this code hasn't changed in over a decade, so this is unlikely to pose maintainability risks.
   
   My read of https://infra.apache.org/licensing-howto.html is that this is permissible via 
   
   - https://infra.apache.org/licensing-howto.html#alv2-dep
   - https://infra.apache.org/licensing-howto.html#bundle-asf-product
   
   and I don't see any `NOTICE` portions in `hadoop-common` that specifically pertain to this file, so I believe we're clear from a license perspective 👍 



##########
common/network-common/src/main/java/org/apache/spark/network/ssl/ReloadingX509TrustManager.java:
##########
@@ -0,0 +1,201 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.network.ssl;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+import javax.net.ssl.TrustManager;
+import javax.net.ssl.TrustManagerFactory;
+import javax.net.ssl.X509TrustManager;
+import java.io.File;
+import java.io.FileInputStream;
+import java.io.IOException;
+import java.security.GeneralSecurityException;
+import java.security.KeyStore;
+import java.security.cert.CertificateException;
+import java.security.cert.X509Certificate;
+import java.util.concurrent.atomic.AtomicReference;
+
+/**
+ * A {@link TrustManager} implementation that reloads its configuration when
+ * the truststore file on disk changes.
+ * This implementation is based entirely on the
+ * org.apache.hadoop.security.ssl.ReloadingX509TrustManager class in the Apache Hadoop Encrypted
+ * Shuffle implementation.
+ *
+ * @see <a href="https://hadoop.apache.org/docs/current/hadoop-mapreduce-client/hadoop-mapreduce-client-core/EncryptedShuffle.html">Hadoop MapReduce Next Generation - Encrypted Shuffle</a>
+ */
+public final class ReloadingX509TrustManager

Review Comment:
   Bundling this as source avoids the need to add `hadoop-common` dependencies to `common/network-common`. Currently it looks like nothing under `common/` depends on Hadoop stuff. Based on https://github.com/c9n/hadoop/blob/master/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/security/ssl/ReloadingX509TrustManager.java it looks like this code hasn't changed in over a decade, so this is unlikely to pose maintainability risks.
   
   My read of https://infra.apache.org/licensing-howto.html is that this is permissible via 
   
   - https://infra.apache.org/licensing-howto.html#alv2-dep
   - https://infra.apache.org/licensing-howto.html#bundle-asf-product
   
   and I don't see any `NOTICE` portions in `hadoop-common` that specifically pertain to this file, so I believe we're clear from a license perspective 👍 



##########
common/network-common/src/main/java/org/apache/spark/network/client/TransportClient.java:
##########
@@ -325,7 +325,10 @@ public TransportResponseHandler getHandler() {
 
   @Override
   public void close() {
-    // close is a local operation and should finish with milliseconds; timeout just to be safe
+    // Mark the connection as timed out, so we do not return a connection that's being closed
+    // from the TransportClientFactory if closing takes some time (e.g. with SSL)
+    this.timedOut = true;

Review Comment:
   This change is necessary for SSL because SSL connections can take a long time to close, but I think it's also okay to have it apply for regular unencrypted connections (as you've done here): if we have a client that is in the process of closing then marking it as timed out during that close helps to avoid a race condition in which a pooled client might be handed to a requester only to immediately close. 👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org