You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "hasnain-db (via GitHub)" <gi...@apache.org> on 2023/10/06 02:17:37 UTC

[PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

hasnain-db opened a new pull request, #43238:
URL: https://github.com/apache/spark/pull/43238

   ### What changes were proposed in this pull request?
   
   This PR adds the options added in https://github.com/apache/spark/pull/43220  to `SSLOptions` and `SparkTransportConf`.
   
   By adding it to the `SSLOptions` we can support inheritance of options, so settings for the UI and RPC SSL settings can be shared as much as possible. The `SparkTransportConf` changes are needed to support propagating these settings.
   
   I also make some changes to `SecurityManager` to log when this feature is enabled, and make the existing `spark.authenticate` and `spark.network.crypto` options mutually exclusive with this new settings (it would just involve double encryption then).
   
   Lastly, make these flags propagate down to when executors are launch, and allow the passwords to be sent via environment variables (similar to how it's done for an existing secret). This ensures they are not visible in plaintext, but also ensures they are available at executor startup (otherwise it can never talk to the driver/worker)
   
   ### Why are the changes needed?
   
   The propagation of these options are needed for the RPC functionality to work
   
   ### Does this PR introduce _any_ user-facing change?
   
   No
   
   ### How was this patch tested?
   
   CI
   
   Added some unit tests which I verified passed:
   
   ```
   build/sbt
   > project core
   > testOnly org.apache.spark.SparkConfSuite org.apache.spark.SSLOptionsSuite org.apache.spark.SecurityManagerSuite org.apache.spark.deploy.worker.CommandUtilsSuite
   ``` 
   
   The rest of the changes and integration were tested as part of https://github.com/apache/spark/pull/42685
   
   ### Was this patch authored or co-authored using generative AI tooling?
   
   No
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349630129


##########
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala:
##########
@@ -38,12 +56,14 @@ object SparkTransportConf {
    *                       This restriction will only occur if these properties are not already set.
    * @param role           optional role, could be driver, executor, worker and master. Default is
    *                      [[None]], means no role specific configurations.
+   * @param sslOptions SSL config options
    */
-  def fromSparkConf(
+  def fromSparkConfWithSslOptions(

Review Comment:
   Tests fail saying that this breaks binary compatibility. I initially went with that but changed it for that reason



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1762701923

   The failures are not related.
   Merging to master.
   Thanks for working on this @hasnain-db !
   Thanks for the review @JoshRosen :-)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1760357448

   I was trying to see if `spark.network.ssl` made more sense, and why we have `spark.ssl` and not `spark.ui.ssl` (given current use is for UI).
   
   We used to have more ssl related configs in the past under that namespace - `spark.ssl.akka`, `spark.ssl.fs` (file server), etc ... with the intent to keep `spark.ssl` as the namespace for various module's ssl config.
   
   Given this, even though today it is just UI, it does make more sense to go back to that approach.
   Let us keep it as proposed in this PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1354109836


##########
core/src/main/scala/org/apache/spark/SSLOptions.scala:
##########
@@ -152,15 +207,25 @@ private[spark] object SSLOptions extends Logging {
    * $ - `[ns].port` - the port where to bind the SSL server
    * $ - `[ns].keyStore` - a path to the key-store file; can be relative to the current directory
    * $ - `[ns].keyStorePassword` - a password to the key-store file
+   * $ - `[ns].privateKey` - a PKCS#8 private key file in PEM format
    * $ - `[ns].keyPassword` - a password to the private key
    * $ - `[ns].keyStoreType` - the type of the key-store
    * $ - `[ns].needClientAuth` - whether SSL needs client authentication
+   * $ - `[ns].certChain` - an X.509 certificate chain file in PEM format
    * $ - `[ns].trustStore` - a path to the trust-store file; can be relative to the current
    *                         directory
    * $ - `[ns].trustStorePassword` - a password to the trust-store file
    * $ - `[ns].trustStoreType` - the type of trust-store
+   * $ - `[ns].trustStoreReloadingEnabled` - enables or disables using a trust-store
+   * that reloads its configuration when the trust-store file on disk changes
+   * $ - `[ns].trustStoreReloadIntervalMs` - the interval, in milliseconds, the
+   * trust-store will reload its configuration
+   * $ - `[ns].openSslEnabled` - enables or disables using an OpenSSL implementation
+   * (if available on host system), requires certChain and keyFile arguments
    * $ - `[ns].protocol` - a protocol name supported by a particular Java version
    * $ - `[ns].enabledAlgorithms` - a comma separated list of ciphers
+   * $ - `[ns].dangerouslyFallbackIfKeysNotPresent` - whether to fallback to unencrypted
+   *                                                  communication if keys are not present.

Review Comment:
   removed.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349768410


##########
core/src/main/scala/org/apache/spark/SparkConf.scala:
##########
@@ -739,6 +739,9 @@ private[spark] object SparkConf extends Logging {
     (name.startsWith("spark.auth") && name != SecurityManager.SPARK_AUTH_SECRET_CONF) ||
     name.startsWith("spark.rpc") ||
     name.startsWith("spark.network") ||
+    // We need SSL configs to propagate as they may be needed for RPCs.
+    // Passwords are propagated separately though.
+    (name.startsWith("spark.ssl") && !name.contains("Password")) ||

Review Comment:
   Sounds good, thanks for clarifying



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1749927614

   cc: @mridulm @JoshRosen 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1762251118

   test failure looks unrelated, retrying for the third time


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1357425365


##########
core/src/main/scala/org/apache/spark/SecurityManager.scala:
##########
@@ -269,9 +276,24 @@ private[spark] class SecurityManager(
    * @return Whether to enable encryption when connecting to services that support it.
    */
   def isEncryptionEnabled(): Boolean = {
-    sparkConf.get(Network.NETWORK_CRYPTO_ENABLED) || sparkConf.get(SASL_ENCRYPTION_ENABLED)
+    if (sparkConf.get(Network.NETWORK_CRYPTO_ENABLED) || sparkConf.get(SASL_ENCRYPTION_ENABLED)) {
+      if (rpcSSLOptions.enabled) {
+        logWarning("Network encryption disabled as RPC SSL encryption is enabled")
+        false
+      } else {
+        true
+      }
+    } else {
+      false
+    }

Review Comment:
   nit:
   
   ```suggestion
       val encryptionEnabled = sparkConf.get(Network.NETWORK_CRYPTO_ENABLED) || sparkConf.get(SASL_ENCRYPTION_ENABLED)
       if (encryptionEnabled && rpcSSLOptions.enabled) {
         logWarning("Network encryption disabled as RPC SSL encryption is enabled")
         false
       } else 
         encryptionEnabled
       }
   ```



##########
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala:
##########
@@ -31,19 +31,21 @@ object SparkTransportConf {
 
   /**
    * Utility for creating a [[TransportConf]] from a [[SparkConf]].
-   * @param _conf the [[SparkConf]]
-   * @param module the module name
+   * @param _conf          the [[SparkConf]]
+   * @param module         the module name
    * @param numUsableCores if nonzero, this will restrict the server and client threads to only
    *                       use the given number of cores, rather than all of the machine's cores.
    *                       This restriction will only occur if these properties are not already set.
    * @param role           optional role, could be driver, executor, worker and master. Default is
-   *                      [[None]], means no role specific configurations.
+   *                       [[None]], means no role specific configurations.
+   * @param sslOptions SSL config options

Review Comment:
   nit: Remove white space changes ?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1756833647

   addressed part of the feedback. Still waiting for @JoshRosen to also chime in with his thoughts re: which config namespace to use.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1758023704

   FYI The open PR for the docs is here: https://github.com/apache/spark/pull/43240 -- happy to address and make changes there.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349630092


##########
core/src/main/scala/org/apache/spark/SSLOptions.scala:
##########
@@ -152,15 +207,25 @@ private[spark] object SSLOptions extends Logging {
    * $ - `[ns].port` - the port where to bind the SSL server
    * $ - `[ns].keyStore` - a path to the key-store file; can be relative to the current directory
    * $ - `[ns].keyStorePassword` - a password to the key-store file
+   * $ - `[ns].privateKey` - a PKCS#8 private key file in PEM format
    * $ - `[ns].keyPassword` - a password to the private key
    * $ - `[ns].keyStoreType` - the type of the key-store
    * $ - `[ns].needClientAuth` - whether SSL needs client authentication
+   * $ - `[ns].certChain` - an X.509 certificate chain file in PEM format
    * $ - `[ns].trustStore` - a path to the trust-store file; can be relative to the current
    *                         directory
    * $ - `[ns].trustStorePassword` - a password to the trust-store file
    * $ - `[ns].trustStoreType` - the type of trust-store
+   * $ - `[ns].trustStoreReloadingEnabled` - enables or disables using a trust-store
+   * that reloads its configuration when the trust-store file on disk changes
+   * $ - `[ns].trustStoreReloadIntervalMs` - the interval, in milliseconds, the
+   * trust-store will reload its configuration
+   * $ - `[ns].openSslEnabled` - enables or disables using an OpenSSL implementation
+   * (if available on host system), requires certChain and keyFile arguments
    * $ - `[ns].protocol` - a protocol name supported by a particular Java version
    * $ - `[ns].enabledAlgorithms` - a comma separated list of ciphers
+   * $ - `[ns].dangerouslyFallbackIfKeysNotPresent` - whether to fallback to unencrypted
+   *                                                  communication if keys are not present.

Review Comment:
   this is helpful in certain deployment settings where a key might not be available at node startup but is available before it needs to communicate to other machines. Happy to keep this out of OSS spark though, I don't feel too strongly about this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349630607


##########
core/src/main/scala/org/apache/spark/SecurityManager.scala:
##########
@@ -258,18 +265,46 @@ private[spark] class SecurityManager(
     isUserInACL(user, modifyAcls, modifyAclsGroups)
   }
 
+  /**
+   * Check to see if authentication is enabled and SSL for RPC is disabled (as they are mutually
+   * exclusive)
+   * @return Whether authentication is enabled and SSL for RPC is disabled
+   */
+  private def isAuthenticationEnabledAndSslRpcDisabled(): Boolean = {
+    if (sparkConf.get(NETWORK_AUTH_ENABLED)) {
+      if (rpcSSLOptions.enabled) {
+        logWarning("Network auth disabled as RPC SSL encryption is enabled")
+        false
+      } else {
+        true
+      }
+    } else {
+      false
+    }
+  }
+
+
   /**
    * Check to see if authentication for the Spark communication protocols is enabled
    * @return true if authentication is enabled, otherwise false
    */
-  def isAuthenticationEnabled(): Boolean = authOn
+  def isAuthenticationEnabled(): Boolean = isAuthenticationEnabledAndSslRpcDisabled
 
   /**
    * Checks whether network encryption should be enabled.
    * @return Whether to enable encryption when connecting to services that support it.
    */
   def isEncryptionEnabled(): Boolean = {

Review Comment:
   ah, good catch: https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/SparkEnv.scala#L310-L315 should not warn if TLS is enabled I imagine. I'll try to fix this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349630702


##########
core/src/main/scala/org/apache/spark/SparkConf.scala:
##########
@@ -739,6 +739,9 @@ private[spark] object SparkConf extends Logging {
     (name.startsWith("spark.auth") && name != SecurityManager.SPARK_AUTH_SECRET_CONF) ||
     name.startsWith("spark.rpc") ||
     name.startsWith("spark.network") ||
+    // We need SSL configs to propagate as they may be needed for RPCs.
+    // Passwords are propagated separately though.
+    (name.startsWith("spark.ssl") && !name.contains("Password")) ||

Review Comment:
   We do, and we pass them over environment variables. `isExecutorStartupConf` is used for arguments that are passed in the command line invocation to start the executor. If we passed the password there, anyone on the system using `ps` could see the password in plaintext, which we don't want. There is an existing test which fails, due to exactly this bug, that I had to adjust in this PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349768184


##########
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala:
##########
@@ -38,12 +56,14 @@ object SparkTransportConf {
    *                       This restriction will only occur if these properties are not already set.
    * @param role           optional role, could be driver, executor, worker and master. Default is
    *                      [[None]], means no role specific configurations.
+   * @param sslOptions SSL config options
    */
-  def fromSparkConf(
+  def fromSparkConfWithSslOptions(

Review Comment:
   We can add it to mima exclusion rules as the change is preserving source compatibility.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349768255


##########
core/src/main/scala/org/apache/spark/SecurityManager.scala:
##########
@@ -258,18 +265,46 @@ private[spark] class SecurityManager(
     isUserInACL(user, modifyAcls, modifyAclsGroups)
   }
 
+  /**
+   * Check to see if authentication is enabled and SSL for RPC is disabled (as they are mutually
+   * exclusive)
+   * @return Whether authentication is enabled and SSL for RPC is disabled
+   */
+  private def isAuthenticationEnabledAndSslRpcDisabled(): Boolean = {

Review Comment:
   Agree, does not make sense to do existing encryption over TLS :-)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1758897493

   @JoshRosen I updated the docs PR to hopefully add more clarity around the various encryption options (and, to your request, clarified how they overlap with the authentication setting).
   
   > There may be some advantages to the hierarchical namespacing of SSL configurations...
   
   Agreed.  If we reuse the `spark.ssl.rpc` namespace as is done in the PR the implementation is also much easier -- but we can also make the settings inheritance work if we use a different namespace for it - implementation complexity shouldn't hinder UX IMO.
   
   >  I'm wondering if we use some long-term vision of a future end state to guide us here.
   
   I don't have enough context to make a call here and defer to @mridulm . I will say that as a random observer here who hasn't had to manage too many spark deployments just yet, I feel like the current encryption mechanism has a strong point in its favor: it's quite simple to configure just a shared secret. SSL requires provisioning keys and certs which is always more work.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1762546624

   `org.apache.spark.sql.kafka010.KafkaSourceStressSuite` is failing on this and the other outstanding PR, and seems to be unrelated to the changes here. I wonder if that is related to the new kafka version bump since it looks like it also failed on https://github.com/apache/spark/actions/runs/6497942586/job/17648192318


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm closed pull request #43238: [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf
URL: https://github.com/apache/spark/pull/43238


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "mridulm (via GitHub)" <gi...@apache.org>.
mridulm commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349607873


##########
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala:
##########
@@ -57,12 +77,14 @@ object SparkTransportConf {
       conf.set(s"spark.$module.io.$suffix", value)
     }
 
-    new TransportConf(module, new ConfigProvider {
+    val configProvider = sslOptions.map(_.createConfigProvider(conf)).getOrElse({

Review Comment:
   Remove the suffix `{`



##########
core/src/main/scala/org/apache/spark/SecurityManager.scala:
##########
@@ -258,18 +265,46 @@ private[spark] class SecurityManager(
     isUserInACL(user, modifyAcls, modifyAclsGroups)
   }
 
+  /**
+   * Check to see if authentication is enabled and SSL for RPC is disabled (as they are mutually
+   * exclusive)
+   * @return Whether authentication is enabled and SSL for RPC is disabled
+   */
+  private def isAuthenticationEnabledAndSslRpcDisabled(): Boolean = {
+    if (sparkConf.get(NETWORK_AUTH_ENABLED)) {
+      if (rpcSSLOptions.enabled) {
+        logWarning("Network auth disabled as RPC SSL encryption is enabled")
+        false
+      } else {
+        true
+      }
+    } else {
+      false
+    }
+  }
+
+
   /**
    * Check to see if authentication for the Spark communication protocols is enabled
    * @return true if authentication is enabled, otherwise false
    */
-  def isAuthenticationEnabled(): Boolean = authOn
+  def isAuthenticationEnabled(): Boolean = isAuthenticationEnabledAndSslRpcDisabled
 
   /**
    * Checks whether network encryption should be enabled.
    * @return Whether to enable encryption when connecting to services that support it.
    */
   def isEncryptionEnabled(): Boolean = {

Review Comment:
   Fix the warning in `SparkEnv` as well ?



##########
core/src/main/scala/org/apache/spark/SparkConf.scala:
##########
@@ -739,6 +739,9 @@ private[spark] object SparkConf extends Logging {
     (name.startsWith("spark.auth") && name != SecurityManager.SPARK_AUTH_SECRET_CONF) ||
     name.startsWith("spark.rpc") ||
     name.startsWith("spark.network") ||
+    // We need SSL configs to propagate as they may be needed for RPCs.
+    // Passwords are propagated separately though.
+    (name.startsWith("spark.ssl") && !name.contains("Password")) ||

Review Comment:
   Dont you need the password in order to load the keys ?



##########
core/src/main/scala/org/apache/spark/network/netty/SparkTransportConf.scala:
##########
@@ -38,12 +56,14 @@ object SparkTransportConf {
    *                       This restriction will only occur if these properties are not already set.
    * @param role           optional role, could be driver, executor, worker and master. Default is
    *                      [[None]], means no role specific configurations.
+   * @param sslOptions SSL config options
    */
-  def fromSparkConf(
+  def fromSparkConfWithSslOptions(

Review Comment:
   Why not add `sslOptions: Option[SSLOptions] = None` to `fromSparkConf` ?



##########
core/src/main/scala/org/apache/spark/SSLOptions.scala:
##########
@@ -152,15 +207,25 @@ private[spark] object SSLOptions extends Logging {
    * $ - `[ns].port` - the port where to bind the SSL server
    * $ - `[ns].keyStore` - a path to the key-store file; can be relative to the current directory
    * $ - `[ns].keyStorePassword` - a password to the key-store file
+   * $ - `[ns].privateKey` - a PKCS#8 private key file in PEM format
    * $ - `[ns].keyPassword` - a password to the private key
    * $ - `[ns].keyStoreType` - the type of the key-store
    * $ - `[ns].needClientAuth` - whether SSL needs client authentication
+   * $ - `[ns].certChain` - an X.509 certificate chain file in PEM format
    * $ - `[ns].trustStore` - a path to the trust-store file; can be relative to the current
    *                         directory
    * $ - `[ns].trustStorePassword` - a password to the trust-store file
    * $ - `[ns].trustStoreType` - the type of trust-store
+   * $ - `[ns].trustStoreReloadingEnabled` - enables or disables using a trust-store
+   * that reloads its configuration when the trust-store file on disk changes
+   * $ - `[ns].trustStoreReloadIntervalMs` - the interval, in milliseconds, the
+   * trust-store will reload its configuration
+   * $ - `[ns].openSslEnabled` - enables or disables using an OpenSSL implementation
+   * (if available on host system), requires certChain and keyFile arguments
    * $ - `[ns].protocol` - a protocol name supported by a particular Java version
    * $ - `[ns].enabledAlgorithms` - a comma separated list of ciphers
+   * $ - `[ns].dangerouslyFallbackIfKeysNotPresent` - whether to fallback to unencrypted
+   *                                                  communication if keys are not present.

Review Comment:
   IMO we should not support `dangerouslyFallbackIfKeysNotPresent`.
   Thoughts @JoshRosen, @hasnain-db ?



##########
core/src/main/scala/org/apache/spark/SecurityManager.scala:
##########
@@ -258,18 +265,46 @@ private[spark] class SecurityManager(
     isUserInACL(user, modifyAcls, modifyAclsGroups)
   }
 
+  /**
+   * Check to see if authentication is enabled and SSL for RPC is disabled (as they are mutually
+   * exclusive)
+   * @return Whether authentication is enabled and SSL for RPC is disabled
+   */
+  private def isAuthenticationEnabledAndSslRpcDisabled(): Boolean = {

Review Comment:
   Why is authentication mutually exclusive from rpc ssl ?
   For example, external shuffle service uses it to validate identity of the incoming request.
   
   IMO let us keep SSL for securing over-the-wire comm



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.
beliefer commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349612574


##########
core/src/main/scala/org/apache/spark/SSLOptions.scala:
##########
@@ -134,12 +157,44 @@ private[spark] case class SSLOptions(
     supported
   }
 
+  /**
+   *
+   * @return
+   */

Review Comment:
   Please supply the comment or remove it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on code in PR #43238:
URL: https://github.com/apache/spark/pull/43238#discussion_r1349630416


##########
core/src/main/scala/org/apache/spark/SecurityManager.scala:
##########
@@ -258,18 +265,46 @@ private[spark] class SecurityManager(
     isUserInACL(user, modifyAcls, modifyAclsGroups)
   }
 
+  /**
+   * Check to see if authentication is enabled and SSL for RPC is disabled (as they are mutually
+   * exclusive)
+   * @return Whether authentication is enabled and SSL for RPC is disabled
+   */
+  private def isAuthenticationEnabledAndSslRpcDisabled(): Boolean = {

Review Comment:
   Thanks! I had misread the docs earlier and thought authentication required encryption, but re-reading now I realize it's the other way around. I can adjust. While TLS can handle authentication too (with the right trust stores) it makes sense to keep existing auth mechanisms working over TLS.
   
   Just to confirm, we do want to keep the legacy encryption and TLS mutually exclusive right? I don't see a benefit to double encrypting (just overhead)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-45427][CORE] Add RPC SSL settings to SSLOptions and SparkTransportConf [spark]

Posted by "hasnain-db (via GitHub)" <gi...@apache.org>.
hasnain-db commented on PR #43238:
URL: https://github.com/apache/spark/pull/43238#issuecomment-1751928928

   > w.r.t the config namespace, we currently have ...
   > IMO we should namespace to spark.network and add ssl there, instead of moving these configs for rpc to spark.ssl.
   
   My main preference here was to use the existing `spark.ssl` settings because there is a nice interop with how SSL works for the UI: we can reuse the same keystore and trustore and protocol settings and have things be consistent. I agree it's a little confusing as it's not under the network settings, but I don't see it being a stretch for a user to expect SSL settings to be under `spark.ssl` - that's naively where I checked initially when looking for RPC SSL support.
   
   What do you think? I'm happy to defer to you and @JoshRosen for the final call here.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org