You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "zhenlineo (via GitHub)" <gi...@apache.org> on 2023/02/22 23:54:56 UTC

[GitHub] [spark] zhenlineo opened a new pull request, #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

zhenlineo opened a new pull request, #40133:
URL: https://github.com/apache/spark/pull/40133

   ### What changes were proposed in this pull request?
   Adding SSL encryption and access token support for Scala client
   
   ### Why are the changes needed?
   To support basic client side encryption to protect data sent over the network.
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   ### How was this patch tested?
   Unit tests. TODO: Manual tests.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell closed pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "hvanhovell (via GitHub)" <gi...@apache.org>.
hvanhovell closed pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client
URL: https://github.com/apache/spark/pull/40133


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "hvanhovell (via GitHub)" <gi...@apache.org>.
hvanhovell commented on code in PR #40133:
URL: https://github.com/apache/spark/pull/40133#discussion_r1115215682


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala:
##########
@@ -117,6 +126,53 @@ object SparkConnectClient {
       this
     }
 
+    /**
+     * Setting the token implicitly sets the use_ssl=true. All the following examples yield the
+     * same results:
+     *
+     * {{{
+     * sc://localhost/;token=aaa
+     * sc://localhost/;use_ssl=true;token=aaa
+     * sc://localhost/;token=aaa;use_ssl=true
+     * }}}
+     *
+     * Throws exception if the token is set but use_ssl=false.
+     *
+     * @param inputToken
+     *   the user token.
+     * @return
+     *   this builder.
+     */
+    def token(inputToken: String): Builder = {
+      require(inputToken != null && inputToken.nonEmpty)
+      token = Some(inputToken)
+      // Only set the isSSlEnabled if it is not yet set
+      isSslEnabled match {
+        case None => isSslEnabled = Some(true)
+        case Some(false) =>
+          throw new IllegalArgumentException(AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG)
+        case Some(true) => // Good, the ssl is enabled
+      }
+      this
+    }
+
+    def enableSsl(): Builder = {
+      isSslEnabled = Some(true)
+      this
+    }
+
+    /**
+     * Disables the SSL. Throws exception if the token has been set.
+     *
+     * @return
+     *   this builder.
+     */
+    def disableSsl(): Builder = {
+      require(token.isEmpty, AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG)

Review Comment:
   So there is no way to disable this when the token is set?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on a diff in pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on code in PR #40133:
URL: https://github.com/apache/spark/pull/40133#discussion_r1116938358


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala:
##########
@@ -117,6 +126,53 @@ object SparkConnectClient {
       this
     }
 
+    /**
+     * Setting the token implicitly sets the use_ssl=true. All the following examples yield the
+     * same results:
+     *
+     * {{{
+     * sc://localhost/;token=aaa
+     * sc://localhost/;use_ssl=true;token=aaa
+     * sc://localhost/;token=aaa;use_ssl=true
+     * }}}
+     *
+     * Throws exception if the token is set but use_ssl=false.
+     *
+     * @param inputToken
+     *   the user token.
+     * @return
+     *   this builder.
+     */
+    def token(inputToken: String): Builder = {
+      require(inputToken != null && inputToken.nonEmpty)
+      token = Some(inputToken)
+      // Only set the isSSlEnabled if it is not yet set
+      isSslEnabled match {
+        case None => isSslEnabled = Some(true)
+        case Some(false) =>
+          throw new IllegalArgumentException(AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG)
+        case Some(true) => // Good, the ssl is enabled
+      }
+      this
+    }
+
+    def enableSsl(): Builder = {
+      isSslEnabled = Some(true)
+      this
+    }
+
+    /**
+     * Disables the SSL. Throws exception if the token has been set.
+     *
+     * @return
+     *   this builder.
+     */
+    def disableSsl(): Builder = {
+      require(token.isEmpty, AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG)

Review Comment:
   No, from the Python GRPC implementation it basically says that you cannot use a token on plaintext and we should follow this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "hvanhovell (via GitHub)" <gi...@apache.org>.
hvanhovell commented on code in PR #40133:
URL: https://github.com/apache/spark/pull/40133#discussion_r1115216626


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala:
##########
@@ -189,9 +245,54 @@ object SparkConnectClient {
     }
 
     def build(): SparkConnectClient = {
-      val channelBuilder = ManagedChannelBuilder.forAddress(host, port).usePlaintext()
-      val channel: ManagedChannel = channelBuilder.build()
+      def insecureCredentials(): ChannelCredentials = {
+        InsecureChannelCredentials.create()
+      }
+      val creds = isSslEnabled match {
+        case None => insecureCredentials()

Review Comment:
   NIT, you can combine cases, e.g.: `case None | Some(false) => insecureCredentials()`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on a diff in pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "hvanhovell (via GitHub)" <gi...@apache.org>.
hvanhovell commented on code in PR #40133:
URL: https://github.com/apache/spark/pull/40133#discussion_r1115216249


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala:
##########
@@ -158,13 +214,14 @@ object SparkConnectClient {
           }
           (arr(0), arr(1))
         }
-        if (key == URIParams.PARAM_USER_ID) {
-          userContextBuilder.setUserId(value)
-        } else {
-          // TODO(SPARK-41917): Support SSL and Auth tokens.
-          throw new UnsupportedOperationException(
-            "Parameters apart from user_id" +
-              " are currently unsupported.")
+        key match {
+          case URIParams.PARAM_USER_ID => userContextBuilder.setUserId(value)
+          case URIParams.PARAM_TOKEN => token(value)
+          case URIParams.PARAM_USE_SSL =>
+            if (true.toString.equalsIgnoreCase(value)) enableSsl() else disableSsl()

Review Comment:
   `Boolean.valueOf()`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhenlineo commented on a diff in pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "zhenlineo (via GitHub)" <gi...@apache.org>.
zhenlineo commented on code in PR #40133:
URL: https://github.com/apache/spark/pull/40133#discussion_r1116335683


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/connect/client/SparkConnectClient.scala:
##########
@@ -117,6 +126,53 @@ object SparkConnectClient {
       this
     }
 
+    /**
+     * Setting the token implicitly sets the use_ssl=true. All the following examples yield the
+     * same results:
+     *
+     * {{{
+     * sc://localhost/;token=aaa
+     * sc://localhost/;use_ssl=true;token=aaa
+     * sc://localhost/;token=aaa;use_ssl=true
+     * }}}
+     *
+     * Throws exception if the token is set but use_ssl=false.
+     *
+     * @param inputToken
+     *   the user token.
+     * @return
+     *   this builder.
+     */
+    def token(inputToken: String): Builder = {
+      require(inputToken != null && inputToken.nonEmpty)
+      token = Some(inputToken)
+      // Only set the isSSlEnabled if it is not yet set
+      isSslEnabled match {
+        case None => isSslEnabled = Some(true)
+        case Some(false) =>
+          throw new IllegalArgumentException(AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG)
+        case Some(true) => // Good, the ssl is enabled
+      }
+      this
+    }
+
+    def enableSsl(): Builder = {
+      isSslEnabled = Some(true)
+      this
+    }
+
+    /**
+     * Disables the SSL. Throws exception if the token has been set.
+     *
+     * @return
+     *   this builder.
+     */
+    def disableSsl(): Builder = {
+      require(token.isEmpty, AUTH_TOKEN_ON_INSECURE_CONN_ERROR_MSG)

Review Comment:
   @grundprinzip What's the correct action here? Should we allow users to turn off ssl when token is provided?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] hvanhovell commented on pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "hvanhovell (via GitHub)" <gi...@apache.org>.
hvanhovell commented on PR #40133:
URL: https://github.com/apache/spark/pull/40133#issuecomment-1443634139

   Merging to master/3.4


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhenlineo commented on a diff in pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "zhenlineo (via GitHub)" <gi...@apache.org>.
zhenlineo commented on code in PR #40133:
URL: https://github.com/apache/spark/pull/40133#discussion_r1116336199


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/MyTestSuite.scala:
##########
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.client
+
+import org.scalatest.funsuite.AnyFunSuite // scalastyle:ignore funsuite
+
+import org.apache.spark.sql.{SaveMode, SparkSession}
+
+class MyTestSuite extends AnyFunSuite { // scalastyle:ignore funsuite
+
+  private var client: SparkConnectClient = _
+
+//  System.setProperty("javax.net.debug", "all")
+  test("Test ssl and token manually") {
+    client = SparkConnectClient
+      .builder()
+      .connectionString(

Review Comment:
   The client method name is not changed. But the `SparkSession.remote` method is added.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] zhenlineo commented on pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "zhenlineo (via GitHub)" <gi...@apache.org>.
zhenlineo commented on PR #40133:
URL: https://github.com/apache/spark/pull/40133#issuecomment-1441018465

   cc @grundprinzip 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] grundprinzip commented on a diff in pull request #40133: [SPARK-42533][CONNECT][Scala] Add ssl for Scala client

Posted by "grundprinzip (via GitHub)" <gi...@apache.org>.
grundprinzip commented on code in PR #40133:
URL: https://github.com/apache/spark/pull/40133#discussion_r1116232505


##########
connector/connect/client/jvm/src/test/scala/org/apache/spark/sql/connect/client/MyTestSuite.scala:
##########
@@ -0,0 +1,44 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.connect.client
+
+import org.scalatest.funsuite.AnyFunSuite // scalastyle:ignore funsuite
+
+import org.apache.spark.sql.{SaveMode, SparkSession}
+
+class MyTestSuite extends AnyFunSuite { // scalastyle:ignore funsuite
+
+  private var client: SparkConnectClient = _
+
+//  System.setProperty("javax.net.debug", "all")
+  test("Test ssl and token manually") {
+    client = SparkConnectClient
+      .builder()
+      .connectionString(

Review Comment:
   ```suggestion
         .remote(
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org