You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/10/07 08:50:25 UTC

[GitHub] [spark] gaborgsomogyi opened a new pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

gaborgsomogyi opened a new pull request #29964:
URL: https://github.com/apache/spark/pull/29964


   ### What changes were proposed in this pull request?
   At the moment there is no possibility to turn off JDBC authentication providers which exists on the classpath. This can be problematic because service providers are loaded with service loader. In this PR I've added `spark.security.jdbc.connection.provider.{providerName}.enabled` configuration possibility (default: true).
   
   ### Why are the changes needed?
   No possibility to turn off JDBC authentication providers.
   
   ### Does this PR introduce _any_ user-facing change?
   Yes, it introduces new configuration option.
   
   ### How was this patch tested?
   * Existing + newly added unit tests.
   * Existing integration tests.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705619907


   **[Test build #129557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129557/testReport)** for PR 29964 at commit [`738c90f`](https://github.com/apache/spark/commit/738c90f05b3921c482a66598e879f09b6fd370b4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi edited a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi edited a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-708226412


   @HyukjinKwon The main intention is to make the framework as self-serving as possible. I see mainly 2 use-cases (I'm sure others can come up more):
   * The built-in provider works fine for a specific database (for example MSSQL) but the user wants different authentication functionality which is not able to be done with the built-in one. Such case the user implements a new provider but when added then 2 providers will be in place to handle MSSQL authentication (the binding is class name based). Such case the built-in one must be turned off to use the new one.
   * The built-in provider contains a bug for a specific database. Such case full Spark patch build can be provided in normal case but this configuration opens new ways in testing and patching. Namely one can easily send a custom provider to proof/solve provider related issues (of course temporarily).
   
   As a general opinion all the issue solutions and good features must be added upstream but it could take some time so having such flexibility would pay off in terms of reaction time.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-708381996


   To make other developers understood smoothly, could you move the comment above into the `Why are the changes needed?` section in the PR description?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704861726


   Yeah, here is a separate jira for it: https://issues.apache.org/jira/browse/SPARK-31816


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704833552


   BTW, quick question: do you plan to write a doc for JDBC security stuff? It should be good to have it.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
maropu commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704881030


   > BTW, quick question: do you plan to write a doc for JDBC security stuff? It should be good to have it.
   
   +1; it sound nice.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705619907






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501788251



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/DB2ConnectionProvider.scala
##########
@@ -28,6 +28,8 @@ import org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions
 private[sql] class DB2ConnectionProvider extends SecureConnectionProvider {
   override val driverClass = "com.ibm.db2.jcc.DB2Driver"
 
+  override def name: String = "db2"

Review comment:
       Fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704947776






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501592794



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +49,13 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    // toSeq seems duplicate but it's needed for Scala 2.13
+    val sparkConf = SparkEnv.get.conf
+    providers.filter { p =>
+      val key = providerEnabledConfig.format(p.name)
+      sparkConf.getOption(key).forall(_.toBoolean)

Review comment:
       Adding more descriptive message.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705655874






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705772334






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704793304


   **[Test build #129498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129498/testReport)** for PR 29964 at commit [`80f67b8`](https://github.com/apache/spark/commit/80f67b81910bc3bd8685e36ef289787c759eb438).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704829693






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi edited a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi edited a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-708226412


   @HyukjinKwon The main intention is to make the framework as self-serving as possible. I see mainly 2 use-cases (I'm sure others can came up more):
   * The built-in provider works fine for a specific database (for example MSSQL) but the user wants different authentication functionality which is not able to be done with the built-in one. Such case the user implements a new provider but when added then 2 providers will be in place to handle MSSQL authentication (the binding is class name based). Such case the built-in one must be turned off to use the new one.
   * The built-in provider contains a bug for a specific database. Such case full Spark patch build can be provided in normal case but this configuration opens new ways in testing and patching. Namely one can easily send a custom provider to proof/solve provider related issues (of course temporarily).
   
   As a general opinion all the issue solutions and good features must be added upstream but it could take some time so having such flexibility would pay off in terms of reaction time.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-706834736


   Merged to master.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-708226412


   @HyukjinKwon The main intention is to make the framework as self-serving as possible. I see mainly 2 use-cases (I'm sure others can came up more):
   * The built-in provider works fine for a specific database but the user wants different authentication functionality which is not able to be done with the built-in one. Such case the user implements a new provider but when added 2 providers will be in place to handle MSSQL authentication (the binding is class name based). Such case the built-in one must be turned off to use the new one.
   * The built-in provider contains a bug for a specific database. Such case full Spark patch build can be provided in normal case but this configuration opens new ways in testing and patching. Namely one can easily send a custom provider to proof/solve provider related issues (of course temporarily).
   
   As a general opinion all the issue solutions and good features must be added upstream but it could take some time so having such flexibility would pay off in terms of reaction time.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r500962666



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -22,12 +22,15 @@ import java.util.ServiceLoader
 
 import scala.collection.mutable
 
+import org.apache.spark.{SparkConf, SparkEnv}
 import org.apache.spark.internal.Logging
 import org.apache.spark.security.SecurityConfigurationLock
 import org.apache.spark.sql.jdbc.JdbcConnectionProvider
 import org.apache.spark.util.Utils
 
 private[jdbc] object ConnectionProvider extends Logging {
+  private val providerEnabledConfig = "spark.security.jdbc.connection.provider.%s.enabled"

Review comment:
       Ah, I see. Thanks for the info. Adding it in `SQLConf` looks better if possible.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r500890819



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +49,13 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    // toSeq seems duplicate but it's needed for Scala 2.13
+    val sparkConf = SparkEnv.get.conf
+    providers.filter { p =>
+      val key = providerEnabledConfig.format(p.name)
+      sparkConf.getOption(key).forall(_.toBoolean)

Review comment:
       Should we maybe throw a better exception saying that the configuration only takes `true` or `false`?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501790366



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +47,10 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    val disabledProviders = SQLConf.get.disabledJdbcConnectionProviders.split(",")

Review comment:
       Changed, however I've taken this solution from existing params. After this I'm going to file a PR to change them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501787679



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +49,13 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    // toSeq seems duplicate but it's needed for Scala 2.13

Review comment:
       Fixed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-713365565


   @maropu almost missed your last comment. Was just hanging around and spotted it. Adding it which will be mentioned in the doc too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705674415


   Test FAILed.
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/34166/
   Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501787551



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -22,12 +22,15 @@ import java.util.ServiceLoader
 
 import scala.collection.mutable
 
+import org.apache.spark.{SparkConf, SparkEnv}
 import org.apache.spark.internal.Logging
 import org.apache.spark.security.SecurityConfigurationLock
 import org.apache.spark.sql.jdbc.JdbcConnectionProvider
 import org.apache.spark.util.Utils
 
 private[jdbc] object ConnectionProvider extends Logging {
+  private val providerEnabledConfig = "spark.security.jdbc.connection.provider.%s.enabled"

Review comment:
       Just changed it.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501786746



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +47,10 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    val disabledProviders = SQLConf.get.disabledJdbcConnectionProviders.split(",")

Review comment:
       Looks like there's a util for this `Utils.stringToSeq`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r500959911



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -22,12 +22,15 @@ import java.util.ServiceLoader
 
 import scala.collection.mutable
 
+import org.apache.spark.{SparkConf, SparkEnv}
 import org.apache.spark.internal.Logging
 import org.apache.spark.security.SecurityConfigurationLock
 import org.apache.spark.sql.jdbc.JdbcConnectionProvider
 import org.apache.spark.util.Utils
 
 private[jdbc] object ConnectionProvider extends Logging {
+  private val providerEnabledConfig = "spark.security.jdbc.connection.provider.%s.enabled"

Review comment:
       This kind of pattern `provider.{providerName}.enabled` is used often in other non-SQL codebase e.g.) `"spark.security.credentials.%s.enabled"` at `HadoopDelegationTokenManager`. I think he's more used to this kind of pattern.
   
   Since we're in SQL, maybe it's better to stick to the suggestion above if possible.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705619907






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r500946426



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -22,12 +22,15 @@ import java.util.ServiceLoader
 
 import scala.collection.mutable
 
+import org.apache.spark.{SparkConf, SparkEnv}
 import org.apache.spark.internal.Logging
 import org.apache.spark.security.SecurityConfigurationLock
 import org.apache.spark.sql.jdbc.JdbcConnectionProvider
 import org.apache.spark.util.Utils
 
 private[jdbc] object ConnectionProvider extends Logging {
+  private val providerEnabledConfig = "spark.security.jdbc.connection.provider.%s.enabled"

Review comment:
       Why didn't you put this config in `SQLConfig`? We assume users don't care about it? I read the PR description and I thought first we would add a config like `spark.sql.sources.useV1SourceList`;
   ```
   For example;
   
   
     val USE_JDBC_CONN_PROVIDER_LIST = buildConf("spark.sql.sources.useJdbcConnProviderList")
       .internal()
       .doc("...")
       .version("3.1.0")
       .stringConf
       .createWithDefault("db2,mysql,postgresql,...")
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi edited a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi edited a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704861726


   Yeah, here is a separate jira for it: SPARK-31816


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704816986


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34104/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705629951


   Changed the description too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-706820897


   I think it looks okay. WDYT @maropu?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704946407


   **[Test build #129498 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129498/testReport)** for PR 29964 at commit [`80f67b8`](https://github.com/apache/spark/commit/80f67b81910bc3bd8685e36ef289787c759eb438).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704829693






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501786746



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +47,10 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    val disabledProviders = SQLConf.get.disabledJdbcConnectionProviders.split(",")

Review comment:
       Looks like there's a util for this `Utils.stringToSeq`.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu edited a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
maropu edited a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704881030


   > BTW, quick question: do you plan to write a doc for JDBC security stuff? It should be good to have it.
   
   +1; it sounds nice.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705674388


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34166/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705655849


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34164/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705772334






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705674406






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704793304


   **[Test build #129498 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129498/testReport)** for PR 29964 at commit [`80f67b8`](https://github.com/apache/spark/commit/80f67b81910bc3bd8685e36ef289787c759eb438).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r500952785



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/DB2ConnectionProvider.scala
##########
@@ -28,6 +28,8 @@ import org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions
 private[sql] class DB2ConnectionProvider extends SecureConnectionProvider {
   override val driverClass = "com.ibm.db2.jcc.DB2Driver"
 
+  override def name: String = "db2"

Review comment:
       nit: `def` -> `val`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704832289


   Looks good to me otherwise. cc @vanzin, @maropu and @viirya 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] maropu commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
maropu commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r500946754



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +49,13 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    // toSeq seems duplicate but it's needed for Scala 2.13

Review comment:
       Could you move this comment in L58?
   ```
       }.toSeq // toSeq seems duplicate but it's needed for Scala 2.13
   ```




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501578769



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -22,12 +22,15 @@ import java.util.ServiceLoader
 
 import scala.collection.mutable
 
+import org.apache.spark.{SparkConf, SparkEnv}
 import org.apache.spark.internal.Logging
 import org.apache.spark.security.SecurityConfigurationLock
 import org.apache.spark.sql.jdbc.JdbcConnectionProvider
 import org.apache.spark.util.Utils
 
 private[jdbc] object ConnectionProvider extends Logging {
+  private val providerEnabledConfig = "spark.security.jdbc.connection.provider.%s.enabled"

Review comment:
       It makes sense to make it SQL like so adapting. What I think however we need blacklist and not whitelist. Having a blacklist doesn't expect any configuration from user when new provider added.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +49,13 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    // toSeq seems duplicate but it's needed for Scala 2.13
+    val sparkConf = SparkEnv.get.conf
+    providers.filter { p =>
+      val key = providerEnabledConfig.format(p.name)
+      sparkConf.getOption(key).forall(_.toBoolean)

Review comment:
       Adding more descriptive message.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -22,12 +22,15 @@ import java.util.ServiceLoader
 
 import scala.collection.mutable
 
+import org.apache.spark.{SparkConf, SparkEnv}
 import org.apache.spark.internal.Logging
 import org.apache.spark.security.SecurityConfigurationLock
 import org.apache.spark.sql.jdbc.JdbcConnectionProvider
 import org.apache.spark.util.Utils
 
 private[jdbc] object ConnectionProvider extends Logging {
+  private val providerEnabledConfig = "spark.security.jdbc.connection.provider.%s.enabled"

Review comment:
       Just changed it.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +49,13 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    // toSeq seems duplicate but it's needed for Scala 2.13

Review comment:
       Fixed.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +49,13 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    // toSeq seems duplicate but it's needed for Scala 2.13
+    val sparkConf = SparkEnv.get.conf
+    providers.filter { p =>
+      val key = providerEnabledConfig.format(p.name)
+      sparkConf.getOption(key).forall(_.toBoolean)

Review comment:
       Code changed so not needed.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/DB2ConnectionProvider.scala
##########
@@ -28,6 +28,8 @@ import org.apache.spark.sql.execution.datasources.jdbc.JDBCOptions
 private[sql] class DB2ConnectionProvider extends SecureConnectionProvider {
   override val driverClass = "com.ibm.db2.jcc.DB2Driver"
 
+  override def name: String = "db2"

Review comment:
       Fixed.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +47,10 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    val disabledProviders = SQLConf.get.disabledJdbcConnectionProviders.split(",")

Review comment:
       Changed, however I've taken this solution from existing params. After this I'm going to file a PR to change them.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704829674


   Kubernetes integration test status success
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34104/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705655874






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704795754


   I've just update the example connection provider to show the usage: https://github.com/gaborgsomogyi/spark-jdbc-connection-provider


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705655874






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon closed pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon closed pull request #29964:
URL: https://github.com/apache/spark/pull/29964


   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705619907


   **[Test build #129557 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129557/testReport)** for PR 29964 at commit [`738c90f`](https://github.com/apache/spark/commit/738c90f05b3921c482a66598e879f09b6fd370b4).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501787919



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -46,8 +49,13 @@ private[jdbc] object ConnectionProvider extends Logging {
           logError(s"Failed to load built in provider.", t)
       }
     }
-    // Seems duplicate but it's needed for Scala 2.13
-    providers.toSeq
+
+    // toSeq seems duplicate but it's needed for Scala 2.13
+    val sparkConf = SparkEnv.get.conf
+    providers.filter { p =>
+      val key = providerEnabledConfig.format(p.name)
+      sparkConf.getOption(key).forall(_.toBoolean)

Review comment:
       Code changed so not needed.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on a change in pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on a change in pull request #29964:
URL: https://github.com/apache/spark/pull/29964#discussion_r501578769



##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/jdbc/connection/ConnectionProvider.scala
##########
@@ -22,12 +22,15 @@ import java.util.ServiceLoader
 
 import scala.collection.mutable
 
+import org.apache.spark.{SparkConf, SparkEnv}
 import org.apache.spark.internal.Logging
 import org.apache.spark.security.SecurityConfigurationLock
 import org.apache.spark.sql.jdbc.JdbcConnectionProvider
 import org.apache.spark.util.Utils
 
 private[jdbc] object ConnectionProvider extends Logging {
+  private val providerEnabledConfig = "spark.security.jdbc.connection.provider.%s.enabled"

Review comment:
       It makes sense to make it SQL like so adapting. What I think however we need blacklist and not whitelist. Having a blacklist doesn't expect any configuration from user when new provider added.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi edited a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi edited a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-708226412


   @HyukjinKwon The main intention is to make the framework as self-serving as possible. I see mainly 2 use-cases (I'm sure others can came up more):
   * The built-in provider works fine for a specific database but the user wants different authentication functionality which is not able to be done with the built-in one. Such case the user implements a new provider but when added then 2 providers will be in place to handle MSSQL authentication (the binding is class name based). Such case the built-in one must be turned off to use the new one.
   * The built-in provider contains a bug for a specific database. Such case full Spark patch build can be provided in normal case but this configuration opens new ways in testing and patching. Namely one can easily send a custom provider to proof/solve provider related issues (of course temporarily).
   
   As a general opinion all the issue solutions and good features must be added upstream but it could take some time so having such flexibility would pay off in terms of reaction time.
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705766291






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705645238


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34164/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705674406


   Merged build finished. Test FAILed.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705770972


   **[Test build #129560 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129560/testReport)** for PR 29964 at commit [`b6458cf`](https://github.com/apache/spark/commit/b6458cf988f482134925414309956223af4d6739).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705766291






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705629938


   **[Test build #129560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129560/testReport)** for PR 29964 at commit [`b6458cf`](https://github.com/apache/spark/commit/b6458cf988f482134925414309956223af4d6739).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-704947776






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] AmplabJenkins removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
AmplabJenkins removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705655874






----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705656133


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/34166/
   


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA removed a comment on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA removed a comment on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705629938


   **[Test build #129560 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129560/testReport)** for PR 29964 at commit [`b6458cf`](https://github.com/apache/spark/commit/b6458cf988f482134925414309956223af4d6739).


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gaborgsomogyi commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
gaborgsomogyi commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705629951


   Changed the description too.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] HyukjinKwon commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
HyukjinKwon commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-708116061


   @gaborgsomogyi, just a question. When do we disable some providers? Would you mind elaborating the usecase?


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] SparkQA commented on pull request #29964: [SPARK-32047][SQL]Add JDBC connection provider disable possibility

Posted by GitBox <gi...@apache.org>.
SparkQA commented on pull request #29964:
URL: https://github.com/apache/spark/pull/29964#issuecomment-705765370


   **[Test build #129557 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/129557/testReport)** for PR 29964 at commit [`738c90f`](https://github.com/apache/spark/commit/738c90f05b3921c482a66598e879f09b6fd370b4).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org