You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2024/03/05 03:21:07 UTC

Re: [PR] [SPARK-46989][SQL][CONNECT] Improve concurrency performance for SparkSession [spark]

beliefer commented on code in PR #45046:
URL: https://github.com/apache/spark/pull/45046#discussion_r1482658250


##########
connector/connect/client/jvm/src/main/scala/org/apache/spark/sql/SparkSession.scala:
##########
@@ -854,7 +855,7 @@ object SparkSession extends Logging {
     // the remote() function, as it takes precedence over the SPARK_REMOTE environment variable.
     private val builder = SparkConnectClient.builder().loadFromEnvironment()
     private var client: SparkConnectClient = _
-    private[this] val options = new scala.collection.mutable.HashMap[String, String]
+    private[this] val options = new ConcurrentHashMap[String, String]

Review Comment:
   We cannot assume what threading mode the client will adopt.
   No matter what the concurrency in user scene, we can avoid using `synchronized` with `ConcurrentHashMap`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org