You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "beliefer (via GitHub)" <gi...@apache.org> on 2024/01/26 04:30:34 UTC

[PR] [WIP][SQL] Remove unnecessary synchronized [spark]

beliefer opened a new pull request, #44892:
URL: https://github.com/apache/spark/pull/44892

   ### What changes were proposed in this pull request?
   This PR propose to remove unnecessary synchronized for `SessionCatalog` and `CatalogManager`.
   
   
   ### Why are the changes needed?
   I invested there are two synchronized is unnecessary due to the returned objects are always different.
   `functionRegistry.lookupFunction` always return different Option object even if looks up the same function.
   `catalogs.keys.toSeq` returns different object too.
   
   
   ### Does this PR introduce _any_ user-facing change?
   'No'.
   
   
   ### How was this patch tested?
   GA
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   'No'.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-46877][SQL] Remove unnecessary synchronized [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #44892:
URL: https://github.com/apache/spark/pull/44892#discussion_r1469484842


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala:
##########
@@ -138,7 +138,7 @@ class CatalogManager(
   }
 
   def listCatalogs(pattern: Option[String]): Seq[String] = {
-    val allCatalogs = (synchronized(catalogs.keys.toSeq) :+ SESSION_CATALOG_NAME).distinct.sorted
+    val allCatalogs = (catalogs.keys.toSeq :+ SESSION_CATALOG_NAME).distinct.sorted

Review Comment:
   We should use
   synchronized(catalogs) {
     catalogs.keys.toSeq
   }
   if the accesses of catalogs should be synchronized.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala:
##########
@@ -138,7 +138,7 @@ class CatalogManager(
   }
 
   def listCatalogs(pattern: Option[String]): Seq[String] = {
-    val allCatalogs = (synchronized(catalogs.keys.toSeq) :+ SESSION_CATALOG_NAME).distinct.sorted
+    val allCatalogs = (catalogs.keys.toSeq :+ SESSION_CATALOG_NAME).distinct.sorted

Review Comment:
   We should use
   ```
   synchronized(catalogs) {
     catalogs.keys.toSeq
   }
   ```
   if the accesses of catalogs should be synchronized.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-46877][SQL] Remove unnecessary synchronized [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on code in PR #44892:
URL: https://github.com/apache/spark/pull/44892#discussion_r1469484842


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala:
##########
@@ -138,7 +138,7 @@ class CatalogManager(
   }
 
   def listCatalogs(pattern: Option[String]): Seq[String] = {
-    val allCatalogs = (synchronized(catalogs.keys.toSeq) :+ SESSION_CATALOG_NAME).distinct.sorted
+    val allCatalogs = (catalogs.keys.toSeq :+ SESSION_CATALOG_NAME).distinct.sorted

Review Comment:
   We should use
   ```
   synchronized(catalogs) {
     catalogs.keys.toSeq
   }
   ```
   if the accesses of `catalogs` should be synchronized.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-46877][SQL] Remove unnecessary synchronized [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #44892:
URL: https://github.com/apache/spark/pull/44892#discussion_r1469234702


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogManager.scala:
##########
@@ -138,7 +138,7 @@ class CatalogManager(
   }
 
   def listCatalogs(pattern: Option[String]): Seq[String] = {
-    val allCatalogs = (synchronized(catalogs.keys.toSeq) :+ SESSION_CATALOG_NAME).distinct.sorted
+    val allCatalogs = (catalogs.keys.toSeq :+ SESSION_CATALOG_NAME).distinct.sorted

Review Comment:
   The accesses of `catalogs` should be synchronized, or do I miss something?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-46877][SQL] Remove unnecessary synchronized [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer closed pull request #44892: [SPARK-46877][SQL] Remove unnecessary synchronized
URL: https://github.com/apache/spark/pull/44892


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

Re: [PR] [SPARK-46877][SQL] Remove unnecessary synchronized [spark]

Posted by "beliefer (via GitHub)" <gi...@apache.org>.

beliefer commented on PR #44892:
URL: https://github.com/apache/spark/pull/44892#issuecomment-1914688674

   synchronized(catalogs.keys.toSeq) uses this as the lock.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org