You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "mihailom-db (via GitHub)" <gi...@apache.org> on 2024/02/22 13:55:35 UTC

[PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [WIP] [spark]

mihailom-db opened a new pull request, #45218:
URL: https://github.com/apache/spark/pull/45218

   ### What changes were proposed in this pull request?
   This PR adds COLLATION_ENABLED config to `SQLConf` and introduces new error class `COLLATION_SUPPORT_DISABLED` to appropriately report error on usage of feature under development. 
   
   ### Why are the changes needed?
   We want to make collations configurable on this some flag. These changes disable usage of `collate` and `collation` functions, along with any `COLLATE` syntax when the flag is set to false. By default, the flag is set to false.
   
   
   ### Does this PR introduce _any_ user-facing change?
   Yes. It introduces new error along with an appropriate message.
   
   
   ### How was this patch tested?
   ```
   ./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.errors.QueryCompilationErrorsSuite test
   ./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.catalyst.expressions.CollationExpressionSuite test
   ./build/mvn -Dtest=none -DwildcardSuites=org.apache.spark.sql.CollationSuite test
   ```
   
   
   ### Was this patch authored or co-authored using generative AI tooling?
   No.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1502950928


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -469,6 +469,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_NOT_ENABLED" : {

Review Comment:
   Shouldn't we add a sub-class of `UNSUPPORTED_FEATURE`?



##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -469,6 +469,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_NOT_ENABLED" : {
+    "message" : [
+      "Collation support is under development and not all features are supported. To enable existing features set `spark.sql.collation.enabled` to `true`."

Review Comment:
   Replace `spark.sql.collation.enabled` by <config>, see other items in the file.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "dbatomic (via GitHub)" <gi...@apache.org>.
dbatomic commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1499518489


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollationExpressionSuite.scala:
##########
@@ -19,59 +19,75 @@ package org.apache.spark.sql.catalyst.expressions
 
 import org.apache.spark.{SparkException, SparkFunSuite}
 import org.apache.spark.sql.catalyst.util.CollationFactory
+import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 
 class CollationExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
   test("validate default collation") {
-    val collationId = CollationFactory.collationNameToId("UCS_BASIC")
-    assert(collationId == 0)
-    val collateExpr = Collate(Literal("abc"), "UCS_BASIC")
-    assert(collateExpr.dataType === StringType(collationId))
-    collateExpr.dataType.asInstanceOf[StringType].collationId == 0
-    checkEvaluation(collateExpr, "abc")
+    withSQLConf(SQLConf.COLLATION_ENABLED.key -> "true") {

Review Comment:
   Can we avoid adding `withSQLConf` to every test?
   Instead, maybe we can use `beforeAll` override and add `withSparkEnvConfs` there?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "mkaravel (via GitHub)" <gi...@apache.org>.
mkaravel commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500231222


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -463,6 +463,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_DISABLED" : {
+    "message" : [
+      "Collation feature is under development and not all features are supported. To enable existing features toggle `COLLATION_ENABLED` flag to true."

Review Comment:
   ```suggestion
         "Collation support is under development and not all features are supported. To enable existing features set `spark.sql.collation.enabled` to `true`."
   ```



##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -463,6 +463,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_DISABLED" : {
+    "message" : [
+      "Collation feature is under development and not all features are supported. To enable existing features toggle `COLLATION_ENABLED` flag to true."

Review Comment:
   Out of curiosity: do we have or want any mechanism to disallow the enablement of the feature?



##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -463,6 +463,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_DISABLED" : {

Review Comment:
   How about renaming this to `COLLATION_SUPPORT_NOT_ENABLED`?
   
   Somehow it feels better: we want to convey the information that the feature is not enabled, rather than that it is disabled.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##########
@@ -289,14 +289,21 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase with Compilat
   }
 
   def namedArgumentsNotEnabledError(functionName: String, argumentName: String): Throwable = {
-    new AnalysisException(
+    new SparkUnsupportedOperationException(
       errorClass = "NAMED_PARAMETER_SUPPORT_DISABLED",
       messageParameters = Map(
         "functionName" -> toSQLId(functionName),
         "argument" -> toSQLId(argumentName))
     )
   }
 
+  def collationDisabledError(): Throwable = {

Review Comment:
   In the same spirit as my previous comment, I would rename this to `collationNotEnabledError`.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##########
@@ -289,14 +289,21 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase with Compilat
   }
 
   def namedArgumentsNotEnabledError(functionName: String, argumentName: String): Throwable = {
-    new AnalysisException(
+    new SparkUnsupportedOperationException(

Review Comment:
   Why are we changing this from an `AnalysisException` to a `SparkUnsupportedOperationException`?
   
   Another way to ask kind of the same thing: why are we not using `SparkUnsupportedOperationException` in OSS?



##########
docs/sql-error-conditions.md:
##########
@@ -386,6 +386,12 @@ For more details see [CODEC_NOT_AVAILABLE](sql-error-conditions-codec-not-availa
 
 Cannot find a short name for the codec `<codecName>`.
 
+### COLLATION_SUPPORT_DISABLED
+
+[SQLSTATE: 0A000](sql-error-conditions-sqlstates.html#class-0A-feature-not-supported)
+
+Collation feature is under development and not all features are supported. To enable existing features toggle `COLLATION_ENABLED` flag to true.

Review Comment:
   ```suggestion
   Collation support is under development and not all features are supported. To enable existing features set `spark.sql.collation.enabled` to `true`.
   ```



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollationExpressionSuite.scala:
##########
@@ -19,59 +19,75 @@ package org.apache.spark.sql.catalyst.expressions
 
 import org.apache.spark.{SparkException, SparkFunSuite}
 import org.apache.spark.sql.catalyst.util.CollationFactory
+import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 
 class CollationExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
   test("validate default collation") {
-    val collationId = CollationFactory.collationNameToId("UCS_BASIC")
-    assert(collationId == 0)
-    val collateExpr = Collate(Literal("abc"), "UCS_BASIC")
-    assert(collateExpr.dataType === StringType(collationId))
-    collateExpr.dataType.asInstanceOf[StringType].collationId == 0
-    checkEvaluation(collateExpr, "abc")
+    withSQLConf(SQLConf.COLLATION_ENABLED.key -> "true") {

Review Comment:
   +1



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collationExpressions.scala:
##########
@@ -67,6 +68,9 @@ object CollateExpressionBuilder extends ExpressionBuilder {
  */
 case class Collate(child: Expression, collationName: String)
   extends UnaryExpression with ExpectsInputTypes {
+  if (!SQLConf.get.collationEnabled) {

Review Comment:
   +1



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollationExpressionSuite.scala:
##########
@@ -19,59 +19,75 @@ package org.apache.spark.sql.catalyst.expressions
 
 import org.apache.spark.{SparkException, SparkFunSuite}
 import org.apache.spark.sql.catalyst.util.CollationFactory
+import org.apache.spark.sql.internal.SQLConf
 import org.apache.spark.sql.types._
 
 class CollationExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
   test("validate default collation") {
-    val collationId = CollationFactory.collationNameToId("UCS_BASIC")
-    assert(collationId == 0)
-    val collateExpr = Collate(Literal("abc"), "UCS_BASIC")
-    assert(collateExpr.dataType === StringType(collationId))
-    collateExpr.dataType.asInstanceOf[StringType].collationId == 0
-    checkEvaluation(collateExpr, "abc")
+    withSQLConf(SQLConf.COLLATION_ENABLED.key -> "true") {

Review Comment:
   Another option is to enable the feature for testing purposes and disable it otherwise. I believe (not 100% sure) that this is doable if we define the config as:
   ```scala
   val COLLATION_ENABLED =
       buildConf("spark.sql.collation.enabled")
         .doc("Collations feature is under development and its use should be done under this" +
           "feature flag.")
         .version("4.0.0")
         .booleanConf
         .createWithDefault(Utils.isTesting)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "mihailom-db (via GitHub)" <gi...@apache.org>.
mihailom-db commented on PR #45218:
URL: https://github.com/apache/spark/pull/45218#issuecomment-1963490566

   PR should be ready now, but until I get GitHub Actions fixed for my account, CI runs can't be run before merging. Will ping when that gets fixed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.
cloud-fan commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1504403020


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -463,6 +463,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_DISABLED" : {
+    "message" : [
+      "Collation feature is under development and not all features are supported. To enable existing features toggle `COLLATION_ENABLED` flag to true."

Review Comment:
   Preventing the use of a partially baked feature is a good reason here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "mihailom-db (via GitHub)" <gi...@apache.org>.
mihailom-db commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500370094


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##########
@@ -289,14 +289,21 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase with Compilat
   }
 
   def namedArgumentsNotEnabledError(functionName: String, argumentName: String): Throwable = {
-    new AnalysisException(
+    new SparkUnsupportedOperationException(

Review Comment:
   It slipped in, I was thinking whether my exception should be one or the other, so I accidentally changed this.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "mihailom-db (via GitHub)" <gi...@apache.org>.
mihailom-db commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500269743


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collationExpressions.scala:
##########
@@ -67,6 +68,9 @@ object CollateExpressionBuilder extends ExpressionBuilder {
  */
 case class Collate(child: Expression, collationName: String)
   extends UnaryExpression with ExpectsInputTypes {
+  if (!SQLConf.get.collationEnabled) {

Review Comment:
   I firstly did this, but changed it back, since visitCollate calls Collate() and I thought it would be double guard. Do we want to do double guard in this case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "mihailom-db (via GitHub)" <gi...@apache.org>.
mihailom-db commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500368655


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -463,6 +463,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_DISABLED" : {

Review Comment:
   Yeah, makes more sense.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "dbatomic (via GitHub)" <gi...@apache.org>.
dbatomic commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1499504129


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collationExpressions.scala:
##########
@@ -67,6 +68,9 @@ object CollateExpressionBuilder extends ExpressionBuilder {
  */
 case class Collate(child: Expression, collationName: String)
   extends UnaryExpression with ExpectsInputTypes {
+  if (!SQLConf.get.collationEnabled) {

Review Comment:
   Can you also disable it in `visitCollate` in `AstBuilder.scala`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "mihailom-db (via GitHub)" <gi...@apache.org>.
mihailom-db commented on PR #45218:
URL: https://github.com/apache/spark/pull/45218#issuecomment-1966490879

   Switched to https://github.com/apache/spark/pull/45285 as GitHub Actions are not enabled on my account. Please provide review on that PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "mkaravel (via GitHub)" <gi...@apache.org>.
mkaravel commented on PR #45218:
URL: https://github.com/apache/spark/pull/45218#issuecomment-1960772736

   One more thing: it would be nice to add the `[Collation]` tag in the title of the PR.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "dbatomic (via GitHub)" <gi...@apache.org>.
dbatomic commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500450911


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -463,6 +463,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_DISABLED" : {
+    "message" : [
+      "Collation feature is under development and not all features are supported. To enable existing features toggle `COLLATION_ENABLED` flag to true."

Review Comment:
   This is a good question. I am not sure what is the policy here - @cloud-fan and @MaxGekk - can you guide us here?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "mkaravel (via GitHub)" <gi...@apache.org>.
mkaravel commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500233180


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala:
##########
@@ -289,14 +289,21 @@ private[sql] object QueryCompilationErrors extends QueryErrorsBase with Compilat
   }
 
   def namedArgumentsNotEnabledError(functionName: String, argumentName: String): Throwable = {
-    new AnalysisException(
+    new SparkUnsupportedOperationException(

Review Comment:
   Why are we changing this from an `AnalysisException` to a `SparkUnsupportedOperationException`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "dbatomic (via GitHub)" <gi...@apache.org>.
dbatomic commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500454552


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/collationExpressions.scala:
##########
@@ -67,6 +68,9 @@ object CollateExpressionBuilder extends ExpressionBuilder {
  */
 case class Collate(child: Expression, collationName: String)
   extends UnaryExpression with ExpectsInputTypes {
+  if (!SQLConf.get.collationEnabled) {

Review Comment:
   Yeah, I guess that this is fine. In visitCollate we do very little so I guess raising an error in expression is ok.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "dbatomic (via GitHub)" <gi...@apache.org>.
dbatomic commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500455827


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CollationExpressionSuite.scala:
##########
@@ -59,7 +59,9 @@ class CollationExpressionSuite extends SparkFunSuite with ExpressionEvalHelper {
 
   test("collate on non existing collation") {
     checkError(
-      exception = intercept[SparkException] { Collate(Literal("abc"), "UCS_BASIS") },
+      exception = intercept[SparkException] {

Review Comment:
   To make your life easier when merging the change further on, please avoid needless changes :)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "mihailom-db (via GitHub)" <gi...@apache.org>.
mihailom-db commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500315583


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -463,6 +463,12 @@
     ],
     "sqlState" : "42704"
   },
+  "COLLATION_SUPPORT_DISABLED" : {
+    "message" : [
+      "Collation feature is under development and not all features are supported. To enable existing features toggle `COLLATION_ENABLED` flag to true."

Review Comment:
   If I understood correctly, we want to make sure that customer is prevented from trying to use partially built feature and getting in trouble because some functionality was not implemented at the time.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "dbatomic (via GitHub)" <gi...@apache.org>.
dbatomic commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1500456651


##########
sql/core/src/test/scala/org/apache/spark/sql/errors/QueryCompilationErrorsSuite.scala:
##########
@@ -964,6 +964,18 @@ class QueryCompilationErrorsSuite
         "className" -> "org.apache.spark.sql.catalyst.expressions.UnsafeRow"))
   }
 
+  test("SPARK-47102: Collation query when COLLATION_ENABLED is false") {
+    withSQLConf(SQLConf.COLLATION_ENABLED.key -> "false") {
+      checkError(
+        exception = intercept[AnalysisException] {

Review Comment:
   Please add a test for `collation` expression as well.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL] Adding COLLATION_ENABLED config [spark]

Posted by "dbatomic (via GitHub)" <gi...@apache.org>.
dbatomic commented on code in PR #45218:
URL: https://github.com/apache/spark/pull/45218#discussion_r1499498320


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -4962,6 +4970,9 @@ class SQLConf extends Serializable with Logging with SqlApiConf {
     }
   }
 
+  def collationEnabled: Boolean = {
+    getConf(COLLATION_ENABLED)
+  }

Review Comment:
   nit: empty line here



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


Re: [PR] [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config [spark]

Posted by "MaxGekk (via GitHub)" <gi...@apache.org>.
MaxGekk closed pull request #45218: [SPARK-47102][SQL][COLLATION] Adding COLLATION_ENABLED config
URL: https://github.com/apache/spark/pull/45218


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org