You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "anchovYu (via GitHub)" <gi...@apache.org> on 2023/07/05 17:41:54 UTC

[GitHub] [spark] anchovYu commented on a diff in pull request #41864: [SPARK-44059] Add analyzer support of named arguments for built-in functions

anchovYu commented on code in PR #41864:
URL: https://github.com/apache/spark/pull/41864#discussion_r1253421717


##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -674,6 +674,12 @@
     ],
     "sqlState" : "23505"
   },
+  "DUPLICATE_ROUTINE_PARAMETER_ASSIGNMENT" : {
+    "message" : [
+      "Call to function <functionName> is invalid because it includes multiple argument assignments to the same name <parameterName>."
+    ],
+    "sqlState" : "42704K"

Review Comment:
   I feel that some of my previous comments are naturally collapsed so you can't see them without expanding. I'll copy those to this new PR.
   
   The error code should be 5 characters according to https://github.com/apache/spark/tree/master/common/utils/src/main/resources/error. 



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/expressions/maskExpressions.scala:
##########
@@ -320,4 +321,18 @@ object Mask {
       case _ => maskedChar(c, maskOther)
     }
   }
+  override def functionSignatures: Seq[FunctionSignature] = {
+    val strArg = NamedArgument("str", FixedArgumentType(StringType))
+    val upperCharArg = NamedArgument("upperChar",

Review Comment:
   nit: For code style could you split each argument on a new line?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SupportsNamedArguments.scala:
##########
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.catalyst.plans.logical
+
+import scala.reflect.ClassTag
+
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedArgumentExpression}
+import org.apache.spark.sql.errors.QueryCompilationErrors
+import org.apache.spark.sql.types.AbstractDataType
+
+/**
+ * A general trait which is used to identify the DataType of the argument
+ */
+trait NamedArgumentType
+
+/**
+ * The standard case class used to represent a simple data type
+ *
+ * @param dataType The data type of some argument
+ */
+case class FixedArgumentType(dataType: AbstractDataType) extends NamedArgumentType
+
+/**
+ * A named parameter
+ *
+ * @param name     The name of the string.
+ * @param dataType The datatype of the argument.
+ * @param default  The default value of the argument. If the default is none, then that means the
+ *                 argument is required. If no argument is provided, an exception is thrown.
+ */
+case class NamedArgument(name: String,
+    dataType: NamedArgumentType,
+    default: Option[Expression] = None)
+
+/**
+ * Represents a method signature and the list of arguments it receives as input.
+ * Currently, overloads are not supported and only one FunctionSignature is allowed
+ * per function expression.
+ *
+ * @param parameters The list of arguments which the function takes
+ */
+case class FunctionSignature(parameters: Seq[NamedArgument])
+
+/**
+ * The class which companion objects of function expression implement to
+ * support named arguments for that function expression.
+ */
+abstract class SupportsNamedArguments {
+  def functionSignatures: Seq[FunctionSignature]

Review Comment:
   Could you add a comment here noting that now by default only one signature is allowed, with some reasoning?



##########
sql/core/src/test/resources/sql-tests/inputs/named-function-arguments.sql:
##########
@@ -1,5 +1,13 @@
+-- Test for named arguments
 SELECT mask('AbCD123-@$#', lowerChar => 'q', upperChar => 'Q', otherChar => 'o', digitChar => 'd');
 SELECT mask(lowerChar => 'q', upperChar => 'Q', otherChar => 'o', digitChar => 'd', str => 'AbCD123-@$#');
 SELECT mask('AbCD123-@$#', lowerChar => 'q', upperChar => 'Q', digitChar => 'd');
 SELECT mask(lowerChar => 'q', upperChar => 'Q', digitChar => 'd', str => 'AbCD123-@$#');
+-- Unexpected positional argument
 SELECT mask(lowerChar => 'q', 'AbCD123-@$#', upperChar => 'Q', otherChar => 'o', digitChar => 'd');
+-- Duplicate parameter assignment
+SELECT mask('AbCD123-@$#', lowerChar => 'q', upperChar => 'Q', otherChar => 'o', digitChar => 'd', digitChar => 'e');
+-- Required parameter not found
+SELECT mask(lowerChar => 'q', upperChar => 'Q', otherChar => 'o', digitChar => 'd');
+-- Unrecognized parameter name
+SELECT mask('AbCD123-@$#', lowerChar => 'q', upperChar => 'Q', otherChar => 'o', digitChar => 'd', cellular => 'automata');

Review Comment:
   Could you add one more test for `NAMED_ARGUMENTS_NOT_SUPPORTED`?



##########
sql/core/src/test/resources/sql-tests/inputs/named-function-arguments.sql:
##########
@@ -1,5 +1,13 @@
+-- Test for named arguments
 SELECT mask('AbCD123-@$#', lowerChar => 'q', upperChar => 'Q', otherChar => 'o', digitChar => 'd');
 SELECT mask(lowerChar => 'q', upperChar => 'Q', otherChar => 'o', digitChar => 'd', str => 'AbCD123-@$#');
 SELECT mask('AbCD123-@$#', lowerChar => 'q', upperChar => 'Q', digitChar => 'd');
 SELECT mask(lowerChar => 'q', upperChar => 'Q', digitChar => 'd', str => 'AbCD123-@$#');
+-- Unexpected positional argument
 SELECT mask(lowerChar => 'q', 'AbCD123-@$#', upperChar => 'Q', otherChar => 'o', digitChar => 'd');
+-- Duplicate parameter assignment
+SELECT mask('AbCD123-@$#', lowerChar => 'q', upperChar => 'Q', otherChar => 'o', digitChar => 'd', digitChar => 'e');

Review Comment:
   Could you add one more test for duplicated assignments on a positional arg and a named arg?



##########
common/utils/src/main/resources/error/error-classes.json:
##########
@@ -2114,6 +2125,11 @@
     ],
     "sqlState" : "42614"
   },
+  "REQUIRED_PARAMETER_NOT_FOUND" : {
+    "message" : [
+      "Cannot invoke function <functionName> because the parameter named <parameterName> is required, but the function call did not supply a value. Please update the function call to supply an argument value (either positionally or by name) and retry the query again."
+    ]

Review Comment:
   IIRC new error classes better come with sqlState whenever possible. cc @srielau 



##########
sql/core/src/test/resources/sql-tests/inputs/mask-functions.sql:
##########
@@ -55,4 +55,4 @@ SELECT mask(c1, replaceArg) from values('abcd-EFGH-8765-4321', 'a') as t(c1, rep
 SELECT mask(c1, replaceArg) from values('abcd-EFGH-8765-4321', 'ABC') as t(c1, replaceArg);
 SELECT mask(c1, replaceArg) from values('abcd-EFGH-8765-4321', 123) as t(c1, replaceArg);
 SELECT mask('abcd-EFGH-8765-4321', 'A', 'w', '');
-SELECT mask('abcd-EFGH-8765-4321', 'A', 'abc');
\ No newline at end of file
+SELECT mask('abcd-EFGH-8765-4321', 'A', 'abc');

Review Comment:
   Could you revert the changes of this file?



##########
sql/core/src/test/scala/org/apache/spark/sql/CountMinSketchAggQuerySuite.scala:
##########
@@ -61,4 +61,23 @@ class CountMinSketchAggQuerySuite extends QueryTest with SharedSparkSession {
 
     assert(sketch == reference)
   }
+
+  test("count-min sketch with named-arguments") {

Review Comment:
   Shall we add some e2e tests on the function to `sql/core/src/test/resources/sql-tests/inputs/named-function-arguments.sql`?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/SupportsNamedArguments.scala:
##########
@@ -0,0 +1,196 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+package org.apache.spark.sql.catalyst.plans.logical
+
+import scala.reflect.ClassTag
+
+import org.apache.spark.sql.catalyst.expressions.{Expression, NamedArgumentExpression}
+import org.apache.spark.sql.errors.QueryCompilationErrors
+import org.apache.spark.sql.types.AbstractDataType
+
+/**
+ * A general trait which is used to identify the DataType of the argument
+ */
+trait NamedArgumentType
+
+/**
+ * The standard case class used to represent a simple data type
+ *
+ * @param dataType The data type of some argument
+ */
+case class FixedArgumentType(dataType: AbstractDataType) extends NamedArgumentType
+
+/**
+ * A named parameter
+ *
+ * @param name     The name of the string.
+ * @param dataType The datatype of the argument.
+ * @param default  The default value of the argument. If the default is none, then that means the
+ *                 argument is required. If no argument is provided, an exception is thrown.
+ */
+case class NamedArgument(name: String,

Review Comment:
   ```suggestion
   case class NamedArgument(
       name: String,
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org