You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2020/07/13 13:28:56 UTC

[GitHub] [spark] cloud-fan commented on a change in pull request #28840: [SPARK-31999][SQL] Add REFRESH FUNCTION command

cloud-fan commented on a change in pull request #28840:
URL: https://github.com/apache/spark/pull/28840#discussion_r453643288



##########
File path: docs/sql-ref-syntax-aux-cache-refresh-function.md
##########
@@ -0,0 +1,60 @@
+---
+layout: global
+title: REFRESH FUNCTION
+displayTitle: REFRESH FUNCTION
+license: |
+  Licensed to the Apache Software Foundation (ASF) under one or more
+  contributor license agreements.  See the NOTICE file distributed with
+  this work for additional information regarding copyright ownership.
+  The ASF licenses this file to You under the Apache License, Version 2.0
+  (the "License"); you may not use this file except in compliance with
+  the License.  You may obtain a copy of the License at
+ 
+     http://www.apache.org/licenses/LICENSE-2.0
+ 
+  Unless required by applicable law or agreed to in writing, software
+  distributed under the License is distributed on an "AS IS" BASIS,
+  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+  See the License for the specific language governing permissions and
+  limitations under the License.
+---
+
+### Description
+
+`REFRESH FUNCTION` statement invalidates the cached function entry, which includes a class name
+and resource location of the given function. The invalidated cache is populated right away.
+Note that `REFRESH FUNCTION` only works for permanent functions. Refreshing native functions or temporary functions will cause an exception.
+
+### Syntax
+
+```sql
+REFRESH FUNCTION function_identifier
+```
+
+### Parameters
+
+* **function_identifier**
+
+    Specifies a function name, which is either a qualified or unqualified name. If no database identifier is provided, uses the current database.
+
+    **Syntax:** `[ database_name. ] function_name`
+
+### Examples
+
+```sql
+-- The cached entry of the function will be refreshed
+-- The function is resolved from the current database as the function name is unqualified.
+REFRESH FUNCTION func1;
+
+-- The cached entry of the function will be refreshed
+-- The function is resolved from tempDB database as the function name is qualified.
+REFRESH FUNCTION tempDB.func1;   

Review comment:
       nit: `db1.func1`? `tempDB` sounds like Spark supports temporary database, while it doesn't.

##########
File path: sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/LookupCatalog.scala
##########
@@ -155,4 +155,31 @@ private[sql] trait LookupCatalog extends Logging {
         None
     }
   }
+
+  // TODO: move function related v2 statements to the new framework.

Review comment:
       @imback82 do you have time to work on this TODO?

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##########
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *    REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+    databaseName: Option[String],
+    functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+    val catalog = sparkSession.sessionState.catalog
+    if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+      throw new AnalysisException(s"Cannot refresh builtin function $functionName")
+    }
+    if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) {
+      throw new AnalysisException(s"Cannot refresh temporary function $functionName")
+    }
+
+    val identifier = FunctionIdentifier(
+      functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+    // we only refresh the permanent function.
+    // 1. clear cached function.
+    // 2. register function if exists.
+    catalog.unregisterFunction(identifier, true)
+    if (catalog.isPersistentFunction(identifier)) {
+      val func = catalog.getFunctionMetadata(identifier)
+      catalog.registerFunction(func, true)

Review comment:
       Does `registerFunction` overwrite existing entry? If it does then we don't need to add `unregisterFunction` API.

##########
File path: sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala
##########
@@ -236,6 +236,45 @@ case class ShowFunctionsCommand(
   }
 }
 
+
+/**
+ * A command for users to refresh the persistent function.
+ * The syntax of using this command in SQL is:
+ * {{{
+ *    REFRESH FUNCTION functionName
+ * }}}
+ */
+case class RefreshFunctionCommand(
+    databaseName: Option[String],
+    functionName: String)
+  extends RunnableCommand {
+
+  override def run(sparkSession: SparkSession): Seq[Row] = {
+    val catalog = sparkSession.sessionState.catalog
+    if (FunctionRegistry.builtin.functionExists(FunctionIdentifier(functionName))) {
+      throw new AnalysisException(s"Cannot refresh builtin function $functionName")
+    }
+    if (catalog.isTemporaryFunction(FunctionIdentifier(functionName, databaseName))) {
+      throw new AnalysisException(s"Cannot refresh temporary function $functionName")
+    }
+
+    val identifier = FunctionIdentifier(
+      functionName, Some(databaseName.getOrElse(catalog.getCurrentDatabase)))
+    // we only refresh the permanent function.
+    // 1. clear cached function.
+    // 2. register function if exists.
+    catalog.unregisterFunction(identifier, true)

Review comment:
       shall we move this into the `if (catalog.isPersistentFunction(identifier))` branch?




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org