You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by gatorsmile <gi...@git.apache.org> on 2017/05/30 05:02:25 UTC

[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

GitHub user gatorsmile opened a pull request:

    https://github.com/apache/spark/pull/18142

    [SPARK-20918] [SQL] Use FunctionIdentifier as function identifiers in FunctionRegistry

    ### What changes were proposed in this pull request?
    Currently, the unquoted string of a function identifier is being used as the function identifier in the function registry. This could cause the incorrect the behavior when users use `.` in the function names. This PR is to take the `FunctionIdentifier` as the identifier in the function registry. 
    
    ### How was this patch tested?
    TODO: add extra test cases to verify the inclusive bug fixes. 

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/gatorsmile/spark fuctionRegistry

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/18142.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #18142
    
----
commit a374e9f14fd2486bb8a77c24d9ff8c3aa12d7bd4
Author: Xiao Li <ga...@gmail.com>
Date:   2017-05-30T04:38:18Z

    fix.

commit 201787f7f01cadf21ca1f9c30304aa4a26af8226
Author: Xiao Li <ga...@gmail.com>
Date:   2017-05-30T04:51:06Z

    fix.

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    @cloud-fan, should we update migration guide as well?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119167774
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ---
    @@ -1152,27 +1151,27 @@ class SessionCatalog(
         // Note: the implementation of this function is a little bit convoluted.
         // We probably shouldn't use a single FunctionRegistry to register all three kinds of functions
         // (built-in, temp, and external).
    -    if (name.database.isEmpty && functionRegistry.functionExists(name.funcName)) {
    +    if (name.database.isEmpty && functionRegistry.functionExists(name)) {
           // This function has been already loaded into the function registry.
    -      return functionRegistry.lookupFunction(name.funcName, children)
    +      return functionRegistry.lookupFunction(name, children)
         }
     
         // If the name itself is not qualified, add the current database to it.
    -    val database = name.database.orElse(Some(currentDb)).map(formatDatabaseName)
    -    val qualifiedName = name.copy(database = database)
    +    val database = formatDatabaseName(name.database.getOrElse(getCurrentDatabase))
    +    val qualifiedName = name.copy(database = Some(database))
     
    -    if (functionRegistry.functionExists(qualifiedName.unquotedString)) {
    +    if (functionRegistry.functionExists(qualifiedName)) {
           // This function has been already loaded into the function registry.
           // Unlike the above block, we find this function by using the qualified name.
    -      return functionRegistry.lookupFunction(qualifiedName.unquotedString, children)
    +      return functionRegistry.lookupFunction(qualifiedName, children)
         }
     
         // The function has not been loaded to the function registry, which means
         // that the function is a permanent function (if it actually has been registered
         // in the metastore). We need to first put the function in the FunctionRegistry.
         // TODO: why not just check whether the function exists first?
         val catalogFunction = try {
    -      externalCatalog.getFunction(currentDb, name.funcName)
    --- End diff --
    
    Submitted a separate PR #18146 for easily backporting to the previous branch.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77825 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77825/testReport)** for PR 18142 at commit [`5635c27`](https://github.com/apache/spark/commit/5635c272862be6a8f769cbe751cccc698c35a78e).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Guys - please in the future separate bug fixes with refactoring. Don't mix a bunch of cosmetic changes with actual bug fixes together.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    > BTW, I believe there's no particular standard for backticks themselves since different DBMS uses different backtick implementations.
    
    You are right, but SQL standard does define how to quote identifiers, by using `"xyz"`. Databases usually support one more syntax to quote identifiers, e.g. backticks, square brackets, etc.
    
    So for this case, I think it's obvious that users want to quote the function name, and we have a bug.
    
    > One explicit problem here is, we claim Hive compatibility in Spark.
    
    Do we? Sometimes we follow hive behaviors, but we never guarantee that IIRC.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/18142


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77815/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    > Spark SQL is designed to be compatible with the Hive Metastore, SerDes and UDFs.
    
    This is different from `Spark can run any Hive SQL`. Spark can load and use Hive UDFs, with the right SQL syntax.
    
    Anyway I'm happy to see a PR to note it. It's good to be verbose in the doc. But I won't treat it a must-have.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    retest this please


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r120961406
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
    @@ -61,7 +61,7 @@ class UDFRegistration private[sql] (functionRegistry: FunctionRegistry) extends
             | dataType: ${udf.dataType}
           """.stripMargin)
     
    -    functionRegistry.registerFunction(name, udf.builder)
    +    functionRegistry.createOrReplaceTempFunction(name, udf.builder)
    --- End diff --
    
    Our current register function APIs in UDFRegistration does not allow users to specify the database names as part of `name`, since the registered functions are temporary.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by maropu <gi...@git.apache.org>.
Github user maropu commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    @gatorsmile btw, we've already file a lira to track these kinds of all the SQL-compiliant issues? not yet?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    We do not need to follow Hive if Hive does not follow SQL compliance. Our main goal is to follow the mainstream DBMS vendors. 
    
    BTW, we can enhance our parser to recognize the other symbols (e.g., double quotes) as the quotes instead of forcing users to choose backtick. cc @maropu 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    > This clearly violates the SQL semantic: the string inside backticks should be treated as a string literal.
    
    BTW, I believe there's no particular standard for backticks themselves since different DBMS uses different backtick implementations.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    @HyukjinKwon Thanks for the note! I think this behavior is better, I'm adding a `release_note` tag to the JIRA ticket, so that we don't forget to mention it in release notes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Yea, I didn't mean it super seriously @cloud-fan - I just left a comment in case for a better documentation since I see many users go from Hive to Spark.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r121177648
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ---
    @@ -1205,8 +1204,8 @@ class SessionCatalog(
         requireDbExists(dbName)
         val dbFunctions = externalCatalog.listFunctions(dbName, pattern).map { f =>
           FunctionIdentifier(f, Some(dbName)) }
    -    val loadedFunctions =
    -      StringUtils.filterPattern(functionRegistry.listFunction(), pattern).map { f =>
    +    val loadedFunctions = StringUtils
    +      .filterPattern(functionRegistry.listFunction().map(_.unquotedString), pattern).map { f =>
    --- End diff --
    
    we can fix it as a follow-up


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    LGTM, merging to master!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77538 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77538/testReport)** for PR 18142 at commit [`e8a534a`](https://github.com/apache/spark/commit/e8a534a382955c181acf1759bc2f44ee6e98aacd).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    After a second thought, isn't it a bug?
    ```
    hive> SELECT `d100.udf100`(`emp`.`name`) FROM `emp`;
    USER
    ```
    
    This clearly violates the SQL semantic: the string inside backticks should be treated as a string literal. I think we should update the JIRA ticket to explain this bug, but we don't need to put it in a release notes or migration guide.
    
    I'll update the ticket.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    I agree with this change too for clarification.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77519 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77519/testReport)** for PR 18142 at commit [`201787f`](https://github.com/apache/spark/commit/201787f7f01cadf21ca1f9c30304aa4a26af8226).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    I mean https://spark.apache.org/docs/latest/sql-programming-guide.html#supported-hive-features and https://spark.apache.org/docs/latest/sql-programming-guide.html#unsupported-hive-functionality
    
    The issue is also related with VIEW support as well. Should be good to note.
    
    I don't mean we should necessarily follow Hive's behaviour for clarification .. Also, I agree with this change ..


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77519/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77602 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77602/testReport)** for PR 18142 at commit [`794de15`](https://github.com/apache/spark/commit/794de15e594d19dc069546d6d071c10df7c3a40a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119013261
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ---
    @@ -1205,8 +1204,8 @@ class SessionCatalog(
         requireDbExists(dbName)
         val dbFunctions = externalCatalog.listFunctions(dbName, pattern).map { f =>
           FunctionIdentifier(f, Some(dbName)) }
    -    val loadedFunctions =
    -      StringUtils.filterPattern(functionRegistry.listFunction(), pattern).map { f =>
    +    val loadedFunctions = StringUtils
    +      .filterPattern(functionRegistry.listFunction().map(_.unquotedString), pattern).map { f =>
    --- End diff --
    
    This PR keeps the current behavior. However, I think it is also a bug. The user-specified `pattern` should not consider the database name. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119012967
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -72,39 +89,53 @@ trait FunctionRegistry {
     
     class SimpleFunctionRegistry extends FunctionRegistry {
     
    -  protected val functionBuilders =
    -    StringKeyHashMap[(ExpressionInfo, FunctionBuilder)](caseSensitive = false)
    --- End diff --
    
    This has a bug. The database name could be case sensitive. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    One explicit problem here is, we claim Hive compatibility in Spark. The difference should be explained when we are clear on this.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    
    
    I am leaving a note (at least to myself) since it looked anyhow caused behaviour change.
    
    
    With this statements in Hive side:
    
    ```sql
    CREATE TABLE emp AS SELECT 'user' AS name, 'address' as address;
    CREATE DATABASE d100;
    CREATE FUNCTION d100.udf100 AS 'org.apache.hadoop.hive.ql.udf.generic.GenericUDFUpper';
    ```
    
    
    **Hive**
    
    
    ```
    hive> SELECT d100.udf100(`emp`.`name`) FROM `emp`;
    USER
    hive> SELECT `d100.udf100`(`emp`.`name`) FROM `emp`;
    USER
    ```
    
    
    **Spark**
    
    Before:
    
    ```
    scala> spark.sql("SELECT d100.udf100(`emp`.`name`) FROM `emp`").show
    +-----------------+
    |d100.udf100(name)|
    +-----------------+
    |             USER|
    +-----------------+
    
    
    scala> spark.sql("SELECT `d100.udf100`(`emp`.`name`) FROM `emp`").show
    +-----------------+
    |d100.udf100(name)|
    +-----------------+
    |             USER|
    +-----------------+
    ```
    
    After:
    
    ```
    scala> spark.sql("SELECT d100.udf100(`emp`.`name`) FROM `emp`").show
    +-----------------+
    |d100.udf100(name)|
    +-----------------+
    |             USER|
    +-----------------+
    
    
    scala> spark.sql("SELECT `d100.udf100`(`emp`.`name`) FROM `emp`").show
    org.apache.spark.sql.AnalysisException: Undefined function: 'd100.udf100'. This function is neither a registered temporary function nor a permanent function registered in the database 'default'.; line 1 pos 7
    ```
    
    
    **MySQL**
    
    This change causes a inconsistency with Hive although looks consistent compating to MySQL.
    
    ```
    mysql> SELECT `d100.udf100`(`emp`.`name`) FROM `emp`;
    ERROR 1305 (42000): FUNCTION hkwon.d100.udf100 does not exist
    mysql> SELECT d100.udf100(`emp`.`name`) FROM `emp`;
    +---------------------------+
    | d100.udf100(`emp`.`name`) |
    +---------------------------+
    | Hello, user!              |
    +---------------------------+
    1 row in set (0.01 sec)
    ```



---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77520 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77520/testReport)** for PR 18142 at commit [`3f253f3`](https://github.com/apache/spark/commit/3f253f37b1660a6f69376986b470213c38c10cc6).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119013093
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ---
    @@ -1116,8 +1115,8 @@ class SessionCatalog(
         // TODO: just make function registry take in FunctionIdentifier instead of duplicating this
         val database = name.database.orElse(Some(currentDb)).map(formatDatabaseName)
         val qualifiedName = name.copy(database = database)
    -    functionRegistry.lookupFunction(name.funcName)
    --- End diff --
    
    This also sounds a bug. This line ignores the database name. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119959289
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -72,39 +95,53 @@ trait FunctionRegistry {
     
     class SimpleFunctionRegistry extends FunctionRegistry {
     
    -  protected val functionBuilders =
    -    StringKeyHashMap[(ExpressionInfo, FunctionBuilder)](caseSensitive = false)
    +  @GuardedBy("this")
    +  private val functionBuilders =
    +    new mutable.HashMap[FunctionIdentifier, (ExpressionInfo, FunctionBuilder)]
    +
    +  // Resolution of the function name is always case insensitive, but the database name
    --- End diff --
    
    this looks weird, database name is always case sensitive and function name is always case insenstive?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77825 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77825/testReport)** for PR 18142 at commit [`5635c27`](https://github.com/apache/spark/commit/5635c272862be6a8f769cbe751cccc698c35a78e).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77825/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119159618
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala ---
    @@ -1152,27 +1151,27 @@ class SessionCatalog(
         // Note: the implementation of this function is a little bit convoluted.
         // We probably shouldn't use a single FunctionRegistry to register all three kinds of functions
         // (built-in, temp, and external).
    -    if (name.database.isEmpty && functionRegistry.functionExists(name.funcName)) {
    +    if (name.database.isEmpty && functionRegistry.functionExists(name)) {
           // This function has been already loaded into the function registry.
    -      return functionRegistry.lookupFunction(name.funcName, children)
    +      return functionRegistry.lookupFunction(name, children)
         }
     
         // If the name itself is not qualified, add the current database to it.
    -    val database = name.database.orElse(Some(currentDb)).map(formatDatabaseName)
    -    val qualifiedName = name.copy(database = database)
    +    val database = formatDatabaseName(name.database.getOrElse(getCurrentDatabase))
    +    val qualifiedName = name.copy(database = Some(database))
     
    -    if (functionRegistry.functionExists(qualifiedName.unquotedString)) {
    +    if (functionRegistry.functionExists(qualifiedName)) {
           // This function has been already loaded into the function registry.
           // Unlike the above block, we find this function by using the qualified name.
    -      return functionRegistry.lookupFunction(qualifiedName.unquotedString, children)
    +      return functionRegistry.lookupFunction(qualifiedName, children)
         }
     
         // The function has not been loaded to the function registry, which means
         // that the function is a permanent function (if it actually has been registered
         // in the metastore). We need to first put the function in the FunctionRegistry.
         // TODO: why not just check whether the function exists first?
         val catalogFunction = try {
    -      externalCatalog.getFunction(currentDb, name.funcName)
    --- End diff --
    
    This is another bug that blocks users to use the function qualified for the other database. For example, 
    ```SQL
    SELECT db1.test_avg(1)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by jerryshao <gi...@git.apache.org>.
Github user jerryshao commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    I see. Thanks for the note.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    hmm, then it's too late. Maybe we can add it in Spark 2.3.2, cc @jerryshao 


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119012802
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -72,39 +89,53 @@ trait FunctionRegistry {
     
     class SimpleFunctionRegistry extends FunctionRegistry {
     
    -  protected val functionBuilders =
    -    StringKeyHashMap[(ExpressionInfo, FunctionBuilder)](caseSensitive = false)
    +  @GuardedBy("this")
    +  private val functionBuilders =
    +    new mutable.HashMap[FunctionIdentifier, (ExpressionInfo, FunctionBuilder)]
    +
    +  // Resolution of the function name is always case insensitive, but the database name
    +  // depends on the caller
    +  private def normalizeFuncName(name: FunctionIdentifier): FunctionIdentifier = {
    +    FunctionIdentifier(name.funcName.toLowerCase(Locale.ROOT), name.database)
    +  }
     
       override def registerFunction(
    -      name: String,
    +      name: FunctionIdentifier,
           info: ExpressionInfo,
           builder: FunctionBuilder): Unit = synchronized {
    -    functionBuilders.put(name, (info, builder))
    +    functionBuilders.put(normalizeFuncName(name), (info, builder))
       }
     
    -  override def lookupFunction(name: String, children: Seq[Expression]): Expression = {
    +  override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = {
         val func = synchronized {
    -      functionBuilders.get(name).map(_._2).getOrElse {
    +      functionBuilders.get(normalizeFuncName(name)).map(_._2).getOrElse {
             throw new AnalysisException(s"undefined function $name")
           }
         }
         func(children)
       }
     
    -  override def listFunction(): Seq[String] = synchronized {
    -    functionBuilders.iterator.map(_._1).toList.sorted
    --- End diff --
    
    This `sorted` is useless.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119959672
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/UDFRegistration.scala ---
    @@ -61,7 +61,7 @@ class UDFRegistration private[sql] (functionRegistry: FunctionRegistry) extends
             | dataType: ${udf.dataType}
           """.stripMargin)
     
    -    functionRegistry.registerFunction(name, udf.builder)
    +    functionRegistry.createOrReplaceTempFunction(name, udf.builder)
    --- End diff --
    
    is it same as before? what if the `name` contains database part?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77520/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119959388
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -72,39 +89,53 @@ trait FunctionRegistry {
     
     class SimpleFunctionRegistry extends FunctionRegistry {
     
    -  protected val functionBuilders =
    -    StringKeyHashMap[(ExpressionInfo, FunctionBuilder)](caseSensitive = false)
    +  @GuardedBy("this")
    +  private val functionBuilders =
    +    new mutable.HashMap[FunctionIdentifier, (ExpressionInfo, FunctionBuilder)]
    +
    +  // Resolution of the function name is always case insensitive, but the database name
    +  // depends on the caller
    +  private def normalizeFuncName(name: FunctionIdentifier): FunctionIdentifier = {
    +    FunctionIdentifier(name.funcName.toLowerCase(Locale.ROOT), name.database)
    +  }
     
       override def registerFunction(
    -      name: String,
    +      name: FunctionIdentifier,
           info: ExpressionInfo,
           builder: FunctionBuilder): Unit = synchronized {
    -    functionBuilders.put(name, (info, builder))
    +    functionBuilders.put(normalizeFuncName(name), (info, builder))
       }
     
    -  override def lookupFunction(name: String, children: Seq[Expression]): Expression = {
    +  override def lookupFunction(name: FunctionIdentifier, children: Seq[Expression]): Expression = {
         val func = synchronized {
    -      functionBuilders.get(name).map(_._2).getOrElse {
    +      functionBuilders.get(normalizeFuncName(name)).map(_._2).getOrElse {
             throw new AnalysisException(s"undefined function $name")
           }
         }
         func(children)
       }
     
    -  override def listFunction(): Seq[String] = synchronized {
    -    functionBuilders.iterator.map(_._1).toList.sorted
    --- End diff --
    
    I think sorted output can make users easy to search for a function, shall we still keep it?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77815 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77815/testReport)** for PR 18142 at commit [`794de15`](https://github.com/apache/spark/commit/794de15e594d19dc069546d6d071c10df7c3a40a).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    BTW, this is changed at Spark 2.3.0. How did we handle this before?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77538 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77538/testReport)** for PR 18142 at commit [`e8a534a`](https://github.com/apache/spark/commit/e8a534a382955c181acf1759bc2f44ee6e98aacd).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77538/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/77602/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by HyukjinKwon <gi...@git.apache.org>.
Github user HyukjinKwon commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    Yea that was my impression as well. Let me bring this back when we're clear if this is a bug or not.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r119014081
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -17,51 +17,74 @@
     
     package org.apache.spark.sql.catalyst.analysis
     
    -import java.lang.reflect.Modifier
    +import java.util.Locale
    +import javax.annotation.concurrent.GuardedBy
     
    +import scala.collection.mutable
     import scala.language.existentials
     import scala.reflect.ClassTag
     import scala.util.{Failure, Success, Try}
     
     import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.FunctionIdentifier
     import org.apache.spark.sql.catalyst.analysis.FunctionRegistry.FunctionBuilder
     import org.apache.spark.sql.catalyst.expressions._
     import org.apache.spark.sql.catalyst.expressions.aggregate._
     import org.apache.spark.sql.catalyst.expressions.xml._
    -import org.apache.spark.sql.catalyst.util.StringKeyHashMap
     import org.apache.spark.sql.types._
     
     
     /**
      * A catalog for looking up user defined functions, used by an [[Analyzer]].
      *
    - * Note: The implementation should be thread-safe to allow concurrent access.
    + * Note:
    + *   1) The implementation should be thread-safe to allow concurrent access.
    + *   2) the database name is always case-sensitive here, callers are responsible to
    + *      format the database name w.r.t. case-sensitive config.
      */
     trait FunctionRegistry {
     
    -  final def registerFunction(name: String, builder: FunctionBuilder): Unit = {
    -    registerFunction(name, new ExpressionInfo(builder.getClass.getCanonicalName, name), builder)
    +  final def registerFunction(name: FunctionIdentifier, builder: FunctionBuilder): Unit = {
    +    val info = new ExpressionInfo(
    +      builder.getClass.getCanonicalName, name.database.orNull, name.funcName)
    +    registerFunction(name, info, builder)
       }
     
    -  def registerFunction(name: String, info: ExpressionInfo, builder: FunctionBuilder): Unit
    +  def registerFunction(
    +    name: FunctionIdentifier,
    +    info: ExpressionInfo,
    +    builder: FunctionBuilder): Unit
    +
    +  /* Create or replace a temporary function. */
    +  final def createOrReplaceTempFunction(name: String, builder: FunctionBuilder): Unit = {
    --- End diff --
    
    Since we already expose `FunctionRegistry` to the stable class `UDFRegistration`, I added this extra API for a helper function. 
    
    Ideally, this function should only exist in `SessionCatalog`. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77602 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77602/testReport)** for PR 18142 at commit [`794de15`](https://github.com/apache/spark/commit/794de15e594d19dc069546d6d071c10df7c3a40a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as function i...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/18142
  
    **[Test build #77815 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/77815/testReport)** for PR 18142 at commit [`794de15`](https://github.com/apache/spark/commit/794de15e594d19dc069546d6d071c10df7c3a40a).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #18142: [SPARK-20918] [SQL] Use FunctionIdentifier as fun...

Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:

    https://github.com/apache/spark/pull/18142#discussion_r120960771
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala ---
    @@ -72,39 +95,53 @@ trait FunctionRegistry {
     
     class SimpleFunctionRegistry extends FunctionRegistry {
     
    -  protected val functionBuilders =
    -    StringKeyHashMap[(ExpressionInfo, FunctionBuilder)](caseSensitive = false)
    +  @GuardedBy("this")
    +  private val functionBuilders =
    +    new mutable.HashMap[FunctionIdentifier, (ExpressionInfo, FunctionBuilder)]
    +
    +  // Resolution of the function name is always case insensitive, but the database name
    --- End diff --
    
    That is the resolution rule we are using now. : (


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org