You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/21 05:00:00 UTC

[GitHub] [spark] ulysses-you opened a new pull request, #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

ulysses-you opened a new pull request, #36936:
URL: https://github.com/apache/spark/pull/36936

   <!--
   Thanks for sending a pull request!  Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://spark.apache.org/contributing.html
     2. Ensure you have added or run the appropriate tests for your PR: https://spark.apache.org/developer-tools.html
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][SPARK-XXXX] Your PR title ...'.
     4. Be sure to keep the PR description updated to reflect all changes.
     5. Please write your PR title to summarize what this PR proposes.
     6. If possible, provide a concise example to reproduce the issue for a faster review.
     7. If you want to add a new configuration, please read the guideline first for naming configurations in
        'core/src/main/scala/org/apache/spark/internal/config/ConfigEntry.scala'.
     8. If you want to add or modify an error type or message, please read the guideline first in
        'core/src/main/resources/error/README.md'.
   -->
   
   ### What changes were proposed in this pull request?
   <!--
   Please clarify what changes you are proposing. The purpose of this section is to outline the changes and how this PR fixes the issue. 
   If possible, please consider writing useful notes for better and faster reviews in your PR. See the examples below.
     1. If you refactor some codes with changing classes, showing the class hierarchy will help reviewers.
     2. If you fix some SQL features, you can provide some references of other DBMSes.
     3. If there is design documentation, please add the link.
     4. If there is a discussion in the mailing list, please add the link.
   -->
   - Add session catalog name in identifiers, then all identifiers will be 3 part name
   - Make `CatalogImpl#refreshTable` support 3 part name. This option is required since a lot of command use this method.
   
   ### Why are the changes needed?
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you propose a new API, clarify the use case for a new API.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   To make it more clearer that this table or function comes from which catalog. It affects:
   
   - the scan table(view) of the query plan
   - the target table(view) of the data writing
   - desc database
   - desc table(view)
   - show create table(view)
   - desc function
   
   ### Does this PR introduce _any_ user-facing change?
   <!--
   Note that it means *any* user-facing change including all aspects such as the documentation fix.
   If yes, please clarify the previous behavior and the change this PR proposes - provide the console output, description and/or an example to show the behavior difference if possible.
   If possible, please also clarify if this is a user-facing change compared to the released Spark versions or within the unreleased branches such as master.
   If no, write 'No'.
   -->
   maybe yes, so add a new config `spark.sql.legacy.identifierOutputCatalogName` to restore the old behavior
   
   ### How was this patch tested?
   <!--
   If tests were added, say they were added here. Please make sure to add some test cases that check the changes thoroughly including negative and positive cases if possible.
   If it was tested in a way different from regular unit tests, please clarify how you tested step by step, ideally copy and paste-able, so that other reviewers can test and check, and descendants can verify in the future.
   If tests were not added, please describe why they were not added and/or why it was difficult to add.
   If benchmark tests were added, please run the benchmarks in GitHub Actions for the consistent environment, and the instructions could accord to: https://spark.apache.org/developer-tools.html#github-workflow-benchmarks.
   -->
   pass CI


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #36936:
URL: https://github.com/apache/spark/pull/36936#issuecomment-1168150746

   The change list to help reviewer:
   ```scala
   sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala                  |   3 ++-
   sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalog.scala               |  16 +++++++++++-----
   sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/catalog/interface.scala                    |  12 ++++++++++--
   sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala                          |  40 +++++++++++++++++++++++++++++++---------
   sql/catalyst/src/main/scala/org/apache/spark/sql/errors/QueryCompilationErrors.scala                 |  47 ++++++++++++++++++++++++++++++-----------------
   sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala                              |   9 +++++++++
   sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala          |  41 +++++++++++++++++++++++------------------
   sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/parser/ExpressionParserSuite.scala         |   9 ++++++---
   sql/core/src/main/scala/org/apache/spark/sql/DataFrameWriter.scala                                   |   3 ++-
   sql/core/src/main/scala/org/apache/spark/sql/execution/DataSourceScanExec.scala                      |   3 ++-
   sql/core/src/main/scala/org/apache/spark/sql/execution/command/AnalyzePartitionCommand.scala         |   5 +++--
   sql/core/src/main/scala/org/apache/spark/sql/execution/command/createDataSourceTables.scala          |   9 ++++++---
   sql/core/src/main/scala/org/apache/spark/sql/execution/command/ddl.scala                             |  21 ++++++++++++---------
   sql/core/src/main/scala/org/apache/spark/sql/execution/command/functions.scala                       |   9 ++++++---
   sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala                          |  36 +++++++++++++++++++++++-------------
   sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/LogicalRelation.scala             |   4 +++-
   sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/rules.scala                       |  19 +++++++++++--------
   sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeNamespaceExec.scala    |   1 +
   sql/core/src/main/scala/org/apache/spark/sql/internal/BaseSessionStateBuilder.scala                  |   8 ++++----
   sql/core/src/main/scala/org/apache/spark/sql/internal/CatalogImpl.scala                              |  57 ++++++++++++++++++++++++++++++++++----------------------
   
   sql/core/src/test/scala/org/apache/spark/sql/DataFrameWriterV2Suite.scala                            |  10 +++++++---
   sql/core/src/test/scala/org/apache/spark/sql/execution/QueryExecutionSuite.scala                     |   4 +++-
   sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala                        |  24 +++++++++++++++---------
   sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/DescribeNamespaceSuite.scala       |  12 +++++++-----
   sql/core/src/test/scala/org/apache/spark/sql/execution/command/v1/DescribeTableSuite.scala           |   2 ++
   sql/core/src/test/scala/org/apache/spark/sql/execution/command/v2/DescribeNamespaceSuite.scala       |   1 +
   sql/core/src/test/scala/org/apache/spark/sql/execution/metric/SQLMetricsSuite.scala                  |   3 ++-
   sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala                          |  19 ++++++++++---------
   sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveMetastoreCatalog.scala                         |   9 ++++++---
   sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveStrategies.scala                               |   4 +++-
   sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala                |   4 +++-
   sql/hive/src/test/scala/org/apache/spark/sql/hive/MetastoreDataSourcesSuite.scala                    |  24 ++++++++++++++----------
   sql/hive/src/test/scala/org/apache/spark/sql/hive/StatisticsSuite.scala                              |   7 ++++---
   sql/hive/src/test/scala/org/apache/spark/sql/hive/UDFSuite.scala                                     |   9 +++++----
   sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveDDLSuite.scala                       |  22 ++++++++++++++--------
   sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveExplainSuite.scala                   |   3 ++-
   sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveTableScanSuite.scala                 |  13 +++++++------
   sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/HiveUDFSuite.scala                       |   3 ++-
   sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/SQLQuerySuite.scala                      |  11 +++++++----
   sql/hive/src/test/scala/org/apache/spark/sql/hive/execution/command/DescribeTableSuite.scala         |   2 ++
   
   sql/core/src/test/resources/sql-tests/results/charvarchar.sql.out                                    |  16 +++++++++++++++-
   sql/core/src/test/resources/sql-tests/results/describe-part-after-analyze.sql.out                    |   7 +++++++
   sql/core/src/test/resources/sql-tests/results/describe-table-after-alter-table.sql.out               |   5 +++++
   sql/core/src/test/resources/sql-tests/results/describe.sql.out                                       |  23 +++++++++++++++--------
   sql/core/src/test/resources/sql-tests/results/explain-aqe.sql.out                                    | 104 +++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------------------------
   sql/core/src/test/resources/sql-tests/results/explain-cbo.sql.out                                    |   8 ++++----
   sql/core/src/test/resources/sql-tests/results/explain.sql.out                                        |  96 +++++++++++++++++++++++++++++++++++++++++++++++------------------------------------------------
   sql/core/src/test/resources/sql-tests/results/postgreSQL/numeric.sql.out                             |   2 +-
   sql/core/src/test/resources/sql-tests/results/show-tables.sql.out                                    |  10 ++++++----
   sql/core/src/test/resources/sql-tests/results/udaf.sql.out                                           |   2 +-
   sql/core/src/test/resources/sql-tests/results/udf/udf-udaf.sql.out                                   |   2 +-
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you closed pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
ulysses-you closed pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function
URL: https://github.com/apache/spark/pull/36936


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] gatorsmile commented on pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
gatorsmile commented on PR #36936:
URL: https://github.com/apache/spark/pull/36936#issuecomment-1165716964

   cc @cloud-fan @amaliujia @gengliangwang 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on a diff in pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #36936:
URL: https://github.com/apache/spark/pull/36936#discussion_r909614603


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala:
##########
@@ -895,7 +905,7 @@ case class ShowTablesCommand(
       val normalizedSpec = PartitioningUtils.normalizePartitionSpec(
         partitionSpec.get,
         table.partitionSchema,
-        tableIdent.quotedString,
+        tableIdent.quotedString(SESSION_CATALOG_NAME),
         sparkSession.sessionState.conf.resolver)
       val partition = catalog.getPartition(tableIdent, normalizedSpec)
       val database = tableIdent.database.getOrElse("")

Review Comment:
   thank you @tomvanbussel for pointing out this, actually it is out of scope of this pr. We can support improve the output attribute of some related commands in future.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on a diff in pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on code in PR #36936:
URL: https://github.com/apache/spark/pull/36936#discussion_r908095046


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala:
##########
@@ -895,7 +905,7 @@ case class ShowTablesCommand(
       val normalizedSpec = PartitioningUtils.normalizePartitionSpec(
         partitionSpec.get,
         table.partitionSchema,
-        tableIdent.quotedString,
+        tableIdent.quotedString(SESSION_CATALOG_NAME),
         sparkSession.sessionState.conf.resolver)
       val partition = catalog.getPartition(tableIdent, normalizedSpec)
       val database = tableIdent.database.getOrElse("")

Review Comment:
   Do we also want to add the catalog name to the output of this function?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala:
##########
@@ -33,15 +36,34 @@ sealed trait IdentifierWithDatabase {
    */
   private def quoteIdentifier(name: String): String = name.replace("`", "``")
 
-  def quotedString: String = {
-    val replacedId = quoteIdentifier(identifier)
-    val replacedDb = database.map(quoteIdentifier(_))
+  def quotedString: String = quotedString(None)
+  def quotedString(catalog: String): String = quotedString(Some(catalog))
 
-    if (replacedDb.isDefined) s"`${replacedDb.get}`.`$replacedId`" else s"`$replacedId`"
+  def quotedString(catalog: Option[String]): String = {

Review Comment:
   I don't think we should add these new methods. These methods make it easier to omit the catalog than to include it. I'm worried that this leads to most new code (and some existing code) not including the catalog. AFAIK it's always safe to include the catalog name if the database name is set (as in that case it won't be a local temporary view), so the following should work:
   ```scala
   def quotedString: String = {
     val replacedId = quoteIdentifier(identifier)
     val replacedDb = database.map(quoteIdentifier)
     if (database.isDefined) {
       if (SQLConf.get.getConf(LEGACY_NON_IDENTIFIER_OUTPUT_CATALOG_NAME)) {
         s"`${replacedDb.get}`.`$replacedId`"
       } else {
         s"`$SESSION_CATALOG_NAME`.`${replacedDb.get}`.`$replacedId`"
       }
     } else {
       s"`$replacedId`"
     }
   }
   



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/unresolved.scala:
##########
@@ -234,7 +235,7 @@ case class UnresolvedGenerator(name: FunctionIdentifier, children: Seq[Expressio
   override def nullable: Boolean = throw new UnresolvedException("nullable")
   override lazy val resolved = false
 
-  override def prettyName: String = name.unquotedString
+  override def prettyName: String = name.unquotedString(SESSION_CATALOG_NAME)
   override def toString: String = s"'$name(${children.mkString(", ")})"

Review Comment:
   This will include the unquoted identifier *without* the catalog, as the `toString` method of `FunctionIdentifier` is implemented as `unquotedString`. Is this intentional?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36936:
URL: https://github.com/apache/spark/pull/36936#discussion_r907137742


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3848,6 +3848,15 @@ object SQLConf {
       .booleanConf
       .createWithDefault(false)
 
+  val LEGACY_IDENTIFIER_OUTPUT_CATALOG_NAME =
+    buildConf("spark.sql.legacy.identifierOutputCatalogName")
+      .internal()
+      .doc("When set to true, the identifier will output catalog name if database is defined. " +
+        "When set to false, it restores the legacy behavior that does not output catalog name.")

Review Comment:
   nit: usually true means legacy behavior, we probably need to rename the config a little bit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #36936:
URL: https://github.com/apache/spark/pull/36936#issuecomment-1173244010

   close this pr.  in favor of https://github.com/apache/spark/pull/37021


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on a diff in pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on code in PR #36936:
URL: https://github.com/apache/spark/pull/36936#discussion_r908106880


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala:
##########
@@ -33,15 +36,34 @@ sealed trait IdentifierWithDatabase {
    */
   private def quoteIdentifier(name: String): String = name.replace("`", "``")
 
-  def quotedString: String = {
-    val replacedId = quoteIdentifier(identifier)
-    val replacedDb = database.map(quoteIdentifier(_))
+  def quotedString: String = quotedString(None)
+  def quotedString(catalog: String): String = quotedString(Some(catalog))
 
-    if (replacedDb.isDefined) s"`${replacedDb.get}`.`$replacedId`" else s"`$replacedId`"
+  def quotedString(catalog: Option[String]): String = {

Review Comment:
   I don't think we should add these new methods. These methods make it easier to omit the catalog than to include it. I'm worried that this leads to most new code (and some existing code) not including the catalog. AFAIK it's always safe to include the catalog name if the database name is set (as in that case we know that it won't be a local temporary view), so the following should work:
   ```scala
   def quotedString: String = {
     val replacedId = quoteIdentifier(identifier)
     val replacedDb = database.map(quoteIdentifier)
     if (database.isDefined) {
       if (SQLConf.get.getConf(LEGACY_NON_IDENTIFIER_OUTPUT_CATALOG_NAME) || isTempView) {
         s"`${replacedDb.get}`.`$replacedId`"
       } else {
         s"`$SESSION_CATALOG_NAME`.`${replacedDb.get}`.`$replacedId`"
       }
     } else {
       s"`$replacedId`"
     }
   }
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #36936:
URL: https://github.com/apache/spark/pull/36936#issuecomment-1169961468

   I create a new pr https://github.com/apache/spark/pull/37021 for this issue but a new approach:
   - add catalog field in identifier, so identifier just print catalog if defined
   - inject catalog at the beginning of identifier life
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on PR #36936:
URL: https://github.com/apache/spark/pull/36936#issuecomment-1164368403

   @ulysses-you Fully agreed that we should leave _local_ temporary views as they are. They also do not belong to a database AFAIK, so it indeed would be weird to include the session catalog in the identifier. For _global_ temporary views, I believe we should display the catalog, as they belong to the `global_temp` database in the session catalog.
   
   I'm not sure if I'm the right person to discuss how views should be integrated with the v2 catalog, but it seems like a neat idea. Might be worth exploring in a follow-up.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on PR #36936:
URL: https://github.com/apache/spark/pull/36936#issuecomment-1164335379

   @tomvanbussel yeah, I'm fine to also support add session catalog for view. But I want to clarify some difference and trade-off for view. 
   - For temp view, it does not belong to any catalog, so it should be as is
   - For global temp view, ideally, we should treat it as a special catalog, but for now we can not due to we do not support v2 view catalog, then we also can not add session catalog for it
   - What we can add support is permanent view


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on a diff in pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on code in PR #36936:
URL: https://github.com/apache/spark/pull/36936#discussion_r907136211


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala:
##########
@@ -34,13 +38,35 @@ sealed trait IdentifierWithDatabase {
   private def quoteIdentifier(name: String): String = name.replace("`", "``")
 
   def quotedString: String = {
+    if (SQLConf.get.getConf(LEGACY_IDENTIFIER_OUTPUT_CATALOG_NAME) && database.isDefined) {
+      val replacedId = quoteIdentifier(identifier)
+      val replacedDb = database.map(quoteIdentifier(_))

Review Comment:
   ```suggestion
         val replacedDb = quoteIdentifier(database.get)
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cloud-fan commented on pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
cloud-fan commented on PR #36936:
URL: https://github.com/apache/spark/pull/36936#issuecomment-1167076595

   looks good in general. I'm taking a closer look at where we use `unquotedStringWithoutCatalog`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] tomvanbussel commented on a diff in pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
tomvanbussel commented on code in PR #36936:
URL: https://github.com/apache/spark/pull/36936#discussion_r908106880


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala:
##########
@@ -33,15 +36,34 @@ sealed trait IdentifierWithDatabase {
    */
   private def quoteIdentifier(name: String): String = name.replace("`", "``")
 
-  def quotedString: String = {
-    val replacedId = quoteIdentifier(identifier)
-    val replacedDb = database.map(quoteIdentifier(_))
+  def quotedString: String = quotedString(None)
+  def quotedString(catalog: String): String = quotedString(Some(catalog))
 
-    if (replacedDb.isDefined) s"`${replacedDb.get}`.`$replacedId`" else s"`$replacedId`"
+  def quotedString(catalog: Option[String]): String = {

Review Comment:
   I don't think we should add these new methods. These methods make it easier to omit the catalog than to include it. I'm worried that this leads to most new code (and some existing code) not including the catalog. AFAIK it's always safe to include the catalog name if the database name is set (as in that case we know that it won't be a local temporary view), so the following should work:
   ```scala
   def quotedString: String = {
     val replacedId = quoteIdentifier(identifier)
     val replacedDb = database.map(quoteIdentifier)
     if (database.isDefined) {
       if (SQLConf.get.getConf(LEGACY_NON_IDENTIFIER_OUTPUT_CATALOG_NAME)) {
         s"`${replacedDb.get}`.`$replacedId`"
       } else {
         s"`$SESSION_CATALOG_NAME`.`${replacedDb.get}`.`$replacedId`"
       }
     } else {
       s"`$replacedId`"
     }
   }
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] ulysses-you commented on a diff in pull request #36936: [SPARK-39503][SQL] Add session catalog name for v1 database table and function

Posted by GitBox <gi...@apache.org>.
ulysses-you commented on code in PR #36936:
URL: https://github.com/apache/spark/pull/36936#discussion_r909617245


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/identifiers.scala:
##########
@@ -33,15 +36,34 @@ sealed trait IdentifierWithDatabase {
    */
   private def quoteIdentifier(name: String): String = name.replace("`", "``")
 
-  def quotedString: String = {
-    val replacedId = quoteIdentifier(identifier)
-    val replacedDb = database.map(quoteIdentifier(_))
+  def quotedString: String = quotedString(None)
+  def quotedString(catalog: String): String = quotedString(Some(catalog))
 
-    if (replacedDb.isDefined) s"`${replacedDb.get}`.`$replacedId`" else s"`$replacedId`"
+  def quotedString(catalog: Option[String]): String = {

Review Comment:
   it is hacky that we access the conf inside identifier that it may appear in anywhere. I make a new pr which pull out the config see https://github.com/apache/spark/pull/37021 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org