You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/06/05 21:57:45 UTC

[GitHub] [spark] dtenedor opened a new pull request, #36771: [WIP][SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

dtenedor opened a new pull request, #36771:
URL: https://github.com/apache/spark/pull/36771

   ### What changes were proposed in this pull request?
   
   Extend DEFAULT column support in ALTER TABLE ADD COLUMNS commands to include V2 data sources.
   
   Example:
   
   ```
   > create or replace table t (a string default 'abc') using $v2Source
   > insert into t values (default)
   > alter table t add column (b string default 'def')
   > insert into t values ("ghi")
   > Select * from t
   "abc", "def",
   "ghi", "def"
   ```
   
   ### Why are the changes needed?
   
   This makes V2 data sources easier to use and extend.
   
   ### Does this PR introduce _any_ user-facing change?
   
   Yes.
   
   ### How was this patch tested?
   
   This PR includes new test coverage.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r894798568


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala:
##########
@@ -95,7 +95,7 @@ object ResolveDefaultColumns {
     if (SQLConf.get.enableDefaultColumns) {
       val allowedTableProviders: Array[String] =
         SQLConf.get.getConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS)
-          .toLowerCase().split(",").map(_.trim)
+          .toLowerCase().split(",").map(_.trim) ++ Seq("")

Review Comment:
   Good question: we include the empty string here to allow the 'tableProvider' argument to be empty. This is only empty in tests, for example when using the InMemoryTableCatalog. In all other cases, the code requires a non-empty table provider to be given to this method call. I added this information to a comment here to help explain.
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r895767393


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala:
##########
@@ -43,6 +43,8 @@ class V2SessionCatalog(catalog: SessionCatalog)
   extends TableCatalog with FunctionCatalog with SupportsNamespaces with SQLConfHelper {
   import V2SessionCatalog._
 
+  var defaultColumnAnalyzer: Analyzer = null

Review Comment:
   @gengliangwang Good question. I tried this earlier, but it appears that the `SimpleAnalyzer` uses an empty `FunctionRegistry`. Therefore default expressions like `concat('abc', 'def')`, which show up several places in the existing tests, fail to resolve with "function not found" errors.
   
   So this leaves us with a couple options, what do you think?
   
   1. Pass in the main `Analyzer` instance into the `constantFoldCurrentDefaultsToExistDefaults` method, as is done in this PR.
   2. Create a new variant of `SimpleAnalyzer` with the built-in functions from the `FunctionRegistry` [a], e.g. `SimpleAnalyzerWithBuiltInFunctions`.
   
   [a]  sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/FunctionRegistry.scala



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on PR #36771:
URL: https://github.com/apache/spark/pull/36771#issuecomment-1159275091

   LGTM except one comment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on PR #36771:
URL: https://github.com/apache/spark/pull/36771#issuecomment-1162026733

   Merging to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r898205938


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala:
##########
@@ -43,6 +43,8 @@ class V2SessionCatalog(catalog: SessionCatalog)
   extends TableCatalog with FunctionCatalog with SupportsNamespaces with SQLConfHelper {
   import V2SessionCatalog._
 
+  var defaultColumnAnalyzer: Analyzer = null

Review Comment:
   I like the idea of SimpleAnalyzerWithBuiltinFunctions, or we can make it ColumnDefaultSimpleAnalyzer.
   It will be an associate Analyzer for the default column value constant folding. It is more reasonable than passing the Analyzer which contains the column default analysis rule to the column default analysis rule



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r900999638


##########
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala:
##########
@@ -310,6 +311,26 @@ trait AlterTableTests extends SharedSparkSession {
     }
   }
 
+  test("SPARK-39383 DEFAULT columns on V2 data sources with ALTER TABLE ADD COLUMN") {
+    withSQLConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS.key -> s"$v2Format, ") {

Review Comment:
   Hmm, why setting the "v2Format" is not enough?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r900965411


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala:
##########
@@ -98,7 +98,10 @@ object ResolveDefaultColumns {
     if (SQLConf.get.enableDefaultColumns) {
       val allowedTableProviders: Array[String] =
         SQLConf.get.getConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS)
-          .toLowerCase().split(",").map(_.trim)
+          .toLowerCase().split(",").map(_.trim) ++
+          // We include the empty string here to allow the 'tableProvider' argument to be empty.
+          // This is only empty in tests, for example when using the InMemoryTableCatalog.

Review Comment:
   @gengliangwang sounds good, I followed the second option.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r899211242


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala:
##########
@@ -43,6 +43,8 @@ class V2SessionCatalog(catalog: SessionCatalog)
   extends TableCatalog with FunctionCatalog with SupportsNamespaces with SQLConfHelper {
   import V2SessionCatalog._
 
+  var defaultColumnAnalyzer: Analyzer = null

Review Comment:
   @gengliangwang Now that the refactoring PR is merged, I synced it into this PR and committed the change. This PR is now much simpler and ready for your review again, thank you for suggesting that ideas.



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala:
##########
@@ -43,6 +43,8 @@ class V2SessionCatalog(catalog: SessionCatalog)
   extends TableCatalog with FunctionCatalog with SupportsNamespaces with SQLConfHelper {
   import V2SessionCatalog._
 
+  var defaultColumnAnalyzer: Analyzer = null

Review Comment:
   @gengliangwang Now that the refactoring PR is merged, I synced it into this PR and committed the change. This PR is now much simpler and ready for your review again, thank you for suggesting that idea.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r895248853


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala:
##########
@@ -43,6 +43,8 @@ class V2SessionCatalog(catalog: SessionCatalog)
   extends TableCatalog with FunctionCatalog with SupportsNamespaces with SQLConfHelper {
   import V2SessionCatalog._
 
+  var defaultColumnAnalyzer: Analyzer = null

Review Comment:
   If possible, we should avoid using mutable variables.  
   How about always using the `SimpleAnalyzer` to analyze the column default expression?  I can't recall why we pass in an analyzer in many column-default PRs, but `SimpleAnalyzer` seems enough for simple foldable expressions.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r901000714


##########
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala:
##########
@@ -310,6 +311,26 @@ trait AlterTableTests extends SharedSparkSession {
     }
   }
 
+  test("SPARK-39383 DEFAULT columns on V2 data sources with ALTER TABLE ADD COLUMN") {
+    withSQLConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS.key -> s"$v2Format, ") {

Review Comment:
   Got it:)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on PR #36771:
URL: https://github.com/apache/spark/pull/36771#issuecomment-1151836833

   @gengliangwang @HyukjinKwon Hi all, I looked at the V2 data sources, and it seems like this PR is needed for ALTER TABLE commands to work. Please let me know what you think.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r894785751


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala:
##########
@@ -95,7 +95,7 @@ object ResolveDefaultColumns {
     if (SQLConf.get.enableDefaultColumns) {
       val allowedTableProviders: Array[String] =
         SQLConf.get.getConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS)
-          .toLowerCase().split(",").map(_.trim)
+          .toLowerCase().split(",").map(_.trim) ++ Seq("")

Review Comment:
   why do we need this?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r898323664


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala:
##########
@@ -43,6 +43,8 @@ class V2SessionCatalog(catalog: SessionCatalog)
   extends TableCatalog with FunctionCatalog with SupportsNamespaces with SQLConfHelper {
   import V2SessionCatalog._
 
+  var defaultColumnAnalyzer: Analyzer = null

Review Comment:
   @gengliangwang SG, I started doing that change and it seemed better to create the refactoring first as a separate PR. I sent that out in https://github.com/apache/spark/pull/36880. Once that's merged, I can sync this PR with the latest changes and then it will be simpler.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #36771: [WIP][SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on PR #36771:
URL: https://github.com/apache/spark/pull/36771#issuecomment-1146896095

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r895248971


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/V2SessionCatalog.scala:
##########
@@ -43,6 +43,8 @@ class V2SessionCatalog(catalog: SessionCatalog)
   extends TableCatalog with FunctionCatalog with SupportsNamespaces with SQLConfHelper {
   import V2SessionCatalog._
 
+  var defaultColumnAnalyzer: Analyzer = null

Review Comment:
   This is my largest concern about this PR. We can have another PR for refactoring, whether before merging this one, or we can use `SimpleAnalyzer` in this PR and have a refactoring PR later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r900529038


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala:
##########
@@ -98,7 +98,10 @@ object ResolveDefaultColumns {
     if (SQLConf.get.enableDefaultColumns) {
       val allowedTableProviders: Array[String] =
         SQLConf.get.getConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS)
-          .toLowerCase().split(",").map(_.trim)
+          .toLowerCase().split(",").map(_.trim) ++
+          // We include the empty string here to allow the 'tableProvider' argument to be empty.
+          // This is only empty in tests, for example when using the InMemoryTableCatalog.

Review Comment:
   @dtenedor I don't think this is a good practice. We can do one of the following:
   
   - use a valid provider in the tests, e.g. "parquet"
   - set the config for the allowed providers in the tests
   - use `if(isTesting)...` here. This is not recommended though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang closed pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

gengliangwang closed pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS  to V2 data sources
URL: https://github.com/apache/spark/pull/36771


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r894798568


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala:
##########
@@ -95,7 +95,7 @@ object ResolveDefaultColumns {
     if (SQLConf.get.enableDefaultColumns) {
       val allowedTableProviders: Array[String] =
         SQLConf.get.getConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS)
-          .toLowerCase().split(",").map(_.trim)
+          .toLowerCase().split(",").map(_.trim) ++ Seq("")

Review Comment:
   Good question: we include the empty string here to allow the 'tableProvider' argument to be empty. This is only empty in tests, for example when using the InMemoryTableCatalog. In all other cases, the code requires a non-empty table provider to be given to this method call. I added this information to a comment here to help explain.
   (Sorry for the accidental force-push; the only change was adding this comment.)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on PR #36771:
URL: https://github.com/apache/spark/pull/36771#issuecomment-1158446164

   @gengliangwang this one is ready again :)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #36771: [SPARK-39383][SQL] Support DEFAULT columns in ALTER TABLE ADD COLUMNS to V2 data sources

Posted by GitBox <gi...@apache.org>.

dtenedor commented on code in PR #36771:
URL: https://github.com/apache/spark/pull/36771#discussion_r901000165


##########
sql/core/src/test/scala/org/apache/spark/sql/connector/AlterTableTests.scala:
##########
@@ -310,6 +311,26 @@ trait AlterTableTests extends SharedSparkSession {
     }
   }
 
+  test("SPARK-39383 DEFAULT columns on V2 data sources with ALTER TABLE ADD COLUMN") {
+    withSQLConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS.key -> s"$v2Format, ") {

Review Comment:
   Good Q. The tests use the `InMemoryTableCatalog` which passes an empty table provider for intermediate in-memory tables. By allowing both the `v2Format` as well as the empty string, both tables built `s"using $v2Format"` and in-memory catalog tables used in testing are supported.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org