You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by "cloud-fan (via GitHub)" <gi...@apache.org> on 2023/02/08 11:56:21 UTC

[GitHub] [spark] cloud-fan opened a new pull request, #39942: [WIP] refine default column value framework

cloud-fan opened a new pull request, #39942:
URL: https://github.com/apache/spark/pull/39942

### What changes were proposed in this pull request?

### Why are the changes needed?

### Does this PR introduce _any_ user-facing change?

### How was this patch tested?

--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "dtenedor (via GitHub)" <gi...@apache.org>.

dtenedor commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1106172353


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##########
@@ -394,6 +395,31 @@ case class StructType(fields: Array[StructField]) extends DataType with Seq[Stru
 
   protected[sql] def toAttributes: Seq[AttributeReference] = map(field => field.toAttribute)
 
+  private[sql] def toColumns: Array[Column] = fields.map { f =>
+    val defaultValue = f.getCurrentDefaultValue().map { sql =>
+      val existDefaultOpt = f.getExistenceDefaultValue()
+      assert(existDefaultOpt.isDefined, "current and exist default must be both set or neither")

Review Comment:
   Technically, the existence default value can be set but the current default value can be absent if we do `create table t (col int default 42) using parquet` and then `alter table t alter column col drop default`. Maybe reword this message to "if the current default value is set, the existence default value must also be set"?



##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ColumnDefaultValue.java:
##########
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Objects;
+import javax.annotation.Nonnull;
+
+import org.apache.spark.sql.connector.expressions.Literal;
+
+public class ColumnDefaultValue {

Review Comment:
   Could we add a class comment to describe what this is used for, the corresponding SQL syntax, etc.?



##########
sql/catalyst/src/test/scala/org/apache/spark/sql/types/StructTypeSuite.scala:
##########
@@ -471,34 +471,22 @@ class StructTypeSuite extends SparkFunSuite with SQLHelper {
     assert(source1.existenceDefaultValues(1) == UTF8String.fromString("abc"))
     assert(source1.existenceDefaultValues(2) == null)
 
-    // Positive test: StructType.defaultValues works because the existence default value parses and
-    // resolves successfully, then evaluates to a non-literal expression: this is constant-folded at

Review Comment:
   do we still test this constant-folding somewhere?



##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeTableExec.scala:
##########
@@ -37,6 +38,7 @@ case class DescribeTableExec(
     addPartitioning(rows)
 
     if (isExtended) {
+      addColumnDefaultValue(rows)

Review Comment:
   does this add the column default information earlier in the result than before? that might be OK...



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/types/StructType.scala:
##########
@@ -394,6 +395,31 @@ case class StructType(fields: Array[StructField]) extends DataType with Seq[Stru
 
   protected[sql] def toAttributes: Seq[AttributeReference] = map(field => field.toAttribute)
 
+  private[sql] def toColumns: Array[Column] = fields.map { f =>
+    val defaultValue = f.getCurrentDefaultValue().map { sql =>
+      val existDefaultOpt = f.getExistenceDefaultValue()
+      assert(existDefaultOpt.isDefined, "current and exist default must be both set or neither")
+      val e = CatalystSqlParser.parseExpression(f.getExistenceDefaultValue().get)

Review Comment:
   ```suggestion
         val e = CatalystSqlParser.parseExpression(existDefaultOpt.get)
   ```



##########
sql/core/src/main/scala/org/apache/spark/sql/catalyst/analysis/ResolveSessionCatalog.scala:
##########
@@ -50,17 +50,26 @@ class ResolveSessionCatalog(val catalogManager: CatalogManager)
   import org.apache.spark.sql.execution.datasources.v2.DataSourceV2Implicits._
 
   override def apply(plan: LogicalPlan): LogicalPlan = plan.resolveOperatorsUp {
-    case AddColumns(ResolvedV1TableIdentifier(ident), cols) =>
-      cols.foreach { c =>
+    case a @ AddColumns(ResolvedV1TableIdentifier(ident), colsToAdd) if a.resolved =>
+      colsToAdd.foreach { c =>
         if (c.name.length > 1) {
           throw QueryCompilationErrors.operationOnlySupportedWithV2TableError(
             Seq(ident.catalog.get, ident.database.get, ident.table),
             "ADD COLUMN with qualified column")
         }
-        if (!c.nullable) {
+        if (!c.column.nullable) {
           throw QueryCompilationErrors.addColumnWithV1TableCannotSpecifyNotNullError
         }
       }
+      // Check default values before converting to v1 command.
+      DefaultColumnUtil.checkDefaultValuesInPlan(a, isForV1 = true)
+      val cols = if (colsToAdd.exists(_.column.defaultValue.isDefined)) {
+        // Do a constant-folding, as we need to store the expression SQL string which should be in

Review Comment:
   we should probably put this into a util method in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala for better clarity of the code



##########
sql/core/src/test/scala/org/apache/spark/sql/connector/TestV2SessionCatalogBase.scala:
##########
@@ -64,6 +64,15 @@ private[connector] trait TestV2SessionCatalogBase[T <: Table] extends Delegating
     }
   }
 
+  override def createTable(
+      ident: Identifier,
+      columns: Array[Column],
+      partitions: Array[Transform],
+      properties: java.util.Map[String, String]): Table = {
+    createTable(ident, CatalogV2Util.v2ColumnsToStructType(columns), partitions, properties)
+  }
+
+  // TODO: remove it when no tests calling this deprecated method.

Review Comment:
   There are a lot of method changes like this in this PR. Would it be better to clean up the code by updating the call sites? Or should we do that in a separate PR?



##########
sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala:
##########
@@ -1982,47 +1970,7 @@ class InsertSuite extends DataSourceTest with SharedSparkSession {
     ).foreach { query =>
       assert(intercept[AnalysisException] {
         sql(query)
-      }.getMessage.contains(
-        QueryCompilationErrors.defaultValuesMayNotContainSubQueryExpressions().getMessage))
-    }
-  }
-
-  test("SPARK-39844 Restrict adding DEFAULT columns for existing tables to certain sources") {

Review Comment:
   Per response to that comment, it looks likely we'll have to retain this functionality.



##########
sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala:
##########
@@ -1058,21 +1057,21 @@ class InsertSuite extends DataSourceTest with SharedSparkSession {
       assert(intercept[AnalysisException] {
         sql("create table t(i boolean, s bigint default (select min(x) from badtable)) " +
           "using parquet")
-      }.getMessage.contains(Errors.BAD_SUBQUERY))

Review Comment:
   this no longer returns an error that subquery expressions are not allowed in default values. Can we have a test that checks this case?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1106969565


##########
sql/core/src/test/scala/org/apache/spark/sql/connector/TestV2SessionCatalogBase.scala:
##########
@@ -64,6 +64,15 @@ private[connector] trait TestV2SessionCatalogBase[T <: Table] extends Delegating
     }
   }
 
+  override def createTable(
+      ident: Identifier,
+      columns: Array[Column],
+      partitions: Array[Transform],
+      properties: java.util.Map[String, String]): Table = {
+    createTable(ident, CatalogV2Util.v2ColumnsToStructType(columns), partitions, properties)
+  }
+
+  // TODO: remove it when no tests calling this deprecated method.

Review Comment:
   There are too many call sides in the test, I'd like to clean it up in a separate PR.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] github-actions[bot] closed pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.

github-actions[bot] closed pull request #39942: [SPARK-42398][SQL] Refine default column value framework
URL: https://github.com/apache/spark/pull/39942


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102394700


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##########
@@ -4077,19 +4061,18 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit
     } else {
       None
     }
-    val setDefaultExpression: Option[String] =
-      if (action.defaultExpression != null) {
-        Option(action.defaultExpression()).map(visitDefaultExpression)
-      } else if (action.dropDefault != null) {
-        Some("")
-      } else {
-        None
-      }
-    if (setDefaultExpression.isDefined && !conf.getConf(SQLConf.ENABLE_DEFAULT_COLUMNS)) {
+    val defaultExpression: Option[Expression] = if (action.defaultExpression != null) {
+      Some(visitDefaultExpression(action.defaultExpression))
+    } else if (action.dropDefault != null) {
+      Some(DropDefaultColumnValue)

Review Comment:
   We can use `Some(null)` to indicate removing default value but I think this is better.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "dtenedor (via GitHub)" <gi...@apache.org>.

dtenedor commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1105182862


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/Column.java:
##########
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import javax.annotation.Nullable;
+
+import org.apache.spark.sql.connector.expressions.Literal;
+import org.apache.spark.sql.internal.connector.ColumnImpl;
+import org.apache.spark.sql.types.DataType;
+
+public interface Column {

Review Comment:
   it could help to have a high level comment describing what this interface represents, and who is expected to implement it?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
       case _ =>
     }
   }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {

Review Comment:
   This helper could go in the `ResolveDefaultColumns` object in sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/util/ResolveDefaultColumnsUtil.scala.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
       case _ =>
     }
   }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {
+    // Do not check anything if the children are not resolved yet.
+    if (!plan.childrenResolved) return
+    plan match {
+      case AlterColumn(t: ResolvedTable, col: ResolvedFieldName, _, _, _, _,
+          Some(default: DefaultValueExpression)) =>
+        checkTableProvider(t.catalog, t.name, getTableProviderFromProp(t.table.properties()))
+        checkDefaultValue(default, t.name, col.name, col.field.dataType, isForV1)
+
+      case cmd: V2CreateTablePlan if cmd.columns.exists(_.defaultValue.isDefined) =>
+        val ident = cmd.name.asInstanceOf[ResolvedIdentifier]

Review Comment:
   should we bind this in the pattern match instead of using `asInstanceOf`? Same on L1459 below.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
       case _ =>
     }
   }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {
+    // Do not check anything if the children are not resolved yet.
+    if (!plan.childrenResolved) return

Review Comment:
   optional: instead of the `return`, we could just add a `case _ if !plan.childrenResolved => return` below. This might be generally safer in the vicinity of Catalyst rules which accept partial functions as arguments.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala:
##########
@@ -133,32 +135,33 @@ private[sql] object CatalogV2Util {
    */
   def applySchemaChanges(
       schema: StructType,
-      changes: Seq[TableChange],
-      tableProvider: Option[String],
-      statementType: String): StructType = {
+      changes: Seq[TableChange]): StructType = {
     changes.foldLeft(schema) { (schema, change) =>
       change match {
         case add: AddColumn =>
           add.fieldNames match {
             case Array(name) =>
               val field = StructField(name, add.dataType, nullable = add.isNullable)
-              val fieldWithDefault: StructField =
-                Option(add.defaultValue).map(field.withCurrentDefaultValue).getOrElse(field)
+              val fieldWithDefault: StructField = Option(add.defaultValue).map { default =>
+                val defaultSQL = getDefaultValueSQL(default)
+                field.withCurrentDefaultValue(defaultSQL).withExistenceDefaultValue(defaultSQL)
+              }.getOrElse(field)
               val fieldWithComment: StructField =
                 Option(add.comment).map(fieldWithDefault.withComment).getOrElse(fieldWithDefault)
-              addField(schema, fieldWithComment, add.position(), tableProvider, statementType, true)
+              addField(schema, fieldWithComment, add.position())
             case names =>
               replace(schema, names.init, parent => parent.dataType match {
                 case parentType: StructType =>
                   val field = StructField(names.last, add.dataType, nullable = add.isNullable)
-                  val fieldWithDefault: StructField =
-                    Option(add.defaultValue).map(field.withCurrentDefaultValue).getOrElse(field)
+                  val fieldWithDefault: StructField = Option(add.defaultValue).map { default =>

Review Comment:
   this is duplicated with L145-148 above, should we dedup into one place?



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##########
@@ -3977,26 +3954,31 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit
     }
   }
 
+  override def visitDefaultExpression(ctx: DefaultExpressionContext): Expression = withOrigin(ctx) {
+    expression(ctx.expression())
+  }
+
   /**
    * Parse new column info from ADD COLUMN into a QualifiedColType.
    */
   override def visitQualifiedColTypeWithPosition(
       ctx: QualifiedColTypeWithPositionContext): QualifiedColType = withOrigin(ctx) {
     val name = typedVisit[Seq[String]](ctx.name)
-    // Add the 'DEFAULT expression' clause in the column definition, if any, to the column metadata.
     val defaultExpr = Option(ctx.defaultExpression()).map(visitDefaultExpression)
     if (defaultExpr.isDefined && !conf.getConf(SQLConf.ENABLE_DEFAULT_COLUMNS)) {
       throw QueryParsingErrors.defaultColumnNotEnabledError(ctx)
     }
     QualifiedColType(
-      path = if (name.length > 1) Some(UnresolvedFieldName(name.init)) else None,
-      colName = name.last,
-      dataType = typedVisit[DataType](ctx.dataType),
-      nullable = ctx.NULL == null,
-      comment = Option(ctx.commentSpec()).map(visitCommentSpec),
+      path = if (name.length > 1) UnresolvedFieldName(name.init) else RootTableSchema,

Review Comment:
   Sounds good, it is convenient if `QualifiedColType` is an expression.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/connector/catalog/CatalogV2Util.scala:
##########
@@ -431,4 +436,45 @@ private[sql] object CatalogV2Util {
       .getOrElse(catalogManager.v2SessionCatalog)
       .asTableCatalog
   }
+
+  def v2ColumnsToStructType(columns: Array[Column]): StructType = {
+    StructType(columns.map(v2ColumnToStructField))
+  }
+
+  def v2ColumnToStructField(col: Column): StructField = {
+    val metadata = Option(col.metadataInJSON()).map(Metadata.fromJson).getOrElse(Metadata.empty)
+    var f = StructField(col.name(), col.dataType(), col.nullable(), metadata)
+    Option(col.comment()).foreach { comment =>
+      f = f.withComment(comment)
+    }
+    Option(col.defaultValue()).foreach { default =>
+      val defaultSQL = CatalogV2Util.getDefaultValueSQL(default)
+      f = f.withExistenceDefaultValue(defaultSQL).withCurrentDefaultValue(defaultSQL)
+    }
+    f
+  }
+
+  def structTypeToV2Columns(schema: StructType): Array[Column] = {
+    schema.fields.map { f =>
+      // TODO: remove it when our testing v2 tables no longer implements and replies on the
+      //       deprecated `Table.schema`.
+      val defaultValue = f.getCurrentDefaultValue().map { sql =>
+        val e = CatalystSqlParser.parseExpression(sql)
+        if (!e.resolved || !e.foldable) {
+          // This should not happen as we do validation before setting column default values, but
+          // in case something goes wrong, we treat it as no default value.
+          null

Review Comment:
   we should throw an exception here instead of returning NULL, since that would comprise an incorrect result.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##########
@@ -4077,19 +4061,18 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit
     } else {
       None
     }
-    val setDefaultExpression: Option[String] =
-      if (action.defaultExpression != null) {
-        Option(action.defaultExpression()).map(visitDefaultExpression)
-      } else if (action.dropDefault != null) {
-        Some("")
-      } else {
-        None
-      }
-    if (setDefaultExpression.isDefined && !conf.getConf(SQLConf.ENABLE_DEFAULT_COLUMNS)) {
+    val defaultExpression: Option[Expression] = if (action.defaultExpression != null) {
+      Some(visitDefaultExpression(action.defaultExpression))
+    } else if (action.dropDefault != null) {
+      Some(DropDefaultColumnValue)

Review Comment:
   Yes, this is more explicit.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
       case _ =>
     }
   }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {
+    // Do not check anything if the children are not resolved yet.
+    if (!plan.childrenResolved) return
+    plan match {
+      case AlterColumn(t: ResolvedTable, col: ResolvedFieldName, _, _, _, _,
+          Some(default: DefaultValueExpression)) =>
+        checkTableProvider(t.catalog, t.name, getTableProviderFromProp(t.table.properties()))
+        checkDefaultValue(default, t.name, col.name, col.field.dataType, isForV1)
+
+      case cmd: V2CreateTablePlan if cmd.columns.exists(_.defaultValue.isDefined) =>
+        val ident = cmd.name.asInstanceOf[ResolvedIdentifier]
+        checkTableProvider(ident.catalog, ident.name, cmd.tableSpec.provider)
+        cmd.columns.filter(_.defaultValue.isDefined).foreach { col =>
+          val Column(name, dataType, _, _, Some(default), _) = col
+          // CREATE/REPLACE TABLE only has top-level columns
+          val colName = Seq(name)
+          checkDefaultValue(default, ident.name, colName, dataType, isForV1)
+        }
+
+      case cmd: AlterTableCommand =>
+        val table = cmd.table.asInstanceOf[ResolvedTable]
+        cmd.transformExpressionsDown {
+          case q @ QualifiedColType(path, Column(name, dataType, _, _, Some(default), _), _)
+              if path.resolved =>
+            checkTableProvider(
+              table.catalog, table.name, getTableProviderFromProp(table.table.properties()))
+            checkDefaultValue(
+              default,
+              table.name,
+              path.name :+ name,
+              dataType,
+              isForV1)
+            q
+        }
+
+      case _ =>
+    }
+  }
+
+  private def getTableProviderFromProp(props: JMap[String, String]): Option[String] = {
+    Option(props.get(TableCatalog.PROP_PROVIDER))
+  }
+
+  private def checkTableProvider(
+      catalog: CatalogPlugin,
+      tableName: String,
+      provider: Option[String]): Unit = {
+    // We only need to check table provider for the session catalog. Other custom v2 catalogs
+    // can check table providers in their implementations of createTable, alterTable, etc.
+    if (!CatalogV2Util.isSessionCatalog(catalog)) return
+    val conf = SQLConf.get
+    val allowedProviders: Array[String] = conf.getConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS)
+      .toLowerCase().split(",").map(_.trim)
+    if (!allowedProviders.contains(provider.getOrElse(conf.defaultDataSourceName).toLowerCase())) {
+      throw QueryCompilationErrors.defaultReferencesNotAllowedInDataSource(tableName)
+    }
+  }
+
+  private def checkDefaultValue(
+      default: DefaultValueExpression,
+      tblName: String,
+      colName: Seq[String],
+      targetType: DataType,
+      isForV1: Boolean): Unit = {
+    if (default.resolved) {
+      if (!default.child.foldable) {
+        throw QueryCompilationErrors.notConstantDefaultValueError(
+          tblName, colName, default.originalSQL)
+      } else if (!Cast.canUpCast(default.child.dataType, targetType)) {

Review Comment:
   no need for `else` since we throw an error in the `if` block above.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3091,12 +3091,11 @@ object SQLConf {
     buildConf("spark.sql.defaultColumn.allowedProviders")
       .internal()
       .doc("List of table providers wherein SQL commands are permitted to assign DEFAULT column " +
-        "values. Comma-separated list, whitespace ignored, case-insensitive. If an asterisk " +
-        "appears after any table provider in this list, any command may assign DEFAULT column " +
-        "except `ALTER TABLE ... ADD COLUMN`. Otherwise, if no asterisk appears, all commands " +

Review Comment:
   The argument makes sense in spirit, but at least one data source has already used this functionality of the default column providers which would be broken with this change. Let's talk offline.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
       case _ =>
     }
   }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {
+    // Do not check anything if the children are not resolved yet.
+    if (!plan.childrenResolved) return
+    plan match {
+      case AlterColumn(t: ResolvedTable, col: ResolvedFieldName, _, _, _, _,
+          Some(default: DefaultValueExpression)) =>
+        checkTableProvider(t.catalog, t.name, getTableProviderFromProp(t.table.properties()))
+        checkDefaultValue(default, t.name, col.name, col.field.dataType, isForV1)
+
+      case cmd: V2CreateTablePlan if cmd.columns.exists(_.defaultValue.isDefined) =>
+        val ident = cmd.name.asInstanceOf[ResolvedIdentifier]
+        checkTableProvider(ident.catalog, ident.name, cmd.tableSpec.provider)
+        cmd.columns.filter(_.defaultValue.isDefined).foreach { col =>
+          val Column(name, dataType, _, _, Some(default), _) = col
+          // CREATE/REPLACE TABLE only has top-level columns
+          val colName = Seq(name)
+          checkDefaultValue(default, ident.name, colName, dataType, isForV1)
+        }
+
+      case cmd: AlterTableCommand =>
+        val table = cmd.table.asInstanceOf[ResolvedTable]
+        cmd.transformExpressionsDown {
+          case q @ QualifiedColType(path, Column(name, dataType, _, _, Some(default), _), _)
+              if path.resolved =>
+            checkTableProvider(
+              table.catalog, table.name, getTableProviderFromProp(table.table.properties()))
+            checkDefaultValue(
+              default,
+              table.name,
+              path.name :+ name,
+              dataType,
+              isForV1)
+            q
+        }
+
+      case _ =>
+    }
+  }
+
+  private def getTableProviderFromProp(props: JMap[String, String]): Option[String] = {
+    Option(props.get(TableCatalog.PROP_PROVIDER))
+  }
+
+  private def checkTableProvider(
+      catalog: CatalogPlugin,
+      tableName: String,
+      provider: Option[String]): Unit = {
+    // We only need to check table provider for the session catalog. Other custom v2 catalogs
+    // can check table providers in their implementations of createTable, alterTable, etc.
+    if (!CatalogV2Util.isSessionCatalog(catalog)) return

Review Comment:
   same as above, we should probably reverse the logic and put the below in the `if` block body instead.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
       case _ =>
     }
   }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {
+    // Do not check anything if the children are not resolved yet.
+    if (!plan.childrenResolved) return
+    plan match {
+      case AlterColumn(t: ResolvedTable, col: ResolvedFieldName, _, _, _, _,
+          Some(default: DefaultValueExpression)) =>
+        checkTableProvider(t.catalog, t.name, getTableProviderFromProp(t.table.properties()))
+        checkDefaultValue(default, t.name, col.name, col.field.dataType, isForV1)
+
+      case cmd: V2CreateTablePlan if cmd.columns.exists(_.defaultValue.isDefined) =>
+        val ident = cmd.name.asInstanceOf[ResolvedIdentifier]
+        checkTableProvider(ident.catalog, ident.name, cmd.tableSpec.provider)
+        cmd.columns.filter(_.defaultValue.isDefined).foreach { col =>
+          val Column(name, dataType, _, _, Some(default), _) = col
+          // CREATE/REPLACE TABLE only has top-level columns
+          val colName = Seq(name)
+          checkDefaultValue(default, ident.name, colName, dataType, isForV1)
+        }
+
+      case cmd: AlterTableCommand =>
+        val table = cmd.table.asInstanceOf[ResolvedTable]
+        cmd.transformExpressionsDown {
+          case q @ QualifiedColType(path, Column(name, dataType, _, _, Some(default), _), _)
+              if path.resolved =>
+            checkTableProvider(
+              table.catalog, table.name, getTableProviderFromProp(table.table.properties()))
+            checkDefaultValue(
+              default,
+              table.name,
+              path.name :+ name,
+              dataType,
+              isForV1)
+            q
+        }
+
+      case _ =>
+    }
+  }
+
+  private def getTableProviderFromProp(props: JMap[String, String]): Option[String] = {
+    Option(props.get(TableCatalog.PROP_PROVIDER))
+  }
+
+  private def checkTableProvider(
+      catalog: CatalogPlugin,
+      tableName: String,
+      provider: Option[String]): Unit = {
+    // We only need to check table provider for the session catalog. Other custom v2 catalogs
+    // can check table providers in their implementations of createTable, alterTable, etc.
+    if (!CatalogV2Util.isSessionCatalog(catalog)) return
+    val conf = SQLConf.get
+    val allowedProviders: Array[String] = conf.getConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS)
+      .toLowerCase().split(",").map(_.trim)
+    if (!allowedProviders.contains(provider.getOrElse(conf.defaultDataSourceName).toLowerCase())) {
+      throw QueryCompilationErrors.defaultReferencesNotAllowedInDataSource(tableName)
+    }
+  }
+
+  private def checkDefaultValue(
+      default: DefaultValueExpression,
+      tblName: String,
+      colName: Seq[String],
+      targetType: DataType,
+      isForV1: Boolean): Unit = {
+    if (default.resolved) {
+      if (!default.child.foldable) {
+        throw QueryCompilationErrors.notConstantDefaultValueError(
+          tblName, colName, default.originalSQL)
+      } else if (!Cast.canUpCast(default.child.dataType, targetType)) {
+        throw QueryCompilationErrors.incompatibleTypeDefaultValueError(
+          tblName, colName, targetType, default.child, default.originalSQL)
+      } else {
+        // OK.
+      }
+    } else {
+      // Ideally we should let the rest of `CheckAnalysis` to report errors about why the default
+      // expression is unresolved. But we should report a better error here if the default
+      // expression references columns or contains subquery expressions, which means it's not a
+      // constant for sure.
+      if (default.references.nonEmpty || default.containsPattern(PLAN_EXPRESSION)) {

Review Comment:
   you can combine this with the `else` above as `else if`.



##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
       case _ =>
     }
   }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {
+    // Do not check anything if the children are not resolved yet.
+    if (!plan.childrenResolved) return
+    plan match {
+      case AlterColumn(t: ResolvedTable, col: ResolvedFieldName, _, _, _, _,
+          Some(default: DefaultValueExpression)) =>
+        checkTableProvider(t.catalog, t.name, getTableProviderFromProp(t.table.properties()))
+        checkDefaultValue(default, t.name, col.name, col.field.dataType, isForV1)
+
+      case cmd: V2CreateTablePlan if cmd.columns.exists(_.defaultValue.isDefined) =>
+        val ident = cmd.name.asInstanceOf[ResolvedIdentifier]
+        checkTableProvider(ident.catalog, ident.name, cmd.tableSpec.provider)
+        cmd.columns.filter(_.defaultValue.isDefined).foreach { col =>
+          val Column(name, dataType, _, _, Some(default), _) = col
+          // CREATE/REPLACE TABLE only has top-level columns
+          val colName = Seq(name)
+          checkDefaultValue(default, ident.name, colName, dataType, isForV1)
+        }
+
+      case cmd: AlterTableCommand =>
+        val table = cmd.table.asInstanceOf[ResolvedTable]
+        cmd.transformExpressionsDown {
+          case q @ QualifiedColType(path, Column(name, dataType, _, _, Some(default), _), _)
+              if path.resolved =>
+            checkTableProvider(
+              table.catalog, table.name, getTableProviderFromProp(table.table.properties()))
+            checkDefaultValue(
+              default,
+              table.name,
+              path.name :+ name,
+              dataType,
+              isForV1)
+            q
+        }
+
+      case _ =>
+    }
+  }
+
+  private def getTableProviderFromProp(props: JMap[String, String]): Option[String] = {
+    Option(props.get(TableCatalog.PROP_PROVIDER))
+  }
+
+  private def checkTableProvider(
+      catalog: CatalogPlugin,
+      tableName: String,
+      provider: Option[String]): Unit = {
+    // We only need to check table provider for the session catalog. Other custom v2 catalogs
+    // can check table providers in their implementations of createTable, alterTable, etc.
+    if (!CatalogV2Util.isSessionCatalog(catalog)) return
+    val conf = SQLConf.get
+    val allowedProviders: Array[String] = conf.getConf(SQLConf.DEFAULT_COLUMN_ALLOWED_PROVIDERS)
+      .toLowerCase().split(",").map(_.trim)
+    if (!allowedProviders.contains(provider.getOrElse(conf.defaultDataSourceName).toLowerCase())) {
+      throw QueryCompilationErrors.defaultReferencesNotAllowedInDataSource(tableName)
+    }
+  }
+
+  private def checkDefaultValue(
+      default: DefaultValueExpression,
+      tblName: String,
+      colName: Seq[String],
+      targetType: DataType,
+      isForV1: Boolean): Unit = {
+    if (default.resolved) {
+      if (!default.child.foldable) {
+        throw QueryCompilationErrors.notConstantDefaultValueError(
+          tblName, colName, default.originalSQL)
+      } else if (!Cast.canUpCast(default.child.dataType, targetType)) {
+        throw QueryCompilationErrors.incompatibleTypeDefaultValueError(
+          tblName, colName, targetType, default.child, default.originalSQL)
+      } else {

Review Comment:
   same here.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102410248


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3091,12 +3091,11 @@ object SQLConf {
     buildConf("spark.sql.defaultColumn.allowedProviders")
       .internal()
       .doc("List of table providers wherein SQL commands are permitted to assign DEFAULT column " +
-        "values. Comma-separated list, whitespace ignored, case-insensitive. If an asterisk " +
-        "appears after any table provider in this list, any command may assign DEFAULT column " +
-        "except `ALTER TABLE ... ADD COLUMN`. Otherwise, if no asterisk appears, all commands " +

Review Comment:
   I don't think this is reasonable to allow a builtin data source to partially support column default value. Custom v2 catalogs can reject add columns by themselves in `TableCatalog.alterTable`, and builtin files sources must fully support default column as add column is not the only issue. People can do `CREATE TABLE t (with default value) USING parquet LOCATION ...` and the data files do not have the columns with default value.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102423056


##########
sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala:
##########
@@ -1982,47 +1970,7 @@ class InsertSuite extends DataSourceTest with SharedSparkSession {
     ).foreach { query =>
       assert(intercept[AnalysisException] {
         sql(query)
-      }.getMessage.contains(
-        QueryCompilationErrors.defaultValuesMayNotContainSubQueryExpressions().getMessage))
-    }
-  }
-
-  test("SPARK-39844 Restrict adding DEFAULT columns for existing tables to certain sources") {

Review Comment:
   see https://github.com/apache/spark/pull/39942/files#r1102410248



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102393297


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/parser/AstBuilder.scala:
##########
@@ -3977,26 +3954,31 @@ class AstBuilder extends SqlBaseParserBaseVisitor[AnyRef] with SQLConfHelper wit
     }
   }
 
+  override def visitDefaultExpression(ctx: DefaultExpressionContext): Expression = withOrigin(ctx) {
+    expression(ctx.expression())
+  }
+
   /**
    * Parse new column info from ADD COLUMN into a QualifiedColType.
    */
   override def visitQualifiedColTypeWithPosition(
       ctx: QualifiedColTypeWithPositionContext): QualifiedColType = withOrigin(ctx) {
     val name = typedVisit[Seq[String]](ctx.name)
-    // Add the 'DEFAULT expression' clause in the column definition, if any, to the column metadata.
     val defaultExpr = Option(ctx.defaultExpression()).map(visitDefaultExpression)
     if (defaultExpr.isDefined && !conf.getConf(SQLConf.ENABLE_DEFAULT_COLUMNS)) {
       throw QueryParsingErrors.defaultColumnNotEnabledError(ctx)
     }
     QualifiedColType(
-      path = if (name.length > 1) Some(UnresolvedFieldName(name.init)) else None,
-      colName = name.last,
-      dataType = typedVisit[DataType](ctx.dataType),
-      nullable = ctx.NULL == null,
-      comment = Option(ctx.commentSpec()).map(visitCommentSpec),
+      path = if (name.length > 1) UnresolvedFieldName(name.init) else RootTableSchema,

Review Comment:
   This change is necessary to make `QualifiedColType` an expression, otherwise we can't properly implement `withChildren` as `QualifiedColType` has 2 optional children.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "gengliangwang (via GitHub)" <gi...@apache.org>.

gengliangwang commented on PR #39942:
URL: https://github.com/apache/spark/pull/39942#issuecomment-1428923475

   cc @dtenedor 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102237616


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCNamespaceTest.scala:
##########
@@ -118,7 +119,7 @@ private[v2] trait V2JDBCNamespaceTest extends SharedSparkSession with DockerInte
       // Drop non empty namespace without cascade
       catalog.createNamespace(Array("foo"), commentMap.asJava)
       assert(catalog.namespaceExists(Array("foo")) === true)
-      catalog.createTable(ident1, schema, Array.empty, emptyProps)
+      catalog.createTable(ident1, schema, Array.empty[Transform], emptyProps)

Review Comment:
   Scala compiler type inference doesn't work well with array and function overloads. This is not a breaking change, as users only implement `creteTable` and Spark calls it.



##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCNamespaceTest.scala:
##########
@@ -118,7 +119,7 @@ private[v2] trait V2JDBCNamespaceTest extends SharedSparkSession with DockerInte
       // Drop non empty namespace without cascade
       catalog.createNamespace(Array("foo"), commentMap.asJava)
       assert(catalog.namespaceExists(Array("foo")) === true)
-      catalog.createTable(ident1, schema, Array.empty, emptyProps)
+      catalog.createTable(ident1, schema, Array.empty[Transform], emptyProps)

Review Comment:
   Scala compiler type inference doesn't work well with array and function overloads. This is not a breaking change, as users only implement `createTable` and Spark calls it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102237616


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCNamespaceTest.scala:
##########
@@ -118,7 +119,7 @@ private[v2] trait V2JDBCNamespaceTest extends SharedSparkSession with DockerInte
       // Drop non empty namespace without cascade
       catalog.createNamespace(Array("foo"), commentMap.asJava)
       assert(catalog.namespaceExists(Array("foo")) === true)
-      catalog.createTable(ident1, schema, Array.empty, emptyProps)
+      catalog.createTable(ident1, schema, Array.empty[Transform], emptyProps)

Review Comment:
   Scala compiler type inference doesn't work well with array and function overloads. This is not a breaking change, as users only implement it and Spark calls it.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102393531


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/v2ResolutionPlans.scala:
##########
@@ -195,6 +196,10 @@ case class ResolvedFieldName(path: Seq[String], field: StructField) extends Fiel
   def name: Seq[String] = path :+ field.name
 }
 
+case object RootTableSchema extends FieldName {

Review Comment:
   The reason to add it : https://github.com/apache/spark/pull/39942/files#r1102393297



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1106955626


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/types/StructTypeSuite.scala:
##########
@@ -471,34 +471,22 @@ class StructTypeSuite extends SparkFunSuite with SQLHelper {
     assert(source1.existenceDefaultValues(1) == UTF8String.fromString("abc"))
     assert(source1.existenceDefaultValues(2) == null)
 
-    // Positive test: StructType.defaultValues works because the existence default value parses and
-    // resolves successfully, then evaluates to a non-literal expression: this is constant-folded at

Review Comment:
   This is done in a more type-safe way, via `DefaultValueExpression`. The analyzer makes sure `DefaultValueExpression.child` is foldable, and at execution time we require `DefaultValueExpression.child` to be a literal before passing it to catalogs.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] github-actions[bot] commented on pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "github-actions[bot] (via GitHub)" <gi...@apache.org>.

github-actions[bot] commented on PR #39942:
URL: https://github.com/apache/spark/pull/39942#issuecomment-1566322960

   We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
   If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dtenedor commented on pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "dtenedor (via GitHub)" <gi...@apache.org>.

dtenedor commented on PR #39942:
URL: https://github.com/apache/spark/pull/39942#issuecomment-1428974755

   Thanks @cloud-fan for working on this, I reviewed the first 20/68 files (up through and including `SQLConf.scala`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1105523323


##########
connector/connect/common/src/test/resources/query-tests/queries/function_udf_2.12.json:
##########
@@ -21,7 +21,7 @@
           }
         }],
         "scalarScalaUdf": {
-          "payload": "rO0ABXNyAC1vcmcuYXBhY2hlLnNwYXJrLnNxbC5jb25uZWN0LmNvbW1vbi5VZGZQYWNrZXR7DRDpFmMz1QIAA0wACGZ1bmN0aW9udAASTGphdmEvbGFuZy9PYmplY3Q7TAANaW5wdXRFbmNvZGVyc3QAFkxzY2FsYS9jb2xsZWN0aW9uL1NlcTtMAA1vdXRwdXRFbmNvZGVydAA4TG9yZy9hcGFjaGUvc3Bhcmsvc3FsL2NhdGFseXN0L2VuY29kZXJzL0Fnbm9zdGljRW5jb2Rlcjt4cHNyACVvcmcuYXBhY2hlLnNwYXJrLnNxbC5UZXN0VURGcyQkYW5vbiQyWQlE7Ce2cPkCAAB4cHNyACRzY2FsYS5jb2xsZWN0aW9uLm11dGFibGUuQXJyYXlCdWZmZXIVOLBTg4KOcwIAA0kAC2luaXRpYWxTaXplSQAFc2l6ZTBbAAVhcnJheXQAE1tMamF2YS9sYW5nL09iamVjdDt4cAAAABAAAAABdXIAE1tMamF2YS5sYW5nLk9iamVjdDuQzlifEHMpbAIAAHhwAAAAAXNyAE1vcmcuYXBhY2hlLnNwYXJrLnNxbC5jYXRhbHlzdC5lbmNvZGVycy5BZ25vc3RpY0VuY29kZXJzJFByaW1pdGl2ZUxvbmdFbmNvZGVyJE3e2W1O/wUaAgAAeHIATG9yZy5hcGFjaGUuc3Bhcmsuc3FsLmNhdGFseXN0LmVuY29kZXJzLkFnbm9zdGljRW5jb2RlcnMkUHJpbWl0aXZlTGVhZkVuY29kZXLf4z0LufMQmwIAAHhyAENvcmcuYXBhY2hlLnNwYXJrLnNxbC5jYXRhbHlzdC5lbmNvZGVycy5BZ25vc3RpY0VuY29kZXJzJExlYWZFbmNvZGVyKtoOee2SqKMCAANaAAtpc1ByaW1pdGl2ZUwABmNsc1RhZ3QAGExzY2FsYS9yZWZsZWN0L0NsYXN
 zVGFnO0wACGRhdGFUeXBldAAlTG9yZy9hcGFjaGUvc3Bhcmsvc3FsL3R5cGVzL0RhdGFUeXBlO3hwAXNyACpzY2FsYS5yZWZsZWN0Lk1hbmlmZXN0RmFjdG9yeSRMb25nTWFuaWZlc3QAAAAAAAAAAQIAAHhyABxzY2FsYS5yZWZsZWN0LkFueVZhbE1hbmlmZXN0AAAAAAAAAAECAAFMAAh0b1N0cmluZ3QAEkxqYXZhL2xhbmcvU3RyaW5nO3hwdAAETG9uZ3NyACRvcmcuYXBhY2hlLnNwYXJrLnNxbC50eXBlcy5Mb25nVHlwZSStAg1XC0z3OwIAAHhwc3IARm9yZy5hcGFjaGUuc3Bhcmsuc3FsLmNhdGFseXN0LmVuY29kZXJzLkFnbm9zdGljRW5jb2RlcnMkUHJvZHVjdEVuY29kZXIcqKluUDodYQIAA0wABmNsc1RhZ3EAfgAPTAAGZmllbGRzcQB+AAJMAAZzY2hlbWF0ACdMb3JnL2FwYWNoZS9zcGFyay9zcWwvdHlwZXMvU3RydWN0VHlwZTt4cHNyACZzY2FsYS5yZWZsZWN0LkNsYXNzVGFnJEdlbmVyaWNDbGFzc1RhZwAAAAAAAAABAgABTAAMcnVudGltZUNsYXNzdAARTGphdmEvbGFuZy9DbGFzczt4cHZyAAxzY2FsYS5UdXBsZTHPDUf18JuzPAIAAUwAAl8xcQB+AAF4cHNyADJzY2FsYS5jb2xsZWN0aW9uLmltbXV0YWJsZS5MaXN0JFNlcmlhbGl6YXRpb25Qcm94eQAAAAAAAAABAwAAeHBzcgBEb3JnLmFwYWNoZS5zcGFyay5zcWwuY2F0YWx5c3QuZW5jb2RlcnMuQWdub3N0aWNFbmNvZGVycyRFbmNvZGVyRmllbGTkpQPl0rttUgIABloACG51bGxhYmxlTAADZW5jcQB+AANMAAhtZXRhZGF0YXQAJUxvcmcvYXBhY2hlL3Nw
 YXJrL3NxbC90eXBlcy9NZXRhZGF0YTtMAARuYW1lcQB+ABRMAApyZWFkTWV0aG9kdAAOTHNjYWxhL09wdGlvbjtMAAt3cml0ZU1ldGhvZHEAfgAleHAAcQB+ABFzcgAjb3JnLmFwYWNoZS5zcGFyay5zcWwudHlwZXMuTWV0YWRhdGFtjWLvldkl+wIAA0kACV9oYXNoQ29kZVoACGJpdG1hcCQwTAADbWFwdAAgTHNjYWxhL2NvbGxlY3Rpb24vaW1tdXRhYmxlL01hcDt4cAAAAAAAc3IAKHNjYWxhLmNvbGxlY3Rpb24uaW1tdXRhYmxlLk1hcCRFbXB0eU1hcCSx6xuFbkKAywIAAHhwdAACXzFzcgALc2NhbGEuTm9uZSRGUCT2U8qUrAIAAHhyAAxzY2FsYS5PcHRpb27+aTf92w5mdAIAAHhwcQB+AC9zcgAsc2NhbGEuY29sbGVjdGlvbi5pbW11dGFibGUuTGlzdFNlcmlhbGl6ZUVuZCSKXGNb91MLbQIAAHhweHNyACVvcmcuYXBhY2hlLnNwYXJrLnNxbC50eXBlcy5TdHJ1Y3RUeXBl9I34EN1tjkUCAAlJAAlfaGFzaENvZGVCAAhiaXRtYXAkMFoAGWhhc0V4aXN0ZW5jZURlZmF1bHRWYWx1ZXNbABZleGlzdGVuY2VEZWZhdWx0VmFsdWVzcQB+AAhbABhleGlzdGVuY2VEZWZhdWx0c0JpdG1hc2t0AAJbWkwADWZpZWxkTmFtZXNTZXR0ACBMc2NhbGEvY29sbGVjdGlvbi9pbW11dGFibGUvU2V0O1sABmZpZWxkc3QAKVtMb3JnL2FwYWNoZS9zcGFyay9zcWwvdHlwZXMvU3RydWN0RmllbGQ7TAALbmFtZVRvRmllbGR0ABZMc2NhbGEvY29sbGVjdGlvbi9NYXA7TAALbmFtZVRvSW5kZXhxAH4ANnhwAAAAAAAAcHBwdXIAKVtMb3JnLmFwY
 WNoZS5zcGFyay5zcWwudHlwZXMuU3RydWN0RmllbGQ7tWPEaGAaDUcCAAB4cAAAAAFzcgAmb3JnLmFwYWNoZS5zcGFyay5zcWwudHlwZXMuU3RydWN0RmllbGQrgSSJZ9l3nwIABFoACG51bGxhYmxlTAAIZGF0YVR5cGVxAH4AEEwACG1ldGFkYXRhcQB+ACRMAARuYW1lcQB+ABR4cABxAH4AGHEAfgApcQB+ACxwcA==",

Review Comment:
   this change is due to some methods in `StructType` are marked as private.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1105545893


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala:
##########
@@ -1429,4 +1435,97 @@ trait CheckAnalysis extends PredicateHelper with LookupCatalog with QueryErrorsB
       case _ =>
     }
   }
+
+  def checkDefaultValuesInPlan(plan: LogicalPlan, isForV1: Boolean = false): Unit = {
+    // Do not check anything if the children are not resolved yet.
+    if (!plan.childrenResolved) return
+    plan match {
+      case AlterColumn(t: ResolvedTable, col: ResolvedFieldName, _, _, _, _,
+          Some(default: DefaultValueExpression)) =>
+        checkTableProvider(t.catalog, t.name, getTableProviderFromProp(t.table.properties()))
+        checkDefaultValue(default, t.name, col.name, col.field.dataType, isForV1)
+
+      case cmd: V2CreateTablePlan if cmd.columns.exists(_.defaultValue.isDefined) =>
+        val ident = cmd.name.asInstanceOf[ResolvedIdentifier]

Review Comment:
   It will make the `case` statement very long and hard to read. Let me improve it a little bit.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102422842


##########
sql/core/src/test/scala/org/apache/spark/sql/sources/InsertSuite.scala:
##########
@@ -1043,9 +1043,8 @@ class InsertSuite extends DataSourceTest with SharedSparkSession {
 
   test("SPARK-38336 INSERT INTO statements with tables with default columns: negative tests") {
     object Errors {
-      val COMMON_SUBSTRING = " has a DEFAULT value"
+      val COMMON_SUBSTRING = "has an invalid DEFAULT value"

Review Comment:
   TODO: we should move default value tests to a new suite and use the new error framework to check errors.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102412729


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala:
##########
@@ -122,72 +121,6 @@ abstract class SessionCatalogSuite extends AnalysisTest with Eventually {
     assert(e.contains(s"`$name` is not a valid name for tables/databases."))
   }
 
-  test("create table with default columns") {
-    def test: Unit = withBasicCatalog { catalog =>
-      assert(catalog.externalCatalog.listTables("db1").isEmpty)
-      assert(catalog.externalCatalog.listTables("db2").toSet == Set("tbl1", "tbl2"))
-      catalog.createTable(newTable(
-        "tbl3", Some("db1"), defaultColumns = true), ignoreIfExists = false)
-      catalog.createTable(newTable(
-        "tbl3", Some("db2"), defaultColumns = true), ignoreIfExists = false)
-      assert(catalog.externalCatalog.listTables("db1").toSet == Set("tbl3"))
-      assert(catalog.externalCatalog.listTables("db2").toSet == Set("tbl1", "tbl2", "tbl3"))
-      // Inspect the default column values.
-      val db1tbl3 = catalog.externalCatalog.getTable("db1", "tbl3")
-      val currentDefault = ResolveDefaultColumns.CURRENT_DEFAULT_COLUMN_METADATA_KEY
-
-      def findField(name: String, schema: StructType): StructField =
-        schema.fields.filter(_.name == name).head
-      val columnA: StructField = findField("a", db1tbl3.schema)
-      val columnB: StructField = findField("b", db1tbl3.schema)
-      val columnC: StructField = findField("c", db1tbl3.schema)
-      val columnD: StructField = findField("d", db1tbl3.schema)
-      val columnE: StructField = findField("e", db1tbl3.schema)
-
-      val defaultValueColumnA: String = columnA.metadata.getString(currentDefault)
-      val defaultValueColumnB: String = columnB.metadata.getString(currentDefault)
-      val defaultValueColumnC: String = columnC.metadata.getString(currentDefault)
-      val defaultValueColumnD: String = columnD.metadata.getString(currentDefault)
-      val defaultValueColumnE: String = columnE.metadata.getString(currentDefault)
-
-      assert(defaultValueColumnA == "42")
-      assert(defaultValueColumnB == "\"abc\"")
-      assert(defaultValueColumnC == "_@#$%")
-      assert(defaultValueColumnD == "(select min(x) from badtable)")
-      assert(defaultValueColumnE == "41 + 1")
-
-      // Analyze the default column values.
-      val statementType = "CREATE TABLE"
-      assert(ResolveDefaultColumns.analyze(columnA, statementType).sql == "42")

Review Comment:
   no need to test this specific analyzer any more. The default expression will be resolved with the main analyzer with the commands.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102412729


##########
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/catalog/SessionCatalogSuite.scala:
##########
@@ -122,72 +121,6 @@ abstract class SessionCatalogSuite extends AnalysisTest with Eventually {
     assert(e.contains(s"`$name` is not a valid name for tables/databases."))
   }
 
-  test("create table with default columns") {
-    def test: Unit = withBasicCatalog { catalog =>
-      assert(catalog.externalCatalog.listTables("db1").isEmpty)
-      assert(catalog.externalCatalog.listTables("db2").toSet == Set("tbl1", "tbl2"))
-      catalog.createTable(newTable(
-        "tbl3", Some("db1"), defaultColumns = true), ignoreIfExists = false)
-      catalog.createTable(newTable(
-        "tbl3", Some("db2"), defaultColumns = true), ignoreIfExists = false)
-      assert(catalog.externalCatalog.listTables("db1").toSet == Set("tbl3"))
-      assert(catalog.externalCatalog.listTables("db2").toSet == Set("tbl1", "tbl2", "tbl3"))
-      // Inspect the default column values.
-      val db1tbl3 = catalog.externalCatalog.getTable("db1", "tbl3")
-      val currentDefault = ResolveDefaultColumns.CURRENT_DEFAULT_COLUMN_METADATA_KEY
-
-      def findField(name: String, schema: StructType): StructField =
-        schema.fields.filter(_.name == name).head
-      val columnA: StructField = findField("a", db1tbl3.schema)
-      val columnB: StructField = findField("b", db1tbl3.schema)
-      val columnC: StructField = findField("c", db1tbl3.schema)
-      val columnD: StructField = findField("d", db1tbl3.schema)
-      val columnE: StructField = findField("e", db1tbl3.schema)
-
-      val defaultValueColumnA: String = columnA.metadata.getString(currentDefault)
-      val defaultValueColumnB: String = columnB.metadata.getString(currentDefault)
-      val defaultValueColumnC: String = columnC.metadata.getString(currentDefault)
-      val defaultValueColumnD: String = columnD.metadata.getString(currentDefault)
-      val defaultValueColumnE: String = columnE.metadata.getString(currentDefault)
-
-      assert(defaultValueColumnA == "42")
-      assert(defaultValueColumnB == "\"abc\"")
-      assert(defaultValueColumnC == "_@#$%")
-      assert(defaultValueColumnD == "(select min(x) from badtable)")
-      assert(defaultValueColumnE == "41 + 1")
-
-      // Analyze the default column values.
-      val statementType = "CREATE TABLE"
-      assert(ResolveDefaultColumns.analyze(columnA, statementType).sql == "42")

Review Comment:
   no need to test this specific analyzer any more. The default expression will be resolved together with the commands by the main analyzer.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [WIP] refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1102895359


##########
connector/docker-integration-tests/src/test/scala/org/apache/spark/sql/jdbc/v2/V2JDBCTest.scala:
##########
@@ -48,18 +48,16 @@ private[v2] trait V2JDBCTest extends SharedSparkSession with DockerIntegrationFu
 
   def notSupportsTableComment: Boolean = false
 
-  val defaultMetadata = new MetadataBuilder().putLong("scale", 0).build()

Review Comment:
   with the new `Column` abstraction, jdbc tables no longer need to expose the internal special field metadata, which is only used by `JDBCRDD`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1106934399


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/Column.java:
##########
@@ -0,0 +1,78 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import javax.annotation.Nullable;
+
+import org.apache.spark.sql.connector.expressions.Literal;
+import org.apache.spark.sql.internal.connector.ColumnImpl;
+import org.apache.spark.sql.types.DataType;
+
+public interface Column {

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1106943619


##########
sql/catalyst/src/main/java/org/apache/spark/sql/connector/catalog/ColumnDefaultValue.java:
##########
@@ -0,0 +1,61 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.connector.catalog;
+
+import java.util.Objects;
+import javax.annotation.Nonnull;
+
+import org.apache.spark.sql.connector.expressions.Literal;
+
+public class ColumnDefaultValue {

Review Comment:
   done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1106968795


##########
sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/v2/DescribeTableExec.scala:
##########
@@ -37,6 +38,7 @@ case class DescribeTableExec(
     addPartitioning(rows)
 
     if (isExtended) {
+      addColumnDefaultValue(rows)

Review Comment:
   yea, I think this is more important than general table properties like owner and location.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a diff in pull request #39942: [SPARK-42398][SQL] Refine default column value framework

Posted by "cloud-fan (via GitHub)" <gi...@apache.org>.

cloud-fan commented on code in PR #39942:
URL: https://github.com/apache/spark/pull/39942#discussion_r1110153301


##########
sql/catalyst/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala:
##########
@@ -3108,7 +3108,7 @@ object SQLConf {
         "provided values when the corresponding fields are not present in storage.")
       .version("3.4.0")
       .stringConf
-      .createWithDefault("csv,json,orc,parquet")
+      .createWithDefault("csv,json,orc,parquet,hive")

Review Comment:
   ```suggestion
         .createWithDefault("csv,json,orc,parquet")
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org