You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rdblue <gi...@git.apache.org> on 2018/08/08 18:37:16 UTC

[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

GitHub user rdblue opened a pull request:

    https://github.com/apache/spark/pull/22043

    [SPARK-24251][SQL] Add analysis tests for AppendData.

    ## What changes were proposed in this pull request?
    
    This is a follow-up to #21305 that adds a test suite for AppendData analysis.
    
    ## How was this patch tested?
    
    This PR adds a test suite for AppendData analysis.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/rdblue/spark SPARK-24251-add-append-data-analysis-tests

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/22043.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #22043
    
----
commit e58d4fc666aa3d13c5d24af3823afc4c4bc31535
Author: Ryan Blue <bl...@...>
Date:   2018-08-08T18:33:17Z

    SPARK-24251: Add analysis tests for AppendData.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1964/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208989784
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    --- End diff --
    
    Yes, this is testing that the query that would work when matching by position fails when matching by name.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2013/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Thanks for reviewing, @cloud-fan!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208899807
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    --- End diff --
    
    can't we just call `query.output.head`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208896182
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    --- End diff --
    
    can't we just call `query.output.head`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r209016303
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    +    // missing required field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(requiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing optional columns cause failure and are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: fail canWrite check") {
    +    val parsedPlan = AppendData.byName(table, widerTable)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write", "'table-name'",
    +      "Cannot safely cast", "'x'", "'y'", "DoubleType to FloatType"))
    +  }
    +
    +  test("Append.byName: insert safe cast") {
    +    val x = table.output.toIndexedSeq(0)
    +    val y = table.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(widerTable, table)
    +    val expectedPlan = AppendData.byName(widerTable,
    +      Project(Seq(
    +        Alias(Cast(x, DoubleType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, DoubleType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        table))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail extra data fields") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("x", FloatType),
    +      StructField("y", FloatType),
    +      StructField("z", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write", "'table-name'", "too many data columns",
    +      "Table columns: 'x', 'y'",
    +      "Data columns: 'x', 'y', 'z'"))
    +  }
    +
    +  test("Append.byName: multiple field errors are reported") {
    +    val xRequiredTable = TestRelation(StructType(Seq(
    +      StructField("x", FloatType, nullable = false),
    +      StructField("y", DoubleType))).toAttributes)
    +
    +    val query = TestRelation(StructType(Seq(
    +      StructField("x", DoubleType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(xRequiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot safely cast", "'x'", "DoubleType to FloatType",
    +      "Cannot write nullable values to non-null column", "'x'",
    +      "Cannot find data for output column", "'y'"))
    +  }
    +
    +  test("Append.byPosition: basic behavior") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val a = query.output.toIndexedSeq(0)
    +    val b = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byPosition(table, query)
    +    val expectedPlan = AppendData.byPosition(table,
    +      Project(Seq(
    +        Alias(Cast(a, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(b, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byPosition: case does not fail column resolution") {
    --- End diff --
    
    Removed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208884689
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ---
    @@ -367,6 +367,7 @@ case class AppendData(
           case (inAttr, outAttr) =>
               // names and types must match, nullability must be compatible
               inAttr.name == outAttr.name &&
    +          inAttr.resolved && outAttr.resolved &&
    --- End diff --
    
    I think it's more clear to write `table.resolved && query.resolved && query.output.size == table.output.size && ...`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    thanks, merging to master!


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208896977
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    --- End diff --
    
    where do we lowercase the attribute name in `ResolveOutputRelation`?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208899867
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    +    // missing required field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(requiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing optional columns cause failure and are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: fail canWrite check") {
    +    val parsedPlan = AppendData.byName(table, widerTable)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write", "'table-name'",
    +      "Cannot safely cast", "'x'", "'y'", "DoubleType to FloatType"))
    +  }
    +
    +  test("Append.byName: insert safe cast") {
    +    val x = table.output.toIndexedSeq(0)
    +    val y = table.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(widerTable, table)
    +    val expectedPlan = AppendData.byName(widerTable,
    +      Project(Seq(
    +        Alias(Cast(x, DoubleType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, DoubleType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        table))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail extra data fields") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("x", FloatType),
    +      StructField("y", FloatType),
    +      StructField("z", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write", "'table-name'", "too many data columns",
    +      "Table columns: 'x', 'y'",
    +      "Data columns: 'x', 'y', 'z'"))
    +  }
    +
    +  test("Append.byName: multiple field errors are reported") {
    +    val xRequiredTable = TestRelation(StructType(Seq(
    +      StructField("x", FloatType, nullable = false),
    +      StructField("y", DoubleType))).toAttributes)
    +
    +    val query = TestRelation(StructType(Seq(
    +      StructField("x", DoubleType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(xRequiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot safely cast", "'x'", "DoubleType to FloatType",
    +      "Cannot write nullable values to non-null column", "'x'",
    +      "Cannot find data for output column", "'y'"))
    +  }
    +
    +  test("Append.byPosition: basic behavior") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val a = query.output.toIndexedSeq(0)
    +    val b = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byPosition(table, query)
    +    val expectedPlan = AppendData.byPosition(table,
    +      Project(Seq(
    +        Alias(Cast(a, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(b, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byPosition: case does not fail column resolution") {
    --- End diff --
    
    do we need this test? In "Append.byPosition: basic behavior" we proved that we can do append even the column names are different.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208898898
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    +    // missing required field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(requiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing optional columns cause failure and are identified by name") {
    --- End diff --
    
    this test is identical to "Append.byName: missing columns are identified by name"


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208898738
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    --- End diff --
    
    is there really a difference between missing required columns and missing optional columns?


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    @cloud-fan, here are tests to validate the analysis of AppendData logical plans.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94516/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208895537
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    --- End diff --
    
    this test is by name.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208994951
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    +    // missing required field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(requiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing optional columns cause failure and are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: fail canWrite check") {
    +    val parsedPlan = AppendData.byName(table, widerTable)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write", "'table-name'",
    +      "Cannot safely cast", "'x'", "'y'", "DoubleType to FloatType"))
    +  }
    +
    +  test("Append.byName: insert safe cast") {
    +    val x = table.output.toIndexedSeq(0)
    +    val y = table.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(widerTable, table)
    +    val expectedPlan = AppendData.byName(widerTable,
    +      Project(Seq(
    +        Alias(Cast(x, DoubleType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, DoubleType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        table))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail extra data fields") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("x", FloatType),
    +      StructField("y", FloatType),
    +      StructField("z", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write", "'table-name'", "too many data columns",
    +      "Table columns: 'x', 'y'",
    +      "Data columns: 'x', 'y', 'z'"))
    +  }
    +
    +  test("Append.byName: multiple field errors are reported") {
    +    val xRequiredTable = TestRelation(StructType(Seq(
    +      StructField("x", FloatType, nullable = false),
    +      StructField("y", DoubleType))).toAttributes)
    +
    +    val query = TestRelation(StructType(Seq(
    +      StructField("x", DoubleType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(xRequiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot safely cast", "'x'", "DoubleType to FloatType",
    +      "Cannot write nullable values to non-null column", "'x'",
    +      "Cannot find data for output column", "'y'"))
    +  }
    +
    +  test("Append.byPosition: basic behavior") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val a = query.output.toIndexedSeq(0)
    +    val b = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byPosition(table, query)
    +    val expectedPlan = AppendData.byPosition(table,
    +      Project(Seq(
    +        Alias(Cast(a, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(b, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byPosition: case does not fail column resolution") {
    --- End diff --
    
    I can remove it. I was including most test cases for both byName and byPosition to validate the different behaviors.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208993025
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    +    // missing required field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(requiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing optional columns cause failure and are identified by name") {
    --- End diff --
    
    I probably intended to update it for byPosition. I'll fix it.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208992737
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    --- End diff --
    
    Missing optional columns may be allowed in the future. We've already had a team request this feature (enabled by a flag) to support schema evolution. The use case is that you don't want to fail existing jobs when you add a column to the table. Iceberg supports this, so Spark should too.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r209015906
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    --- End diff --
    
    Fixed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    **[Test build #94449 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94449/testReport)** for PR 22043 at commit [`e58d4fc`](https://github.com/apache/spark/commit/e58d4fc666aa3d13c5d24af3823afc4c4bc31535).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r209016366
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ---
    @@ -367,6 +367,7 @@ case class AppendData(
           case (inAttr, outAttr) =>
               // names and types must match, nullability must be compatible
               inAttr.name == outAttr.name &&
    +          inAttr.resolved && outAttr.resolved &&
    --- End diff --
    
    Agreed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94449/
    Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/22043


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r209014458
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    --- End diff --
    
    Updated.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r209016955
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    --- End diff --
    
    Symbols are rarely used in Scala, so I think it is better to use the StructType and convert. It matches what users do more closely.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208885915
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    --- End diff --
    
    nit:
    ```
    import org.apache.spark.sql.catalyst.dsl.expressions._
    import org.apache.spark.sql.catalyst.dsl.plans._
    
    val table = TestRelation(Seq('x.float, 'y.float))
    val requiredTable = TestRelation(Seq('x.float.notNull, 'y.float.notNull))
    ...
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r209016101
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    +    // missing required field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(requiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing optional columns cause failure and are identified by name") {
    --- End diff --
    
    Removed. Looks like it was from when I split out the tests for required/optional.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208896173
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    --- End diff --
    
    query.output.last


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r209015928
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    --- End diff --
    
    Fixed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    Merged build finished. Test PASSed.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    **[Test build #94516 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94516/testReport)** for PR 22043 at commit [`765c5b4`](https://github.com/apache/spark/commit/765c5b4fb7dd8f90a1a0e71d43ee4f2312c39552).


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    **[Test build #94516 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94516/testReport)** for PR 22043 at commit [`765c5b4`](https://github.com/apache/spark/commit/765c5b4fb7dd8f90a1a0e71d43ee4f2312c39552).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/22043
  
    **[Test build #94449 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94449/testReport)** for PR 22043 at commit [`e58d4fc`](https://github.com/apache/spark/commit/e58d4fc666aa3d13c5d24af3823afc4c4bc31535).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208991938
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    --- End diff --
    
    I don't know, but this is required for tests to pass. Other parts of the code also assume that case may not match (like in [PruneFileSourcePartitions](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala#L42-L50)) so I didn't investigate further.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208895791
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    --- End diff --
    
    it's clearer to specify the `caseSensitive` parameter of `assertAnalysisError`


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/22043#discussion_r208899715
  
    --- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
    @@ -0,0 +1,411 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.catalyst.analysis
    +
    +import java.util.Locale
    +
    +import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
    +import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
    +import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
    +
    +case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
    +  override def name: String = "table-name"
    +}
    +
    +class DataSourceV2AnalysisSuite extends AnalysisTest {
    +  val table = TestRelation(StructType(Seq(
    +    StructField("x", FloatType),
    +    StructField("y", FloatType))).toAttributes)
    +
    +  val requiredTable = TestRelation(StructType(Seq(
    +    StructField("x", FloatType, nullable = false),
    +    StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +  val widerTable = TestRelation(StructType(Seq(
    +    StructField("x", DoubleType),
    +    StructField("y", DoubleType))).toAttributes)
    +
    +  test("Append.byName: basic behavior") {
    +    val query = TestRelation(table.schema.toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    checkAnalysis(parsedPlan, parsedPlan)
    +    assertResolved(parsedPlan)
    +  }
    +
    +  test("Append.byName: does not match by position") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: case sensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: case insensitive column resolution") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("X", FloatType), // doesn't match case!
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val X = query.output.toIndexedSeq(0)
    +    val y = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: data columns are reordered by name") {
    +    // out of order
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType),
    +      StructField("x", FloatType))).toAttributes)
    +
    +    val y = query.output.toIndexedSeq(0)
    +    val x = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +    val expectedPlan = AppendData.byName(table,
    +      Project(Seq(
    +        Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail nullable data written to required columns") {
    +    val parsedPlan = AppendData.byName(requiredTable, table)
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot write nullable values to non-null column", "'x'", "'y'"))
    +  }
    +
    +  test("Append.byName: allow required data written to nullable columns") {
    +    val parsedPlan = AppendData.byName(table, requiredTable)
    +    assertResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, parsedPlan)
    +  }
    +
    +  test("Append.byName: missing columns are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing required columns cause failure and are identified by name") {
    +    // missing required field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType, nullable = false))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(requiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: missing optional columns cause failure and are identified by name") {
    +    // missing optional field x
    +    val query = TestRelation(StructType(Seq(
    +      StructField("y", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot find data for output column", "'x'"))
    +  }
    +
    +  test("Append.byName: fail canWrite check") {
    +    val parsedPlan = AppendData.byName(table, widerTable)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write", "'table-name'",
    +      "Cannot safely cast", "'x'", "'y'", "DoubleType to FloatType"))
    +  }
    +
    +  test("Append.byName: insert safe cast") {
    +    val x = table.output.toIndexedSeq(0)
    +    val y = table.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byName(widerTable, table)
    +    val expectedPlan = AppendData.byName(widerTable,
    +      Project(Seq(
    +        Alias(Cast(x, DoubleType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(y, DoubleType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        table))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byName: fail extra data fields") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("x", FloatType),
    +      StructField("y", FloatType),
    +      StructField("z", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(table, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write", "'table-name'", "too many data columns",
    +      "Table columns: 'x', 'y'",
    +      "Data columns: 'x', 'y', 'z'"))
    +  }
    +
    +  test("Append.byName: multiple field errors are reported") {
    +    val xRequiredTable = TestRelation(StructType(Seq(
    +      StructField("x", FloatType, nullable = false),
    +      StructField("y", DoubleType))).toAttributes)
    +
    +    val query = TestRelation(StructType(Seq(
    +      StructField("x", DoubleType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val parsedPlan = AppendData.byName(xRequiredTable, query)
    +
    +    assertNotResolved(parsedPlan)
    +    assertAnalysisError(parsedPlan, Seq(
    +      "Cannot write incompatible data to table", "'table-name'",
    +      "Cannot safely cast", "'x'", "DoubleType to FloatType",
    +      "Cannot write nullable values to non-null column", "'x'",
    +      "Cannot find data for output column", "'y'"))
    +  }
    +
    +  test("Append.byPosition: basic behavior") {
    +    val query = TestRelation(StructType(Seq(
    +      StructField("a", FloatType),
    +      StructField("b", FloatType))).toAttributes)
    +
    +    val a = query.output.toIndexedSeq(0)
    +    val b = query.output.toIndexedSeq(1)
    +
    +    val parsedPlan = AppendData.byPosition(table, query)
    +    val expectedPlan = AppendData.byPosition(table,
    +      Project(Seq(
    +        Alias(Cast(a, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
    +        Alias(Cast(b, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
    +        query))
    +
    +    assertNotResolved(parsedPlan)
    +    checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
    +    assertResolved(expectedPlan)
    +  }
    +
    +  test("Append.byPosition: case does not fail column resolution") {
    --- End diff --
    
    do we need this test? In "Append.byPosition: basic behavior" we proved that we can do append even the column names are different.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org