You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by rdblue <gi...@git.apache.org> on 2018/08/08 18:37:16 UTC
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
GitHub user rdblue opened a pull request:
https://github.com/apache/spark/pull/22043
[SPARK-24251][SQL] Add analysis tests for AppendData.
## What changes were proposed in this pull request?
This is a follow-up to #21305 that adds a test suite for AppendData analysis.
## How was this patch tested?
This PR adds a test suite for AppendData analysis.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rdblue/spark SPARK-24251-add-append-data-analysis-tests
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/22043.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #22043
----
commit e58d4fc666aa3d13c5d24af3823afc4c4bc31535
Author: Ryan Blue <bl...@...>
Date: 2018-08-08T18:33:17Z
SPARK-24251: Add analysis tests for AppendData.
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22043
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/1964/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22043
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208989784
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
--- End diff --
Yes, this is testing that the query that would work when matching by position fails when matching by name.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22043
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/testing-k8s-prb-make-spark-distribution-unified/2013/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22043
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22043
Thanks for reviewing, @cloud-fan!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208899807
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
--- End diff --
can't we just call `query.output.head`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208896182
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
--- End diff --
can't we just call `query.output.head`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r209016303
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
+ // missing required field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val parsedPlan = AppendData.byName(requiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing optional columns cause failure and are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: fail canWrite check") {
+ val parsedPlan = AppendData.byName(table, widerTable)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write", "'table-name'",
+ "Cannot safely cast", "'x'", "'y'", "DoubleType to FloatType"))
+ }
+
+ test("Append.byName: insert safe cast") {
+ val x = table.output.toIndexedSeq(0)
+ val y = table.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(widerTable, table)
+ val expectedPlan = AppendData.byName(widerTable,
+ Project(Seq(
+ Alias(Cast(x, DoubleType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, DoubleType, Some(conf.sessionLocalTimeZone)), "y")()),
+ table))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail extra data fields") {
+ val query = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType),
+ StructField("z", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write", "'table-name'", "too many data columns",
+ "Table columns: 'x', 'y'",
+ "Data columns: 'x', 'y', 'z'"))
+ }
+
+ test("Append.byName: multiple field errors are reported") {
+ val xRequiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", DoubleType))).toAttributes)
+
+ val query = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(xRequiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot safely cast", "'x'", "DoubleType to FloatType",
+ "Cannot write nullable values to non-null column", "'x'",
+ "Cannot find data for output column", "'y'"))
+ }
+
+ test("Append.byPosition: basic behavior") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val a = query.output.toIndexedSeq(0)
+ val b = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byPosition(table, query)
+ val expectedPlan = AppendData.byPosition(table,
+ Project(Seq(
+ Alias(Cast(a, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(b, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byPosition: case does not fail column resolution") {
--- End diff --
Removed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208884689
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ---
@@ -367,6 +367,7 @@ case class AppendData(
case (inAttr, outAttr) =>
// names and types must match, nullability must be compatible
inAttr.name == outAttr.name &&
+ inAttr.resolved && outAttr.resolved &&
--- End diff --
I think it's more clear to write `table.resolved && query.resolved && query.output.size == table.output.size && ...`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on the issue:
https://github.com/apache/spark/pull/22043
thanks, merging to master!
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208896977
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
--- End diff --
where do we lowercase the attribute name in `ResolveOutputRelation`?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208899867
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
+ // missing required field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val parsedPlan = AppendData.byName(requiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing optional columns cause failure and are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: fail canWrite check") {
+ val parsedPlan = AppendData.byName(table, widerTable)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write", "'table-name'",
+ "Cannot safely cast", "'x'", "'y'", "DoubleType to FloatType"))
+ }
+
+ test("Append.byName: insert safe cast") {
+ val x = table.output.toIndexedSeq(0)
+ val y = table.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(widerTable, table)
+ val expectedPlan = AppendData.byName(widerTable,
+ Project(Seq(
+ Alias(Cast(x, DoubleType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, DoubleType, Some(conf.sessionLocalTimeZone)), "y")()),
+ table))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail extra data fields") {
+ val query = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType),
+ StructField("z", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write", "'table-name'", "too many data columns",
+ "Table columns: 'x', 'y'",
+ "Data columns: 'x', 'y', 'z'"))
+ }
+
+ test("Append.byName: multiple field errors are reported") {
+ val xRequiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", DoubleType))).toAttributes)
+
+ val query = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(xRequiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot safely cast", "'x'", "DoubleType to FloatType",
+ "Cannot write nullable values to non-null column", "'x'",
+ "Cannot find data for output column", "'y'"))
+ }
+
+ test("Append.byPosition: basic behavior") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val a = query.output.toIndexedSeq(0)
+ val b = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byPosition(table, query)
+ val expectedPlan = AppendData.byPosition(table,
+ Project(Seq(
+ Alias(Cast(a, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(b, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byPosition: case does not fail column resolution") {
--- End diff --
do we need this test? In "Append.byPosition: basic behavior" we proved that we can do append even the column names are different.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208898898
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
+ // missing required field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val parsedPlan = AppendData.byName(requiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing optional columns cause failure and are identified by name") {
--- End diff --
this test is identical to "Append.byName: missing columns are identified by name"
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208898738
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
--- End diff --
is there really a difference between missing required columns and missing optional columns?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on the issue:
https://github.com/apache/spark/pull/22043
@cloud-fan, here are tests to validate the analysis of AppendData logical plans.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22043
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94516/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208895537
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
--- End diff --
this test is by name.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208994951
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
+ // missing required field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val parsedPlan = AppendData.byName(requiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing optional columns cause failure and are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: fail canWrite check") {
+ val parsedPlan = AppendData.byName(table, widerTable)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write", "'table-name'",
+ "Cannot safely cast", "'x'", "'y'", "DoubleType to FloatType"))
+ }
+
+ test("Append.byName: insert safe cast") {
+ val x = table.output.toIndexedSeq(0)
+ val y = table.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(widerTable, table)
+ val expectedPlan = AppendData.byName(widerTable,
+ Project(Seq(
+ Alias(Cast(x, DoubleType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, DoubleType, Some(conf.sessionLocalTimeZone)), "y")()),
+ table))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail extra data fields") {
+ val query = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType),
+ StructField("z", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write", "'table-name'", "too many data columns",
+ "Table columns: 'x', 'y'",
+ "Data columns: 'x', 'y', 'z'"))
+ }
+
+ test("Append.byName: multiple field errors are reported") {
+ val xRequiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", DoubleType))).toAttributes)
+
+ val query = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(xRequiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot safely cast", "'x'", "DoubleType to FloatType",
+ "Cannot write nullable values to non-null column", "'x'",
+ "Cannot find data for output column", "'y'"))
+ }
+
+ test("Append.byPosition: basic behavior") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val a = query.output.toIndexedSeq(0)
+ val b = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byPosition(table, query)
+ val expectedPlan = AppendData.byPosition(table,
+ Project(Seq(
+ Alias(Cast(a, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(b, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byPosition: case does not fail column resolution") {
--- End diff --
I can remove it. I was including most test cases for both byName and byPosition to validate the different behaviors.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208993025
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
+ // missing required field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val parsedPlan = AppendData.byName(requiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing optional columns cause failure and are identified by name") {
--- End diff --
I probably intended to update it for byPosition. I'll fix it.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208992737
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
--- End diff --
Missing optional columns may be allowed in the future. We've already had a team request this feature (enabled by a flag) to support schema evolution. The use case is that you don't want to fail existing jobs when you add a column to the table. Iceberg supports this, so Spark should too.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r209015906
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
--- End diff --
Fixed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22043
**[Test build #94449 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94449/testReport)** for PR 22043 at commit [`e58d4fc`](https://github.com/apache/spark/commit/e58d4fc666aa3d13c5d24af3823afc4c4bc31535).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r209016366
--- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/plans/logical/basicLogicalOperators.scala ---
@@ -367,6 +367,7 @@ case class AppendData(
case (inAttr, outAttr) =>
// names and types must match, nullability must be compatible
inAttr.name == outAttr.name &&
+ inAttr.resolved && outAttr.resolved &&
--- End diff --
Agreed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22043
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/94449/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/22043
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r209014458
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
--- End diff --
Updated.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r209016955
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
--- End diff --
Symbols are rarely used in Scala, so I think it is better to use the StructType and convert. It matches what users do more closely.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208885915
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
--- End diff --
nit:
```
import org.apache.spark.sql.catalyst.dsl.expressions._
import org.apache.spark.sql.catalyst.dsl.plans._
val table = TestRelation(Seq('x.float, 'y.float))
val requiredTable = TestRelation(Seq('x.float.notNull, 'y.float.notNull))
...
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22043
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r209016101
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
+ // missing required field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val parsedPlan = AppendData.byName(requiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing optional columns cause failure and are identified by name") {
--- End diff --
Removed. Looks like it was from when I split out the tests for required/optional.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208896173
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
--- End diff --
query.output.last
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r209015928
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
--- End diff --
Fixed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/22043
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22043
**[Test build #94516 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94516/testReport)** for PR 22043 at commit [`765c5b4`](https://github.com/apache/spark/commit/765c5b4fb7dd8f90a1a0e71d43ee4f2312c39552).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22043
**[Test build #94516 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94516/testReport)** for PR 22043 at commit [`765c5b4`](https://github.com/apache/spark/commit/765c5b4fb7dd8f90a1a0e71d43ee4f2312c39552).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #22043: [SPARK-24251][SQL] Add analysis tests for AppendData.
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/22043
**[Test build #94449 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/94449/testReport)** for PR 22043 at commit [`e58d4fc`](https://github.com/apache/spark/commit/e58d4fc666aa3d13c5d24af3823afc4c4bc31535).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208991938
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
--- End diff --
I don't know, but this is required for tests to pass. Other parts of the code also assume that case may not match (like in [PruneFileSourcePartitions](https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/PruneFileSourcePartitions.scala#L42-L50)) so I didn't investigate further.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208895791
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
--- End diff --
it's clearer to specify the `caseSensitive` parameter of `assertAnalysisError`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #22043: [SPARK-24251][SQL] Add analysis tests for AppendD...
Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:
https://github.com/apache/spark/pull/22043#discussion_r208899715
--- Diff: sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/analysis/DataSourceV2AnalysisSuite.scala ---
@@ -0,0 +1,411 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License. You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.spark.sql.catalyst.analysis
+
+import java.util.Locale
+
+import org.apache.spark.sql.catalyst.expressions.{Alias, AttributeReference, Cast, UpCast}
+import org.apache.spark.sql.catalyst.plans.logical.{AppendData, LeafNode, LogicalPlan, Project}
+import org.apache.spark.sql.types.{DoubleType, FloatType, StructField, StructType}
+
+case class TestRelation(output: Seq[AttributeReference]) extends LeafNode with NamedRelation {
+ override def name: String = "table-name"
+}
+
+class DataSourceV2AnalysisSuite extends AnalysisTest {
+ val table = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType))).toAttributes)
+
+ val requiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val widerTable = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("y", DoubleType))).toAttributes)
+
+ test("Append.byName: basic behavior") {
+ val query = TestRelation(table.schema.toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ checkAnalysis(parsedPlan, parsedPlan)
+ assertResolved(parsedPlan)
+ }
+
+ test("Append.byName: does not match by position") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: case sensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: case insensitive column resolution") {
+ val query = TestRelation(StructType(Seq(
+ StructField("X", FloatType), // doesn't match case!
+ StructField("y", FloatType))).toAttributes)
+
+ val X = query.output.toIndexedSeq(0)
+ val y = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(toLower(X), FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: data columns are reordered by name") {
+ // out of order
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType),
+ StructField("x", FloatType))).toAttributes)
+
+ val y = query.output.toIndexedSeq(0)
+ val x = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(table, query)
+ val expectedPlan = AppendData.byName(table,
+ Project(Seq(
+ Alias(Cast(x, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail nullable data written to required columns") {
+ val parsedPlan = AppendData.byName(requiredTable, table)
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot write nullable values to non-null column", "'x'", "'y'"))
+ }
+
+ test("Append.byName: allow required data written to nullable columns") {
+ val parsedPlan = AppendData.byName(table, requiredTable)
+ assertResolved(parsedPlan)
+ checkAnalysis(parsedPlan, parsedPlan)
+ }
+
+ test("Append.byName: missing columns are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing required columns cause failure and are identified by name") {
+ // missing required field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType, nullable = false))).toAttributes)
+
+ val parsedPlan = AppendData.byName(requiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: missing optional columns cause failure and are identified by name") {
+ // missing optional field x
+ val query = TestRelation(StructType(Seq(
+ StructField("y", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot find data for output column", "'x'"))
+ }
+
+ test("Append.byName: fail canWrite check") {
+ val parsedPlan = AppendData.byName(table, widerTable)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write", "'table-name'",
+ "Cannot safely cast", "'x'", "'y'", "DoubleType to FloatType"))
+ }
+
+ test("Append.byName: insert safe cast") {
+ val x = table.output.toIndexedSeq(0)
+ val y = table.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byName(widerTable, table)
+ val expectedPlan = AppendData.byName(widerTable,
+ Project(Seq(
+ Alias(Cast(x, DoubleType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(y, DoubleType, Some(conf.sessionLocalTimeZone)), "y")()),
+ table))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byName: fail extra data fields") {
+ val query = TestRelation(StructType(Seq(
+ StructField("x", FloatType),
+ StructField("y", FloatType),
+ StructField("z", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(table, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write", "'table-name'", "too many data columns",
+ "Table columns: 'x', 'y'",
+ "Data columns: 'x', 'y', 'z'"))
+ }
+
+ test("Append.byName: multiple field errors are reported") {
+ val xRequiredTable = TestRelation(StructType(Seq(
+ StructField("x", FloatType, nullable = false),
+ StructField("y", DoubleType))).toAttributes)
+
+ val query = TestRelation(StructType(Seq(
+ StructField("x", DoubleType),
+ StructField("b", FloatType))).toAttributes)
+
+ val parsedPlan = AppendData.byName(xRequiredTable, query)
+
+ assertNotResolved(parsedPlan)
+ assertAnalysisError(parsedPlan, Seq(
+ "Cannot write incompatible data to table", "'table-name'",
+ "Cannot safely cast", "'x'", "DoubleType to FloatType",
+ "Cannot write nullable values to non-null column", "'x'",
+ "Cannot find data for output column", "'y'"))
+ }
+
+ test("Append.byPosition: basic behavior") {
+ val query = TestRelation(StructType(Seq(
+ StructField("a", FloatType),
+ StructField("b", FloatType))).toAttributes)
+
+ val a = query.output.toIndexedSeq(0)
+ val b = query.output.toIndexedSeq(1)
+
+ val parsedPlan = AppendData.byPosition(table, query)
+ val expectedPlan = AppendData.byPosition(table,
+ Project(Seq(
+ Alias(Cast(a, FloatType, Some(conf.sessionLocalTimeZone)), "x")(),
+ Alias(Cast(b, FloatType, Some(conf.sessionLocalTimeZone)), "y")()),
+ query))
+
+ assertNotResolved(parsedPlan)
+ checkAnalysis(parsedPlan, expectedPlan, caseSensitive = false)
+ assertResolved(expectedPlan)
+ }
+
+ test("Append.byPosition: case does not fail column resolution") {
--- End diff --
do we need this test? In "Append.byPosition: basic behavior" we proved that we can do append even the column names are different.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org