You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by dongjoon-hyun <gi...@git.apache.org> on 2017/10/20 21:27:56 UTC
[GitHub] spark pull request #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_nam...
GitHub user dongjoon-hyun opened a pull request:
https://github.com/apache/spark/pull/19545
[SPARK-21929][SQL] Support `ALTER TABLE table_name ADD COLUMNS(..)` for ORC data source
## What changes were proposed in this pull request?
When SPARK-19261 implements `ALTER TABLE ADD COLUMNS`, ORC data source is omitted due to SPARK-14387, SPARK-16628, and SPARK-18355. Now, those issues are fixed and Spark 2.3 is using Spark schema to read ORC table instead of ORC file schema. This PR enables `ALTER TABLE ADD COLUMNS` for ORC data source.
## How was this patch tested?
Pass the updated and added test cases.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/dongjoon-hyun/spark SPARK-21929
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/19545.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #19545
----
----
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19545
**[Test build #82949 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82949/testReport)** for PR 19545 at commit [`dfc59fc`](https://github.com/apache/spark/commit/dfc59fc7426eedf40b66d63f75d6f3f133ec240d).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/19545
Hi, @gatorsmile . Could you review this PR?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19545
Merged build finished. Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/19545
Retest this please.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19545
Thanks! Merged to master.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19545
**[Test build #82939 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82939/testReport)** for PR 19545 at commit [`cc52547`](https://github.com/apache/spark/commit/cc525479951868ff7094097aea886819c29fb549).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/19545
Thank you a lot, @gatorsmile !
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19545
**[Test build #82949 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82949/testReport)** for PR 19545 at commit [`dfc59fc`](https://github.com/apache/spark/commit/dfc59fc7426eedf40b66d63f75d6f3f133ec240d).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_nam...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/19545#discussion_r146096038
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
@@ -2202,56 +2202,64 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils {
}
}
+ def testAddColumn(provider: String): Unit = {
+ withTable("t1") {
+ sql(s"CREATE TABLE t1 (c1 int) USING $provider")
+ sql("INSERT INTO t1 VALUES (1)")
+ sql("ALTER TABLE t1 ADD COLUMNS (c2 int)")
+ checkAnswer(
+ spark.table("t1"),
+ Seq(Row(1, null))
+ )
+ checkAnswer(
+ sql("SELECT * FROM t1 WHERE c2 is null"),
+ Seq(Row(1, null))
+ )
+
+ sql("INSERT INTO t1 VALUES (3, 2)")
+ checkAnswer(
+ sql("SELECT * FROM t1 WHERE c2 = 2"),
+ Seq(Row(3, 2))
+ )
+ }
+ }
+
+ def testAddColumnPartitioned(provider: String): Unit = {
--- End diff --
Nit: `protected`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19545
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82949/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_nam...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on a diff in the pull request:
https://github.com/apache/spark/pull/19545#discussion_r146090102
--- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/command/tables.scala ---
@@ -235,11 +235,10 @@ case class AlterTableAddColumnsCommand(
DataSource.lookupDataSource(catalogTable.provider.get).newInstance() match {
// For datasource table, this command can only support the following File format.
// TextFileFormat only default to one column "value"
- // OrcFileFormat can not handle difference between user-specified schema and
- // inferred schema yet. TODO, once this issue is resolved , we can add Orc back.
// Hive type is already considered as hive serde table, so the logic will not
// come in here.
case _: JsonFileFormat | _: CSVFileFormat | _: ParquetFileFormat =>
+ case s if s.getClass.getCanonicalName.endsWith("OrcFileFormat") =>
--- End diff --
After implementing OrcFileFormat based on Apache ORC, we can move `OrcFileFormat` from `sql/hive` module into `sql/core` module.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_nam...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on a diff in the pull request:
https://github.com/apache/spark/pull/19545#discussion_r146096042
--- Diff: sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala ---
@@ -2202,56 +2202,64 @@ abstract class DDLSuite extends QueryTest with SQLTestUtils {
}
}
+ def testAddColumn(provider: String): Unit = {
--- End diff --
Nit: `protected`
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19545
Test FAILed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82942/
Test FAILed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark pull request #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_nam...
Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:
https://github.com/apache/spark/pull/19545
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19545
**[Test build #82942 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82942/testReport)** for PR 19545 at commit [`dfc59fc`](https://github.com/apache/spark/commit/dfc59fc7426eedf40b66d63f75d6f3f133ec240d).
* This patch **fails due to an unknown error code, -9**.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by gatorsmile <gi...@git.apache.org>.
Github user gatorsmile commented on the issue:
https://github.com/apache/spark/pull/19545
LGTM
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19545
**[Test build #82939 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82939/testReport)** for PR 19545 at commit [`cc52547`](https://github.com/apache/spark/commit/cc525479951868ff7094097aea886819c29fb549).
* This patch passes all tests.
* This patch merges cleanly.
* This patch adds no public classes.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19545
Test PASSed.
Refer to this link for build results (access rights to CI server needed):
https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/82939/
Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19545
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:
https://github.com/apache/spark/pull/19545
Merged build finished. Test PASSed.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/19545
It’s ready for review again, @gatorsmile. Thanks.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by dongjoon-hyun <gi...@git.apache.org>.
Github user dongjoon-hyun commented on the issue:
https://github.com/apache/spark/pull/19545
Thank you for review, @gatorsmile ! I updated the PR.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org
[GitHub] spark issue #19545: [SPARK-21929][SQL] Support `ALTER TABLE table_name ADD C...
Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:
https://github.com/apache/spark/pull/19545
**[Test build #82942 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/82942/testReport)** for PR 19545 at commit [`dfc59fc`](https://github.com/apache/spark/commit/dfc59fc7426eedf40b66d63f75d6f3f133ec240d).
---
---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org