You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by yhuai <gi...@git.apache.org> on 2016/06/19 07:12:21 UTC

[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

GitHub user yhuai opened a pull request:

    https://github.com/apache/spark/pull/13769

    [SPARK-16030] [SQL] Allow specifying static partitions when inserting to data source tables

    ## What changes were proposed in this pull request?
    This PR adds the static partition support to INSERT statement when the target table is a data source table.
    
    ## How was this patch tested?
    New tests in InsertIntoHiveTableSuite and DataSourceAnalysisSuite.
    
    **Note: This PR is based on https://github.com/apache/spark/pull/13766. The last commit is the actual change.**

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/yhuai/spark SPARK-16030-1

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/13769.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #13769
    
----
commit 03aea3c75c6b4897d3d65d76e3e877b39fb3b3fd
Author: Yin Huai <yh...@databricks.com>
Date:   2016-06-19T01:26:53Z

    Remove unneeded code in DataFrameWriter.insertInto.

commit 1e43dffe6712d618a6017891da2d78050c560ceb
Author: Yin Huai <yh...@databricks.com>
Date:   2016-06-19T01:29:30Z

    Follow up of SPARK-16033. Improve the error message (https://github.com/apache/spark/pull/13747#discussion_r67591585).

commit 28d130dabda491d19f4ec1b625597f8cab4e2b6d
Author: Yin Huai <yh...@databricks.com>
Date:   2016-06-19T03:52:55Z

    Follow up of SPARK-16034.

commit 13008137c86321f04b2b54fbaaaebb8f3643bf75
Author: Yin Huai <yh...@databricks.com>
Date:   2016-06-19T04:45:59Z

    Follow up of SPARK-16036

commit 5315b80128e5cabc31ec0a77df9aca9a42aa10c5
Author: Yin Huai <yh...@databricks.com>
Date:   2016-06-19T07:10:38Z

    [SPARK-16030] [SQL] Allow specifying static partitions when inserting to data source tables

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    Merged build finished. Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    https://github.com/apache/spark/pull/13769/commits/5315b80128e5cabc31ec0a77df9aca9a42aa10c5 is the actual change. Other commits are from https://github.com/apache/spark/pull/13766. 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    **[Test build #60833 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60833/consoleFull)** for PR 13769 at commit [`ba9c04c`](https://github.com/apache/spark/commit/ba9c04cfe46680e5145859b086357f3ed1a76ff1).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67638513
  
    --- Diff: sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/CheckAnalysis.scala ---
    @@ -313,13 +313,32 @@ trait CheckAnalysis extends PredicateHelper {
                      |${s.catalogTable.identifier}
                    """.stripMargin)
     
    +          // TODO: We need to consolidate this kind of checks for InsertIntoTable
    +          // with the rule of PreWriteCheck defined in extendedCheckRules.
               case InsertIntoTable(s: SimpleCatalogRelation, _, _, _, _) =>
                 failAnalysis(
                   s"""
                      |Hive support is required to insert into the following tables:
                      |${s.catalogTable.identifier}
                    """.stripMargin)
     
    +          case InsertIntoTable(t, _, _, _, _)
    --- End diff --
    
    Why do we move these checks from `PreWriteCheck` to here?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67637488
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -43,8 +43,128 @@ import org.apache.spark.unsafe.types.UTF8String
      * Replaces generic operations with specific variants that are designed to work with Spark
      * SQL Data Sources.
      */
    -private[sql] object DataSourceAnalysis extends Rule[LogicalPlan] {
    +private[sql] case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] {
    +
    +  def resolver: Resolver = {
    +    if (conf.caseSensitiveAnalysis) {
    +      caseSensitiveResolution
    +    } else {
    +      caseInsensitiveResolution
    +    }
    +  }
    +
    +  // The access modifier is used to expose this method to tests.
    +  private[sql] def convertStaticPartitions(
    +    sourceAttributes: Seq[Attribute],
    +    providedPartitions: Map[String, Option[String]],
    +    targetAttributes: Seq[Attribute],
    +    targetPartitionSchema: StructType): Seq[NamedExpression] = {
    +
    +    assert(providedPartitions.exists(_._2.isDefined))
    +
    +    val staticPartitions = providedPartitions.flatMap {
    +      case (partKey, Some(partValue)) => (partKey, partValue) :: Nil
    +      case (_, None) => Nil
    +    }
    +
    +    // The sum of the number of static partition columns and columns provided in the SELECT
    +    // clause needs to match the number of columns of the target table.
    +    if (staticPartitions.size + sourceAttributes.size != targetAttributes.size) {
    +      throw new AnalysisException(
    +        s"The data to be inserted needs to have the same number of " +
    +          s"columns as the target table: target table has ${targetAttributes.size} " +
    +          s"column(s) but the inserted data has ${sourceAttributes.size + staticPartitions.size} " +
    +          s"column(s), which contain ${staticPartitions.size} partition column(s) having " +
    +          s"assigned constant values.")
    +    }
    +
    +    if (providedPartitions.size != targetPartitionSchema.fields.size) {
    +      throw new AnalysisException(
    +        s"The data to be inserted needs to have the same number of " +
    +          s"partition columns as the target table: target table " +
    +          s"has ${targetPartitionSchema.fields.size} partition column(s) but the inserted " +
    +          s"data has ${providedPartitions.size} partition columns specified.")
    +    }
    +
    +    staticPartitions.foreach {
    +      case (partKey, partValue) =>
    +        if (!targetPartitionSchema.fields.exists(field => resolver(field.name, partKey))) {
    +          throw new AnalysisException(
    +            s"$partKey is not a partition column. Partition columns are " +
    +              s"${targetPartitionSchema.fields.map(_.name).mkString("[", ",", "]")}")
    +        }
    +    }
    +
    +    val partitionList = targetPartitionSchema.fields.map { field =>
    +      val potentialSpecs = staticPartitions.filter {
    +        case (partKey, partValue) => resolver(field.name, partKey)
    +      }
    +      if (potentialSpecs.size == 0) {
    +        None
    +      } else if (potentialSpecs.size == 1) {
    +        val partValue = potentialSpecs.head._2
    +        Some(Alias(Cast(Literal(partValue), field.dataType), "_staticPart")())
    +      } else {
    +        throw new AnalysisException(
    +          s"Partition column ${field.name} have multiple values specified, " +
    +            s"${potentialSpecs.mkString("[", ", ", "]")}. Please only specify a single value.")
    +      }
    +    }
    +
    +    partitionList.sliding(2).foreach { v =>
    --- End diff --
    
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    **[Test build #60812 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60812/consoleFull)** for PR 13769 at commit [`7dc04af`](https://github.com/apache/spark/commit/7dc04af4e8c4a6ff978d277335dce81654260ed8).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67611543
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -43,8 +43,128 @@ import org.apache.spark.unsafe.types.UTF8String
      * Replaces generic operations with specific variants that are designed to work with Spark
      * SQL Data Sources.
      */
    -private[sql] object DataSourceAnalysis extends Rule[LogicalPlan] {
    +private[sql] case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] {
    +
    +  def resolver: Resolver = {
    +    if (conf.caseSensitiveAnalysis) {
    +      caseSensitiveResolution
    +    } else {
    +      caseInsensitiveResolution
    +    }
    +  }
    +
    +  // The access modifier is used to expose this method to tests.
    +  private[sql] def convertStaticPartitions(
    --- End diff --
    
    This is the new rule to use dynamic partition insert to evaluate insert statements having static partitions when the target table is a HadoopFsRelation.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    **[Test build #60833 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60833/consoleFull)** for PR 13769 at commit [`ba9c04c`](https://github.com/apache/spark/commit/ba9c04cfe46680e5145859b086357f3ed1a76ff1).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60833/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67611526
  
    --- Diff: sql/hive/src/main/scala/org/apache/spark/sql/hive/execution/InsertIntoHiveTable.scala ---
    @@ -196,7 +197,7 @@ case class InsertIntoHiveTable(
           // Report error if any static partition appears after a dynamic partition
           val isDynamic = partitionColumnNames.map(partitionSpec(_).isEmpty)
           if (isDynamic.init.zip(isDynamic.tail).contains((true, false))) {
    -        throw new SparkException(ErrorMsg.PARTITION_DYN_STA_ORDER.getMsg)
    +        throw new AnalysisException(ErrorMsg.PARTITION_DYN_STA_ORDER.getMsg)
    --- End diff --
    
    We can have another PR to change other exceptions in this node from `SparkException` to `AnalysisException`.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67643557
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/DataSourceAnalysisSuite.scala ---
    @@ -0,0 +1,202 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.sources
    +
    +import org.scalatest.BeforeAndAfterAll
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.SimpleCatalystConf
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, Cast, Expression, Literal}
    +import org.apache.spark.sql.execution.datasources.DataSourceAnalysis
    +import org.apache.spark.sql.types.{IntegerType, StructType}
    +
    +class DataSourceAnalysisSuite extends SparkFunSuite with BeforeAndAfterAll {
    +
    +  private var targetAttributes: Seq[Attribute] = _
    +  private var targetPartitionSchema: StructType = _
    --- End diff --
    
    Why are these `var`s? Seems that they are immutable throughout this test suite.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    Test FAILed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60805/
    Test FAILed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    Merged build finished. Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67611551
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/DataSourceAnalysisSuite.scala ---
    @@ -0,0 +1,190 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.sources
    +
    +import org.scalatest.BeforeAndAfterAll
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.SimpleCatalystConf
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, Cast, Expression, Literal}
    +import org.apache.spark.sql.execution.datasources.DataSourceAnalysis
    +import org.apache.spark.sql.types.{IntegerType, StructType}
    +
    +class DataSourceAnalysisSuite extends SparkFunSuite with BeforeAndAfterAll {
    --- End diff --
    
    Tests in this file are testing the behavior of DataSourceAnalysis.convertStaticPartitions.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67617959
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -372,4 +380,93 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
           assert(!logical.resolved, "Should not resolve: missing partition data")
         }
       }
    +
    +  testPartitionedTable(
    +    "SPARK-16036: better error message when insert into a table with mismatch schema") {
    +    tableName =>
    +      val e = intercept[AnalysisException] {
    +        sql(s"INSERT INTO TABLE $tableName PARTITION(b=1, c=2) SELECT 1, 2, 3")
    +      }
    +      assert(e.message.contains("the number of columns are different"))
    +  }
    +
    +  testPartitionedTable("SPARK-16037: INSERT statement should match columns by position") {
    +    tableName =>
    +      withSQLConf("hive.exec.dynamic.partition.mode" -> "nonstrict") {
    +        sql(s"INSERT INTO TABLE $tableName SELECT 1, 4, 2 AS c, 3 AS b")
    +        checkAnswer(sql(s"SELECT a, b, c, d FROM $tableName"), Row(1, 2, 3, 4))
    +        sql(s"INSERT OVERWRITE TABLE $tableName SELECT 1, 4, 2, 3")
    +        checkAnswer(sql(s"SELECT a, b, c, 4 FROM $tableName"), Row(1, 2, 3, 4))
    +      }
    +  }
    +
    +  testPartitionedTable("INSERT INTO a partitioned table (semantic and error handling)") {
    --- End diff --
    
    These are new tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by rxin <gi...@git.apache.org>.
Github user rxin commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67749677
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -43,8 +43,128 @@ import org.apache.spark.unsafe.types.UTF8String
      * Replaces generic operations with specific variants that are designed to work with Spark
      * SQL Data Sources.
      */
    -private[sql] object DataSourceAnalysis extends Rule[LogicalPlan] {
    +private[sql] case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] {
    +
    +  def resolver: Resolver = {
    +    if (conf.caseSensitiveAnalysis) {
    +      caseSensitiveResolution
    +    } else {
    +      caseInsensitiveResolution
    +    }
    +  }
    +
    +  // The access modifier is used to expose this method to tests.
    +  private[sql] def convertStaticPartitions(
    --- End diff --
    
    Yea that's part of this: https://issues.apache.org/jira/browse/SPARK-15691
    
    Let me know if you want to help out.



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    LGTM except for minor issues. I'm merging this to master and branch-2.0 first since this is an critical issue. We can address the comments later.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67638211
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -43,8 +43,127 @@ import org.apache.spark.unsafe.types.UTF8String
      * Replaces generic operations with specific variants that are designed to work with Spark
      * SQL Data Sources.
      */
    -private[sql] object DataSourceAnalysis extends Rule[LogicalPlan] {
    +private[sql] case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] {
    +
    +  def resolver: Resolver = {
    +    if (conf.caseSensitiveAnalysis) {
    +      caseSensitiveResolution
    +    } else {
    +      caseInsensitiveResolution
    +    }
    +  }
    +
    +  // The access modifier is used to expose this method to tests.
    +  private[sql] def convertStaticPartitions(
    +    sourceAttributes: Seq[Attribute],
    +    providedPartitions: Map[String, Option[String]],
    +    targetAttributes: Seq[Attribute],
    +    targetPartitionSchema: StructType): Seq[NamedExpression] = {
    +
    +    assert(providedPartitions.exists(_._2.isDefined))
    +
    +    val staticPartitions = providedPartitions.flatMap {
    +      case (partKey, Some(partValue)) => (partKey, partValue) :: Nil
    +      case (_, None) => Nil
    +    }
    +
    +    // The sum of the number of static partition columns and columns provided in the SELECT
    +    // clause needs to match the number of columns of the target table.
    +    if (staticPartitions.size + sourceAttributes.size != targetAttributes.size) {
    --- End diff --
    
    Looks like we already have this check somewhere?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    **[Test build #60831 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60831/consoleFull)** for PR 13769 at commit [`6bd7b6f`](https://github.com/apache/spark/commit/6bd7b6fb992af794216fa7752d25747d64d82280).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    **[Test build #60812 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60812/consoleFull)** for PR 13769 at commit [`7dc04af`](https://github.com/apache/spark/commit/7dc04af4e8c4a6ff978d277335dce81654260ed8).
     * This patch passes all tests.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by rdblue <gi...@git.apache.org>.
Github user rdblue commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67749047
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -43,8 +43,128 @@ import org.apache.spark.unsafe.types.UTF8String
      * Replaces generic operations with specific variants that are designed to work with Spark
      * SQL Data Sources.
      */
    -private[sql] object DataSourceAnalysis extends Rule[LogicalPlan] {
    +private[sql] case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] {
    +
    +  def resolver: Resolver = {
    +    if (conf.caseSensitiveAnalysis) {
    +      caseSensitiveResolution
    +    } else {
    +      caseInsensitiveResolution
    +    }
    +  }
    +
    +  // The access modifier is used to expose this method to tests.
    +  private[sql] def convertStaticPartitions(
    --- End diff --
    
    Why is this rule only applied to HadoopFsRelation? It would be fine to change Hive to write using this rule as well and we would need fewer relation-specific rules.
    
    This isn't a huge issue, but I'm concerned about the proliferation of fixes for either Hive or data sources that are never applied to the other. We should be consolidating the implementation wherever possible.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67618071
  
    --- Diff: sql/core/src/test/scala/org/apache/spark/sql/sources/DataSourceAnalysisSuite.scala ---
    @@ -0,0 +1,190 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *    http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.spark.sql.sources
    +
    +import org.scalatest.BeforeAndAfterAll
    +
    +import org.apache.spark.SparkFunSuite
    +import org.apache.spark.sql.AnalysisException
    +import org.apache.spark.sql.catalyst.SimpleCatalystConf
    +import org.apache.spark.sql.catalyst.dsl.expressions._
    +import org.apache.spark.sql.catalyst.expressions.{Alias, Attribute, Cast, Expression, Literal}
    +import org.apache.spark.sql.execution.datasources.DataSourceAnalysis
    +import org.apache.spark.sql.types.{IntegerType, StructType}
    +
    +class DataSourceAnalysisSuite extends SparkFunSuite with BeforeAndAfterAll {
    --- End diff --
    
    @rxin These tests are unit tests for the rewrite logic.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67637303
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -43,8 +43,128 @@ import org.apache.spark.unsafe.types.UTF8String
      * Replaces generic operations with specific variants that are designed to work with Spark
      * SQL Data Sources.
      */
    -private[sql] object DataSourceAnalysis extends Rule[LogicalPlan] {
    +private[sql] case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] {
    +
    +  def resolver: Resolver = {
    +    if (conf.caseSensitiveAnalysis) {
    +      caseSensitiveResolution
    +    } else {
    +      caseInsensitiveResolution
    +    }
    +  }
    +
    +  // The access modifier is used to expose this method to tests.
    +  private[sql] def convertStaticPartitions(
    +    sourceAttributes: Seq[Attribute],
    +    providedPartitions: Map[String, Option[String]],
    +    targetAttributes: Seq[Attribute],
    +    targetPartitionSchema: StructType): Seq[NamedExpression] = {
    +
    +    assert(providedPartitions.exists(_._2.isDefined))
    +
    +    val staticPartitions = providedPartitions.flatMap {
    +      case (partKey, Some(partValue)) => (partKey, partValue) :: Nil
    +      case (_, None) => Nil
    +    }
    +
    +    // The sum of the number of static partition columns and columns provided in the SELECT
    +    // clause needs to match the number of columns of the target table.
    +    if (staticPartitions.size + sourceAttributes.size != targetAttributes.size) {
    +      throw new AnalysisException(
    +        s"The data to be inserted needs to have the same number of " +
    +          s"columns as the target table: target table has ${targetAttributes.size} " +
    +          s"column(s) but the inserted data has ${sourceAttributes.size + staticPartitions.size} " +
    +          s"column(s), which contain ${staticPartitions.size} partition column(s) having " +
    +          s"assigned constant values.")
    +    }
    +
    +    if (providedPartitions.size != targetPartitionSchema.fields.size) {
    +      throw new AnalysisException(
    +        s"The data to be inserted needs to have the same number of " +
    +          s"partition columns as the target table: target table " +
    +          s"has ${targetPartitionSchema.fields.size} partition column(s) but the inserted " +
    +          s"data has ${providedPartitions.size} partition columns specified.")
    +    }
    +
    +    staticPartitions.foreach {
    +      case (partKey, partValue) =>
    +        if (!targetPartitionSchema.fields.exists(field => resolver(field.name, partKey))) {
    +          throw new AnalysisException(
    +            s"$partKey is not a partition column. Partition columns are " +
    +              s"${targetPartitionSchema.fields.map(_.name).mkString("[", ",", "]")}")
    +        }
    +    }
    +
    +    val partitionList = targetPartitionSchema.fields.map { field =>
    +      val potentialSpecs = staticPartitions.filter {
    +        case (partKey, partValue) => resolver(field.name, partKey)
    +      }
    +      if (potentialSpecs.size == 0) {
    +        None
    +      } else if (potentialSpecs.size == 1) {
    +        val partValue = potentialSpecs.head._2
    +        Some(Alias(Cast(Literal(partValue), field.dataType), "_staticPart")())
    +      } else {
    +        throw new AnalysisException(
    +          s"Partition column ${field.name} have multiple values specified, " +
    +            s"${potentialSpecs.mkString("[", ", ", "]")}. Please only specify a single value.")
    +      }
    +    }
    +
    +    partitionList.sliding(2).foreach { v =>
    --- End diff --
    
    We can use the following check instead:
    
    ```scala
    partitionList.dropWhile(_.isDefined).collectFirst {
      case Some(_) =>
        throw new AnalysisException("...")
    }
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by liancheng <gi...@git.apache.org>.
Github user liancheng commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67637019
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -43,8 +43,128 @@ import org.apache.spark.unsafe.types.UTF8String
      * Replaces generic operations with specific variants that are designed to work with Spark
      * SQL Data Sources.
      */
    -private[sql] object DataSourceAnalysis extends Rule[LogicalPlan] {
    +private[sql] case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] {
    +
    +  def resolver: Resolver = {
    +    if (conf.caseSensitiveAnalysis) {
    +      caseSensitiveResolution
    +    } else {
    +      caseInsensitiveResolution
    +    }
    +  }
    +
    +  // The access modifier is used to expose this method to tests.
    +  private[sql] def convertStaticPartitions(
    +    sourceAttributes: Seq[Attribute],
    +    providedPartitions: Map[String, Option[String]],
    +    targetAttributes: Seq[Attribute],
    +    targetPartitionSchema: StructType): Seq[NamedExpression] = {
    +
    +    assert(providedPartitions.exists(_._2.isDefined))
    +
    +    val staticPartitions = providedPartitions.flatMap {
    +      case (partKey, Some(partValue)) => (partKey, partValue) :: Nil
    +      case (_, None) => Nil
    +    }
    +
    +    // The sum of the number of static partition columns and columns provided in the SELECT
    +    // clause needs to match the number of columns of the target table.
    +    if (staticPartitions.size + sourceAttributes.size != targetAttributes.size) {
    +      throw new AnalysisException(
    +        s"The data to be inserted needs to have the same number of " +
    +          s"columns as the target table: target table has ${targetAttributes.size} " +
    +          s"column(s) but the inserted data has ${sourceAttributes.size + staticPartitions.size} " +
    +          s"column(s), which contain ${staticPartitions.size} partition column(s) having " +
    +          s"assigned constant values.")
    +    }
    +
    +    if (providedPartitions.size != targetPartitionSchema.fields.size) {
    +      throw new AnalysisException(
    +        s"The data to be inserted needs to have the same number of " +
    +          s"partition columns as the target table: target table " +
    +          s"has ${targetPartitionSchema.fields.size} partition column(s) but the inserted " +
    +          s"data has ${providedPartitions.size} partition columns specified.")
    +    }
    +
    +    staticPartitions.foreach {
    +      case (partKey, partValue) =>
    +        if (!targetPartitionSchema.fields.exists(field => resolver(field.name, partKey))) {
    +          throw new AnalysisException(
    +            s"$partKey is not a partition column. Partition columns are " +
    +              s"${targetPartitionSchema.fields.map(_.name).mkString("[", ",", "]")}")
    +        }
    +    }
    +
    +    val partitionList = targetPartitionSchema.fields.map { field =>
    +      val potentialSpecs = staticPartitions.filter {
    +        case (partKey, partValue) => resolver(field.name, partKey)
    +      }
    +      if (potentialSpecs.size == 0) {
    +        None
    +      } else if (potentialSpecs.size == 1) {
    +        val partValue = potentialSpecs.head._2
    +        Some(Alias(Cast(Literal(partValue), field.dataType), "_staticPart")())
    +      } else {
    +        throw new AnalysisException(
    +          s"Partition column ${field.name} have multiple values specified, " +
    +            s"${potentialSpecs.mkString("[", ", ", "]")}. Please only specify a single value.")
    +      }
    +    }
    +
    +    partitionList.sliding(2).foreach { v =>
    --- End diff --
    
    `sliding(2)` can be dangerous for single-element collections:
    
    ```
    scala> Seq(1).sliding(2).foreach(println)
    List(1)
    ```


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60812/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by yhuai <gi...@git.apache.org>.
Github user yhuai commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67611533
  
    --- Diff: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertIntoHiveTableSuite.scala ---
    @@ -372,4 +380,95 @@ class InsertIntoHiveTableSuite extends QueryTest with TestHiveSingleton with Bef
           assert(!logical.resolved, "Should not resolve: missing partition data")
         }
       }
    +
    +  testPartitionedTable(
    +    "SPARK-16036: better error message when insert into a table with mismatch schema") {
    +    tableName =>
    +      val e = intercept[AnalysisException] {
    +        sql(s"INSERT INTO TABLE $tableName PARTITION(b=1, c=2) SELECT 1, 2, 3")
    +      }
    +      assert(e.message.contains("the number of columns are different"))
    +  }
    +
    +  testPartitionedTable(
    +    "SPARK-16037: INSERT statement should match columns by position") {
    +    tableName =>
    +      withSQLConf("hive.exec.dynamic.partition.mode" -> "nonstrict") {
    +        sql(s"INSERT INTO TABLE $tableName SELECT 1, 4, 2 AS c, 3 AS b")
    +        checkAnswer(sql(s"SELECT a, b, c, d FROM $tableName"), Row(1, 2, 3, 4))
    +        sql(s"INSERT OVERWRITE TABLE $tableName SELECT 1, 4, 2, 3")
    +        checkAnswer(sql(s"SELECT a, b, c, 4 FROM $tableName"), Row(1, 2, 3, 4))
    +      }
    +  }
    +
    +  testPartitionedTable(
    --- End diff --
    
    These are new tests.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    **[Test build #60805 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60805/consoleFull)** for PR 13769 at commit [`5315b80`](https://github.com/apache/spark/commit/5315b80128e5cabc31ec0a77df9aca9a42aa10c5).
     * This patch **fails Spark unit tests**.
     * This patch merges cleanly.
     * This patch adds no public classes.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    **[Test build #60831 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60831/consoleFull)** for PR 13769 at commit [`6bd7b6f`](https://github.com/apache/spark/commit/6bd7b6fb992af794216fa7752d25747d64d82280).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by SparkQA <gi...@git.apache.org>.
Github user SparkQA commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    **[Test build #60805 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/60805/consoleFull)** for PR 13769 at commit [`5315b80`](https://github.com/apache/spark/commit/5315b80128e5cabc31ec0a77df9aca9a42aa10c5).


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark issue #13769: [SPARK-16030] [SQL] Allow specifying static partitions w...

Posted by AmplabJenkins <gi...@git.apache.org>.
Github user AmplabJenkins commented on the issue:

    https://github.com/apache/spark/pull/13769
  
    Test PASSed.
    Refer to this link for build results (access rights to CI server needed): 
    https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/60831/
    Test PASSed.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by asfgit <gi...@git.apache.org>.
Github user asfgit closed the pull request at:

    https://github.com/apache/spark/pull/13769


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] spark pull request #13769: [SPARK-16030] [SQL] Allow specifying static parti...

Posted by cloud-fan <gi...@git.apache.org>.
Github user cloud-fan commented on a diff in the pull request:

    https://github.com/apache/spark/pull/13769#discussion_r67638318
  
    --- Diff: sql/core/src/main/scala/org/apache/spark/sql/execution/datasources/DataSourceStrategy.scala ---
    @@ -43,8 +43,127 @@ import org.apache.spark.unsafe.types.UTF8String
      * Replaces generic operations with specific variants that are designed to work with Spark
      * SQL Data Sources.
      */
    -private[sql] object DataSourceAnalysis extends Rule[LogicalPlan] {
    +private[sql] case class DataSourceAnalysis(conf: CatalystConf) extends Rule[LogicalPlan] {
    +
    +  def resolver: Resolver = {
    +    if (conf.caseSensitiveAnalysis) {
    +      caseSensitiveResolution
    +    } else {
    +      caseInsensitiveResolution
    +    }
    +  }
    +
    +  // The access modifier is used to expose this method to tests.
    +  private[sql] def convertStaticPartitions(
    +    sourceAttributes: Seq[Attribute],
    +    providedPartitions: Map[String, Option[String]],
    +    targetAttributes: Seq[Attribute],
    +    targetPartitionSchema: StructType): Seq[NamedExpression] = {
    +
    +    assert(providedPartitions.exists(_._2.isDefined))
    +
    +    val staticPartitions = providedPartitions.flatMap {
    +      case (partKey, Some(partValue)) => (partKey, partValue) :: Nil
    +      case (_, None) => Nil
    +    }
    +
    +    // The sum of the number of static partition columns and columns provided in the SELECT
    +    // clause needs to match the number of columns of the target table.
    +    if (staticPartitions.size + sourceAttributes.size != targetAttributes.size) {
    --- End diff --
    
    in `PreprocessTableInsertion`


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastructure@apache.org or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org