You are viewing a plain text version of this content. The canonical link for it is here.

Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2021/10/08 03:30:56 UTC

[GitHub] [spark] AngersZhuuuu opened a new pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

AngersZhuuuu opened a new pull request #34218:
URL: https://github.com/apache/spark/pull/34218


   
   ### What changes were proposed in this pull request?
   when convert to HiveTable, respect table schema cases.
   
   ### Why are the changes needed?
   When user create a hive bucket table with upper case schema, the table schema will be stored as lower cases while bucket column info will stay the same with user input.
   
   if we try to insert into this table, an HiveException reports bucket column is not in table schema.
   
   here is a simple repro
   ```
   spark.sql("""
     CREATE TABLE TEST1(
       V1 BIGINT,
       S1 INT)
     PARTITIONED BY (PK BIGINT)
     CLUSTERED BY (V1)
     SORTED BY (S1)
     INTO 200 BUCKETS
     STORED AS PARQUET """).show
   
   spark.sql("INSERT INTO TEST1 SELECT * FROM VALUES(1,1,1)").show
   ```
   Error message:
   ```
   scala> spark.sql("INSERT INTO TEST1 SELECT * FROM VALUES(1,1,1)").show
   org.apache.spark.sql.AnalysisException: org.apache.hadoop.hive.ql.metadata.HiveException: Bucket columns V1 is not part of the table columns ([FieldSchema(name:v1, type:bigint, comment:null), FieldSchema(name:s1, type:int, comment:null)]
     at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:112)
     at org.apache.spark.sql.hive.HiveExternalCatalog.listPartitions(HiveExternalCatalog.scala:1242)
     at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.listPartitions(ExternalCatalogWithListener.scala:254)
     at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listPartitions(SessionCatalog.scala:1166)
     at org.apache.spark.sql.execution.datasources.InsertIntoHadoopFsRelationCommand.run(InsertIntoHadoopFsRelationCommand.scala:103)
     at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult$lzycompute(commands.scala:108)
     at org.apache.spark.sql.execution.command.DataWritingCommandExec.sideEffectResult(commands.scala:106)
     at org.apache.spark.sql.execution.command.DataWritingCommandExec.executeCollect(commands.scala:120)
     at org.apache.spark.sql.Dataset.$anonfun$logicalPlan$1(Dataset.scala:228)
     at org.apache.spark.sql.Dataset.$anonfun$withAction$1(Dataset.scala:3687)
     at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$5(SQLExecution.scala:103)
     at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:163)
     at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:90)
     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
     at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:64)
     at org.apache.spark.sql.Dataset.withAction(Dataset.scala:3685)
     at org.apache.spark.sql.Dataset.<init>(Dataset.scala:228)
     at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:99)
     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
     at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:96)
     at org.apache.spark.sql.SparkSession.$anonfun$sql$1(SparkSession.scala:615)
     at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:772)
     at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:610)
     ... 47 elided
   Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: Bucket columns V1 is not part of the table columns ([FieldSchema(name:v1, type:bigint, comment:null), FieldSchema(name:s1, type:int, comment:null)]
     at org.apache.hadoop.hive.ql.metadata.Table.setBucketCols(Table.java:552)
     at org.apache.spark.sql.hive.client.HiveClientImpl$.toHiveTable(HiveClientImpl.scala:1082)
     at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$getPartitions$1(HiveClientImpl.scala:732)
     at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:291)
     at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:224)
     at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:223)
     at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:273)
     at org.apache.spark.sql.hive.client.HiveClientImpl.getPartitions(HiveClientImpl.scala:731)
     at org.apache.spark.sql.hive.client.HiveClient.getPartitions(HiveClient.scala:222)
     at org.apache.spark.sql.hive.client.HiveClient.getPartitions$(HiveClient.scala:218)
     at org.apache.spark.sql.hive.client.HiveClientImpl.getPartitions(HiveClientImpl.scala:91)
     at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listPartitions$1(HiveExternalCatalog.scala:1245)
     at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:102)
     ... 69 more
   ```
   
   ### Does this PR introduce any user-facing change?
   No
   
   ### How was this patch tested?
   UT


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938705277


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144025/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938322164


   **[Test build #144007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144007/testReport)** for PR 34218 at commit [`d19c211`](https://github.com/apache/spark/commit/d19c211d2b52fb92c8b78ecbeca9fe4a899388b1).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939232462


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48523/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939419515


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144060/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939219880


   **[Test build #144046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144046/testReport)** for PR 34218 at commit [`f1d5771`](https://github.com/apache/spark/commit/f1d5771c4d280950ad25554b3bf89bbd4966dc93).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938347287


   Is this a regression, @AngersZhuuuu ?
   
   cc @gengliangwang since SPARK-35531 seems to be opened for 3.2.0.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939799669


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144067/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r725027401



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
##########
@@ -733,14 +733,22 @@ private[hive] class HiveClientImpl(
     Option(hivePartition).map(fromHivePartition)
   }
 
+  override def getPartitions(
+      db: String,
+      table: String,
+      partialSpec: Option[TablePartitionSpec]): Seq[CatalogTablePartition] = withHiveState {
+    getPartitions(
+      getRawTableOption(db, table).getOrElse(throw new NoSuchTableException(db, table)),
+      partialSpec)
+  }

Review comment:
       Implement here to resolve classloader problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939223605


   **[Test build #144044 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144044/testReport)** for PR 34218 at commit [`f4619da`](https://github.com/apache/spark/commit/f4619da87c490f005dfcef0605288968c1ee0ec7).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939269181


   **[Test build #144054 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144054/testReport)** for PR 34218 at commit [`8e22a4d`](https://github.com/apache/spark/commit/8e22a4dcdc005bfbceebeaa999983a9cde493f76).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939231342


   **[Test build #144046 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144046/testReport)** for PR 34218 at commit [`f1d5771`](https://github.com/apache/spark/commit/f1d5771c4d280950ad25554b3bf89bbd4966dc93).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938432228


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144011/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r725816586



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
##########
@@ -733,6 +733,23 @@ private[hive] class HiveClientImpl(
     Option(hivePartition).map(fromHivePartition)
   }
 
+  override def getPartitions(
+      db: String,
+      table: String,
+      spec: Option[TablePartitionSpec]): Seq[CatalogTablePartition] = withHiveState {
+    val hiveTable =
+      getRawTableOption(db, table).getOrElse(throw new NoSuchTableException(db, table))
+    val partSpec = spec match {
+      case None => CatalogTypes.emptyTablePartitionSpec
+      case Some(s) =>
+        assert(s.values.forall(_.nonEmpty), s"partition spec '$s' is invalid")
+        s
+    }
+    val parts = client.getPartitions(hiveTable, partSpec.asJava).asScala.map(fromHivePartition)
+    HiveCatalogMetrics.incrementFetchedPartitions(parts.length)
+    parts.toSeq
+  }
+

Review comment:
       Done




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938663045


   **[Test build #144025 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144025/testReport)** for PR 34218 at commit [`79fbc4f`](https://github.com/apache/spark/commit/79fbc4f740e005bec489fc68862b0ae4936c6f97).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938345522


   **[Test build #144007 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144007/testReport)** for PR 34218 at commit [`d19c211`](https://github.com/apache/spark/commit/d19c211d2b52fb92c8b78ecbeca9fe4a899388b1).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938748795


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48502/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939786073


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48545/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939419115


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48538/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939234287






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938736654


   **[Test build #144028 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144028/testReport)** for PR 34218 at commit [`0b05e37`](https://github.com/apache/spark/commit/0b05e37423b4a0abd65f9a06886c84dfbddce211).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938364201


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48484/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r725424966



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
##########
@@ -733,6 +733,23 @@ private[hive] class HiveClientImpl(
     Option(hivePartition).map(fromHivePartition)
   }
 
+  override def getPartitions(
+      db: String,
+      table: String,
+      spec: Option[TablePartitionSpec]): Seq[CatalogTablePartition] = withHiveState {
+    val hiveTable =
+      getRawTableOption(db, table).getOrElse(throw new NoSuchTableException(db, table))
+    val partSpec = spec match {
+      case None => CatalogTypes.emptyTablePartitionSpec
+      case Some(s) =>
+        assert(s.values.forall(_.nonEmpty), s"partition spec '$s' is invalid")
+        s
+    }
+    val parts = client.getPartitions(hiveTable, partSpec.asJava).asScala.map(fromHivePartition)
+    HiveCatalogMetrics.incrementFetchedPartitions(parts.length)
+    parts.toSeq
+  }
+

Review comment:
       How about remove
   ```
    override def getPartitions(
         table: CatalogTable,
         spec: Option[TablePartitionSpec])
   ```
   in HiveClient since it only used in a UT? @cloud-fan 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938748934






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

cloud-fan commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r725034758



##########
File path: sql/hive/src/test/scala/org/apache/spark/sql/hive/InsertSuite.scala
##########
@@ -854,4 +855,64 @@ class InsertSuite extends QueryTest with TestHiveSingleton with BeforeAndAfter
       assert(e.contains("Partition spec is invalid"))
     }
   }
+
+  test("SPARK-35531: Insert data with different cases of bucket column") {
+    withTable("TEST1") {
+      val createHive =
+        """
+          |CREATE TABLE TEST1(
+          |v1 BIGINT,
+          |s1 INT)
+          |PARTITIONED BY (pk BIGINT)
+          |CLUSTERED BY (v1)
+          |SORTED BY (s1)
+          |INTO 200 BUCKETS
+          |STORED AS PARQUET
+        """.stripMargin
+
+      val insertString =
+        """
+          |INSERT INTO test1
+          |SELECT * FROM VALUES(1,1,1)
+        """.stripMargin
+
+      val dropString = "DROP TABLE IF EXISTS test1"
+
+      spark.sql(dropString)

Review comment:
       doesn't `sql(dropString)` work?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938431875


   **[Test build #144011 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144011/testReport)** for PR 34218 at commit [`5b6e2a0`](https://github.com/apache/spark/commit/5b6e2a034ccfa71531f276bed51220703b1784c7).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

cloud-fan commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r725804468



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
##########
@@ -733,6 +733,23 @@ private[hive] class HiveClientImpl(
     Option(hivePartition).map(fromHivePartition)
   }
 
+  override def getPartitions(
+      db: String,
+      table: String,
+      spec: Option[TablePartitionSpec]): Seq[CatalogTablePartition] = withHiveState {
+    val hiveTable =
+      getRawTableOption(db, table).getOrElse(throw new NoSuchTableException(db, table))
+    val partSpec = spec match {
+      case None => CatalogTypes.emptyTablePartitionSpec
+      case Some(s) =>
+        assert(s.values.forall(_.nonEmpty), s"partition spec '$s' is invalid")
+        s
+    }
+    val parts = client.getPartitions(hiveTable, partSpec.asJava).asScala.map(fromHivePartition)
+    HiveCatalogMetrics.incrementFetchedPartitions(parts.length)
+    parts.toSeq
+  }
+

Review comment:
       SGTM




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939269181


   **[Test build #144054 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144054/testReport)** for PR 34218 at commit [`8e22a4d`](https://github.com/apache/spark/commit/8e22a4dcdc005bfbceebeaa999983a9cde493f76).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938465456


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48488/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938465456


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48488/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939419347


   **[Test build #144060 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144060/testReport)** for PR 34218 at commit [`8e22a4d`](https://github.com/apache/spark/commit/8e22a4dcdc005bfbceebeaa999983a9cde493f76).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939242355


   **[Test build #144050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144050/testReport)** for PR 34218 at commit [`04a973e`](https://github.com/apache/spark/commit/04a973ea40648287a72bcf16d0ca85c603fee074).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939227072






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938708225


   **[Test build #144028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144028/testReport)** for PR 34218 at commit [`0b05e37`](https://github.com/apache/spark/commit/0b05e37423b4a0abd65f9a06886c84dfbddce211).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r725027401



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala
##########
@@ -733,14 +733,22 @@ private[hive] class HiveClientImpl(
     Option(hivePartition).map(fromHivePartition)
   }
 
+  override def getPartitions(
+      db: String,
+      table: String,
+      partialSpec: Option[TablePartitionSpec]): Seq[CatalogTablePartition] = withHiveState {
+    getPartitions(
+      getRawTableOption(db, table).getOrElse(throw new NoSuchTableException(db, table)),
+      partialSpec)
+  }

Review comment:
       Implement here to resolve classloader problem.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] dongjoon-hyun commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

dongjoon-hyun commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938353428


   Thank you for checking, @gengliangwang !


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939786073


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48545/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939285795


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144054/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939285555


   **[Test build #144054 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144054/testReport)** for PR 34218 at commit [`8e22a4d`](https://github.com/apache/spark/commit/8e22a4dcdc005bfbceebeaa999983a9cde493f76).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938847934






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938657932


   how about current @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939411210


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48538/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939217687


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48521/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938847935






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939250152


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48527/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939407497


   **[Test build #144060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144060/testReport)** for PR 34218 at commit [`8e22a4d`](https://github.com/apache/spark/commit/8e22a4dcdc005bfbceebeaa999983a9cde493f76).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939227072






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan closed pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

cloud-fan closed pull request #34218:
URL: https://github.com/apache/spark/pull/34218


   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939212101


   **[Test build #144044 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144044/testReport)** for PR 34218 at commit [`f4619da`](https://github.com/apache/spark/commit/f4619da87c490f005dfcef0605288968c1ee0ec7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r724710813



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##########
@@ -1236,9 +1236,10 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat
       db: String,
       table: String,
       partialSpec: Option[TablePartitionSpec] = None): Seq[CatalogTablePartition] = withClient {
-    val partColNameMap = buildLowerCasePartColNameMap(getTable(db, table))
+    val catalogTable = getTable(db, table)
+    val partColNameMap = buildLowerCasePartColNameMap(catalogTable)
     val metaStoreSpec = partialSpec.map(toMetaStorePartitionSpec)
-    val res = client.getPartitions(db, table, metaStoreSpec)
+    val res = client.getPartitions(catalogTable, metaStoreSpec)

Review comment:
       > @AngersZhuuuu what's the diff before/after? I couldn't follow.
   
   Can refer to https://github.com/apache/spark/pull/32675#discussion_r657699288




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939219880


   **[Test build #144046 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144046/testReport)** for PR 34218 at commit [`f1d5771`](https://github.com/apache/spark/commit/f1d5771c4d280950ad25554b3bf89bbd4966dc93).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938708225


   **[Test build #144028 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144028/testReport)** for PR 34218 at commit [`0b05e37`](https://github.com/apache/spark/commit/0b05e37423b4a0abd65f9a06886c84dfbddce211).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939799669


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144067/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939285019


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48531/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939260022


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144050/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r725027795



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##########
@@ -1236,13 +1236,14 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat
       db: String,
       table: String,
       partialSpec: Option[TablePartitionSpec] = None): Seq[CatalogTablePartition] = withClient {
-    val partColNameMap = buildLowerCasePartColNameMap(getTable(db, table))
-    val metaStoreSpec = partialSpec.map(toMetaStorePartitionSpec)
-    val res = client.getPartitions(db, table, metaStoreSpec)
+    val rawTable = getRawTable(db, table)
+    val catalogTable = restoreTableMetadata(rawTable)

Review comment:
       New code won't change here




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939256840


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48527/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939259777


   **[Test build #144050 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144050/testReport)** for PR 34218 at commit [`04a973e`](https://github.com/apache/spark/commit/04a973ea40648287a72bcf16d0ca85c603fee074).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939242355


   **[Test build #144050 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144050/testReport)** for PR 34218 at commit [`04a973e`](https://github.com/apache/spark/commit/04a973ea40648287a72bcf16d0ca85c603fee074).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938322164


   **[Test build #144007 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144007/testReport)** for PR 34218 at commit [`d19c211`](https://github.com/apache/spark/commit/d19c211d2b52fb92c8b78ecbeca9fe4a899388b1).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938345718


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144007/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938748934






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939781009


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48545/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

HyukjinKwon commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939234119


   @AngersZhuuuu can you fix PR title? grammatically it doesn't make sense, and it doesn't really describe what the Pr proposes. The PR fixes Hive client's partition retrieval logic to respect case sensitivity.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

HyukjinKwon commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r724750472



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##########
@@ -1236,13 +1236,14 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat
       db: String,
       table: String,
       partialSpec: Option[TablePartitionSpec] = None): Seq[CatalogTablePartition] = withClient {
-    val partColNameMap = buildLowerCasePartColNameMap(getTable(db, table))
-    val metaStoreSpec = partialSpec.map(toMetaStorePartitionSpec)
-    val res = client.getPartitions(db, table, metaStoreSpec)
+    val rawTable = getRawTable(db, table)

Review comment:
       Can we add some comments?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939714513


   **[Test build #144067 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144067/testReport)** for PR 34218 at commit [`e1bddf5`](https://github.com/apache/spark/commit/e1bddf5e8185e474a2a905e5f2f0cebdfd7733e3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938336145


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48484/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938746488


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48506/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939234287






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939285795


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144054/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939280528


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48531/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939419515


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144060/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939798390


   **[Test build #144067 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144067/testReport)** for PR 34218 at commit [`e1bddf5`](https://github.com/apache/spark/commit/e1bddf5e8185e474a2a905e5f2f0cebdfd7733e3).
    * This patch passes all tests.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939419115


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48538/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939260022






-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939274743


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48531/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939406657


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

cloud-fan commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939877740


   thanks, merging to master!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939207633


   retest this please


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938663045


   **[Test build #144025 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144025/testReport)** for PR 34218 at commit [`79fbc4f`](https://github.com/apache/spark/commit/79fbc4f740e005bec489fc68862b0ae4936c6f97).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938311230


   ping @cloud-fan 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939714513


   **[Test build #144067 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144067/testReport)** for PR 34218 at commit [`e1bddf5`](https://github.com/apache/spark/commit/e1bddf5e8185e474a2a905e5f2f0cebdfd7733e3).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939260045


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48527/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939212101


   **[Test build #144044 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144044/testReport)** for PR 34218 at commit [`f4619da`](https://github.com/apache/spark/commit/f4619da87c490f005dfcef0605288968c1ee0ec7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938345718


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144007/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938392995


   **[Test build #144011 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144011/testReport)** for PR 34218 at commit [`5b6e2a0`](https://github.com/apache/spark/commit/5b6e2a034ccfa71531f276bed51220703b1784c7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938432228


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144011/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938705277


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder/144025/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938703200


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48502/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939225580


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48523/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AngersZhuuuu commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AngersZhuuuu commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939239295


   > @AngersZhuuuu can you fix PR title? grammatically it doesn't make sense, and it doesn't really describe what the Pr proposes. The PR fixes Hive client's partition retrieval logic to respect case sensitivity.
   
   How about current?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938392995


   **[Test build #144011 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144011/testReport)** for PR 34218 at commit [`5b6e2a0`](https://github.com/apache/spark/commit/5b6e2a034ccfa71531f276bed51220703b1784c7).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

HyukjinKwon commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r724693829



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##########
@@ -1236,9 +1236,10 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat
       db: String,
       table: String,
       partialSpec: Option[TablePartitionSpec] = None): Seq[CatalogTablePartition] = withClient {
-    val partColNameMap = buildLowerCasePartColNameMap(getTable(db, table))
+    val catalogTable = getTable(db, table)
+    val partColNameMap = buildLowerCasePartColNameMap(catalogTable)
     val metaStoreSpec = partialSpec.map(toMetaStorePartitionSpec)
-    val res = client.getPartitions(db, table, metaStoreSpec)
+    val res = client.getPartitions(catalogTable, metaStoreSpec)

Review comment:
       @AngersZhuuuu what's the diff before/after? I couldn't follow.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] cloud-fan commented on a change in pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

cloud-fan commented on a change in pull request #34218:
URL: https://github.com/apache/spark/pull/34218#discussion_r724753714



##########
File path: sql/hive/src/main/scala/org/apache/spark/sql/hive/HiveExternalCatalog.scala
##########
@@ -1236,13 +1236,14 @@ private[spark] class HiveExternalCatalog(conf: SparkConf, hadoopConf: Configurat
       db: String,
       table: String,
       partialSpec: Option[TablePartitionSpec] = None): Seq[CatalogTablePartition] = withClient {
-    val partColNameMap = buildLowerCasePartColNameMap(getTable(db, table))
-    val metaStoreSpec = partialSpec.map(toMetaStorePartitionSpec)
-    val res = client.getPartitions(db, table, metaStoreSpec)
+    val rawTable = getRawTable(db, table)
+    val catalogTable = restoreTableMetadata(rawTable)

Review comment:
       can we just do `val tableDef = getTable(db, table)`?




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938800756


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48506/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938784278


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48504/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938693632


   **[Test build #144025 has finished](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144025/testReport)** for PR 34218 at commit [`79fbc4f`](https://github.com/apache/spark/commit/79fbc4f740e005bec489fc68862b0ae4936c6f97).
    * This patch **fails Spark unit tests**.
    * This patch merges cleanly.
    * This patch adds no public classes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938440044


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48488/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] HyukjinKwon commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

HyukjinKwon commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938336429


   @AngersZhuuuu, let's also revise the PR title "Fix can not insert into hive bucket table if create table with upper case schema"


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938414555


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48488/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939417312


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48538/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] gengliangwang commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

gengliangwang commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938353085


   @dongjoon-hyun I can reproduce the issue on 3.0.0 and 3.1.1. It's a long-standing bug.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938739565


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48504/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins removed a comment on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

AmplabJenkins removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938364201


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48484/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-938350784


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48484/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939738932


   Kubernetes integration test starting
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48545/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA removed a comment on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

SparkQA removed a comment on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939407497


   **[Test build #144060 has started](https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/144060/testReport)** for PR 34218 at commit [`8e22a4d`](https://github.com/apache/spark/commit/8e22a4dcdc005bfbceebeaa999983a9cde493f76).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] AmplabJenkins commented on pull request #34218: [SPARK-35531][SQL] Directly pass hive Table to HiveClient when call getPartitions to avoid unnecessary convert from HiveTable -> CatalogTable -> HiveTable

Posted by GitBox <gi...@apache.org>.

AmplabJenkins commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939285019


   
   Refer to this link for build results (access rights to CI server needed): 
   https://amplab.cs.berkeley.edu/jenkins//job/SparkPullRequestBuilder-K8s/48531/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org

[GitHub] [spark] SparkQA commented on pull request #34218: [SPARK-35531][SQL] Fix can not insert into hive bucket table if create table with upper case schema

Posted by GitBox <gi...@apache.org>.

SparkQA commented on pull request #34218:
URL: https://github.com/apache/spark/pull/34218#issuecomment-939223738


   Kubernetes integration test status failure
   URL: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder-K8s/48521/
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org