You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by GitBox <gi...@apache.org> on 2022/09/08 16:55:18 UTC

[GitHub] [spark] peter-toth commented on a diff in pull request #36027: [SPARK-38717][SQL] Handle Hive's bucket spec case preserving behaviour

peter-toth commented on code in PR #36027:
URL: https://github.com/apache/spark/pull/36027#discussion_r966199347


##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala:
##########
@@ -1095,7 +1095,11 @@ private[hive] object HiveClientImpl extends Logging {
     table.bucketSpec match {
       case Some(bucketSpec) if !HiveExternalCatalog.isDatasourceTable(table) =>
         hiveTable.setNumBuckets(bucketSpec.numBuckets)
-        hiveTable.setBucketCols(bucketSpec.bucketColumnNames.toList.asJava)

Review Comment:
   The issue here is that this `toHiveTable()` is called 2 times during the test below.
   First when the table is created. At that time the `table.schema` contains uppercase `B_C` and so does `table.bucketSpec` too (`B_C`). So simply lowercasing `bucketColumnNames` here for `setBucketCols()`  would throw a similar exception to the one in the description, but bucketspec would be lower and schema column would be uppercase.
   
   Then 2nd time during the `collect()`, the Hive table is restored from metastore for a `listPartitionsByFilter()` Hive call. But this time `schema` contains lowercase `b_c` (columns are not case preserved) but `bucketSpec` contains uppercase `B_C` (bucket spec is case preserved for some reason) and `setBucketCols()` throws the exception in the description. I'm trying to fix this issue.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org