You are viewing a plain text version of this content. The canonical link for it is here.
Posted to notifications@kyuubi.apache.org by "Yikf (via GitHub)" <gi...@apache.org> on 2023/06/27 07:31:00 UTC

[GitHub] [kyuubi] Yikf opened a new pull request, #4999: codecov module should contain the spark 3.4 profile

Yikf opened a new pull request, #4999:
URL: https://github.com/apache/kyuubi/pull/4999

   <!--
   Thanks for sending a pull request!
   
   Here are some tips for you:
     1. If this is your first time, please read our contributor guidelines: https://kyuubi.readthedocs.io/en/latest/community/CONTRIBUTING.html
     2. If the PR is related to an issue in https://github.com/apache/kyuubi/issues, add '[KYUUBI #XXXX]' in your PR title, e.g., '[KYUUBI #XXXX] Your PR title ...'.
     3. If the PR is unfinished, add '[WIP]' in your PR title, e.g., '[WIP][KYUUBI #XXXX] Your PR title ...'.
   -->
   
   ### _Why are the changes needed?_
   <!--
   Please clarify why the changes are needed. For instance,
     1. If you add a feature, you can talk about the use case of it.
     2. If you fix a bug, you can clarify why it is a bug.
   -->
   Apache kyuubi `codecov` module should contain the spark 3.4 profile. So that Apache kyubbi CI can cover some modules.
   
   ### _How was this patch tested?_
   - [ ] Add some test cases that check the changes thoroughly including negative and positive cases if possible
   
   - [ ] Add screenshots for manual tests if appropriate
   
   - [ ] [Run test](https://kyuubi.readthedocs.io/en/master/contributing/code/testing.html#running-tests) locally before make a pull request
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251509100


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala:
##########
@@ -0,0 +1,258 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTablePartition}
+import org.apache.spark.sql.connector.catalog.TableChange
+import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, ColumnPosition, DeleteColumn, First, RenameColumn, UpdateColumnComment, UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.execution.command.CommandUtils
+import org.apache.spark.sql.execution.command.CommandUtils.{calculateMultipleLocationSizes, calculateSingleLocationSize}
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.types.{ArrayType, MapType, StructField, StructType}
+
+import org.apache.kyuubi.spark.connector.common.SparkUtils
+import org.apache.kyuubi.util.SemanticVersion
+import org.apache.kyuubi.util.reflect.ReflectUtils.invokeAs
+
+object HiveConnectorUtils extends Logging {
+
+  def partitionedFilePath(file: PartitionedFile): String = {
+    if (SparkUtils.isSparkVersionAtLeast("3.4")) {
+      invokeAs[String](file, "urlEncodedPath")
+    } else if (SparkUtils.isSparkVersionAtLeast("3.3")) {
+      invokeAs[String](file, "filePath")
+    } else {
+      throw KyuubiHiveConnectorException(s"Spark version $SPARK_VERSION " +
+        s"is not supported by Kyuubi spark hive connector.")
+    }
+  }
+
+  def calculateTotalSize(
+      spark: SparkSession,
+      catalogTable: CatalogTable,
+      hiveTableCatalog: HiveTableCatalog): (BigInt, Seq[CatalogTablePartition]) = {
+    val sessionState = spark.sessionState
+    val startTime = System.nanoTime()
+    val (totalSize, newPartitions) = if (catalogTable.partitionColumnNames.isEmpty) {
+      (
+        calculateSingleLocationSize(
+          sessionState,
+          catalogTable.identifier,
+          catalogTable.storage.locationUri),
+        Seq())
+    } else {
+      // Calculate table size as a sum of the visible partitions. See SPARK-21079
+      val partitions = hiveTableCatalog.listPartitions(catalogTable.identifier)
+      logInfo(s"Starting to calculate sizes for ${partitions.length} partitions.")
+      val paths = partitions.map(_.storage.locationUri)
+      val sizes = calculateMultipleLocationSizes(spark, catalogTable.identifier, paths)
+      val newPartitions = partitions.zipWithIndex.flatMap { case (p, idx) =>
+        val newStats = CommandUtils.compareAndGetNewStats(p.stats, sizes(idx), None)
+        newStats.map(_ => p.copy(stats = newStats))
+      }
+      (sizes.sum, newPartitions)
+    }
+    logInfo(s"It took ${(System.nanoTime() - startTime) / (1000 * 1000)} ms to calculate" +
+      s" the total size for table ${catalogTable.identifier}.")
+    (totalSize, newPartitions)
+  }
+
+  def applySchemaChanges(schema: StructType, changes: Seq[TableChange]): StructType = {
+    changes.foldLeft(schema) { (schema, change) =>
+      change match {
+        case add: AddColumn =>
+          add.fieldNames match {
+            case Array(name) =>
+              val field = StructField(name, add.dataType, nullable = add.isNullable)
+              val newField = Option(add.comment).map(field.withComment).getOrElse(field)
+              addField(schema, newField, add.position())
+
+            case names =>
+              replace(
+                schema,
+                names.init,
+                parent =>
+                  parent.dataType match {
+                    case parentType: StructType =>
+                      val field = StructField(names.last, add.dataType, nullable = add.isNullable)
+                      val newField = Option(add.comment).map(field.withComment).getOrElse(field)
+                      Some(parent.copy(dataType = addField(parentType, newField, add.position())))
+
+                    case _ =>
+                      throw new IllegalArgumentException(s"Not a struct: ${names.init.last}")
+                  })
+          }
+
+        case rename: RenameColumn =>
+          replace(
+            schema,
+            rename.fieldNames,
+            field =>
+              Some(StructField(rename.newName, field.dataType, field.nullable, field.metadata)))
+
+        case update: UpdateColumnType =>
+          replace(
+            schema,
+            update.fieldNames,
+            field => Some(field.copy(dataType = update.newDataType)))
+
+        case update: UpdateColumnNullability =>
+          replace(
+            schema,
+            update.fieldNames,
+            field => Some(field.copy(nullable = update.nullable)))
+
+        case update: UpdateColumnComment =>
+          replace(
+            schema,
+            update.fieldNames,
+            field =>
+              Some(field.withComment(update.newComment)))

Review Comment:
   ```suggestion
               field => Some(field.withComment(update.newComment)))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 closed pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 closed pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4
URL: https://github.com/apache/kyuubi/pull/4999


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251508164


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala:
##########
@@ -0,0 +1,262 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTablePartition}
+import org.apache.spark.sql.connector.catalog.TableChange
+import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, ColumnPosition, DeleteColumn, First, RenameColumn, UpdateColumnComment, UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.execution.command.CommandUtils
+import org.apache.spark.sql.execution.command.CommandUtils.{calculateMultipleLocationSizes, calculateSingleLocationSize}
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.types.{ArrayType, MapType, StructField, StructType}
+
+import org.apache.kyuubi.spark.connector.common.SparkUtils
+import org.apache.kyuubi.util.SemanticVersion
+import org.apache.kyuubi.util.reflect.ReflectUtils.invokeAs
+
+object HiveConnectorUtils extends Logging {
+
+  def partitionedFilePath(file: PartitionedFile): String = {
+    if (SparkUtils.isSparkVersionAtLeast("3.4")) {
+      invokeAs[String](file, "urlEncodedPath")
+    } else if (SparkUtils.isSparkVersionAtLeast("3.3")) {
+      invokeAs[String](file, "filePath")
+    } else {
+      throw KyuubiHiveConnectorException(s"Spark version ${SemanticVersion(SPARK_VERSION)} " +

Review Comment:
   ```suggestion
         throw KyuubiHiveConnectorException(s"Spark version $SPARK_VERSION " +
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251585616


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala:
##########
@@ -0,0 +1,257 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTablePartition}
+import org.apache.spark.sql.connector.catalog.TableChange
+import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, ColumnPosition, DeleteColumn, First, RenameColumn, UpdateColumnComment, UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.execution.command.CommandUtils
+import org.apache.spark.sql.execution.command.CommandUtils.{calculateMultipleLocationSizes, calculateSingleLocationSize}
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.types.{ArrayType, MapType, StructField, StructType}
+
+import org.apache.kyuubi.spark.connector.common.SparkUtils
+import org.apache.kyuubi.util.SemanticVersion

Review Comment:
   ```suggestion
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] Yikf commented on pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "Yikf (via GitHub)" <gi...@apache.org>.
Yikf commented on PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#issuecomment-1619881289

   Ahh, Thanks @pan3793 for helping me fix these points.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1250196009


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/spark-3.4/org/apache/kyuubi/spark/connector/hive/PartitionedFileUtils.scala:
##########
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+
+object PartitionedFileUtils {

Review Comment:
   we don't need to introduce spark-3.3 and spark-3.4 folders, please use reflection to resolve it.
   
   BTW, it's expected that the connector compiled against spark-3.3 can work on both 3.3 and 3.4



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#issuecomment-1619883676

   Thanks, merged to master


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251505974


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala:
##########
@@ -0,0 +1,262 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTablePartition}
+import org.apache.spark.sql.connector.catalog.TableChange
+import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, ColumnPosition, DeleteColumn, First, RenameColumn, UpdateColumnComment, UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.execution.command.CommandUtils
+import org.apache.spark.sql.execution.command.CommandUtils.{calculateMultipleLocationSizes, calculateSingleLocationSize}
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.types.{ArrayType, MapType, StructField, StructType}
+
+import org.apache.kyuubi.spark.connector.common.SparkUtils
+import org.apache.kyuubi.util.SemanticVersion
+import org.apache.kyuubi.util.reflect.ReflectUtils.invokeAs
+
+object HiveConnectorUtils extends Logging {
+
+  def partitionedFilePath(file: PartitionedFile): String = {
+    if (SparkUtils.isSparkVersionAtLeast("3.4.0")) {

Review Comment:
   ```suggestion
       if (SparkUtils.isSparkVersionAtLeast("3.4")) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] codecov-commenter commented on pull request #4999: codecov module should contain the spark 3.4 profile

Posted by "codecov-commenter (via GitHub)" <gi...@apache.org>.
codecov-commenter commented on PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#issuecomment-1609073485

   ## [Codecov](https://app.codecov.io/gh/apache/kyuubi/pull/4999?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) Report
   > Merging [#4999](https://app.codecov.io/gh/apache/kyuubi/pull/4999?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (7f7cfa8) into [master](https://app.codecov.io/gh/apache/kyuubi/commit/1dd9db7492e7b6fd9cda173dc362235f374f89d6?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache) (1dd9db7) will **not change** coverage.
   > The diff coverage is `n/a`.
   
   ```diff
   @@          Coverage Diff           @@
   ##           master   #4999   +/-   ##
   ======================================
     Coverage    0.00%   0.00%           
   ======================================
     Files         563     563           
     Lines       31167   31167           
     Branches     4072    4072           
   ======================================
     Misses      31167   31167           
   ```
   
   
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=apache)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on pull request #4999: codecov module should contain the spark 3.4 profile

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#issuecomment-1609097006

   > By the way, if you don't mind, why did you remove the kudu module?
   
   it's a dummy module, was planned but actually delayed.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] Yikf commented on pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "Yikf (via GitHub)" <gi...@apache.org>.
Yikf commented on PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#issuecomment-1617126136

   Could you please take a look again if you find a time ~ @pan3793 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251507776


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala:
##########
@@ -0,0 +1,262 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTablePartition}
+import org.apache.spark.sql.connector.catalog.TableChange
+import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, ColumnPosition, DeleteColumn, First, RenameColumn, UpdateColumnComment, UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.execution.command.CommandUtils
+import org.apache.spark.sql.execution.command.CommandUtils.{calculateMultipleLocationSizes, calculateSingleLocationSize}
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.types.{ArrayType, MapType, StructField, StructType}
+
+import org.apache.kyuubi.spark.connector.common.SparkUtils
+import org.apache.kyuubi.util.SemanticVersion
+import org.apache.kyuubi.util.reflect.ReflectUtils.invokeAs
+
+object HiveConnectorUtils extends Logging {
+
+  def partitionedFilePath(file: PartitionedFile): String = {
+    if (SparkUtils.isSparkVersionAtLeast("3.4.0")) {

Review Comment:
   ```suggestion
       if (SparkUtils.isSparkVersionAtLeast("3.4")) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251505806


##########
pom.xml:
##########
@@ -1889,9 +1889,9 @@
                         </java>
                         <scala>
                             <includes>
-                                <include>src/main/scala/**/*.scala</include>
-                                <include>src/test/scala/**/*.scala</include>
-                                <include>src/test/gen/scala/**/*.scala</include>
+                                <include>src/main/**/*.scala</include>
+                                <include>src/test/**/*.scala</include>
+                                <include>src/test/gen/**/*.scala</include>

Review Comment:
   ```suggestion
                                   <include>src/main/scala/**/*.scala</include>
                                   <include>src/test/scala/**/*.scala</include>
                                   <include>src/test/gen/scala/**/*.scala</include>
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: codecov module should contain the spark 3.4 profile

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1243379727


##########
dev/kyuubi-codecov/pom.xml:
##########
@@ -209,5 +209,16 @@
                 </dependency>
             </dependencies>
         </profile>
+        <profile>
+            <id>spark-3.4</id>
+            <dependencies>
+                <dependency>
+                    <groupId>org.apache.kyuubi</groupId>
+                    <artifactId>kyuubi-spark-connector-kudu_${scala.binary.version}</artifactId>

Review Comment:
   actually, I'm going to remove this dummy module ...



##########
dev/kyuubi-codecov/pom.xml:
##########
@@ -209,5 +209,16 @@
                 </dependency>
             </dependencies>
         </profile>
+        <profile>
+            <id>spark-3.4</id>
+            <dependencies>
+                <dependency>
+                    <groupId>org.apache.kyuubi</groupId>
+                    <artifactId>kyuubi-spark-connector-kudu_${scala.binary.version}</artifactId>
+                    <version>${project.version}</version>
+                </dependency>
+                <!-- TODO Support Apache Spark 3.4 for KSHC -->

Review Comment:
   let's fix and add this one instead



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] Yikf commented on pull request #4999: codecov module should contain the spark 3.4 profile

Posted by "Yikf (via GitHub)" <gi...@apache.org>.
Yikf commented on PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#issuecomment-1609093998

   Thanks pan @pan3793  for your points, Let me:
   - Removes codecov's reference to kudu in profile spark-3.4
   - Make KSHC support Spark 3.4 in this patch and add it to the codecov Spark-3.4 profile
   
   By the way, if you don't mind, why did you remove the kudu module?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251508918


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala:
##########
@@ -0,0 +1,260 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTablePartition}
+import org.apache.spark.sql.connector.catalog.TableChange
+import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, ColumnPosition, DeleteColumn, First, RenameColumn, UpdateColumnComment, UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.execution.command.CommandUtils
+import org.apache.spark.sql.execution.command.CommandUtils.{calculateMultipleLocationSizes, calculateSingleLocationSize}
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.types.{ArrayType, MapType, StructField, StructType}
+
+import org.apache.kyuubi.spark.connector.common.SparkUtils
+import org.apache.kyuubi.util.SemanticVersion
+import org.apache.kyuubi.util.reflect.ReflectUtils.invokeAs
+
+object HiveConnectorUtils extends Logging {
+
+  def partitionedFilePath(file: PartitionedFile): String = {
+    if (SparkUtils.isSparkVersionAtLeast("3.4")) {
+      invokeAs[String](file, "urlEncodedPath")
+    } else if (SparkUtils.isSparkVersionAtLeast("3.3")) {
+      invokeAs[String](file, "filePath")
+    } else {
+      throw KyuubiHiveConnectorException(s"Spark version $SPARK_VERSION " +
+        s"is not supported by Kyuubi spark hive connector.")
+    }
+  }
+
+  def calculateTotalSize(
+      spark: SparkSession,
+      catalogTable: CatalogTable,
+      hiveTableCatalog: HiveTableCatalog): (BigInt, Seq[CatalogTablePartition]) = {
+    val sessionState = spark.sessionState
+    val startTime = System.nanoTime()
+    val (totalSize, newPartitions) = if (catalogTable.partitionColumnNames.isEmpty) {
+      (
+        calculateSingleLocationSize(
+          sessionState,
+          catalogTable.identifier,
+          catalogTable.storage.locationUri),
+        Seq())
+    } else {
+      // Calculate table size as a sum of the visible partitions. See SPARK-21079
+      val partitions = hiveTableCatalog.listPartitions(catalogTable.identifier)
+      logInfo(s"Starting to calculate sizes for ${partitions.length} partitions.")
+      val paths = partitions.map(_.storage.locationUri)
+      val sizes = calculateMultipleLocationSizes(spark, catalogTable.identifier, paths)
+      val newPartitions = partitions.zipWithIndex.flatMap { case (p, idx) =>
+        val newStats = CommandUtils.compareAndGetNewStats(p.stats, sizes(idx), None)
+        newStats.map(_ => p.copy(stats = newStats))
+      }
+      (sizes.sum, newPartitions)
+    }
+    logInfo(s"It took ${(System.nanoTime() - startTime) / (1000 * 1000)} ms to calculate" +
+      s" the total size for table ${catalogTable.identifier}.")
+    (totalSize, newPartitions)
+  }
+
+  def applySchemaChanges(schema: StructType, changes: Seq[TableChange]): StructType = {
+    changes.foldLeft(schema) { (schema, change) =>
+      change match {
+        case add: AddColumn =>
+          add.fieldNames match {
+            case Array(name) =>
+              val field = StructField(name, add.dataType, nullable = add.isNullable)
+              val newField = Option(add.comment).map(field.withComment).getOrElse(field)
+              addField(schema, newField, add.position())
+
+            case names =>
+              replace(
+                schema,
+                names.init,
+                parent =>
+                  parent.dataType match {
+                    case parentType: StructType =>
+                      val field = StructField(names.last, add.dataType, nullable = add.isNullable)
+                      val newField = Option(add.comment).map(field.withComment).getOrElse(field)
+                      Some(parent.copy(dataType = addField(parentType, newField, add.position())))
+
+                    case _ =>
+                      throw new IllegalArgumentException(s"Not a struct: ${names.init.last}")
+                  })
+          }
+
+        case rename: RenameColumn =>
+          replace(
+            schema,
+            rename.fieldNames,
+            field =>
+              Some(StructField(rename.newName, field.dataType, field.nullable, field.metadata)))
+
+        case update: UpdateColumnType =>
+          replace(
+            schema,
+            update.fieldNames,
+            field => Some(field.copy(dataType = update.newDataType)))
+
+        case update: UpdateColumnNullability =>
+          replace(
+            schema,
+            update.fieldNames,
+            field => {
+              Some(field.copy(nullable = update.nullable))
+            })

Review Comment:
   ```suggestion
               field => Some(field.copy(nullable = update.nullable)))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] Yikf commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "Yikf (via GitHub)" <gi...@apache.org>.
Yikf commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1250502015


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/spark-3.4/org/apache/kyuubi/spark/connector/hive/PartitionedFileUtils.scala:
##########
@@ -0,0 +1,27 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+
+object PartitionedFileUtils {

Review Comment:
   Thanks, updated



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251508601


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala:
##########
@@ -0,0 +1,262 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTablePartition}
+import org.apache.spark.sql.connector.catalog.TableChange
+import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, ColumnPosition, DeleteColumn, First, RenameColumn, UpdateColumnComment, UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.execution.command.CommandUtils
+import org.apache.spark.sql.execution.command.CommandUtils.{calculateMultipleLocationSizes, calculateSingleLocationSize}
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.types.{ArrayType, MapType, StructField, StructType}
+
+import org.apache.kyuubi.spark.connector.common.SparkUtils
+import org.apache.kyuubi.util.SemanticVersion
+import org.apache.kyuubi.util.reflect.ReflectUtils.invokeAs
+
+object HiveConnectorUtils extends Logging {
+
+  def partitionedFilePath(file: PartitionedFile): String = {
+    if (SparkUtils.isSparkVersionAtLeast("3.4")) {
+      invokeAs[String](file, "urlEncodedPath")
+    } else if (SparkUtils.isSparkVersionAtLeast("3.3")) {
+      invokeAs[String](file, "filePath")
+    } else {
+      throw KyuubiHiveConnectorException(s"Spark version $SPARK_VERSION " +
+        s"is not supported by Kyuubi spark hive connector.")
+    }
+  }
+
+  def calculateTotalSize(
+      spark: SparkSession,
+      catalogTable: CatalogTable,
+      hiveTableCatalog: HiveTableCatalog): (BigInt, Seq[CatalogTablePartition]) = {
+    val sessionState = spark.sessionState
+    val startTime = System.nanoTime()
+    val (totalSize, newPartitions) = if (catalogTable.partitionColumnNames.isEmpty) {
+      (
+        calculateSingleLocationSize(
+          sessionState,
+          catalogTable.identifier,
+          catalogTable.storage.locationUri),
+        Seq())
+    } else {
+      // Calculate table size as a sum of the visible partitions. See SPARK-21079
+      val partitions = hiveTableCatalog.listPartitions(catalogTable.identifier)
+      logInfo(s"Starting to calculate sizes for ${partitions.length} partitions.")
+      val paths = partitions.map(_.storage.locationUri)
+      val sizes = calculateMultipleLocationSizes(spark, catalogTable.identifier, paths)
+      val newPartitions = partitions.zipWithIndex.flatMap { case (p, idx) =>
+        val newStats = CommandUtils.compareAndGetNewStats(p.stats, sizes(idx), None)
+        newStats.map(_ => p.copy(stats = newStats))
+      }
+      (sizes.sum, newPartitions)
+    }
+    logInfo(s"It took ${(System.nanoTime() - startTime) / (1000 * 1000)} ms to calculate" +
+      s" the total size for table ${catalogTable.identifier}.")
+    (totalSize, newPartitions)
+  }
+
+  def applySchemaChanges(schema: StructType, changes: Seq[TableChange]): StructType = {
+    changes.foldLeft(schema) { (schema, change) =>
+      change match {
+        case add: AddColumn =>
+          add.fieldNames match {
+            case Array(name) =>
+              val field = StructField(name, add.dataType, nullable = add.isNullable)
+              val newField = Option(add.comment).map(field.withComment).getOrElse(field)
+              addField(schema, newField, add.position())
+
+            case names =>
+              replace(
+                schema,
+                names.init,
+                parent =>
+                  parent.dataType match {
+                    case parentType: StructType =>
+                      val field = StructField(names.last, add.dataType, nullable = add.isNullable)
+                      val newField = Option(add.comment).map(field.withComment).getOrElse(field)
+                      Some(parent.copy(dataType = addField(parentType, newField, add.position())))
+
+                    case _ =>
+                      throw new IllegalArgumentException(s"Not a struct: ${names.init.last}")
+                  })
+          }
+
+        case rename: RenameColumn =>
+          replace(
+            schema,
+            rename.fieldNames,
+            field =>
+              Some(StructField(rename.newName, field.dataType, field.nullable, field.metadata)))
+
+        case update: UpdateColumnType =>
+          replace(
+            schema,
+            update.fieldNames,
+            field => {
+              Some(field.copy(dataType = update.newDataType))
+            })

Review Comment:
   ```suggestion
               field => Some(field.copy(dataType = update.newDataType)))
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org


[GitHub] [kyuubi] pan3793 commented on a diff in pull request #4999: [KSHC] Kyuubi-Spark-Hive-Connector support Apache Spark 3.4

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on code in PR #4999:
URL: https://github.com/apache/kyuubi/pull/4999#discussion_r1251507973


##########
extensions/spark/kyuubi-spark-connector-hive/src/main/scala/org/apache/kyuubi/spark/connector/hive/HiveConnectorUtils.scala:
##########
@@ -0,0 +1,262 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements.  See the NOTICE file distributed with
+ * this work for additional information regarding copyright ownership.
+ * The ASF licenses this file to You under the Apache License, Version 2.0
+ * (the "License"); you may not use this file except in compliance with
+ * the License.  You may obtain a copy of the License at
+ *
+ *    http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.kyuubi.spark.connector.hive
+
+import org.apache.spark.SPARK_VERSION
+import org.apache.spark.internal.Logging
+import org.apache.spark.sql.SparkSession
+import org.apache.spark.sql.catalyst.catalog.{CatalogTable, CatalogTablePartition}
+import org.apache.spark.sql.connector.catalog.TableChange
+import org.apache.spark.sql.connector.catalog.TableChange.{AddColumn, After, ColumnPosition, DeleteColumn, First, RenameColumn, UpdateColumnComment, UpdateColumnNullability, UpdateColumnPosition, UpdateColumnType}
+import org.apache.spark.sql.execution.command.CommandUtils
+import org.apache.spark.sql.execution.command.CommandUtils.{calculateMultipleLocationSizes, calculateSingleLocationSize}
+import org.apache.spark.sql.execution.datasources.PartitionedFile
+import org.apache.spark.sql.types.{ArrayType, MapType, StructField, StructType}
+
+import org.apache.kyuubi.spark.connector.common.SparkUtils
+import org.apache.kyuubi.util.SemanticVersion
+import org.apache.kyuubi.util.reflect.ReflectUtils.invokeAs
+
+object HiveConnectorUtils extends Logging {
+
+  def partitionedFilePath(file: PartitionedFile): String = {
+    if (SparkUtils.isSparkVersionAtLeast("3.4")) {
+      invokeAs[String](file, "urlEncodedPath")
+    } else if (SparkUtils.isSparkVersionAtLeast("3.3.0")) {

Review Comment:
   ```suggestion
       } else if (SparkUtils.isSparkVersionAtLeast("3.3")) {
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: notifications-unsubscribe@kyuubi.apache.org
For additional commands, e-mail: notifications-help@kyuubi.apache.org