You are viewing a plain text version of this content. The canonical link for it is here.
Posted to reviews@spark.apache.org by "cxzl25 (via GitHub)" <gi...@apache.org> on 2023/07/17 05:02:58 UTC

[GitHub] [spark] cxzl25 opened a new pull request, #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

cxzl25 opened a new pull request, #42033:
URL: https://github.com/apache/spark/pull/42033

   ### What changes were proposed in this pull request?
   When `Shim_v2_3#getTablesByType` call returns no `get_tables_by_type` method, throw `SparkUnsupportedOperationException`.
   `HiveClientImpl#listTablesByType` will have a fallback call.
   
   ### Why are the changes needed?
   When we use a high version of Hive Client to communicate with a low version of Hive meta store, we may encounter Invalid method name: 'get_tables_by_type'.
   
   ```java
   23/07/17 12:45:24,391 [main] DEBUG SparkSqlParser: Parsing command: show views
   23/07/17 12:45:24,489 [main] ERROR log: Got exception: org.apache.thrift.TApplicationException Invalid method name: 'get_tables_by_type'
   org.apache.thrift.TApplicationException: Invalid method name: 'get_tables_by_type'
       at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:79)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_tables_by_type(ThriftHiveMetastore.java:1433)
       at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_tables_by_type(ThriftHiveMetastore.java:1418)
       at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.getTables(HiveMetaStoreClient.java:1411)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:173)
       at com.sun.proxy.$Proxy23.getTables(Unknown Source)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.hadoop.hive.metastore.HiveMetaStoreClient$SynchronizedHandler.invoke(HiveMetaStoreClient.java:2344)
       at com.sun.proxy.$Proxy23.getTables(Unknown Source)
       at org.apache.hadoop.hive.ql.metadata.Hive.getTablesByType(Hive.java:1427)
       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
       at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
       at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
       at java.lang.reflect.Method.invoke(Method.java:498)
       at org.apache.spark.sql.hive.client.Shim_v2_3.getTablesByType(HiveShim.scala:1408)
       at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$listTablesByType$1(HiveClientImpl.scala:789)
       at org.apache.spark.sql.hive.client.HiveClientImpl.$anonfun$withHiveState$1(HiveClientImpl.scala:294)
       at org.apache.spark.sql.hive.client.HiveClientImpl.liftedTree1$1(HiveClientImpl.scala:225)
       at org.apache.spark.sql.hive.client.HiveClientImpl.retryLocked(HiveClientImpl.scala:224)
       at org.apache.spark.sql.hive.client.HiveClientImpl.withHiveState(HiveClientImpl.scala:274)
       at org.apache.spark.sql.hive.client.HiveClientImpl.listTablesByType(HiveClientImpl.scala:785)
       at org.apache.spark.sql.hive.HiveExternalCatalog.$anonfun$listViews$1(HiveExternalCatalog.scala:895)
       at org.apache.spark.sql.hive.HiveExternalCatalog.withClient(HiveExternalCatalog.scala:108)
       at org.apache.spark.sql.hive.HiveExternalCatalog.listViews(HiveExternalCatalog.scala:893)
       at org.apache.spark.sql.catalyst.catalog.ExternalCatalogWithListener.listViews(ExternalCatalogWithListener.scala:158)
       at org.apache.spark.sql.catalyst.catalog.SessionCatalog.listViews(SessionCatalog.scala:1040)
       at org.apache.spark.sql.execution.command.ShowViewsCommand.$anonfun$run$5(views.scala:407)
       at scala.Option.getOrElse(Option.scala:189)
       at org.apache.spark.sql.execution.command.ShowViewsCommand.run(views.scala:407) 
   ```
   
   ### Does this PR introduce _any_ user-facing change?
   No
   
   
   ### How was this patch tested?
   Use the built-in Hive 2.3.9 Client version to communicate with the Hive meta store version lower than 2.3, and test.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a diff in pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on code in PR #42033:
URL: https://github.com/apache/spark/pull/42033#discussion_r1272965017


##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala:
##########
@@ -1634,8 +1634,21 @@ private[client] class Shim_v2_3 extends Shim_v2_1 {
       pattern: String,
       tableType: TableType): Seq[String] = {
     recordHiveCall()
-    getTablesByTypeMethod.invoke(hive, dbName, pattern, tableType)
-      .asInstanceOf[JList[String]].asScala.toSeq
+    try {
+      getTablesByTypeMethod.invoke(hive, dbName, pattern, tableType)
+        .asInstanceOf[JList[String]].asScala.toSeq
+    } catch {
+      case ex: InvocationTargetException if ex.getCause.isInstanceOf[HiveException] =>
+        val cause = ex.getCause.getCause
+        if (cause != null && cause.isInstanceOf[MetaException] &&
+          cause.getMessage != null &&
+          cause.getMessage.contains("Invalid method name: 'get_tables_by_type'")) {
+          throw QueryExecutionErrors.getTablesByTypeUnsupportedByHiveVersionError()

Review Comment:
   Could you add a comment here? The exception should be align with https://github.com/apache/spark/blob/c5a23e9c23f7bd7066060d0791f290ad38fca76f/sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveClientImpl.scala#L835



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum closed pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum closed pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback 
URL: https://github.com/apache/spark/pull/42033


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on PR #42033:
URL: https://github.com/apache/spark/pull/42033#issuecomment-1649438672

   It's 2.3: https://issues.apache.org/jira/browse/HIVE-14558.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] pan3793 commented on pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

Posted by "pan3793 (via GitHub)" <gi...@apache.org>.
pan3793 commented on PR #42033:
URL: https://github.com/apache/spark/pull/42033#issuecomment-1649188534

   Which version of HMS starts to support `get_tables_by_type`?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on PR #42033:
URL: https://github.com/apache/spark/pull/42033#issuecomment-1653332871

   Thanks. Merged to master.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] cxzl25 commented on pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

Posted by "cxzl25 (via GitHub)" <gi...@apache.org>.
cxzl25 commented on PR #42033:
URL: https://github.com/apache/spark/pull/42033#issuecomment-1648958579

   @wangyum
   Please help to review, thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on PR #42033:
URL: https://github.com/apache/spark/pull/42033#issuecomment-1649186697

   cc @pan3793  @turboFei


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org


[GitHub] [spark] wangyum commented on a diff in pull request #42033: [SPARK-44454][SQL][HIVE] HiveShim getTablesByType support fallback

Posted by "wangyum (via GitHub)" <gi...@apache.org>.
wangyum commented on code in PR #42033:
URL: https://github.com/apache/spark/pull/42033#discussion_r1273007684


##########
sql/hive/src/main/scala/org/apache/spark/sql/hive/client/HiveShim.scala:
##########
@@ -1634,8 +1634,24 @@ private[client] class Shim_v2_3 extends Shim_v2_1 {
       pattern: String,
       tableType: TableType): Seq[String] = {
     recordHiveCall()
-    getTablesByTypeMethod.invoke(hive, dbName, pattern, tableType)
-      .asInstanceOf[JList[String]].asScala.toSeq
+    try {
+      getTablesByTypeMethod.invoke(hive, dbName, pattern, tableType)
+        .asInstanceOf[JList[String]].asScala.toSeq
+    } catch {
+      case ex: InvocationTargetException if ex.getCause.isInstanceOf[HiveException] =>
+        val cause = ex.getCause.getCause
+        if (cause != null && cause.isInstanceOf[MetaException] &&
+          cause.getMessage != null &&
+          cause.getMessage.contains("Invalid method name: 'get_tables_by_type'")) {
+          // SparkUnsupportedOperationException (inherited from UnsupportedOperationException)
+          // is thrown when the Shim_v2_3#getTablesByType call returns no get_tables_by_type method.
+          // HiveClientImpl#listTablesByType will have fallback processing.
+          throw QueryExecutionErrors.getTablesByTypeUnsupportedByHiveVersionError()
+        }
+        else {
+          throw ex
+        }

Review Comment:
   ```suggestion
           } else {
             throw ex
           }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: reviews-unsubscribe@spark.apache.org
For additional commands, e-mail: reviews-help@spark.apache.org