You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Raymond Xu (Jira)" <ji...@apache.org> on 2022/03/28 18:07:00 UTC
[jira] [Commented] (HUDI-2524) Certify Hive sync on cloud platforms
[ https://issues.apache.org/jira/browse/HUDI-2524?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17513545#comment-17513545 ]
Raymond Xu commented on HUDI-2524:
----------------------------------
{code:java}
//
Exception in thread "main" org.apache.hudi.exception.HoodieException: Could not sync using the meta sync class org.apache.hudi.hive.HiveSyncTool
at org.apache.hudi.sync.common.util.SyncUtilHelpers.runHoodieMetaSync(SyncUtilHelpers.java:42)
at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncMeta(DeltaSync.java:704)
at org.apache.hudi.utilities.deltastreamer.DeltaSync.writeToSink(DeltaSync.java:623)
at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncOnce(DeltaSync.java:327)
at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.lambda$sync$2(HoodieDeltaStreamer.java:193)
at org.apache.hudi.common.util.Option.ifPresent(Option.java:97)
at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.sync(HoodieDeltaStreamer.java:191)
at org.apache.hudi.utilities.deltastreamer.HoodieDeltaStreamer.main(HoodieDeltaStreamer.java:530)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:863)
at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:161)
at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:184)
at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:86)
at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:938)
at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:947)
at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.hudi.exception.HoodieException: Got runtime exception when hive syncing stocks20220328t175931
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:141)
at org.apache.hudi.sync.common.util.SyncUtilHelpers.runHoodieMetaSync(SyncUtilHelpers.java:40)
... 19 more
Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Failed to sync partitions for table stocks20220328t175931
at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:412)
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:230)
at org.apache.hudi.hive.HiveSyncTool.doSync(HiveSyncTool.java:150)
at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:138)
... 20 more
Caused by: org.apache.hudi.hive.HoodieHiveSyncException: Failed to get all partitions for table rxusandbox.stocks20220328t175931
at org.apache.hudi.hive.HoodieHiveClient.getAllPartitions(HoodieHiveClient.java:160)
at org.apache.hudi.hive.HiveSyncTool.syncPartitions(HiveSyncTool.java:388)
... 23 more
Caused by: NoSuchObjectException(message:rxusandbox.stocks20220328t175931 table not found)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.read(ThriftHiveMetastore.java:64527)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result$get_partitions_resultStandardScheme.read(ThriftHiveMetastore.java:64494)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$get_partitions_result.read(ThriftHiveMetastore.java:64428)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:86)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.recv_get_partitions(ThriftHiveMetastore.java:1998)
at org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore$Client.get_partitions(ThriftHiveMetastore.java:1983)
at org.apache.hadoop.hive.metastore.HiveMetaStoreClient.listPartitions(HiveMetaStoreClient.java:1058)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at org.apache.hadoop.hive.metastore.RetryingMetaStoreClient.invoke(RetryingMetaStoreClient.java:184)
at com.sun.proxy.$Proxy89.listPartitions(Unknown Source)
at org.apache.hudi.hive.HoodieHiveClient.getAllPartitions(HoodieHiveClient.java:155)
... 24 more{code}
> Certify Hive sync on cloud platforms
> ------------------------------------
>
> Key: HUDI-2524
> URL: https://issues.apache.org/jira/browse/HUDI-2524
> Project: Apache Hudi
> Issue Type: Task
> Reporter: Sagar Sumit
> Assignee: Raymond Xu
> Priority: Blocker
> Fix For: 0.11.0
>
>
> For instance, hive sync should work seamlessly not just with Apache Hive but also EMR Hive.
> EMR 6.x has Hive 3.1.2, and the later versions of EMR 5.x has Hive 2.3.x. While the HiveSyncTool is known to work with Hive 2.3.x.
> The scope of the ticket is to verify that the hive sync through Hudi works with EMR Hive 3.1.x as well.
> We can refer to [https://docs.aws.amazon.com/emr/latest/ReleaseGuide/emr-hudi-work-with-dataset.html] for hive sync properties.
> The purpose of this verification is that hudi-hive-sync has Hive 2.3.1 as compile-time dependency, so we need to check if the hive APIs used by the sync tool are compatible with Hive 3.1.x.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)