You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "sivabalan narayanan (Jira)" <ji...@apache.org> on 2021/01/05 16:49:00 UTC

[jira] [Commented] (HUDI-1507) Hive sync having issues w/ Clustering

    [ https://issues.apache.org/jira/browse/HUDI-1507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17259040#comment-17259040 ] 

sivabalan narayanan commented on HUDI-1507:
-------------------------------------------

CC : [~satish] 

> Hive sync having issues w/ Clustering
> -------------------------------------
>
>                 Key: HUDI-1507
>                 URL: https://issues.apache.org/jira/browse/HUDI-1507
>             Project: Apache Hudi
>          Issue Type: Bug
>          Components: Storage Management
>    Affects Versions: 0.7.0
>            Reporter: sivabalan narayanan
>            Priority: Major
>
> I was trying out clustering w/ test suite job and ran into hive sync issues.
>  
> 21/01/05 16:45:05 WARN DagNode: Executing ClusteringNode node 5522853c-653b-4d92-acf4-d299c263a77f
> 21/01/05 16:45:05 WARN AbstractHoodieWriteClient: Scheduling clustering at instant time :20210105164505 clustering strategy org.apache.hudi.client.clustering.plan.strategy.SparkRecentDaysClusteringPlanStrategy, clustering sort cols : _row_key, target partitions for clustering :: 0, inline cluster max commit : 1
> 21/01/05 16:45:05 WARN HoodieTestSuiteWriter: Clustering instant :: 20210105164505
> 21/01/05 16:45:22 WARN DagScheduler: Executing node "second_hive_sync" :: \{"queue_name":"adhoc","engine":"mr","name":"80325009-bb92-4df5-8c34-71bd75d001b8","config":"second_hive_sync"}
> 21/01/05 16:45:22 ERROR HiveSyncTool: Got runtime exception when hive syncing
> org.apache.hudi.exception.HoodieIOException: unknown action in timeline replacecommit
>  at org.apache.hudi.common.table.timeline.TimelineUtils.lambda$getAffectedPartitions$1(TimelineUtils.java:99)
>  at java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:267)
>  at java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1382)
>  at java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:482)
>  at java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:472)
>  at java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:708)
>  at java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
>  at java.util.stream.ReferencePipeline.collect(ReferencePipeline.java:499)
>  at org.apache.hudi.common.table.timeline.TimelineUtils.getAffectedPartitions(TimelineUtils.java:102)
>  at org.apache.hudi.common.table.timeline.TimelineUtils.getPartitionsWritten(TimelineUtils.java:50)
>  at org.apache.hudi.sync.common.AbstractSyncHoodieClient.getPartitionsWrittenToSince(AbstractSyncHoodieClient.java:136)
>  at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:145)
>  at org.apache.hudi.hive.HiveSyncTool.syncHoodieTable(HiveSyncTool.java:94)
>  at org.apache.hudi.utilities.deltastreamer.DeltaSync.syncHive(DeltaSync.java:589)
>  at org.apache.hudi.integ.testsuite.helpers.HiveServiceProvider.syncToLocalHiveIfNeeded(HiveServiceProvider.java:53)
>  at org.apache.hudi.integ.testsuite.dag.nodes.HiveSyncNode.execute(HiveSyncNode.java:41)
>  at org.apache.hudi.integ.testsuite.dag.scheduler.DagScheduler.executeNode(DagScheduler.java:139)
>  at org.apache.hudi.integ.testsuite.dag.scheduler.DagScheduler.lambda$execute$0(DagScheduler.java:105)
>  at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>  at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>  at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>  at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>  at java.lang.Thread.run(Thread.java:748)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)