You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "Ethan Guo (Jira)" <ji...@apache.org> on 2021/11/09 05:40:00 UTC

[jira] [Commented] (HUDI-2332) Implement scheduling of compaction/ clustering for Kafka Connect

    [ https://issues.apache.org/jira/browse/HUDI-2332?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17440909#comment-17440909 ] 

Ethan Guo commented on HUDI-2332:
---------------------------------

clusteringjob.properties:
{code:java}
hoodie.datasource.write.recordkey.field=volume
hoodie.datasource.write.partitionpath.field=date
hoodie.deltastreamer.schemaprovider.registry.url=http://localhost:8081/subjects/hudi-test-topic/versions/latest


hoodie.clustering.plan.strategy.target.file.max.bytes=1073741824
hoodie.clustering.plan.strategy.small.file.limit=629145600
hoodie.clustering.execution.strategy.class=org.apache.hudi.client.clustering.run.strategy.SparkSortAndSizeExecutionStrategy
hoodie.clustering.plan.strategy.sort.columns=volume


hoodie.write.concurrency.mode=single_writer {code}

> Implement scheduling of compaction/ clustering for Kafka Connect
> ----------------------------------------------------------------
>
>                 Key: HUDI-2332
>                 URL: https://issues.apache.org/jira/browse/HUDI-2332
>             Project: Apache Hudi
>          Issue Type: Sub-task
>            Reporter: Rajesh Mahindra
>            Assignee: Ethan Guo
>            Priority: Blocker
>              Labels: pull-request-available
>             Fix For: 0.10.0
>
>
> * Implement compaction/ clustering etc. from Java client
>  * Schedule from Coordinator



--
This message was sent by Atlassian Jira
(v8.20.1#820001)