You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@hudi.apache.org by "liujinhui (Jira)" <ji...@apache.org> on 2020/06/09 01:22:00 UTC

[jira] [Commented] (HUDI-914) support different target data clusters

    [ https://issues.apache.org/jira/browse/HUDI-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128757#comment-17128757 ] 

liujinhui commented on HUDI-914:
--------------------------------

Due to the needs of some business parties, they only want the hudi dataset to appear on their clusters, and they do not want to pay attention to specific tasks
[~vinoth]

> support different target data clusters
> --------------------------------------
>
>                 Key: HUDI-914
>                 URL: https://issues.apache.org/jira/browse/HUDI-914
>             Project: Apache Hudi
>          Issue Type: New Feature
>          Components: DeltaStreamer
>            Reporter: liujinhui
>            Assignee: liujinhui
>            Priority: Major
>   Original Estimate: 168h
>  Remaining Estimate: 168h
>
> Currently hudi-DeltaStreamer does not support writing to different target clusters. The specific scenarios are as follows: Generally, Hudi tasks run on an independent cluster. If you want to write data to the target data cluster, you generally rely on core-site.xml and hdfs-site.xml; sometimes you will encounter different targets. The data cluster writes data, but the cluster running the hudi task does not have the core-site.xml and hdfs-site.xml of the target cluster. Although specifying the namenode IP address of the target cluster can be written, this loses HDFS high availability, so I plan to Use the contents of the core-site.xml and hdfs-site.xml files of the target cluster as configuration items and configure them in the dfs-source.properties or kafka-source.properties file of Hudi.
> Is there a better way to solve this problem?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)