You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by "liujinhui (Jira)" <ji...@apache.org> on 2020/06/09 01:22:00 UTC
[jira] [Commented] (HUDI-914) support different target data
clusters
[ https://issues.apache.org/jira/browse/HUDI-914?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17128757#comment-17128757 ]
liujinhui commented on HUDI-914:
--------------------------------
Due to the needs of some business parties, they only want the hudi dataset to appear on their clusters, and they do not want to pay attention to specific tasks
[~vinoth]
> support different target data clusters
> --------------------------------------
>
> Key: HUDI-914
> URL: https://issues.apache.org/jira/browse/HUDI-914
> Project: Apache Hudi
> Issue Type: New Feature
> Components: DeltaStreamer
> Reporter: liujinhui
> Assignee: liujinhui
> Priority: Major
> Original Estimate: 168h
> Remaining Estimate: 168h
>
> Currently hudi-DeltaStreamer does not support writing to different target clusters. The specific scenarios are as follows: Generally, Hudi tasks run on an independent cluster. If you want to write data to the target data cluster, you generally rely on core-site.xml and hdfs-site.xml; sometimes you will encounter different targets. The data cluster writes data, but the cluster running the hudi task does not have the core-site.xml and hdfs-site.xml of the target cluster. Although specifying the namenode IP address of the target cluster can be written, this loses HDFS high availability, so I plan to Use the contents of the core-site.xml and hdfs-site.xml files of the target cluster as configuration items and configure them in the dfs-source.properties or kafka-source.properties file of Hudi.
> Is there a better way to solve this problem?
--
This message was sent by Atlassian Jira
(v8.3.4#803005)