You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by yi...@apache.org on 2022/09/06 02:03:20 UTC

[hudi] branch asf-site updated: [HUDI-4634] Update schema provider configuration in MTDS blog (#6420)

This is an automated email from the ASF dual-hosted git repository.

yihua pushed a commit to branch asf-site
in repository https://gitbox.apache.org/repos/asf/hudi.git


The following commit(s) were added to refs/heads/asf-site by this push:
     new f89d6fe2d8 [HUDI-4634] Update schema provider configuration in MTDS blog (#6420)
f89d6fe2d8 is described below

commit f89d6fe2d8e6108de9b0da61861f8699ac4a62c0
Author: Pratyaksh Sharma <pr...@gmail.com>
AuthorDate: Tue Sep 6 07:33:13 2022 +0530

    [HUDI-4634] Update schema provider configuration in MTDS blog (#6420)
---
 ...2020-08-22-ingest-multiple-tables-using-hudi.md | 31 ++++++++++++++++++++++
 1 file changed, 31 insertions(+)

diff --git a/website/blog/2020-08-22-ingest-multiple-tables-using-hudi.md b/website/blog/2020-08-22-ingest-multiple-tables-using-hudi.md
index 9f68c8cf13..a340edf5a7 100644
--- a/website/blog/2020-08-22-ingest-multiple-tables-using-hudi.md
+++ b/website/blog/2020-08-22-ingest-multiple-tables-using-hudi.md
@@ -47,6 +47,37 @@ hoodie.deltastreamer.ingestion.db1.table1.configFile=s3:///tmp/config_table1.pro
 hoodie.deltastreamer.ingestion.db2.table2.configFile=s3:///tmp/config_table2.properties
 ``` 
 
+### Configuring schema providers
+
+It is possible to configure different schema providers for different tables or same schema provider class for all tables. All you need to do is configure the property `hoodie.deltastreamer.schemaprovider.class` accordingly as per your use case as below - 
+
+```java
+hoodie.deltastreamer.schemaprovider.class=org.apache.hudi.utilities.schema.FilebasedSchemaProvider
+```
+
+Further it is also possible to configure different source and target schema registry urls with `SchemaRegistryProvider` as the schemaprovider class. Originally HoodieMultiTableDeltaStreamer was designed to cater to use cases where subject naming strategy is set to [TopicNameStrategy](https://docs.confluent.io/platform/current/schema-registry/serdes-develop/index.html#subject-name-strategy) which is the default provided by Confluent. 
+With this default strategy in place, the subject name is same as the topic name being used in kafka. Source and target schema registry urls can be configured as below with TopicNameStrategy - 
+
+```java
+hoodie.deltastreamer.schemaprovider.registry.baseUrl=http://localhost:8081/subjects/
+hoodie.deltastreamer.schemaprovider.registry.urlSuffix=-value/versions/latest
+```
+
+If you want to consume different versions of your source and target subjects, you can configure as below - 
+
+```java
+hoodie.deltastreamer.schemaprovider.registry.baseUrl=http://localhost:8081/subjects/
+hoodie.deltastreamer.schemaprovider.registry.sourceUrlSuffix=-value/versions/latest
+hoodie.deltastreamer.schemaprovider.registry.targetUrlSuffix=-value/versions/1
+```
+
+If you are looking to configure the schema registry urls in the most straight forward way, you can do that as below
+
+```java
+hoodie.deltastreamer.schemaprovider.registry.url=http://localhost:8081/subjects/random-value/versions/latest
+hoodie.deltastreamer.schemaprovider.registry.targetUrl=http://localhost:8081/subjects/random-value/versions/latest
+```
+
 ### Run Command
 
 `HoodieMultiTableDeltaStreamer` can be run similar to how one runs `HoodieDeltaStreamer`. Please refer to the example given below for the command.