You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@hudi.apache.org by GitBox <gi...@apache.org> on 2021/08/31 15:28:30 UTC

[GitHub] [hudi] vinothchandar commented on a change in pull request #3566: HUDI-2348 Fixed typos in blog "Schema evolution with DeltaStreamer using KafkaSource"

vinothchandar commented on a change in pull request #3566:
URL: https://github.com/apache/hudi/pull/3566#discussion_r699409791



##########
File path: website/blog/2021-08-16-kafka-custom-deserializer.md
##########
@@ -5,33 +5,33 @@ author: sbernauer
 category: blog
 ---
 
-The schema used for data exchange between services can change change rapidly with new business requirements.
-Apache Hudi is often used in combination with kafka as a event stream where all events are transmitted according to an record schema.
+The schema used for data exchange between services can change rapidly with new business requirements.
+Apache Hudi is often used in combination with kafka as a event stream where all events are transmitted according to a record schema.
 In our case a Confluent schema registry is used to maintain the schema and as schema evolves, newer versions are updated in the schema registry.
 <!--truncate-->
 
 ## What do we want to achieve?
 We have multiple instances of DeltaStreamer running, consuming many topics with different schemas ingesting to multiple Hudi tables. Deltastreamer is a utility in Hudi to assist in ingesting data from multiple sources like DFS, kafka, etc into Hudi. If interested, you can read more about DeltaStreamer tool [here](https://hudi.apache.org/docs/writing_data#deltastreamer)
-Ideally every Topic should be able to evolve the schema to match new business requirements. Consumers start producing data with a new schema version and the DeltaStreamer picks up the new schema and ingests the data with the new schema. For this to work, we run our DeltaStreamer instances with the latest schema version available from the Schema Registry to ensure that we always use the freshest schema with all attributes.
-A prerequisites it that all the mentioned Schema evolutions must be `BACKWARD_TRANSITIVE` compatible (see [Schema Evolution and Compatibility of Avro Schema changes](https://docs.confluent.io/platform/current/schema-registry/avro.html). This ensures that every record in the kafka topic can always be read using the latest schema.
+Ideally every topic should be able to evolve the schema to match new business requirements. Producers start producing data with a new schema version and the DeltaStreamer picks up the new schema and ingests the data with the new schema. For this to work, we run our DeltaStreamer instances with the latest schema version available from the Schema Registry to ensure that we always use the freshest schema with all attributes.
+A prerequisites is that all the mentioned Schema evolutions must be `BACKWARD_TRANSITIVE` compatible (see [Schema Evolution and Compatibility of Avro Schema changes](https://docs.confluent.io/platform/current/schema-registry/avro.html). This ensures that every record in the kafka topic can always be read using the latest schema.

Review comment:
       "a prerequisite"?
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@hudi.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org