You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@eventmesh.apache.org by GitBox <gi...@apache.org> on 2021/08/20 02:51:06 UTC

[GitHub] [incubator-eventmesh] JunjieChou commented on a change in pull request #502: [ISSUE #339] add design doc for integrating OpenSchema

JunjieChou commented on a change in pull request #502:
URL: https://github.com/apache/incubator-eventmesh/pull/502#discussion_r692620496



##########
File path: docs/en/features/eventmesh-schemaregistry-design.md
##########
@@ -0,0 +1,71 @@
+# EventMesh SchemaRegistry (OpenSchema)
+
+## Introduction
+
+[EventMesh(incubating)](https://github.com/apache/incubator-eventmesh) is a dynamic cloud-native eventing infrastructure.
+
+## An Overview of Schema and Schema Registry
+
+### Schema
+
+A Schema stands for the description of serialization instances(string/stream/file/...) and has two properties. First, it is also in the format of serialization type. Second, it defines what requirements such serialized instances should satisfy. 
+
+Besides describing a serialization instance, a Schema may also be used for validating whether an instance is legitimate. The reason is that it defines the ```type```(and other properties) of a JSON instance and inside keys. Taking JSON Schema for example, it could not only be referred when dealing with a JSON string, but also be used for  validating whether a string satisfies properties defined in the schema[[1]](#References).
+
+Commonly, there are JSON Schema, Protobuf Schema, and Avro Schema, representing description of JSON instances, Protobuf instances, and Avro instances respectively.
+
+
+### Schema Registry
+
+Schema Registry is a server provides RESTful interfaces. It could receive and store Schemas from clients, as well as provide intrefaces for other clients to retrieve Schemas from it. 
+
+It could be applied to validation process and (de-)serialization process.
+
+### A Comparison of Schema Registry in Other Projects
+
+Project | Application
+:---: | :---
+EMQ[[2]](#References) | Mainly in (de-)serialization process. Use "Schema Registry" and "Rule Matching" to transfer a message from one serialization format to another.
+Pulsar[[3]](#References) | Mainly in validation process. Use "Schema Registry" to validate a message.
+Confluentinc[[4]](#References) | In both validation and (de-)serialization process.
+
+## An Overview of OpenSchema
+
+OpenSchema[[5]](#References) proposes a specification for data schema when exchanging the message and event in more and more modern cloud-native applcations. It designs a RESTful interface for storing and retrieving such as Avro, JSON Schema, and Protobuf3 schemas from three aspects.
+
+
+## Requirements(Goals)
+
+| Requirement ID | Requirement Description                                      | Comments      |
+| :------------- | ------------------------------------------------------------ | ------------- |
+| F-1            | A message from producer could be understood(known serialization type) by consumer without a contract between each other. | Functionality |
+| F-2            | The message content from producer could be validated whether serialized correctly according to consumer's schema. | Functionality |
+
+
+## Design Details
+
+### Architecture
+
+![OpenSchema](https://user-images.githubusercontent.com/28994988/129255292-e61acc87-5250-4be5-ac9c-a099f6ef157c.png)
+
+### LifeCycle of Schema
+
+The highlevel lifecycle of schema in messages undergoes 9 steps as follows:
+- step1: Producer registers a schema to OpenSchema Registry through OpenSchema service plugin.

Review comment:
       As far as I was concerned, schema plugin provides (de-)serialization/validation/cache, and exists in both eventmesh-runtime and client sdk.
   
   The reason for client connecting to Schema Registry directly is that it forms one-step to register a schema. Compared to client connecting to EventMesh and further to Schema Registry which is two-step, there is one step less. 
   However, the former one requires EventMesh retrieving schema when producer-message arrives which could be omitted from the latter one due to cache during registering. So I think both are ok.
   
   Of course the client could only connect to eventmesh-runtime if you think that's a necessity. And I will design this soon after.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@eventmesh.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@eventmesh.apache.org
For additional commands, e-mail: dev-help@eventmesh.apache.org