You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by si...@apache.org on 2019/08/02 13:01:18 UTC

[pulsar] branch master updated: [Doc] Add contents for *Get Started (Schema)* (#4859)

This is an automated email from the ASF dual-hosted git repository.

sijie pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git


The following commit(s) were added to refs/heads/master by this push:
     new 9c68f19  [Doc] Add contents for *Get Started (Schema)* (#4859)
9c68f19 is described below

commit 9c68f19670676f50f4f1fd5019aa1ca89cdb09c6
Author: Anonymitaet <50...@users.noreply.github.com>
AuthorDate: Fri Aug 2 21:01:12 2019 +0800

    [Doc] Add contents for *Get Started (Schema)* (#4859)
---
 site2/docs/schema-get-started.md | 32 +++++++++++++++++++++++++++++---
 1 file changed, 29 insertions(+), 3 deletions(-)

diff --git a/site2/docs/schema-get-started.md b/site2/docs/schema-get-started.md
index 9c001b4..fd9279f 100644
--- a/site2/docs/schema-get-started.md
+++ b/site2/docs/schema-get-started.md
@@ -4,6 +4,32 @@ title: Get started
 sidebar_label: Get started
 ---
 
+## Schema Registry
+
+Type safety is extremely important in any application built around a message bus like Pulsar. 
+
+Producers and consumers need some kind of mechanism for coordinating types at the topic level to aviod various potential problems arise. For example, serialization and deserialization issues. 
+
+Applications typically adopt one of the following approaches to guarantee type safety in messaging. Both approaches are available in Pulsar, and you're free to adopt one or the other or to mix and match on a per-topic basis.
+
+### Client-side approach
+
+Producers and consumers are responsible for not only serializing and deserializing messages (which consist of raw bytes) but also "knowing" which types are being transmitted via which topics. 
+
+If a producer is sending temperature sensor data on the topic `topic-1`, consumers of that topic will run into trouble if they attempt to parse that data as moisture sensor readings.
+
+Producers and consumers can send and receive messages consisting of raw byte arrays and leave all type safety enforcement to the application on an "out-of-band" basis.
+
+### Server-side approach 
+
+Producers and consumers inform the system which data types can be transmitted via the topic. 
+
+With this approach, the messaging system enforces type safety and ensures that producers and consumers remain synced.
+
+Pulsar has a built-in **schema registry** that enables clients to upload data schemas on a per-topic basis. Those schemas dictate which data types are recognized as valid for that topic.
+
+## Why use schema
+
 When a schema is enabled, Pulsar does parse data, it takes bytes as inputs and sends bytes as outputs. While data has meaning beyond bytes, you need to parse data and might encounter parse exceptions which mainly occur in the following situations:
 
 * The field does not exist
@@ -27,7 +53,7 @@ public class User {
 
 When constructing a producer with the _User_ class, you can specify a schema or not as below.
 
-## Without schema
+### Without schema
 
 If you construct a producer without specifying a schema, then the producer can only produce messages of type `byte[]`. If you have a POJO class, you need to serialize the POJO into bytes before sending messages.
 
@@ -41,7 +67,7 @@ User user = new User(“Tom”, 28);
 byte[] message = … // serialize the `user` by yourself;
 producer.send(message);
 ```
-## With schema
+### With schema
 
 If you construct a producer with specifying a schema, then you can send a class to a topic directly without worrying about how to serialize POJOs into bytes. 
 
@@ -57,6 +83,6 @@ User user = new User(“Tom”, 28);
 producer.send(User);
 ```
 
-## Summary
+### Summary
 
 When constructing a producer with a schema, you do not need to serialize messages into bytes, instead Pulsar schema does this job in the background.