You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2022/06/09 07:56:57 UTC

[GitHub] [pulsar] Anonymitaet commented on a diff in pull request #15809: Add new draft of architecture-overview.md and accompanying images, add under-construction.md and accompanying images

Anonymitaet commented on code in PR #15809:
URL: https://github.com/apache/pulsar/pull/15809#discussion_r893129507


##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts

Review Comment:
   1. currently we have a chapter named "Architecture", its title name is "Architecture overview" as well 
   https://pulsar.apache.org/docs/next/concepts-architecture-overview
   do you want to create a new chapter named "Concepts" which is parallel to "Architecture"? can you explain your intention?
   
   <img width="327" alt="image" src="https://user-images.githubusercontent.com/50226895/172788806-97e12166-d829-480c-9ea7-4364c110dca0.png">
   <img width="472" alt="image" src="https://user-images.githubusercontent.com/50226895/172788927-a07cb63a-a760-43e3-9c2c-a1e9c38d44f8.png">
   
   
   2. as we discussed yesterday, to reduce potential issues, we'd better keep this (metadata) format the same as other files, which is removing blank lines and adding quotes



##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from **producers**, dispatches **messages** to **consumers**, communicates with the Pulsar **configuration store** to handle various coordination tasks, stores messages in BookKeeper instances (aka **bookies**), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles coordination tasks involving multiple clusters, for example [geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the [clusters](admin-api-clusters.md) guide.

Review Comment:
   grammar question: the reason for using "managing" rather than "manage" is `"to" is part of a noun + preposition combination`?



##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from **producers**, dispatches **messages** to **consumers**, communicates with the Pulsar **configuration store** to handle various coordination tasks, stores messages in BookKeeper instances (aka **bookies**), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles coordination tasks involving multiple clusters, for example [geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the [clusters](admin-api-clusters.md) guide.
+
+### Producer
+
+***
+
+A producer is a process that attaches to a topic and publishes messages to a Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes the messages.
+
+Refer to the [producer](concepts-producer.md) topic for more information.
+
+### Topic
+
+***
+
+![Topic](/assets/producer-topic-consumer.svg)
+
+As in other pub-sub systems, topics in Pulsar are named channels for transmitting messages from producers to consumers. Topic names are URLs that have a well-defined structure:
+
+```http
+
+{persistent|non-persistent}://tenant/namespace/topic
+
+```
+
+| Topic name component | Description |
+|:--------------------|:-----------|
+| persistent / non-persistent | This identifies the type of topic. Pulsar supports two kind of topics: [persistent](concepts-architecture-overview.md#persistent-storage) and [non-persistent](#non-persistent-topics). The default is persistent, so if you do not specify a type, the topic is persistent. With persistent topics, all messages are durably persisted on disks (if the broker is not standalone, messages are durably persisted on multiple disks), whereas data for non-persistent topics is not persisted to storage disks.
+tenant             | The topic tenant within the instance. Tenants are essential to multi-tenancy in Pulsar, and spread across clusters.
+|`namespace`          | The administrative unit of the topic, which acts as a grouping mechanism for related topics. Most topic configuration is performed at the [namespace](#namespaces) level. Each tenant has one or multiple namespaces.
+|topic              | The final part of the name. Topic names have no special meaning in a Pulsar instance.
+
+![tenants](/assets/tenants.svg)
+
+Refer to [topic](concepts-topic.md) for more information.
+
+### Consumer
+
+***
+
+A consumer is a process that attaches to a topic via a subscription and then receives messages.
+
+A consumer sends a [flow permit request](developing-binary-protocol.md#flow-control) to a broker to get messages. There is a queue at the consumer side to receive messages pushed from the broker. You can configure the queue size with the [`receiverQueueSize`](client-libraries-java.md#configure-consumer) parameter. The default size is `1000`). Each time `consumer.receive()` is called, a message is dequeued from the buffer.  
+
+Refer to the [consumer](concepts-consumer.md) topic for more information.
+
+### Broker
+
+***
+
+The **Pulsar message broker** is a stateless component that's primarily responsible for running two other components:
+
+* An HTTP server that exposes an {@inject: rest:REST:/} API for both administrative tasks and [topic lookup](concepts-clients.md#client-setup-phase) for producers and consumers. The producers connect to the brokers to publish messages and the consumers connect to the brokers to consume the messages.
+
+* A dispatcher, which is an asynchronous TCP server over a custom [binary protocol](developing-binary-protocol.md) used for all data transfers.
+
+![Broker](/assets/broker.svg)
+
+Messages are typically dispatched out of a [managed ledger](#managed-ledgers) cache for the sake of performance, *unless* the backlog exceeds the cache size. If the backlog grows too large for the cache, the broker will start reading entries from BookKeeper.
+
+Finally, to support geo-replication on global topics, the broker manages replicators that tail the entries published in the local region and republish them to the remote region using the Pulsar [Java client library](client-libraries-java.md).
+
+> For a guide to managing Pulsar brokers, see the [brokers](admin-api-brokers.md) guide.
+
+### Namespace
+
+***
+
+![Namespace](/assets/namespace.svg)
+
+A namespace is a logical nomenclature within a tenant. A tenant creates multiple namespaces via the [admin API](admin-api-namespaces.md#create). For instance, a tenant with different applications can create a separate namespace for each application. A namespace allows the application to create and manage a hierarchy of topics. The topic `my-tenant/app1` is a namespace for the application `app1` for `my-tenant`. You can create any number of [topics](#topics) under the namespace.
+
+### Metadata Store
+
+***
+
+The Pulsar metadata store maintains all the metadata of a Pulsar cluster, such as topic metadata, schema, broker load data, and so on. Pulsar uses [Apache ZooKeeper](https://zookeeper.apache.org/) for metadata storage, cluster configuration, and coordination. The Pulsar metadata store can be deployed on a separate ZooKeeper cluster or an existing ZooKeeper cluster. You can use one ZooKeeper cluster for both Pulsar metadata store and [BookKeeper metadata store](https://bookkeeper.apache.org/docs/latest/getting-started/concepts/#metadata-storage). If you want to deploy Pulsar brokers connected to an existing BookKeeper cluster, you need to deploy separate ZooKeeper clusters for Pulsar metadata store and BookKeeper metadata store respectively.
+
+> Pulsar also supports more metadata backend services, including [ETCD](https://etcd.io/) and [RocksDB](http://rocksdb.org/) (for standalone Pulsar only). 

Review Comment:
   If you want to highlight this part, please use admonitions to make it highlighted in color, or else it's in the same background as running texts (as below), which is not prominent. Please check the layout on your local preview.
   https://docusaurus.io/docs/2.0.0-beta.20/markdown-features/admonitions
   
   <img width="1099" alt="image" src="https://user-images.githubusercontent.com/50226895/172792417-a76a3027-6940-4184-a748-519a639396b7.png">
   



##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from **producers**, dispatches **messages** to **consumers**, communicates with the Pulsar **configuration store** to handle various coordination tasks, stores messages in BookKeeper instances (aka **bookies**), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles coordination tasks involving multiple clusters, for example [geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the [clusters](admin-api-clusters.md) guide.
+
+### Producer
+
+***
+
+A producer is a process that attaches to a topic and publishes messages to a Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes the messages.
+
+Refer to the [producer](concepts-producer.md) topic for more information.
+
+### Topic
+
+***
+
+![Topic](/assets/producer-topic-consumer.svg)
+
+As in other pub-sub systems, topics in Pulsar are named channels for transmitting messages from producers to consumers. Topic names are URLs that have a well-defined structure:
+
+```http
+
+{persistent|non-persistent}://tenant/namespace/topic
+
+```
+
+| Topic name component | Description |
+|:--------------------|:-----------|
+| persistent / non-persistent | This identifies the type of topic. Pulsar supports two kind of topics: [persistent](concepts-architecture-overview.md#persistent-storage) and [non-persistent](#non-persistent-topics). The default is persistent, so if you do not specify a type, the topic is persistent. With persistent topics, all messages are durably persisted on disks (if the broker is not standalone, messages are durably persisted on multiple disks), whereas data for non-persistent topics is not persisted to storage disks.
+tenant             | The topic tenant within the instance. Tenants are essential to multi-tenancy in Pulsar, and spread across clusters.
+|`namespace`          | The administrative unit of the topic, which acts as a grouping mechanism for related topics. Most topic configuration is performed at the [namespace](#namespaces) level. Each tenant has one or multiple namespaces.
+|topic              | The final part of the name. Topic names have no special meaning in a Pulsar instance.
+
+![tenants](/assets/tenants.svg)
+
+Refer to [topic](concepts-topic.md) for more information.
+
+### Consumer
+
+***
+
+A consumer is a process that attaches to a topic via a subscription and then receives messages.
+
+A consumer sends a [flow permit request](developing-binary-protocol.md#flow-control) to a broker to get messages. There is a queue at the consumer side to receive messages pushed from the broker. You can configure the queue size with the [`receiverQueueSize`](client-libraries-java.md#configure-consumer) parameter. The default size is `1000`). Each time `consumer.receive()` is called, a message is dequeued from the buffer.  
+
+Refer to the [consumer](concepts-consumer.md) topic for more information.
+
+### Broker
+
+***
+
+The **Pulsar message broker** is a stateless component that's primarily responsible for running two other components:
+
+* An HTTP server that exposes an {@inject: rest:REST:/} API for both administrative tasks and [topic lookup](concepts-clients.md#client-setup-phase) for producers and consumers. The producers connect to the brokers to publish messages and the consumers connect to the brokers to consume the messages.
+
+* A dispatcher, which is an asynchronous TCP server over a custom [binary protocol](developing-binary-protocol.md) used for all data transfers.
+
+![Broker](/assets/broker.svg)
+
+Messages are typically dispatched out of a [managed ledger](#managed-ledgers) cache for the sake of performance, *unless* the backlog exceeds the cache size. If the backlog grows too large for the cache, the broker will start reading entries from BookKeeper.
+
+Finally, to support geo-replication on global topics, the broker manages replicators that tail the entries published in the local region and republish them to the remote region using the Pulsar [Java client library](client-libraries-java.md).
+
+> For a guide to managing Pulsar brokers, see the [brokers](admin-api-brokers.md) guide.
+
+### Namespace
+
+***
+
+![Namespace](/assets/namespace.svg)
+
+A namespace is a logical nomenclature within a tenant. A tenant creates multiple namespaces via the [admin API](admin-api-namespaces.md#create). For instance, a tenant with different applications can create a separate namespace for each application. A namespace allows the application to create and manage a hierarchy of topics. The topic `my-tenant/app1` is a namespace for the application `app1` for `my-tenant`. You can create any number of [topics](#topics) under the namespace.
+
+### Metadata Store
+
+***
+
+The Pulsar metadata store maintains all the metadata of a Pulsar cluster, such as topic metadata, schema, broker load data, and so on. Pulsar uses [Apache ZooKeeper](https://zookeeper.apache.org/) for metadata storage, cluster configuration, and coordination. The Pulsar metadata store can be deployed on a separate ZooKeeper cluster or an existing ZooKeeper cluster. You can use one ZooKeeper cluster for both Pulsar metadata store and [BookKeeper metadata store](https://bookkeeper.apache.org/docs/latest/getting-started/concepts/#metadata-storage). If you want to deploy Pulsar brokers connected to an existing BookKeeper cluster, you need to deploy separate ZooKeeper clusters for Pulsar metadata store and BookKeeper metadata store respectively.
+
+> Pulsar also supports more metadata backend services, including [ETCD](https://etcd.io/) and [RocksDB](http://rocksdb.org/) (for standalone Pulsar only). 
+
+### Configuration Store
+
+***
+
+* A configuration store quorum stores configuration for tenants, namespaces, and other entities that need to be globally consistent.
+
+* Each cluster has its own local ZooKeeper ensemble that stores cluster-specific configuration and coordination such as which brokers are responsible for which topics as well as ownership metadata, broker load reports, BookKeeper ledger metadata, and more.
+
+The configuration store maintains all the configurations of a Pulsar instance, such as clusters, tenants, namespaces, partitioned topic related configurations, and so on. A Pulsar instance can have a single local cluster, multiple local clusters, or multiple cross-region clusters. Consequently, the configuration store can share the configurations across multiple clusters under a Pulsar instance. The configuration store can be deployed on a separate ZooKeeper cluster or deployed on an existing ZooKeeper cluster.
+
+### Persistent Messaging
+
+***
+
+Pulsar provides guaranteed message delivery for applications. If a message successfully reaches a Pulsar broker, it will be delivered to its intended target.

Review Comment:
   ```suggestion
   Pulsar provides guaranteed message delivery for applications. If a message successfully reaches a Pulsar broker, it is delivered to its intended target.
   
   ```
   Write in the simple present tense as much as possible if you are covering facts that were, are, and forever shall be true.
   https://docs.google.com/document/d/1lc5j4RtuLIzlEYCBo97AC8-U_3Erzs_lxpkDuseU0n4/edit#bookmark=id.e8uqh1awkcnp



##########
site2/docs/under-construction.md:
##########
@@ -0,0 +1,13 @@
+---
+Id: Under-construction
+title: Under construction
+Sidebar_label: 

Review Comment:
   please keep this part consistent with metadata in other files
   - all in lowercases
   - why no value for sidebar_label?
   - if you want to add a new chapter, you need to add it to https://github.com/apache/pulsar/blob/master/site2/website/sidebars.json as well



##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from **producers**, dispatches **messages** to **consumers**, communicates with the Pulsar **configuration store** to handle various coordination tasks, stores messages in BookKeeper instances (aka **bookies**), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles coordination tasks involving multiple clusters, for example [geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the [clusters](admin-api-clusters.md) guide.
+
+### Producer
+
+***
+
+A producer is a process that attaches to a topic and publishes messages to a Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes the messages.
+
+Refer to the [producer](concepts-producer.md) topic for more information.
+
+### Topic
+
+***
+
+![Topic](/assets/producer-topic-consumer.svg)
+
+As in other pub-sub systems, topics in Pulsar are named channels for transmitting messages from producers to consumers. Topic names are URLs that have a well-defined structure:
+
+```http
+
+{persistent|non-persistent}://tenant/namespace/topic
+
+```
+
+| Topic name component | Description |
+|:--------------------|:-----------|
+| persistent / non-persistent | This identifies the type of topic. Pulsar supports two kind of topics: [persistent](concepts-architecture-overview.md#persistent-storage) and [non-persistent](#non-persistent-topics). The default is persistent, so if you do not specify a type, the topic is persistent. With persistent topics, all messages are durably persisted on disks (if the broker is not standalone, messages are durably persisted on multiple disks), whereas data for non-persistent topics is not persisted to storage disks.
+tenant             | The topic tenant within the instance. Tenants are essential to multi-tenancy in Pulsar, and spread across clusters.
+|`namespace`          | The administrative unit of the topic, which acts as a grouping mechanism for related topics. Most topic configuration is performed at the [namespace](#namespaces) level. Each tenant has one or multiple namespaces.

Review Comment:
   same question for `[namespace](#namespaces) `



##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from **producers**, dispatches **messages** to **consumers**, communicates with the Pulsar **configuration store** to handle various coordination tasks, stores messages in BookKeeper instances (aka **bookies**), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles coordination tasks involving multiple clusters, for example [geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the [clusters](admin-api-clusters.md) guide.
+
+### Producer
+
+***
+
+A producer is a process that attaches to a topic and publishes messages to a Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes the messages.
+
+Refer to the [producer](concepts-producer.md) topic for more information.
+
+### Topic
+
+***
+
+![Topic](/assets/producer-topic-consumer.svg)
+
+As in other pub-sub systems, topics in Pulsar are named channels for transmitting messages from producers to consumers. Topic names are URLs that have a well-defined structure:
+
+```http
+
+{persistent|non-persistent}://tenant/namespace/topic
+
+```
+
+| Topic name component | Description |
+|:--------------------|:-----------|
+| persistent / non-persistent | This identifies the type of topic. Pulsar supports two kind of topics: [persistent](concepts-architecture-overview.md#persistent-storage) and [non-persistent](#non-persistent-topics). The default is persistent, so if you do not specify a type, the topic is persistent. With persistent topics, all messages are durably persisted on disks (if the broker is not standalone, messages are durably persisted on multiple disks), whereas data for non-persistent topics is not persisted to storage disks.

Review Comment:
   for the link of "persistent", you use `(concepts-architecture-overview.md#persistent-storage)`, that's correct
    
   but for the link of `non-persistent`, why do you use `non-persistent-topics` which does not exist in the current file?
   
   can you check all links before submitting a PR and attach all screenshots of your changes in the PR description? That's a way to test that all changes can be rendered successfully



##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from **producers**, dispatches **messages** to **consumers**, communicates with the Pulsar **configuration store** to handle various coordination tasks, stores messages in BookKeeper instances (aka **bookies**), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles coordination tasks involving multiple clusters, for example [geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the [clusters](admin-api-clusters.md) guide.
+
+### Producer
+
+***
+
+A producer is a process that attaches to a topic and publishes messages to a Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes the messages.
+
+Refer to the [producer](concepts-producer.md) topic for more information.
+
+### Topic
+
+***
+
+![Topic](/assets/producer-topic-consumer.svg)
+
+As in other pub-sub systems, topics in Pulsar are named channels for transmitting messages from producers to consumers. Topic names are URLs that have a well-defined structure:
+
+```http
+
+{persistent|non-persistent}://tenant/namespace/topic
+
+```
+
+| Topic name component | Description |
+|:--------------------|:-----------|
+| persistent / non-persistent | This identifies the type of topic. Pulsar supports two kind of topics: [persistent](concepts-architecture-overview.md#persistent-storage) and [non-persistent](#non-persistent-topics). The default is persistent, so if you do not specify a type, the topic is persistent. With persistent topics, all messages are durably persisted on disks (if the broker is not standalone, messages are durably persisted on multiple disks), whereas data for non-persistent topics is not persisted to storage disks.
+tenant             | The topic tenant within the instance. Tenants are essential to multi-tenancy in Pulsar, and spread across clusters.
+|`namespace`          | The administrative unit of the topic, which acts as a grouping mechanism for related topics. Most topic configuration is performed at the [namespace](#namespaces) level. Each tenant has one or multiple namespaces.
+|topic              | The final part of the name. Topic names have no special meaning in a Pulsar instance.
+
+![tenants](/assets/tenants.svg)
+
+Refer to [topic](concepts-topic.md) for more information.
+
+### Consumer
+
+***
+
+A consumer is a process that attaches to a topic via a subscription and then receives messages.
+
+A consumer sends a [flow permit request](developing-binary-protocol.md#flow-control) to a broker to get messages. There is a queue at the consumer side to receive messages pushed from the broker. You can configure the queue size with the [`receiverQueueSize`](client-libraries-java.md#configure-consumer) parameter. The default size is `1000`). Each time `consumer.receive()` is called, a message is dequeued from the buffer.  
+
+Refer to the [consumer](concepts-consumer.md) topic for more information.
+
+### Broker
+
+***
+
+The **Pulsar message broker** is a stateless component that's primarily responsible for running two other components:
+
+* An HTTP server that exposes an {@inject: rest:REST:/} API for both administrative tasks and [topic lookup](concepts-clients.md#client-setup-phase) for producers and consumers. The producers connect to the brokers to publish messages and the consumers connect to the brokers to consume the messages.
+
+* A dispatcher, which is an asynchronous TCP server over a custom [binary protocol](developing-binary-protocol.md) used for all data transfers.
+
+![Broker](/assets/broker.svg)
+
+Messages are typically dispatched out of a [managed ledger](#managed-ledgers) cache for the sake of performance, *unless* the backlog exceeds the cache size. If the backlog grows too large for the cache, the broker will start reading entries from BookKeeper.
+
+Finally, to support geo-replication on global topics, the broker manages replicators that tail the entries published in the local region and republish them to the remote region using the Pulsar [Java client library](client-libraries-java.md).
+
+> For a guide to managing Pulsar brokers, see the [brokers](admin-api-brokers.md) guide.
+
+### Namespace
+
+***
+
+![Namespace](/assets/namespace.svg)
+
+A namespace is a logical nomenclature within a tenant. A tenant creates multiple namespaces via the [admin API](admin-api-namespaces.md#create). For instance, a tenant with different applications can create a separate namespace for each application. A namespace allows the application to create and manage a hierarchy of topics. The topic `my-tenant/app1` is a namespace for the application `app1` for `my-tenant`. You can create any number of [topics](#topics) under the namespace.

Review Comment:
   1. same question for `[topics](#topics)`
   
   2. for `[admin API](admin-api-namespace.md#create]`, we do not navigate users to any chapters in `Admin API`, reason:
   https://pulsar.apache.org/docs/next/admin-api-namespaces/
   <img width="1489" alt="image" src="https://user-images.githubusercontent.com/50226895/172791321-5fb48db8-bd05-4bd5-9a89-7f58545d9a2e.png">
   
   instead, we navigate them to a more general place https://pulsar.apache.org/tools/pulsar-admin/ to reduce maintenance cost
   



##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from **producers**, dispatches **messages** to **consumers**, communicates with the Pulsar **configuration store** to handle various coordination tasks, stores messages in BookKeeper instances (aka **bookies**), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles coordination tasks involving multiple clusters, for example [geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the [clusters](admin-api-clusters.md) guide.
+
+### Producer
+
+***
+
+A producer is a process that attaches to a topic and publishes messages to a Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes the messages.
+
+Refer to the [producer](concepts-producer.md) topic for more information.
+
+### Topic
+
+***
+
+![Topic](/assets/producer-topic-consumer.svg)
+
+As in other pub-sub systems, topics in Pulsar are named channels for transmitting messages from producers to consumers. Topic names are URLs that have a well-defined structure:
+
+```http
+
+{persistent|non-persistent}://tenant/namespace/topic
+
+```
+
+| Topic name component | Description |
+|:--------------------|:-----------|
+| persistent / non-persistent | This identifies the type of topic. Pulsar supports two kind of topics: [persistent](concepts-architecture-overview.md#persistent-storage) and [non-persistent](#non-persistent-topics). The default is persistent, so if you do not specify a type, the topic is persistent. With persistent topics, all messages are durably persisted on disks (if the broker is not standalone, messages are durably persisted on multiple disks), whereas data for non-persistent topics is not persisted to storage disks.
+tenant             | The topic tenant within the instance. Tenants are essential to multi-tenancy in Pulsar, and spread across clusters.
+|`namespace`          | The administrative unit of the topic, which acts as a grouping mechanism for related topics. Most topic configuration is performed at the [namespace](#namespaces) level. Each tenant has one or multiple namespaces.
+|topic              | The final part of the name. Topic names have no special meaning in a Pulsar instance.
+
+![tenants](/assets/tenants.svg)
+
+Refer to [topic](concepts-topic.md) for more information.
+
+### Consumer
+
+***
+
+A consumer is a process that attaches to a topic via a subscription and then receives messages.
+
+A consumer sends a [flow permit request](developing-binary-protocol.md#flow-control) to a broker to get messages. There is a queue at the consumer side to receive messages pushed from the broker. You can configure the queue size with the [`receiverQueueSize`](client-libraries-java.md#configure-consumer) parameter. The default size is `1000`). Each time `consumer.receive()` is called, a message is dequeued from the buffer.  
+
+Refer to the [consumer](concepts-consumer.md) topic for more information.
+
+### Broker
+
+***
+
+The **Pulsar message broker** is a stateless component that's primarily responsible for running two other components:
+
+* An HTTP server that exposes an {@inject: rest:REST:/} API for both administrative tasks and [topic lookup](concepts-clients.md#client-setup-phase) for producers and consumers. The producers connect to the brokers to publish messages and the consumers connect to the brokers to consume the messages.
+
+* A dispatcher, which is an asynchronous TCP server over a custom [binary protocol](developing-binary-protocol.md) used for all data transfers.
+
+![Broker](/assets/broker.svg)
+
+Messages are typically dispatched out of a [managed ledger](#managed-ledgers) cache for the sake of performance, *unless* the backlog exceeds the cache size. If the backlog grows too large for the cache, the broker will start reading entries from BookKeeper.
+
+Finally, to support geo-replication on global topics, the broker manages replicators that tail the entries published in the local region and republish them to the remote region using the Pulsar [Java client library](client-libraries-java.md).
+
+> For a guide to managing Pulsar brokers, see the [brokers](admin-api-brokers.md) guide.
+
+### Namespace
+
+***
+
+![Namespace](/assets/namespace.svg)
+
+A namespace is a logical nomenclature within a tenant. A tenant creates multiple namespaces via the [admin API](admin-api-namespaces.md#create). For instance, a tenant with different applications can create a separate namespace for each application. A namespace allows the application to create and manage a hierarchy of topics. The topic `my-tenant/app1` is a namespace for the application `app1` for `my-tenant`. You can create any number of [topics](#topics) under the namespace.
+
+### Metadata Store
+
+***
+
+The Pulsar metadata store maintains all the metadata of a Pulsar cluster, such as topic metadata, schema, broker load data, and so on. Pulsar uses [Apache ZooKeeper](https://zookeeper.apache.org/) for metadata storage, cluster configuration, and coordination. The Pulsar metadata store can be deployed on a separate ZooKeeper cluster or an existing ZooKeeper cluster. You can use one ZooKeeper cluster for both Pulsar metadata store and [BookKeeper metadata store](https://bookkeeper.apache.org/docs/latest/getting-started/concepts/#metadata-storage). If you want to deploy Pulsar brokers connected to an existing BookKeeper cluster, you need to deploy separate ZooKeeper clusters for Pulsar metadata store and BookKeeper metadata store respectively.

Review Comment:
   the link for `BookKeeper metadata store` is invalid. Please run a local preview and check all contents before submitting a PR.



##########
site2/docs/architecture-overview.md:
##########
@@ -0,0 +1,143 @@
+---
+
+id: concepts-architecture-overview
+
+title: Architecture overview
+
+sidebar_label: Concepts
+
+---
+
+The following overview describes the components that make up a Pulsar cluster, from general to specific.  
+
+### Instance
+
+***
+
+A Pulsar instance is composed of one or more Pulsar clusters. Clusters within an instance can [replicate](concepts-replication.md) data amongst themselves.
+
+### Cluster
+
+***
+
+![Pulsar architecture diagram](/assets/pulsar-system-architecture.svg)
+
+In a Pulsar cluster:
+
+* One or more **brokers** handles and load balances incoming messages from **producers**, dispatches **messages** to **consumers**, communicates with the Pulsar **configuration store** to handle various coordination tasks, stores messages in BookKeeper instances (aka **bookies**), relies on a cluster-specific ZooKeeper cluster for certain tasks, and more.
+
+* A BookKeeper cluster consisting of one or more bookies handles [persistent storage](#persistent-storage) of messages.
+
+* A ZooKeeper cluster specific to that cluster handles coordination tasks between Pulsar clusters.
+
+An instance-wide ZooKeeper cluster called the Configuration Store handles coordination tasks involving multiple clusters, for example [geo-replication](concepts-replication.md).
+
+For a guide to managing Pulsar clusters, see the [clusters](admin-api-clusters.md) guide.
+
+### Producer
+
+***
+
+A producer is a process that attaches to a topic and publishes messages to a Pulsar [broker](reference-terminology.md#broker). The Pulsar broker processes the messages.
+
+Refer to the [producer](concepts-producer.md) topic for more information.
+
+### Topic
+
+***
+
+![Topic](/assets/producer-topic-consumer.svg)
+
+As in other pub-sub systems, topics in Pulsar are named channels for transmitting messages from producers to consumers. Topic names are URLs that have a well-defined structure:
+
+```http
+
+{persistent|non-persistent}://tenant/namespace/topic
+
+```
+
+| Topic name component | Description |
+|:--------------------|:-----------|
+| persistent / non-persistent | This identifies the type of topic. Pulsar supports two kind of topics: [persistent](concepts-architecture-overview.md#persistent-storage) and [non-persistent](#non-persistent-topics). The default is persistent, so if you do not specify a type, the topic is persistent. With persistent topics, all messages are durably persisted on disks (if the broker is not standalone, messages are durably persisted on multiple disks), whereas data for non-persistent topics is not persisted to storage disks.
+tenant             | The topic tenant within the instance. Tenants are essential to multi-tenancy in Pulsar, and spread across clusters.
+|`namespace`          | The administrative unit of the topic, which acts as a grouping mechanism for related topics. Most topic configuration is performed at the [namespace](#namespaces) level. Each tenant has one or multiple namespaces.
+|topic              | The final part of the name. Topic names have no special meaning in a Pulsar instance.
+
+![tenants](/assets/tenants.svg)
+
+Refer to [topic](concepts-topic.md) for more information.
+
+### Consumer
+
+***
+
+A consumer is a process that attaches to a topic via a subscription and then receives messages.
+
+A consumer sends a [flow permit request](developing-binary-protocol.md#flow-control) to a broker to get messages. There is a queue at the consumer side to receive messages pushed from the broker. You can configure the queue size with the [`receiverQueueSize`](client-libraries-java.md#configure-consumer) parameter. The default size is `1000`). Each time `consumer.receive()` is called, a message is dequeued from the buffer.  
+
+Refer to the [consumer](concepts-consumer.md) topic for more information.
+
+### Broker
+
+***
+
+The **Pulsar message broker** is a stateless component that's primarily responsible for running two other components:
+
+* An HTTP server that exposes an {@inject: rest:REST:/} API for both administrative tasks and [topic lookup](concepts-clients.md#client-setup-phase) for producers and consumers. The producers connect to the brokers to publish messages and the consumers connect to the brokers to consume the messages.

Review Comment:
   is `[namespace](#namespaces) ` rendered correctly on your local preview?



##########
site2/docs/under-construction.md:
##########
@@ -0,0 +1,13 @@
+---
+Id: Under-construction
+title: Under construction
+Sidebar_label: 
+---
+
+Please excuse our dust! We are working hard to bring you the most concise, organized and accessible documentation possible.  We thank you for your patience as we improve these documents!

Review Comment:
   ```suggestion
   Please excuse our dust! We are working hard to bring you the most concise, organized, and accessible documentation possible.  We thank you for your patience as we improve these documents!
   
   ```
   Use commas to separate items in a series of three or more. Use a comma before the conjunction that precedes the final item.
   https://docs.google.com/document/d/1lc5j4RtuLIzlEYCBo97AC8-U_3Erzs_lxpkDuseU0n4/edit#bookmark=id.b82f2ay5cpsc



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscribe@pulsar.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org