You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by li...@apache.org on 2022/09/07 01:56:25 UTC

[pulsar] branch master updated: [improve][docs] Get started locally (#17475)

This is an automated email from the ASF dual-hosted git repository.

liuyu pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/pulsar.git


The following commit(s) were added to refs/heads/master by this push:
     new ae7c941ab9f [improve][docs] Get started locally (#17475)
ae7c941ab9f is described below

commit ae7c941ab9fb56aae273d8d9d372c362b521a00a
Author: tison <wa...@gmail.com>
AuthorDate: Wed Sep 7 09:56:16 2022 +0800

    [improve][docs] Get started locally (#17475)
---
 site2/docs/getting-started-docker.md     |   6 +-
 site2/docs/getting-started-helm.md       |  14 +-
 site2/docs/getting-started-home.md       |   8 +-
 site2/docs/getting-started-standalone.md | 271 ++++++++-----------------------
 4 files changed, 79 insertions(+), 220 deletions(-)

diff --git a/site2/docs/getting-started-docker.md b/site2/docs/getting-started-docker.md
index 8dd7779ca55..1c4f01600a0 100644
--- a/site2/docs/getting-started-docker.md
+++ b/site2/docs/getting-started-docker.md
@@ -1,10 +1,10 @@
 ---
 id: getting-started-docker
-title: Set up a standalone Pulsar in Docker
+title: Run a standalone Pulsar cluster in Docker
 sidebar_label: "Run Pulsar in Docker"
 ---
 
-For local development and testing, you can run Pulsar in standalone mode on your own machine within a Docker container. 
+For local development and testing, you can run Pulsar in standalone mode on your own machine within a Docker container.
 
 If you have not installed Docker, download the [Community edition](https://www.docker.com/community-edition) and follow the instructions for your OS.
 
@@ -48,7 +48,7 @@ After starting Pulsar successfully, you can see `INFO`-level log messages like t
 
 ## Use Pulsar in Docker
 
-Pulsar offers a variety of [client libraries](client-libraries.md), such as [Java](client-libraries-java.md), [Go](client-libraries-go.md), [Python](client-libraries-python.md), [C++](client-libraries-cpp.md). 
+Pulsar offers a variety of [client libraries](client-libraries.md), such as [Java](client-libraries-java.md), [Go](client-libraries-go.md), [Python](client-libraries-python.md), [C++](client-libraries-cpp.md).
 
 If you're running a local standalone cluster, you can use one of these root URLs to interact with your cluster:
 * `pulsar://localhost:6650`
diff --git a/site2/docs/getting-started-helm.md b/site2/docs/getting-started-helm.md
index 95e3c8079cb..1a373d1893e 100644
--- a/site2/docs/getting-started-helm.md
+++ b/site2/docs/getting-started-helm.md
@@ -1,6 +1,6 @@
 ---
 id: getting-started-helm
-title: Get started in Kubernetes
+title: Run a standalone Pulsar cluster in Kubernetes
 sidebar_label: "Run Pulsar in Kubernetes"
 ---
 
@@ -52,7 +52,7 @@ We use [Minikube](https://minikube.sigs.k8s.io/docs/start/) in this quick start
    minikube dashboard
    ```
 
-   The command automatically triggers opening a webpage in your browser. 
+   The command automatically triggers opening a webpage in your browser.
 
 ## Step 1: Install Pulsar Helm chart
 
@@ -88,7 +88,7 @@ We use [Minikube](https://minikube.sigs.k8s.io/docs/start/) in this quick start
        -c
    ```
 
-4. Use the Pulsar Helm chart to install a Pulsar cluster to Kubernetes. 
+4. Use the Pulsar Helm chart to install a Pulsar cluster to Kubernetes.
 
    ```bash
    helm install \
@@ -169,7 +169,7 @@ We use [Minikube](https://minikube.sigs.k8s.io/docs/start/) in this quick start
    bin/pulsar-admin tenants list
    ```
 
-   You should see a similar output as below. The tenant `apache` has been successfully created. 
+   You should see a similar output as below. The tenant `apache` has been successfully created.
 
    ```bash
    "apache"
@@ -189,7 +189,7 @@ We use [Minikube](https://minikube.sigs.k8s.io/docs/start/) in this quick start
    bin/pulsar-admin namespaces list apache
    ```
 
-   You should see a similar output as below. The namespace `apache/pulsar` has been successfully created. 
+   You should see a similar output as below. The namespace `apache/pulsar` has been successfully created.
 
    ```bash
    "apache/pulsar"
@@ -303,7 +303,7 @@ Then you can proceed with the following steps:
    - From the producer side
 
        **Output**
-       
+
        The messages have been produced successfully.
 
        ```bash
@@ -351,7 +351,7 @@ Then you can proceed with the following steps:
 
 2. The Pulsar Manager UI will be open in your browser. You can use the username `pulsar` and password `pulsar` to log into Pulsar Manager.
 
-3. In Pulsar Manager UI, you can create an environment. 
+3. In Pulsar Manager UI, you can create an environment.
 
    - Click **New Environment** in the upper-left corner.
    - Type `pulsar-mini` for the field `Environment Name` in the pop-up window.
diff --git a/site2/docs/getting-started-home.md b/site2/docs/getting-started-home.md
index 64ea38a8120..fd6f0d329e5 100644
--- a/site2/docs/getting-started-home.md
+++ b/site2/docs/getting-started-home.md
@@ -4,9 +4,9 @@ title: Get started
 sidebar_label: "Get Started"
 ---
 
-Getting up and running with Pulsar is simple. Download it, install it, and try it out.  
+Getting up and running with Pulsar is simple. Download it, install it, and try it out.
 
 You have three options. Click any of these links to begin your Pulsar journey!
-* [Run a standalone Pulsar locally](getting-started-standalone.md) - Run a single instance of Pulsar in standalone mode on a single machine.
-* [Run a standalone Pulsar in Docker](getting-started-docker.md) - Run one or more instances of Pulsar in a Docker container.
-* [Run a standalone Pulsar in Kubernetes](getting-started-helm.md) - Run one or more instances of Pulsar in Kubernetes using a Helm chart.
\ No newline at end of file
+* [Run a standalone Pulsar cluster locally](getting-started-standalone.md) - Run a single instance of Pulsar in standalone mode on a single machine.
+* [Run a standalone Pulsar cluster in Docker](getting-started-docker.md) - Run one or more instances of Pulsar in a Docker container.
+* [Run a standalone Pulsar cluster in Kubernetes](getting-started-helm.md) - Run one or more instances of Pulsar in Kubernetes using a Helm chart.
diff --git a/site2/docs/getting-started-standalone.md b/site2/docs/getting-started-standalone.md
index f89bad1da59..5cba7f981b9 100644
--- a/site2/docs/getting-started-standalone.md
+++ b/site2/docs/getting-started-standalone.md
@@ -1,278 +1,137 @@
 ---
 id: getting-started-standalone
-title: Set up a standalone Pulsar locally
+title: Run a standalone Pulsar cluster locally
 sidebar_label: "Run Pulsar locally"
 ---
 
-For local development and testing, you can run Pulsar in standalone mode on your machine. The standalone mode includes a Pulsar broker, the necessary [RocksDB](http://rocksdb.org/) and BookKeeper components running inside of a single Java Virtual Machine (JVM) process.
-
-> **Pulsar in production?**  
-> If you're looking to run a full production Pulsar installation, see the [Deploying a Pulsar instance](deploy-bare-metal.md) guide.
-
-## Install Pulsar standalone
-
-This tutorial guides you through every step of installing Pulsar locally.
-
-### System requirements
-
-Currently, Pulsar is available for 64-bit **macOS**, **Linux**, and **Windows**. To use Pulsar, you need to install 64-bit JRE/JDK.
-For the runtime Java version, see [Pulsar Runtime Java Version Recommendation](https://github.com/apache/pulsar/blob/master/README.md#pulsar-runtime-java-version-recommendation) according to your target Pulsar version.
+For local development and testing, you can run Pulsar in standalone mode on your machine. The standalone mode runs all components inside a single Java Virtual Machine (JVM) process.
 
 :::tip
 
-By default, Pulsar allocates 2G JVM heap memory to start. It can be changed in `conf/pulsar_env.sh` file under `PULSAR_MEM`. This is an extra option passed into JVM. 
+If you're looking to run a full production Pulsar installation, see the [Deploying a Pulsar instance](deploy-bare-metal.md) guide.
 
 :::
 
-:::note
+## Prerequisites
 
-Broker is only supported on 64-bit JVM.
+- JRE (64-bit). Different Pulsar versions rely on different JRE versions. For how to choose the JRE version, see [Pulsar Runtime Java Version Recommendation](https://github.com/apache/pulsar/blob/master/README.md#pulsar-runtime-java-version-recommendation).
 
-:::
-
-#### Install JDK on M1
-In the current version, Pulsar uses a BookKeeper version which in turn uses RocksDB. RocksDB is compiled to work on x86 architecture and not ARM. Therefore, Pulsar can only work with x86 JDK. This is planned to be fixed in future versions of Pulsar.
-
-One of the ways to easily install an x86 JDK is to use [SDKMan](http://sdkman.io). Follow instructions on the SDKMan website.
-
-2. Turn on Rosetta2 compatibility for SDKMan by editing `~/.sdkman/etc/config` and changing the following property from `false` to `true`.
-
-```properties
-sdkman_rosetta2_compatible=true
-```
+## Step 1. Download Pulsar distribution
 
-3. Close the current shell / terminal window and open a new one.
-4. Make sure you don't have any previously installed JVM of the same version by listing existing installed versions.
+Download the official Apache Pulsar distribution:
 
-```shell
-sdk list java|grep installed
-```
-
-Example output:
-
-```text
-               | >>> | 17.0.3.6.1   | amzn    | installed  | 17.0.3.6.1-amzn
-```
-
-If you have any Java 17 version installed, uninstall it.
-
-```shell
-sdk uinstall java 17.0.3.6.1
+```bash
+wget https://archive.apache.org/dist/pulsar/pulsar-@pulsar:version@/apache-pulsar-@pulsar:version@-bin.tar.gz
 ```
 
-5. Install any Java versions greater than Java 8.
+Once downloaded, unpack the tar file:
 
-```shell
- sdk install java 17.0.3.6.1-amzn
+```bash
+tar xvfz apache-pulsar-@pulsar:version@-bin.tar.gz
 ```
 
-### Install Pulsar using binary release
-
-To get started with Pulsar, download a binary tarball release in one of the following ways:
-
-* download from the Apache mirror (<a href="pulsar:binary_release_url" download>Pulsar @pulsar:version@ binary release</a>)
-
-* download from the Pulsar [downloads page](pulsar:download_page_url)  
-  
-* download from the Pulsar [releases page](https://github.com/apache/pulsar/releases/latest)
-  
-* use [wget](https://www.gnu.org/software/wget):
-
-  ```shell
-  wget pulsar:binary_release_url
-  ```
-
-After you download the tarball, untar it and use the `cd` command to navigate to the resulting directory:
+For the rest of this quickstart all commands are run from the root of the distribution folder, so switch to it:
 
 ```bash
-tar xvfz apache-pulsar-@pulsar:version@-bin.tar.gz
 cd apache-pulsar-@pulsar:version@
 ```
 
-#### What your package contains
-
-The Pulsar binary package initially contains the following directories:
-
-Directory | Contains
-:---------|:--------
-`bin` | Pulsar's command-line tools, such as [`pulsar`](reference-cli-tools.md#pulsar) and [`pulsar-admin`](/tools/pulsar-admin/).
-`conf` | Configuration files for Pulsar, including [broker configuration](reference-configuration.md#broker) and more.<br />**Note:** Pulsar standalone uses RocksDB as the local metadata store and its configuration file path [`metadataStoreConfigPath`](reference-configuration.md) is configurable in the `standalone.conf` file. For more information about the configurations of RocksDB, see [here](https://github.com/facebook/rocksdb/blob/main/examples/rocksdb_option_file_example.ini) and rel [...]
-`examples` | A Java JAR file containing [Pulsar Functions](functions-overview.md) example.
-`instances` | Artifacts created for [Pulsar Functions](functions-overview.md).
-`lib` | The [JAR](https://en.wikipedia.org/wiki/JAR_(file_format)) files used by Pulsar.
-`licenses` | License files, in the`.txt` form, for various components of the Pulsar [codebase](https://github.com/apache/pulsar).
-
-These directories are created once you begin running Pulsar.
-
-Directory | Contains
-:---------|:--------
-`data` | The data storage directory used by RocksDB and BookKeeper.
-`logs` | Logs created by the installation.
+List the contents by executing:
 
-:::tip
-
-If you want to use built-in connectors and tiered storage offloaders, you can install them according to the following instructions:
-* [Install built-in connectors (optional)](#install-built-in-connectors-optional)
-* [Install tiered storage offloaders (optional)](#install-tiered-storage-offloaders-optional)
-Otherwise, skip this step and perform the next step [Start Pulsar standalone](#start-pulsar-standalone). Pulsar can be successfully installed without installing built-in connectors and tiered storage offloaders.
-
-:::
-
-### Install built-in connectors (optional)
-
-Since `2.1.0-incubating` release, Pulsar releases a separate binary distribution, containing all the `built-in` connectors.
-To enable those `built-in` connectors, you can download the connectors tarball release in one of the following ways:
-
-* download from the Apache mirror <a href="pulsar:connector_release_url" download>Pulsar IO Connectors @pulsar:version@ release</a>
-
-* download from the Pulsar [downloads page](pulsar:download_page_url)
+```bash
+ls -1F
+```
 
-* download from the Pulsar [releases page](https://github.com/apache/pulsar/releases/latest)
+You may want to note that:
 
-* use [wget](https://www.gnu.org/software/wget):
+| Directory     | Description                                                                                         |
+| ------------- | --------------------------------------------------------------------------------------------------- |
+| **bin**       | The [`pulsar`](reference-cli-tools.md#pulsar) entry point script, and many other command-line tools |
+| **conf**      | Configuration files, including `broker.conf`                                                        |
+| **lib**       | JARs used by Pulsar                                                                                 |
+| **examples**  | [Pulsar Functions](functions-overview.md) examples                                                  |
+| **instances** | Artifacts for [Pulsar Functions](functions-overview.md)                                             |
 
-  ```shell
-  wget pulsar:connector_release_url/{connector}-@pulsar:version@.nar
-  ```
+## Step 2. Start a Pulsar standalone cluster
 
-After you download the NAR file, copy the file to the `connectors` directory in the pulsar directory. 
-For example, if you download the `pulsar-io-aerospike-@pulsar:version@.nar` connector file, enter the following commands:
+Run this command to start a standalone Pulsar cluster:
 
 ```bash
-mkdir connectors
-mv pulsar-io-aerospike-@pulsar:version@.nar connectors
-
-ls connectors
-pulsar-io-aerospike-@pulsar:version@.nar
-...
+bin/pulsar standalone
 ```
 
-:::note
-
-* If you are running Pulsar in a bare metal cluster, make sure `connectors` tarball is unzipped in every pulsar directory of the broker (or in every pulsar directory of function-worker if you are running a separate worker cluster for Pulsar Functions).
-* If you are [running Pulsar in Docker](getting-started-docker.md) or deploying Pulsar using a docker image (e.g. [K8S](deploy-kubernetes.md) or [DC/OS](https://dcos.io/), you can use the `apachepulsar/pulsar-all` image instead of the `apachepulsar/pulsar` image. `apachepulsar/pulsar-all` image has already bundled [all built-in connectors](io-overview.md#working-with-connectors).
+These directories are created once you started the Pulsar cluster:
 
-:::
-
-### Install tiered storage offloaders (optional)
+| Directory | Description                                |
+| --------- | ------------------------------------------ |
+| **data**  | All data created by BookKeeper and RocksDB |
+| **logs**  | All server-side logs                       |
 
 :::tip
 
-- Since `2.2.0` release, Pulsar releases a separate binary distribution, containing the tiered storage offloaders.
-- To enable the tiered storage feature, follow the instructions below; otherwise skip this section.
+* To run the service as a background process, you can use the `bin/pulsar-daemon start standalone` command. For more information, see [pulsar-daemon](reference-cli-tools.md#pulsar-daemon).
+* The `public/default` namespace is created when you start a Pulsar cluster. This namespace is for development purposes. All Pulsar topics are managed within namespaces. For more information, see [Namespaces](concepts-messaging.md#namespaces) and [Topics](concepts-messaging.md#topics).
 
 :::
 
-To get started with [tiered storage offloaders](concepts-tiered-storage.md), you need to download the offloaders tarball release on every broker node in one of the following ways:
-
-* download from the Apache mirror <a href="pulsar:offloader_release_url" download>Pulsar Tiered Storage Offloaders @pulsar:version@ release</a>
+## Step 3. Create a topic
 
-* download from the Pulsar [downloads page](pulsar:download_page_url)
+Pulsar stores messages in topics. It's a good practice to explicitly create topics before using them, even if Pulsar can automatically create topics when they are referenced.
 
-* download from the Pulsar [releases page](https://github.com/apache/pulsar/releases/latest)
-
-* use [wget](https://www.gnu.org/software/wget):
-
-  ```shell
-  wget pulsar:offloader_release_url
-  ```
-
-After you download the tarball, untar the offloaders package and copy the offloaders as `offloaders`
-in the pulsar directory:
+To create a new topic, run this command:
 
 ```bash
-tar xvfz apache-pulsar-offloaders-@pulsar:version@-bin.tar.gz
-
-// you will find a directory named `apache-pulsar-offloaders-@pulsar:version@` in the pulsar directory
-// then copy the offloaders
-
-mv apache-pulsar-offloaders-@pulsar:version@/offloaders offloaders
-
-ls offloaders
-tiered-storage-jcloud-@pulsar:version@.nar
+bin/pulsar-admin topics create persistent://public/default/my-topic
 ```
 
-For more information on how to configure tiered storage, see [Tiered storage cookbook](cookbooks-tiered-storage.md).
-
-:::note
-
-* If you are running Pulsar in a bare metal cluster, make sure that `offloaders` tarball is unzipped in every broker's pulsar directory.
-* If you are [running Pulsar in Docker](getting-started-docker.md) or deploying Pulsar using a docker image (e.g. [K8S](deploy-kubernetes.md) or DC/OS), you can use the `apachepulsar/pulsar-all` image instead of the `apachepulsar/pulsar` image. `apachepulsar/pulsar-all` image has already bundled tiered storage offloaders.
-
-:::
+## Step 4. Write messages to the topic
 
-## Start Pulsar standalone
+You can use the `pulsar` command line tool to write messages to a topic. This is useful for experimentation, but in practice you'll use the Producer API in your application code, or Pulsar IO connectors for pulling data in from other systems to Pulsar.
 
-Once you have an up-to-date local copy of the release, you can start a local cluster using the [`pulsar`](reference-cli-tools.md#pulsar) command, which is stored in the `bin` directory, and specifying that you want to start Pulsar in standalone mode.
+Run this command to produce a message:
 
 ```bash
-bin/pulsar standalone
+bin/pulsar-client produce my-topic --messages 'Hello Pulsar!'
 ```
 
-If you have started Pulsar successfully, you will see `INFO`-level log messages like this:
+## Step 5. Read messages from the topic
 
-```bash
-21:59:29.327 [DLM-/stream/storage-OrderedScheduler-3-0] INFO  org.apache.bookkeeper.stream.storage.impl.sc.StorageContainerImpl - Successfully started storage container (0).
-21:59:34.576 [main] INFO  org.apache.pulsar.broker.authentication.AuthenticationService - Authentication is disabled
-21:59:34.576 [main] INFO  org.apache.pulsar.websocket.WebSocketService - Pulsar WebSocket Service started
-```
-
-:::tip
-
-* The service is running on your terminal, which is under your direct control. If you need to run other commands, open a new terminal window. 
-* To run the service as a background process, you can use the `bin/pulsar-daemon start standalone` command. For more information, see [pulsar-daemon](/docs/en/reference-cli-tools/#pulsar-daemon).
-* To perform a health check, you can use the `bin/pulsar-admin brokers healthcheck` command. For more information, see [Pulsar-admin docs](/tools/pulsar-admin/).
-* When you start a local standalone cluster, a `public/default` [namespace](concepts-messaging.md#namespaces) is created automatically. The namespace is used for development purposes. All Pulsar topics are managed within namespaces. For more information, see [Topics](concepts-messaging.md#topics).
-* By default, there is no encryption, authentication, or authorization configured. Apache Pulsar can be accessed from a remote server without any authorization. See [Security Overview](security-overview.md) for how to secure your deployment. 
-
-:::
-
-## Use Pulsar standalone
-
-Pulsar provides a CLI tool called [`pulsar-client`](reference-cli-tools.md#pulsar-client). The pulsar-client tool enables you to consume and produce messages to a Pulsar topic in a running cluster. 
-
-### Consume a message
-
-The following command consumes a message with the subscription name `first-subscription` to the `my-topic` topic:
+Now that some messages have been written to the topic, run this command to launch the consumer and read those messages back:
 
 ```bash
-bin/pulsar-client consume my-topic -s "first-subscription"
+bin/pulsar-client consume my-topic -s 'my-subscription' -p Earliest -n 0
 ```
 
-If the message has been successfully consumed, you will see a confirmation like the following in the `pulsar-client` logs:
+Earliest means consuming from the earliest **unconsumed** message. `-n` configures the number of messages to consume, 0 means to consume forever.
 
-```
-22:17:16.781 [main] INFO  org.apache.pulsar.client.cli.PulsarClientTool - 1 messages successfully consumed
-```
+As before, this is useful for trialling things on the command line, but in practice you'll use the Consumer API in your application code, or Pulsar IO connectors for reading data from Pulsar to push to other systems.
 
-:::tip
+You'll see the messages that you produce in the previous step:
 
-As you have noticed that we do not explicitly create the `my-topic` topic, from which we consume the message. When you consume a message from a topic that does not yet exist, Pulsar creates that topic for you automatically. Producing a message to a topic that does not exist will automatically create that topic for you as well.
+```text
+----- got message -----
+key:[null], properties:[], content:Hello Pulsar!
+```
 
-:::
+## Step 6. Write some more messages
 
-### Produce a message
+Leave the consume command from the previous step running. If you've already closed it, just re-run it.
 
-The following command produces a message saying `hello-pulsar` to the `my-topic` topic:
+Now open a new terminal window and produce more messages, the default message separator is `,`:
 
 ```bash
-bin/pulsar-client produce my-topic --messages "hello-pulsar"
-```
-
-If the message has been successfully published to the topic, you will see a confirmation like the following in the `pulsar-client` logs:
-
-```
-22:21:08.693 [main] INFO  org.apache.pulsar.client.cli.PulsarClientTool - 1 messages successfully produced
+bin/pulsar-client produce my-topic --messages "$(seq -s, -f 'Message NO.%g' -t '\n' 1 10)"
 ```
 
-## Stop Pulsar standalone
+Note how they are displayed almost instantaneously in the consumer terminal.
 
-Press `Ctrl+C` to stop a local standalone Pulsar.
+## Step 7. Stop the Pulsar cluster
 
-:::tip
+Once you've finished you can shut down the Pulsar cluster. Press **Ctrl-C** in the terminal window in which you started the cluster.
 
-If the service runs as a background process using the `bin/pulsar-daemon start standalone` command, then use the `bin/pulsar-daemon stop standalone` command to stop the service.
-For more information, see [pulsar-daemon](reference-cli-tools.md#pulsar-daemon).
-
-:::
+## Further readings
 
+* Read [Pulsar Concepts and Architecture](concepts-architecture-overview.md) to learn more about Pulsar fundamentals.
+* Read [Pulsar Client Libraries](client-libraries.md) to connect Pulsar with your application.
+* Read [Pulsar Connectors](io-overview.md) to connect Pulsar with your existing data pipelines.
+* Read [Pulsar Functions](functions-overview.md) to run serverless computations against Pulsar.