You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pulsar.apache.org by GitBox <gi...@apache.org> on 2018/03/07 06:51:57 UTC

[GitHub] sijie closed pull request #1350: Pulsar Functions documentation

sijie closed pull request #1350: Pulsar Functions documentation
URL: https://github.com/apache/incubator-pulsar/pull/1350
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/site/_data/popovers.yaml b/site/_data/popovers.yaml
index 3db11d222..f00197097 100644
--- a/site/_data/popovers.yaml
+++ b/site/_data/popovers.yaml
@@ -88,6 +88,9 @@ pub-sub:
 pulsar:
   q: What is Pulsar?
   def: Pulsar is a distributed messaging system originally created by Yahoo but now under the stewardship of the Apache Software Foundation.
+pulsar-functions:
+  q: What are Pulsar Functions?
+  def: Pulsar Functions are lightweight functions that can consume messages from Pulsar topics, apply custom processing logic, and, if desired, publish results to topics.
 retention-policy:
   q: What is a retention policy?
   def: Size and/or time limits that you can set on a namespace to configure retention of messages that have already been acknowledged.
diff --git a/site/_data/pulsar-functions.yaml b/site/_data/pulsar-functions.yaml
new file mode 100644
index 000000000..bea316614
--- /dev/null
+++ b/site/_data/pulsar-functions.yaml
@@ -0,0 +1,85 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+#   http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing,
+# software distributed under the License is distributed on an
+# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+# KIND, either express or implied.  See the License for the
+# specific language governing permissions and limitations
+# under the License.
+#
+
+description: |
+  A tool for deploying and managing Pulsar Functions.
+example: |
+  pulsar-functions localrun \
+    --function-config my-function.yaml
+commands:
+- name: localrun
+  description: Runs a Pulsar Function
+- name: create
+  description: Creates a new Pulsar Function
+- name: delete
+  description: Deletes an existing Pulsar Function
+- name: update
+  description: Updates an existing Pulsar Function
+- name: get
+  description: Returns information about an existing Pulsar Function
+- name: list
+  description: Lists all currently existing Pulsar Functions
+  options:
+  - flags: --namespace
+    description: The namespace of the Pulsar Functions you'd like to list
+  - flags: --tenant
+    description: The tenant of the Pulsar Functions you'd like to list (you must also specify a namespace using the `--namespace` flag)
+- name: getstatus
+  description: Checks on the status of the specified Pulsar Function
+  options:
+  - flags: --namespace
+    description: The name of the Pulsar Function whose status you'd like to check
+  - flags: --tenant
+    description: The tenant of the Pulsar Function whose status you'd like to check
+  - flags: --tenant
+- name: querystate
+  description: Displays the current state of the specified Pulsar Function, by key
+  options:
+  - flags: -k, --key
+    description: The key for the desired value
+  - flags: --name
+    description: The name of the Pulsar Function whose current state you'd like to query
+  - flags: --namespace
+    description: The namespace of the Pulsar Function whose current state you'd like to query
+  - flags: -u, --storage-service-url
+    description: The URL of the storage service
+  - flags: --tenant
+    description: The tenant of the Pulsar Function whose current state you'd like to query
+  - flags: -w, --watch
+    description: If set, watch for changes in the current state of the specified Pulsar Function (by the key set using `-k`/`--key`)
+    default: 'false'
+options:
+  - flags: --name
+    description: The name of the Pulsar Function
+  - flags: --function-classname
+    description: The Java class name of the Pulsar Function
+  - flags: --function-classpath
+    description: The Java classpath of the Pulsar Function
+  - flags: --source-topic
+    description: The topic from which the Pulsar Function consumes its input
+  - flags: --sink-topic
+    description: The topic to which the Pulsar Function publishes its output (if any)
+  - flags: --input-serde-classname
+    description: Input SerDe
+    default: org.apache.pulsar.functions.runtime.serde.Utf8StringSerDe
+  - flags: --output-serde-classname
+    description: Output SerDe
+    default: org.apache.pulsar.functions.runtime.serde.Utf8StringSerDe
+  - flags: --function-config
+    description: The path for the Pulsar Function's YAML configuration file
\ No newline at end of file
diff --git a/site/docs/latest/functions/api.md b/site/docs/latest/functions/api.md
new file mode 100644
index 000000000..ba71c8665
--- /dev/null
+++ b/site/docs/latest/functions/api.md
@@ -0,0 +1,76 @@
+---
+title: The Pulsar Functions API
+---
+
+## Java
+
+Java API example:
+
+```java
+import java.util.Function;
+public class ExclamationFunction implements Function<String, String> {
+    @Override
+    public String apply(String input) { return String.format("%s!", input); }
+}
+```
+
+### With context
+
+```java
+public interface PulsarFunction<I, O> {
+    O process(I input, Context context) throws Exception;
+}
+```
+
+Context interface:
+
+```java
+public interface Context {
+    byte[] getMessageId();
+    String getTopicName();
+    Collection<String> getSourceTopics();
+    String getSinkTopic();
+    String getOutputSerdeClassName();
+    String getTenant();
+    String getNamespace();
+    String getFunctionName();
+    String getFunctionId();
+    String getInstanceId();
+    String getFunctionVersion();
+    Logger getLogger();
+    void incrCounter(String key, long amount);
+    String getUserConfigValue(String key);
+    void recordMetric(String metricName, double value);
+    <O> CompletableFuture<Void> publish(String topicName, O object, String serDeClassName);
+    <O> CompletableFuture<Void> publish(String topicName, O object);
+    CompletableFuture<Void> ack(byte[] messageId, String topic);
+}
+```
+
+### SerDe
+
+> Serde stands for **Ser**ialization and **De**serialization.
+
+Built-in vs. custom. For custom, you need to implement this interface:
+
+```java
+public interface SerDe<T> {
+    T deserialize(byte[] input);
+    byte[] serialize(T input);
+}
+```
+
+The built-in is the `org.apache.pulsar.functions.api.DefaultSerDe` class:
+
+```java
+
+```
+
+The built-in should work fine for basic Java types. For more advanced types,
+
+
+## Python
+
+```python
+def process(input):
+```
\ No newline at end of file
diff --git a/site/docs/latest/functions/deployment.md b/site/docs/latest/functions/deployment.md
new file mode 100644
index 000000000..ea2d79a97
--- /dev/null
+++ b/site/docs/latest/functions/deployment.md
@@ -0,0 +1,9 @@
+---
+title: Deploying Pulsar Functions
+---
+
+At the moment, Pulsar Functions are deployed
+
+## State storage
+
+By default, Pulsar uses [Apache BookKeeper](https://bookkeeper.apache.org).
\ No newline at end of file
diff --git a/site/docs/latest/functions/guarantees.md b/site/docs/latest/functions/guarantees.md
new file mode 100644
index 000000000..3efd5494f
--- /dev/null
+++ b/site/docs/latest/functions/guarantees.md
@@ -0,0 +1,28 @@
+---
+title: Processing guarantees
+lead: Apply at-most-once, at-least-once, or effectively-once delivery semantics to Pulsar Functions
+---
+
+Pulsar Functions provides three different messaging semantics that you can apply to any Function:
+
+* **At-most-once** delivery
+* **At-least-once** delivery
+* **Effectively-once** delivery
+
+## How it works
+
+You can set the processing guarantees for a Pulsar Function when you create the Function. This [`pulsar-function create`](../../reference/CliTools#pulsar-functions-create) command, for example, would apply effectively-once guarantees to the Function:
+
+```bash
+$ bin/pulsar-functions \
+  # TODO
+  --processingGuarantees EFFECTIVELY_ONCE
+```
+
+The available options are:
+
+* `ATMOST_ONCE`
+* `ATLEAST_ONCE`
+* `EFFECTIVELY_ONCE`
+
+{% include admonition.html type='info' content='By default, Pulsar Functions provide at-most-once delivery guarantees. If you create a function without supplying a value for the `--processingGuarantees`flag, then the Function will provide only at-most-once guarantees.' %}
\ No newline at end of file
diff --git a/site/docs/latest/functions/metrics-and-stats.md b/site/docs/latest/functions/metrics-and-stats.md
new file mode 100644
index 000000000..90f330657
--- /dev/null
+++ b/site/docs/latest/functions/metrics-and-stats.md
@@ -0,0 +1,19 @@
+---
+title: Metrics and stats for Pulsar Functions
+---
+
+Pulsar Functions can publish arbitrary metrics to the metrics interface (which can then be queried).
+
+## Java API
+
+To publish a metric to the metrics interface:
+
+```java
+void recordMetric(String metricName, double value);
+```
+
+Here's an example:
+
+```java
+Context.recordMetric("my-custom-metrics", 475);
+```
\ No newline at end of file
diff --git a/site/docs/latest/functions/overview.md b/site/docs/latest/functions/overview.md
new file mode 100644
index 000000000..6dff8d478
--- /dev/null
+++ b/site/docs/latest/functions/overview.md
@@ -0,0 +1,91 @@
+---
+title: Pulsar Functions overview
+lead: A bird's-eye look at Pulsar's lightweight, developer-friendly compute platform
+---
+
+
+**Pulsar Functions** are lightweight compute processes that
+
+* consume {% popover messages %} from one or more Pulsar {% popover topics %},
+* apply a user-supplied processing logic to each message,
+* publish the results of the computation to another topic
+
+Here's an example Pulsar Function for Java:
+
+```java
+import java.util.Function;
+
+public class ExclamationFunction implements Function<String, String> {
+    @Override
+    public String apply(String input) { return String.format("%s!", input); }
+}
+```
+
+Functions are executed each time a message is published to the input topic. If a function is listening on the topic `tweet-stream`, for example, then the function would be run each time a message.
+
+> Pulsar features automatic message deduplication
+
+### Goals
+
+Core goal: make Pulsar do real heavy lifting without needing to deploy a neighboring system (Storm, Heron, Flink, etc.). Ready-made compute infrastructure at your disposal.
+
+* Developer productivity (easy troubleshooting and deployment)
+  * "Serverless" philosophy
+* No need for a separate SPE
+
+### Inspirations
+
+* AWS Lambda, Google Cloud Functions, etc.
+* FaaS
+* Serverless/NoOps philosophy
+
+### Command-line interface
+
+You can manage Pulsar Functions using the [`pulsar-functions`](../../reference/CliTools#pulsar-functions) CLI tool. Here's an example command that would
+
+```bash
+$ bin/pulsar-functions localrun \
+  --inputs persistent://sample/standalone/ns1/test_src \
+  --output persistent://sample/standalone/ns1/test_result \
+  --jar examples/api-examples.jar \
+  --className org.apache.pulsar.functions.api.examples.ExclamationFunction
+```
+
+### Supported languages
+
+Pulsar Functions can currently be written in [Java](../../functions/api#java) and [Python](../../functions/api#python). Support for additional languages is coming soon.
+
+### Runtime
+
+### Deployment modes
+
+* Local run
+* Cluster run
+
+### Delivery semantics
+
+* At most once
+* At least once
+* Effectively once
+
+### State storage
+
+### Metrics
+
+Here's an example function that publishes a value of 1 to the `my-metric` metric.
+
+```java
+public class MetricsFunction implements PulsarFunction<String, Void> {
+    @Override
+    public Void process(String input, Context context) {
+        context.recordMetric("my-metric", 1);
+        return null;
+    }
+}
+```
+
+### Logging
+
+### Data types
+
+* Strongly typed
diff --git a/site/docs/latest/functions/quickstart.md b/site/docs/latest/functions/quickstart.md
new file mode 100644
index 000000000..8c04c6bd4
--- /dev/null
+++ b/site/docs/latest/functions/quickstart.md
@@ -0,0 +1,29 @@
+---
+title: Getting started with Pulsar Functions
+---
+
+## The `pulsar-functions` CLI tool
+
+[`pulsar-functions`](../../reference/CliTools#pulsar-functions)
+
+```bash
+$ alias pulsar-functions='/path/to/pulsar/bin/pulsar-functions'
+```
+
+## Querying state
+
+```bash
+$ bin/pulsar-functions querystate \
+  --tenant sample \
+  --namespace my-functions \
+  --function-name my-function \
+  --key "some-key"
+```
+
+## Running functions locally
+
+[`localrun`](../../reference/CliTools#pulsar-functions-localrun)
+
+```bash
+$ bin/pulsar-functions localrun
+```
\ No newline at end of file
diff --git a/site/docs/latest/getting-started/ConceptsAndArchitecture.md b/site/docs/latest/getting-started/ConceptsAndArchitecture.md
index 045af3ff9..798934496 100644
--- a/site/docs/latest/getting-started/ConceptsAndArchitecture.md
+++ b/site/docs/latest/getting-started/ConceptsAndArchitecture.md
@@ -288,6 +288,10 @@ With message retention, shown at the top, a <span style="color: #89b557;">retent
 
 With message expiry, shown at the bottom, some messages are <span style="color: #bb3b3e;">deleted</span>, even though they <span style="color: #337db6;">haven't been acknowledged</span>, because they've expired according to the <span style="color: #e39441;">TTL applied to the namespace</span> (for example because a TTL of 5 minutes has been applied and the messages haven't been acknowledged but are 10 minutes old).
 
+## Pulsar Functions
+
+For an in-depth look at Pulsar Functions, see the [Pulsar Functions overview](../../functions/overview).
+
 ## Replication
 
 Pulsar enables messages to be produced and consumed in different geo-locations. For instance, your application may be publishing data in one region or market and you would like to process it for consumption in other regions or markets. [Geo-replication](../../admin/GeoReplication) in Pulsar enables you to do that.


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services