You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@heron.apache.org by ka...@apache.org on 2018/06/25 14:06:40 UTC
[incubator-heron] branch master updated: Clean up website gen +
website python docs (#2928)
This is an automated email from the ASF dual-hosted git repository.
karthikz pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/incubator-heron.git
The following commit(s) were added to refs/heads/master by this push:
new f9d6552 Clean up website gen + website python docs (#2928)
f9d6552 is described below
commit f9d655259bd69b7492af6ee0847f5bc34196f88f
Author: Oliver Bristow <ev...@gmail.com>
AuthorDate: Mon Jun 25 15:06:28 2018 +0100
Clean up website gen + website python docs (#2928)
* Clean up website gen + website python docs
* don't use sudo during `make site` to install pip packages
* update website/config.yaml to reflect latest release
- included newer bazel version as it seems to work for me
* fix python topologies docs
- fixed imports
- tried to make code style more consistent
- changed PEXes to PEXs
* Fix heronpy example links
* Use latest scala rules to fix tests
---
WORKSPACE | 8 +-
website/README.md | 9 +-
website/config.yaml | 4 +-
website/content/docs/developers/python/bolts.md | 48 ++++---
website/content/docs/developers/python/spouts.md | 29 +++--
.../content/docs/developers/python/topologies.md | 140 ++++++++++++---------
website/scripts/python-doc-gen.sh | 11 +-
7 files changed, 136 insertions(+), 113 deletions(-)
diff --git a/WORKSPACE b/WORKSPACE
index 02cd7ef..0e8f20b 100644
--- a/WORKSPACE
+++ b/WORKSPACE
@@ -957,15 +957,17 @@ new_http_archive(
)
# scala integration
-rules_scala_version="5cdae2f034581a05e23c3473613b409de5978833" # update this as needed
+rules_scala_version="669ed8750b77fd99b758298f6467aaa5e6a9dabb" # update this as needed
http_archive(
name = "io_bazel_rules_scala",
- url = "https://github.com/bazelbuild/rules_scala/archive/%s.zip"%rules_scala_version,
+ url = "https://github.com/bazelbuild/rules_scala/archive/%s.zip" % rules_scala_version,
type = "zip",
strip_prefix= "rules_scala-%s" % rules_scala_version,
- sha256 = "bd66b178da5b9b6845f677bdfb2594de8f1050f831a8d69527c6737969376065",
+ sha256 = "1b0f0d7d0cb815116216b0349de0a7d12187dd0d1f4a538f1e7b657d1033a298",
)
load("@io_bazel_rules_scala//scala:scala.bzl", "scala_repositories")
scala_repositories()
+load("@io_bazel_rules_scala//scala:toolchains.bzl", "scala_register_toolchains")
+scala_register_toolchains()
diff --git a/website/README.md b/website/README.md
index 2ee8b12..a5d3ff0 100644
--- a/website/README.md
+++ b/website/README.md
@@ -14,7 +14,7 @@ installed:
* [Make](https://www.gnu.org/software/make/)
* [Node.js](https://nodejs.org/en/)
* [npm](https://www.npmjs.com/)
-* [pip](https://pypi.python.org/pypi/pip)
+* [pip](https://pypi.python.org/pypi/pip) - install `PyYAML>=3.12`
* [Go](https://golang.org) (make sure that your `GOPATH` and `GOROOT` are set)
### macOS setup
@@ -39,8 +39,7 @@ are installed:
1. Navigate to the `website` folder
2. Run `npm install`
-3. Run `make build-static-assets` (this will build all of the necessary static
- assets, i.e. CSS, Javascript, etc.)
+3. Run `make site`
## Building the Docs Locally
@@ -68,9 +67,11 @@ This will run the docs locally on `localhost:1313`. Navigate to
open the browser from the command line:
```bash
-$ open http://localhost:1313/heron
+$ open http://localhost:1313/incubator-heron/
```
+You can edit `.md` files and they will be automatically updated in your browser.
+
## Working on Static Assets
If you'd like to work on the site's static assets (Sass/CSS, JavaScript, etc.),
diff --git a/website/config.yaml b/website/config.yaml
index 600874f..3cd3500 100755
--- a/website/config.yaml
+++ b/website/config.yaml
@@ -36,9 +36,9 @@ params:
author: Twitter, Inc.
description: A realtime, distributed, fault-tolerant stream processing engine from Twitter
versions:
- heron: 0.17.5
+ heron: 0.17.8
bazel: 0.14.1
- heronpy: 0.17.5
+ heronpy: 0.17.8
assets:
favicon:
small: /img/favicon-16x16.png
diff --git a/website/content/docs/developers/python/bolts.md b/website/content/docs/developers/python/bolts.md
index beb220c..f6ba6c3 100644
--- a/website/content/docs/developers/python/bolts.md
+++ b/website/content/docs/developers/python/bolts.md
@@ -8,10 +8,9 @@ title: Implementing Python bolts
Bolts must implement the `Bolt` interface, which has the following methods.
```python
-class Bolt(BaseBolt):
- def initialize(self, config, context)
-
- def process(self, tup)
+class MyBolt(Bolt):
+ def initialize(self, config, context): pass
+ def process(self, tup): pass
```
* The `initialize()` method is called when the bolt is first initialized and
@@ -28,18 +27,15 @@ is equivalent to `execute()` method of `IBolt` interface in Java. You can use
In addition, `BaseBolt` class provides you with the following methods.
```python
-class BaseBolt:
- def emit(self, tup, stream="default", anchors=None, direct_task=None, need_task_ids=False)
- def ack(self, tup)
- def fail(self, tup)
-
- @staticmethod
- def is_tick(tup)
-
- def log(self, message, level=None)
-
- @classmethod
- def spec(cls, name=None, inputs=None, par=1, config=None)
+class BaseBolt(BaseComponent):
+ def emit(self, tup, stream="default", anchors=None, direct_task=None, need_task_ids=False): ...
+ def ack(self, tup): ...
+ def fail(self, tup): ...
+ def log(self, message, level=None): ...
+ @staticmethod
+ def is_tick(tup)
+ @classmethod
+ def spec(cls, name=None, inputs=None, par=1, config=None): ...
```
* The `emit()` method is used to emit a given `tup`, which can be a `list` or `tuple` of
@@ -74,15 +70,17 @@ The following is an example implementation of a bolt in Python.
```python
from collections import Counter
-from heronpy import Bolt
+from heronpy.api.bolt.bolt import Bolt
+
class CountBolt(Bolt):
- outputs = ["word", "count"]
- def initialize(self, config, context):
- self.counter = Counter()
-
- def process(self, tup):
- word = tup.values[0]
- self.counter[word] += 1
- self.emit([word, self.counter[word]])
+ outputs = ["word", "count"]
+
+ def initialize(self, config, context):
+ self.counter = Counter()
+
+ def process(self, tup):
+ word = tup.values[0]
+ self.counter[word] += 1
+ self.emit([word, self.counter[word]])
```
diff --git a/website/content/docs/developers/python/spouts.md b/website/content/docs/developers/python/spouts.md
index 7ef0b61..8e7198a 100644
--- a/website/content/docs/developers/python/spouts.md
+++ b/website/content/docs/developers/python/spouts.md
@@ -8,14 +8,14 @@ title: Implementing Python Spouts
To create a spout for a Heron topology, you need to subclass the [`Spout`](/api/python/spout/spout.m.html#heronpy.spout.spout.Spout) class, which has the following methods.
```python
-class Spout(BaseSpout):
- def initialize(self, config, context)
- def next_tuple(self)
- def ack(self, tup_id)
- def fail(self, tup_id)
- def activate(self)
- def deactivate(self)
- def close(self)
+class MySpout(Spout):
+ def initialize(self, config, context): pass
+ def next_tuple(self): pass
+ def ack(self, tup_id): pass
+ def fail(self, tup_id): pass
+ def activate(self): pass
+ def deactivate(self): pass
+ def close(self): pass
```
## `Spout` class methods
@@ -52,13 +52,11 @@ guarantee that this method is called due to how the instance is killed.
The `Spout` class inherits from the [`BaseSpout`](/api/python/spout/base_spout.m.html#heronpy.spout.base_spout.BaseSpout) class, which also provides you methods you can use in your spouts.
```python
-class BaseSpout:
- def emit(self, tup, tup_id=None, stream="default", direct_task=None, need_task_ids=False)
-
- def log(self, message, level=None)
-
+class BaseSpout(BaseComponent):
+ def log(self, message, level=None): ...
+ def emit(self, tup, tup_id=None, stream="default", direct_task=None, need_task_ids=False): ...
@classmethod
- def spec(cls, name=None, par=1, config=None)
+ def spec(cls, name=None, par=1, config=None): ...
```
* The `emit()` method is used to emit a given tuple, which can be a `list` or `tuple` of any Python objects. Unlike in the Java implementation, there is no `OutputCollector` in the Python implementation.
@@ -84,7 +82,8 @@ The following is an example implementation of a spout in Python.
```python
from itertools import cycle
-from heronpy import Spout
+from heronpy.api.spout.spout import Spout
+
class WordSpout(Spout):
outputs = ['word']
diff --git a/website/content/docs/developers/python/topologies.md b/website/content/docs/developers/python/topologies.md
index c1d781a..6c37406 100644
--- a/website/content/docs/developers/python/topologies.md
+++ b/website/content/docs/developers/python/topologies.md
@@ -2,7 +2,7 @@
title: Python Topologies
---
-> The current version of `py_heron` is [{{% heronpyVersion %}}](https://pypi.python.org/pypi/heronpy/{{% heronpyVersion %}}).
+> The current version of `heronpy` is [{{% heronpyVersion %}}](https://pypi.python.org/pypi/heronpy/{{% heronpyVersion %}}).
Support for developing Heron topologies in Python is provided by a Python library called [`heronpy`](https://pypi.python.org/pypi/heronpy).
@@ -21,7 +21,9 @@ $ easy_install heronpy
Then you can include `heronpy` in your project files. Here's an example:
```python
-from heronpy import Bolt, Spout, Topology
+from heronpy.api.bolt.bolt import Bolt
+from heronpy.api.spout.spout import Spout
+from heronpy.api.topology import Topology
```
## Writing topologies in Python
@@ -37,9 +39,11 @@ Once you've defined spouts and bolts for a topology, you can then compose the to
Here's an example:
```python
- from heronpy import TopologyBuilder
+ #!/usr/bin/env python
+ from heronpy.api.topology import TopologyBuilder
- if __name__ == '__main__':
+
+ if __name__ == "__main__":
builder = TopologyBuilder("MyTopology")
# Add spouts and bolts
builder.build_and_submit()
@@ -50,12 +54,13 @@ Once you've defined spouts and bolts for a topology, you can then compose the to
Here's an example:
```python
+ from heronpy.api.stream import Grouping
+ from heronpy.api.topology import Topology
+
+
class MyTopology(Topology):
- my_spout = MySpout.spec(par=2)
- my_bolt = MyBolt.spec(par=3,
- inputs={
- spout: Grouping.fields('some-input-field')
- })
+ my_spout = WordSpout.spec(par=2)
+ my_bolt = CountBolt.spec(par=3, inputs={spout: Grouping.fields("word")})
```
## Defining topologies using the [`TopologyBuilder`](/api/python/topology.m.html#heronpy.topology.TopologyBuilder) class
@@ -63,7 +68,10 @@ Once you've defined spouts and bolts for a topology, you can then compose the to
If you create a Python topology using a [`TopologyBuilder`](/api/python/topology.m.html#heronpy.topology.TopologyBuilder), you need to instantiate a `TopologyBuilder` inside of a standard Python main function, like this:
```python
-if __name__ == '__main__':
+from heronpy.api.topology import TopologyBuilder
+
+
+if __name__ == "__main__":
builder = TopologyBuilder("MyTopology")
```
@@ -71,8 +79,8 @@ Once you've created a `TopologyBuilder` object, you can add [bolts](../bolts) us
```python
builder = TopologyBuilder("MyTopology")
-builder.add_bolt("my_bolt", MyBolt, par=3)
-builder.add_spout("my_spout", MySpout, par=2)
+builder.add_bolt("my_bolt", CountBolt, par=3)
+builder.add_spout("my_spout", WordSpout, par=2)
```
Both the `add_bolt` and `add_spout` methods return the corresponding [`HeronComponentSpec`](/api/python/component/component_spec.m.html#heronpy.component.component_spec.HeronComponentSpec) object.
@@ -101,17 +109,19 @@ Argument | Data type | Description | Default
The following is an example implementation of a word count topology in Python that subclasses [`TopologyBuilder`](/api/python/topology.m.html#heronpy.topology.TopologyBuilder).
```python
-from heronpy import TopologyBuilder
from your_spout import WordSpout
from your_bolt import CountBolt
+from heronpy.api.stream import Grouping
+from heronpy.api.topology import TopologyBuilder
+
+
if __name__ == "__main__":
builder = TopologyBuilder("WordCountTopology")
+ # piece together the topology
word_spout = builder.add_spout("word_spout", WordSpout, par=2)
-
- count_bolt_input =
- count_bolt = builder.add_bolt("count_bolt", CountBolt, par=2,
- inputs={word_spout: Grouping.fields('word')})
+ count_bolt = builder.add_bolt("count_bolt", CountBolt, par=2, inputs={word_spout: Grouping.fields("word")})
+ # submit the toplogy
builder.build_and_submit()
```
@@ -125,11 +135,12 @@ If you're building a Python topology using a `TopologyBuilder`, you can specify
Here's an example:
```python
-from heronpy import api_constants, TopologyBuilder
+from heronpy.api import api_constants
+from heronpy.api.topology import TopologyBuilder
+
-if __name__ == '__main__':
+if __name__ == "__main__":
topology_config = {
- api_constants.TOPOLOGY_ENABLE_ACKING: True,
api_constants.TOPOLOGY_ENABLE_MESSAGE_TIMEOUTS: True
}
builder = TopologyBuilder("MyTopology")
@@ -151,7 +162,7 @@ $ heron submit local \
Note the `-` in this submission command. If you define a topology by subclassing `TopologyBuilder` you do not need to instruct Heron where your main method is located.
-> #### Example topologies buildable as PEXes
+> #### Example topologies buildable as PEXs
> * See [this repo](https://github.com/streamlio/pants-dev-environment) for an example of a Heron topology written in Python and deployable as a Pants-packaged PEX.
> * See [this repo](https://github.com/streamlio/bazel-dev-environment) for an example of a Heron topology written in Python and deployable as a Bazel-packaged PEX.
@@ -160,16 +171,17 @@ Note the `-` in this submission command. If you define a topology by subclassing
If you create a Python topology by subclassing the [`Topology`](/api/python/topology.m.html#heronpy.topology.Topology) class, you need to create a new topology class, like this:
```python
-from heronpy import Grouping, Topology
-from my_spout import MySpout
-from my_bolt import MyBolt
+from my_spout import WordSpout
+from my_bolt import CountBolt
+
+from heronpy.api.stream import Grouping
+from heronpy.api.topology import Topology
+
class MyTopology(Topology):
- my_spout = MySpout.spec(par=2)
- my_bolt_inputs = {
- my_spout: Grouping.fields('some-input-field')
- }
- my_bolt = MyBolt.spec(par=3, inputs=my_bolt_inputs)
+ my_spout = WordSpout.spec(par=2)
+ my_bolt_inputs = {my_spout: Grouping.fields("word")}
+ my_bolt = CountBolt.spec(par=3, inputs=my_bolt_inputs)
```
All you need to do is place [`HeronComponentSpec`](/api/python/component/component_spec.m.html#heronpy.component.component_spec.HeronComponentSpec)s as the class attributes
@@ -201,13 +213,16 @@ Argument | Data type | Description | Default
Here's an example topology definition with one spout and one bolt:
```python
-from heronpy import Topology
-from your_spout import WordSpout
-from your_bolt import CountBolt
+from my_spout import WordSpout
+from my_bolt import CountBolt
+
+from heronpy.api.stream import Grouping
+from heronpy.api.topology import Topology
+
class WordCount(Topology):
word_spout = WordSpout.spec(par=2)
- count_bolt = CountBolt.spec(par=2, inputs={word_spout: Grouping.fields('word')})
+ count_bolt = CountBolt.spec(par=2, inputs={word_spout: Grouping.fields("word")})
```
### Launching
@@ -231,11 +246,12 @@ If you're building a Python topology by subclassing `Topology`, you can specify
Here's an example:
```python
-from heronpy import api_constants, Topology
+from heronpy.api.topology import Topology
+from heronpy.api import api_constants
+
class MyTopology(Topology):
config = {
- api_constants.TOPOLOGY_ENABLE_ACKING: True,
api_constants.TOPOLOGY_ENABLE_MESSAGE_TIMEOUTS: True
}
# Add bolts and spouts, etc.
@@ -248,8 +264,10 @@ strings for `outputs`, you can specify a list of `Stream` objects, in the follow
```python
class MultiStreamSpout(Spout):
- outputs = [Stream(fields=['normal', 'fields'], name='default'),
- Stream(fields=['error_message'], name='error_stream')]
+ outputs = [
+ Stream(fields=["normal", "fields"], name="default"),
+ Stream(fields=["error_message"], name="error_stream"),
+ ]
```
To select one of these streams as the input for your bolt, you can simply
@@ -258,9 +276,9 @@ stream will be used.
```python
class MultiStreamTopology(Topology):
- spout = MultiStreamSpout.spec()
- error_bolt = ErrorBolt.spec(inputs={spout['error_stream']: Grouping.LOWEST})
- consume_bolt = ConsumeBolt.spec(inputs={spout: Grouping.SHUFFLE})
+ spout = MultiStreamSpout.spec()
+ error_bolt = ErrorBolt.spec(inputs={spout["error_stream"]: Grouping.LOWEST})
+ consume_bolt = ConsumeBolt.spec(inputs={spout: Grouping.SHUFFLE})
```
## Declaring output fields using the `spec()` method
@@ -274,14 +292,14 @@ This is useful in a situation like below.
```python
class IdentityBolt(Bolt):
- # Statically declaring output fields is not allowed
- class process(self, tup):
- emit([tup.values])
+ # Statically declaring output fields is not allowed
+ class process(self, tup):
+ emit([tup.values])
+
class DynamicOutputField(Topology):
- spout = WordSpout.spec()
- bolt = IdentityBolt.spec(inputs={spout: Grouping.ALL},
- optional_outputs=['word'])
+ spout = WordSpout.spec()
+ bolt = IdentityBolt.spec(inputs={spout: Grouping.ALL}, optional_outputs=["word"])
```
You can also declare outputs in the `add_spout()` and the `add_bolt()`
@@ -289,38 +307,38 @@ method for the `TopologyBuilder` in the same way.
## Example topologies
-There are a number of example topologies that you can peruse in the [`heron/examples/src/python`]({{% githubMaster %}}/heron/examples/src/python) directory of the [Heron repo]({{% githubMaster %}}):
+There are a number of example topologies that you can peruse in the [`examples/src/python`]({{% githubMaster %}}/examples/src/python) directory of the [Heron repo]({{% githubMaster %}}):
Topology | File | Description
:--------|:-----|:-----------
-Word count | [`word_count_topology.py`]({{% githubMaster %}}/heron/examples/src/python/word_count_topology.py) | The [`WordSpout`]({{% githubMaster %}}/heron/examples/src/python/spout/word_spout.py) spout emits random words from a list, while the [`CountBolt`]({{% githubMaster %}}/heron/examples/src/python/bolt/count_bolt.py) bolt counts the number of words that have been emitted.
-Multiple streams | [`multi_stream_topology.py`]({{% githubMaster %}}/heron/examples/src/python/multi_stream_topology.py) | The [`MultiStreamSpout`]({{% githubMaster %}}/heron/examples/src/python/spout/multi_stream_spout.py) emits multiple streams to downstream bolts.
-Half acking | [`half_acking_topology.py`]({{% githubMaster %}}/heron/examples/src/python/half_acking_topology.py) | The [`HalfAckBolt`]({{% githubMaster %}}/heron/examples/src/python/bolt/half_ack_bolt.py) acks only half of all received tuples.
-Custom grouping | [`custom_grouping_topology.py`]({{% githubMaster %}}/heron/examples/src/python/custom_grouping_topology.py) | The [`SampleCustomGrouping`]({{% githubMaster %}}/heron/examples/src/python/custom_grouping_topology.py#L26) class provides a custom field grouping.
+Word count | [`word_count_topology.py`]({{% githubMaster %}}/examples/src/python/word_count_topology.py) | The [`WordSpout`]({{% githubMaster %}}/examples/src/python/spout/word_spout.py) spout emits random words from a list, while the [`CountBolt`]({{% githubMaster %}}/examples/src/python/bolt/count_bolt.py) bolt counts the number of words that have been emitted.
+Multiple streams | [`multi_stream_topology.py`]({{% githubMaster %}}/examples/src/python/multi_stream_topology.py) | The [`MultiStreamSpout`]({{% githubMaster %}}/examples/src/python/spout/multi_stream_spout.py) emits multiple streams to downstream bolts.
+Half acking | [`half_acking_topology.py`]({{% githubMaster %}}/examples/src/python/half_acking_topology.py) | The [`HalfAckBolt`]({{% githubMaster %}}/examples/src/python/bolt/half_ack_bolt.py) acks only half of all received tuples.
+Custom grouping | [`custom_grouping_topology.py`]({{% githubMaster %}}/examples/src/python/custom_grouping_topology.py) | The [`SampleCustomGrouping`]({{% githubMaster %}}/examples/src/python/custom_grouping_topology.py#L26) class provides a custom field grouping.
-You can build the respective PEXes for these topologies using the following commands:
+You can build the respective PEXs for these topologies using the following commands:
```shell
-$ bazel build heron/examples/src/python:word_count
-$ bazel build heron/examples/src/python:multi_stream
-$ bazel build heron/examples/src/python:half_acking
-$ bazel build heron/examples/src/python:custom_grouping
+$ bazel build examples/src/python:word_count
+$ bazel build examples/src/python:multi_stream
+$ bazel build examples/src/python:half_acking
+$ bazel build examples/src/python:custom_grouping
```
-All built PEXes will be stored in `bazel-bin/heron/examples/src/python`. You can submit them to Heron like so:
+All built PEXs will be stored in `bazel-bin/examples/src/python`. You can submit them to Heron like so:
```shell
$ heron submit local \
- bazel-bin/heron/examples/src/python/word_count.pex - \
+ bazel-bin/examples/src/python/word_count.pex - \
WordCount
$ heron submit local \
- bazel-bin/heron/examples/src/python/multi_stream.pex \
+ bazel-bin/examples/src/python/multi_stream.pex \
heron.examples.src.python.multi_stream_topology.MultiStream
$ heron submit local \
- bazel-bin/heron/examples/src/python/half_acking.pex - \
+ bazel-bin/examples/src/python/half_acking.pex - \
HalfAcking
$ heron submit local \
- bazel-bin/heron/examples/src/python/custom_grouping.pex \
+ bazel-bin/examples/src/python/custom_grouping.pex \
heron.examples.src.python.custom_grouping_topology.CustomGrouping
```
diff --git a/website/scripts/python-doc-gen.sh b/website/scripts/python-doc-gen.sh
index ba50a43..10ab111 100755
--- a/website/scripts/python-doc-gen.sh
+++ b/website/scripts/python-doc-gen.sh
@@ -1,12 +1,17 @@
#!/bin/bash
+set -e
HERONPY_VERSION=$1
HERON_ROOT_DIR=$(git rev-parse --show-toplevel)
INPUT=heronpy
-TMP_DIR=$(mktemp -d)
+TMP_DIR=$(mktemp --directory)
-sudo pip install heronpy==${HERONPY_VERSION}
-sudo pip install --ignore-installed six
+VENV="$(mktemp --directory)"
+virtualenv "$VENV"
+source "$VENV/bin/activate"
+# TODO: make this a virtualenv
+pip install "heronpy==${HERONPY_VERSION}" "pdoc~=0.3.2"
+pip install --ignore-installed six
mkdir -p static/api && rm -rf static/api/python