You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@heron.apache.org by GitBox <gi...@apache.org> on 2018/06/25 14:06:40 UTC

[GitHub] kramasamy closed pull request #2928: Clean up website gen + website python docs

kramasamy closed pull request #2928: Clean up website gen + website python docs
URL: https://github.com/apache/incubator-heron/pull/2928
 
 
   

This is a PR merged from a forked repository.
As GitHub hides the original diff on merge, it is displayed below for
the sake of provenance:

As this is a foreign pull request (from a fork), the diff is supplied
below (as it won't show otherwise due to GitHub magic):

diff --git a/WORKSPACE b/WORKSPACE
index 02cd7efd1c..0e8f20b95e 100644
--- a/WORKSPACE
+++ b/WORKSPACE
@@ -957,15 +957,17 @@ new_http_archive(
 )
 
 # scala integration
-rules_scala_version="5cdae2f034581a05e23c3473613b409de5978833" # update this as needed
+rules_scala_version="669ed8750b77fd99b758298f6467aaa5e6a9dabb" # update this as needed
 
 http_archive(
     name = "io_bazel_rules_scala",
-    url = "https://github.com/bazelbuild/rules_scala/archive/%s.zip"%rules_scala_version,
+    url = "https://github.com/bazelbuild/rules_scala/archive/%s.zip" % rules_scala_version,
     type = "zip",
     strip_prefix= "rules_scala-%s" % rules_scala_version,
-    sha256 = "bd66b178da5b9b6845f677bdfb2594de8f1050f831a8d69527c6737969376065",
+    sha256 = "1b0f0d7d0cb815116216b0349de0a7d12187dd0d1f4a538f1e7b657d1033a298",
 )
 
 load("@io_bazel_rules_scala//scala:scala.bzl", "scala_repositories")
 scala_repositories()
+load("@io_bazel_rules_scala//scala:toolchains.bzl", "scala_register_toolchains")
+scala_register_toolchains()
diff --git a/website/README.md b/website/README.md
index 2ee8b126bf..a5d3ff0ca3 100644
--- a/website/README.md
+++ b/website/README.md
@@ -14,7 +14,7 @@ installed:
 * [Make](https://www.gnu.org/software/make/)
 * [Node.js](https://nodejs.org/en/)
 * [npm](https://www.npmjs.com/)
-* [pip](https://pypi.python.org/pypi/pip)
+* [pip](https://pypi.python.org/pypi/pip) - install `PyYAML>=3.12`
 * [Go](https://golang.org) (make sure that your `GOPATH` and `GOROOT` are set)
 
 ### macOS setup
@@ -39,8 +39,7 @@ are installed:
 
 1. Navigate to the `website` folder
 2. Run `npm install`
-3. Run `make build-static-assets` (this will build all of the necessary static
-   assets, i.e. CSS, Javascript, etc.)
+3. Run `make site`
 
 ## Building the Docs Locally
 
@@ -68,9 +67,11 @@ This will run the docs locally on `localhost:1313`. Navigate to
 open the browser from the command line:
 
 ```bash
-$ open http://localhost:1313/heron
+$ open http://localhost:1313/incubator-heron/
 ```
 
+You can edit `.md` files and they will be automatically updated in your browser.
+
 ## Working on Static Assets
 
 If you'd like to work on the site's static assets (Sass/CSS, JavaScript, etc.),
diff --git a/website/config.yaml b/website/config.yaml
index 600874fb69..3cd3500ecc 100755
--- a/website/config.yaml
+++ b/website/config.yaml
@@ -36,9 +36,9 @@ params:
   author: Twitter, Inc.
   description: A realtime, distributed, fault-tolerant stream processing engine from Twitter
   versions:
-    heron: 0.17.5
+    heron: 0.17.8
     bazel: 0.14.1
-    heronpy: 0.17.5
+    heronpy: 0.17.8
   assets:
     favicon:
       small: /img/favicon-16x16.png
diff --git a/website/content/docs/developers/python/bolts.md b/website/content/docs/developers/python/bolts.md
index beb220c316..f6ba6c34b1 100644
--- a/website/content/docs/developers/python/bolts.md
+++ b/website/content/docs/developers/python/bolts.md
@@ -8,10 +8,9 @@ title: Implementing Python bolts
 Bolts must implement the `Bolt` interface, which has the following methods.
 
 ```python
-class Bolt(BaseBolt):
-  def initialize(self, config, context)
-
-  def process(self, tup)
+class MyBolt(Bolt):
+    def initialize(self, config, context): pass
+    def process(self, tup): pass
 ```
 
 * The `initialize()` method is called when the bolt is first initialized and
@@ -28,18 +27,15 @@ is equivalent to `execute()` method of `IBolt` interface in Java. You can use
 In addition, `BaseBolt` class provides you with the following methods.
 
 ```python
-class BaseBolt:
-  def emit(self, tup, stream="default", anchors=None, direct_task=None, need_task_ids=False)
-  def ack(self, tup)
-  def fail(self, tup)
-
-  @staticmethod
-  def is_tick(tup)
-
-  def log(self, message, level=None)
-
-  @classmethod
-  def spec(cls, name=None, inputs=None, par=1, config=None)
+class BaseBolt(BaseComponent):
+    def emit(self, tup, stream="default", anchors=None, direct_task=None, need_task_ids=False): ...
+    def ack(self, tup): ...
+    def fail(self, tup): ...
+    def log(self, message, level=None): ...
+    @staticmethod
+    def is_tick(tup)
+    @classmethod
+    def spec(cls, name=None, inputs=None, par=1, config=None): ...
 ```
 
 * The `emit()` method is used to emit a given `tup`, which can be a `list` or `tuple` of
@@ -74,15 +70,17 @@ The following is an example implementation of a bolt in Python.
 
 ```python
 from collections import Counter
-from heronpy import Bolt
+from heronpy.api.bolt.bolt import Bolt
+
 
 class CountBolt(Bolt):
-  outputs = ["word", "count"]
-  def initialize(self, config, context):
-    self.counter = Counter()
-
-  def process(self, tup):
-    word = tup.values[0]
-    self.counter[word] += 1
-    self.emit([word, self.counter[word]])
+    outputs = ["word", "count"]
+
+    def initialize(self, config, context):
+        self.counter = Counter()
+
+    def process(self, tup):
+        word = tup.values[0]
+        self.counter[word] += 1
+        self.emit([word, self.counter[word]])
 ```
diff --git a/website/content/docs/developers/python/spouts.md b/website/content/docs/developers/python/spouts.md
index 7ef0b61140..8e7198a1ce 100644
--- a/website/content/docs/developers/python/spouts.md
+++ b/website/content/docs/developers/python/spouts.md
@@ -8,14 +8,14 @@ title: Implementing Python Spouts
 To create a spout for a Heron topology, you need to subclass the [`Spout`](/api/python/spout/spout.m.html#heronpy.spout.spout.Spout) class, which has the following methods.
 
 ```python
-class Spout(BaseSpout):
-    def initialize(self, config, context)
-    def next_tuple(self)
-    def ack(self, tup_id)
-    def fail(self, tup_id)
-    def activate(self)
-    def deactivate(self)
-    def close(self)
+class MySpout(Spout):
+    def initialize(self, config, context): pass
+    def next_tuple(self): pass
+    def ack(self, tup_id): pass
+    def fail(self, tup_id): pass
+    def activate(self): pass
+    def deactivate(self): pass
+    def close(self): pass
 ```
 
 ## `Spout` class methods
@@ -52,13 +52,11 @@ guarantee that this method is called due to how the instance is killed.
 The `Spout` class inherits from the [`BaseSpout`](/api/python/spout/base_spout.m.html#heronpy.spout.base_spout.BaseSpout) class, which also provides you methods you can use in your spouts.
 
 ```python
-class BaseSpout:
-    def emit(self, tup, tup_id=None, stream="default", direct_task=None, need_task_ids=False)
-
-    def log(self, message, level=None)
-
+class BaseSpout(BaseComponent):
+    def log(self, message, level=None): ...
+    def emit(self, tup, tup_id=None, stream="default", direct_task=None, need_task_ids=False): ...
     @classmethod
-    def spec(cls, name=None, par=1, config=None)
+    def spec(cls, name=None, par=1, config=None): ...
 ```
 
 * The `emit()` method is used to emit a given tuple, which can be a `list` or `tuple` of any Python objects. Unlike in the Java implementation, there is no `OutputCollector` in the Python implementation.
@@ -84,7 +82,8 @@ The following is an example implementation of a spout in Python.
 
 ```python
 from itertools import cycle
-from heronpy import Spout
+from heronpy.api.spout.spout import Spout
+
 
 class WordSpout(Spout):
     outputs = ['word']
diff --git a/website/content/docs/developers/python/topologies.md b/website/content/docs/developers/python/topologies.md
index c1d781a983..6c37406aa9 100644
--- a/website/content/docs/developers/python/topologies.md
+++ b/website/content/docs/developers/python/topologies.md
@@ -2,7 +2,7 @@
 title: Python Topologies
 ---
 
-> The current version of `py_heron` is [{{% heronpyVersion %}}](https://pypi.python.org/pypi/heronpy/{{% heronpyVersion %}}).
+> The current version of `heronpy` is [{{% heronpyVersion %}}](https://pypi.python.org/pypi/heronpy/{{% heronpyVersion %}}).
 
 Support for developing Heron topologies in Python is provided by a Python library called [`heronpy`](https://pypi.python.org/pypi/heronpy).
 
@@ -21,7 +21,9 @@ $ easy_install heronpy
 Then you can include `heronpy` in your project files. Here's an example:
 
 ```python
-from heronpy import Bolt, Spout, Topology
+from heronpy.api.bolt.bolt import Bolt
+from heronpy.api.spout.spout import Spout
+from heronpy.api.topology import Topology
 ```
 
 ## Writing topologies in Python
@@ -37,9 +39,11 @@ Once you've defined spouts and bolts for a topology, you can then compose the to
     Here's an example:
 
     ```python
-    from heronpy import TopologyBuilder
+    #!/usr/bin/env python
+    from heronpy.api.topology import TopologyBuilder
 
-    if __name__ == '__main__':
+
+    if __name__ == "__main__":
         builder = TopologyBuilder("MyTopology")
         # Add spouts and bolts
         builder.build_and_submit()
@@ -50,12 +54,13 @@ Once you've defined spouts and bolts for a topology, you can then compose the to
     Here's an example:
 
     ```python
+    from heronpy.api.stream import Grouping
+    from heronpy.api.topology import Topology
+
+
     class MyTopology(Topology):
-        my_spout = MySpout.spec(par=2)
-        my_bolt = MyBolt.spec(par=3,
-                              inputs={
-                                spout: Grouping.fields('some-input-field')
-                              })
+        my_spout = WordSpout.spec(par=2)
+        my_bolt = CountBolt.spec(par=3, inputs={spout: Grouping.fields("word")})
     ```
 
 ## Defining topologies using the [`TopologyBuilder`](/api/python/topology.m.html#heronpy.topology.TopologyBuilder) class
@@ -63,7 +68,10 @@ Once you've defined spouts and bolts for a topology, you can then compose the to
 If you create a Python topology using a [`TopologyBuilder`](/api/python/topology.m.html#heronpy.topology.TopologyBuilder), you need to instantiate a `TopologyBuilder` inside of a standard Python main function, like this:
 
 ```python
-if __name__ == '__main__':
+from heronpy.api.topology import TopologyBuilder
+
+
+if __name__ == "__main__":
     builder = TopologyBuilder("MyTopology")
 ```
 
@@ -71,8 +79,8 @@ Once you've created a `TopologyBuilder` object, you can add [bolts](../bolts) us
 
 ```python
 builder = TopologyBuilder("MyTopology")
-builder.add_bolt("my_bolt", MyBolt, par=3)
-builder.add_spout("my_spout", MySpout, par=2)
+builder.add_bolt("my_bolt", CountBolt, par=3)
+builder.add_spout("my_spout", WordSpout, par=2)
 ```
 
 Both the `add_bolt` and `add_spout` methods return the corresponding [`HeronComponentSpec`](/api/python/component/component_spec.m.html#heronpy.component.component_spec.HeronComponentSpec) object.
@@ -101,17 +109,19 @@ Argument | Data type | Description | Default
 The following is an example implementation of a word count topology in Python that subclasses [`TopologyBuilder`](/api/python/topology.m.html#heronpy.topology.TopologyBuilder).
 
 ```python
-from heronpy import TopologyBuilder
 from your_spout import WordSpout
 from your_bolt import CountBolt
 
+from heronpy.api.stream import Grouping
+from heronpy.api.topology import TopologyBuilder
+
+
 if __name__ == "__main__":
     builder = TopologyBuilder("WordCountTopology")
+    # piece together the topology
     word_spout = builder.add_spout("word_spout", WordSpout, par=2)
-
-    count_bolt_input =
-    count_bolt = builder.add_bolt("count_bolt", CountBolt, par=2,
-                                  inputs={word_spout: Grouping.fields('word')})
+    count_bolt = builder.add_bolt("count_bolt", CountBolt, par=2, inputs={word_spout: Grouping.fields("word")})
+    # submit the toplogy
     builder.build_and_submit()
 ```
 
@@ -125,11 +135,12 @@ If you're building a Python topology using a `TopologyBuilder`, you can specify
 Here's an example:
 
 ```python
-from heronpy import api_constants, TopologyBuilder
+from heronpy.api import api_constants
+from heronpy.api.topology import TopologyBuilder
+
 
-if __name__ == '__main__':
+if __name__ == "__main__":
     topology_config = {
-        api_constants.TOPOLOGY_ENABLE_ACKING: True,
         api_constants.TOPOLOGY_ENABLE_MESSAGE_TIMEOUTS: True
     }
     builder = TopologyBuilder("MyTopology")
@@ -151,7 +162,7 @@ $ heron submit local \
 
 Note the `-` in this submission command. If you define a topology by subclassing `TopologyBuilder` you do not need to instruct Heron where your main method is located.
 
-> #### Example topologies buildable as PEXes
+> #### Example topologies buildable as PEXs
 > * See [this repo](https://github.com/streamlio/pants-dev-environment) for an example of a Heron topology written in Python and deployable as a Pants-packaged PEX.
 > * See [this repo](https://github.com/streamlio/bazel-dev-environment) for an example of a Heron topology written in Python and deployable as a Bazel-packaged PEX.
 
@@ -160,16 +171,17 @@ Note the `-` in this submission command. If you define a topology by subclassing
 If you create a Python topology by subclassing the [`Topology`](/api/python/topology.m.html#heronpy.topology.Topology) class, you need to create a new topology class, like this:
 
 ```python
-from heronpy import Grouping, Topology
-from my_spout import MySpout
-from my_bolt import MyBolt
+from my_spout import WordSpout
+from my_bolt import CountBolt
+
+from heronpy.api.stream import Grouping
+from heronpy.api.topology import Topology
+
 
 class MyTopology(Topology):
-    my_spout = MySpout.spec(par=2)
-    my_bolt_inputs = {
-        my_spout: Grouping.fields('some-input-field')
-    }
-    my_bolt = MyBolt.spec(par=3, inputs=my_bolt_inputs)
+    my_spout = WordSpout.spec(par=2)
+    my_bolt_inputs = {my_spout: Grouping.fields("word")}
+    my_bolt = CountBolt.spec(par=3, inputs=my_bolt_inputs)
 ```
 
 All you need to do is place [`HeronComponentSpec`](/api/python/component/component_spec.m.html#heronpy.component.component_spec.HeronComponentSpec)s as the class attributes
@@ -201,13 +213,16 @@ Argument | Data type | Description | Default
 Here's an example topology definition with one spout and one bolt:
 
 ```python
-from heronpy import Topology
-from your_spout import WordSpout
-from your_bolt import CountBolt
+from my_spout import WordSpout
+from my_bolt import CountBolt
+
+from heronpy.api.stream import Grouping
+from heronpy.api.topology import Topology
+
 
 class WordCount(Topology):
     word_spout = WordSpout.spec(par=2)
-    count_bolt = CountBolt.spec(par=2, inputs={word_spout: Grouping.fields('word')})
+    count_bolt = CountBolt.spec(par=2, inputs={word_spout: Grouping.fields("word")})
 ```
 
 ### Launching
@@ -231,11 +246,12 @@ If you're building a Python topology by subclassing `Topology`, you can specify
 Here's an example:
 
 ```python
-from heronpy import api_constants, Topology
+from heronpy.api.topology import Topology
+from heronpy.api import api_constants
+
 
 class MyTopology(Topology):
     config = {
-        api_constants.TOPOLOGY_ENABLE_ACKING: True,
         api_constants.TOPOLOGY_ENABLE_MESSAGE_TIMEOUTS: True
     }
     # Add bolts and spouts, etc.
@@ -248,8 +264,10 @@ strings for `outputs`, you can specify a list of `Stream` objects, in the follow
 
 ```python
 class MultiStreamSpout(Spout):
-  outputs = [Stream(fields=['normal', 'fields'], name='default'),
-             Stream(fields=['error_message'], name='error_stream')]
+    outputs = [
+        Stream(fields=["normal", "fields"], name="default"),
+        Stream(fields=["error_message"], name="error_stream"),
+    ]
 ```
 
 To select one of these streams as the input for your bolt, you can simply
@@ -258,9 +276,9 @@ stream will be used.
 
 ```python
 class MultiStreamTopology(Topology):
-  spout = MultiStreamSpout.spec()
-  error_bolt = ErrorBolt.spec(inputs={spout['error_stream']: Grouping.LOWEST})
-  consume_bolt = ConsumeBolt.spec(inputs={spout: Grouping.SHUFFLE})
+    spout = MultiStreamSpout.spec()
+    error_bolt = ErrorBolt.spec(inputs={spout["error_stream"]: Grouping.LOWEST})
+    consume_bolt = ConsumeBolt.spec(inputs={spout: Grouping.SHUFFLE})
 ```
 
 ## Declaring output fields using the `spec()` method
@@ -274,14 +292,14 @@ This is useful in a situation like below.
 
 ```python
 class IdentityBolt(Bolt):
-  # Statically declaring output fields is not allowed
-  class process(self, tup):
-    emit([tup.values])
+    # Statically declaring output fields is not allowed
+    class process(self, tup):
+        emit([tup.values])
+
 
 class DynamicOutputField(Topology):
-  spout = WordSpout.spec()
-  bolt = IdentityBolt.spec(inputs={spout: Grouping.ALL},
-                           optional_outputs=['word'])
+    spout = WordSpout.spec()
+    bolt = IdentityBolt.spec(inputs={spout: Grouping.ALL}, optional_outputs=["word"])
 ```
 
 You can also declare outputs in the `add_spout()` and the `add_bolt()`
@@ -289,38 +307,38 @@ method for the `TopologyBuilder` in the same way.
 
 ## Example topologies
 
-There are a number of example topologies that you can peruse in the [`heron/examples/src/python`]({{% githubMaster %}}/heron/examples/src/python) directory of the [Heron repo]({{% githubMaster %}}):
+There are a number of example topologies that you can peruse in the [`examples/src/python`]({{% githubMaster %}}/examples/src/python) directory of the [Heron repo]({{% githubMaster %}}):
 
 Topology | File | Description
 :--------|:-----|:-----------
-Word count | [`word_count_topology.py`]({{% githubMaster %}}/heron/examples/src/python/word_count_topology.py) | The [`WordSpout`]({{% githubMaster %}}/heron/examples/src/python/spout/word_spout.py) spout emits random words from a list, while the [`CountBolt`]({{% githubMaster %}}/heron/examples/src/python/bolt/count_bolt.py) bolt counts the number of words that have been emitted.
-Multiple streams | [`multi_stream_topology.py`]({{% githubMaster %}}/heron/examples/src/python/multi_stream_topology.py) | The [`MultiStreamSpout`]({{% githubMaster %}}/heron/examples/src/python/spout/multi_stream_spout.py) emits multiple streams to downstream bolts.
-Half acking | [`half_acking_topology.py`]({{% githubMaster %}}/heron/examples/src/python/half_acking_topology.py) | The [`HalfAckBolt`]({{% githubMaster %}}/heron/examples/src/python/bolt/half_ack_bolt.py) acks only half of all received tuples.
-Custom grouping | [`custom_grouping_topology.py`]({{% githubMaster %}}/heron/examples/src/python/custom_grouping_topology.py) | The [`SampleCustomGrouping`]({{% githubMaster %}}/heron/examples/src/python/custom_grouping_topology.py#L26) class provides a custom field grouping.
+Word count | [`word_count_topology.py`]({{% githubMaster %}}/examples/src/python/word_count_topology.py) | The [`WordSpout`]({{% githubMaster %}}/examples/src/python/spout/word_spout.py) spout emits random words from a list, while the [`CountBolt`]({{% githubMaster %}}/examples/src/python/bolt/count_bolt.py) bolt counts the number of words that have been emitted.
+Multiple streams | [`multi_stream_topology.py`]({{% githubMaster %}}/examples/src/python/multi_stream_topology.py) | The [`MultiStreamSpout`]({{% githubMaster %}}/examples/src/python/spout/multi_stream_spout.py) emits multiple streams to downstream bolts.
+Half acking | [`half_acking_topology.py`]({{% githubMaster %}}/examples/src/python/half_acking_topology.py) | The [`HalfAckBolt`]({{% githubMaster %}}/examples/src/python/bolt/half_ack_bolt.py) acks only half of all received tuples.
+Custom grouping | [`custom_grouping_topology.py`]({{% githubMaster %}}/examples/src/python/custom_grouping_topology.py) | The [`SampleCustomGrouping`]({{% githubMaster %}}/examples/src/python/custom_grouping_topology.py#L26) class provides a custom field grouping.
 
-You can build the respective PEXes for these topologies using the following commands:
+You can build the respective PEXs for these topologies using the following commands:
 
 ```shell
-$ bazel build heron/examples/src/python:word_count
-$ bazel build heron/examples/src/python:multi_stream
-$ bazel build heron/examples/src/python:half_acking
-$ bazel build heron/examples/src/python:custom_grouping
+$ bazel build examples/src/python:word_count
+$ bazel build examples/src/python:multi_stream
+$ bazel build examples/src/python:half_acking
+$ bazel build examples/src/python:custom_grouping
 ```
 
-All built PEXes will be stored in `bazel-bin/heron/examples/src/python`. You can submit them to Heron like so:
+All built PEXs will be stored in `bazel-bin/examples/src/python`. You can submit them to Heron like so:
 
 ```shell
 $ heron submit local \
-  bazel-bin/heron/examples/src/python/word_count.pex - \
+  bazel-bin/examples/src/python/word_count.pex - \
   WordCount
 $ heron submit local \
-  bazel-bin/heron/examples/src/python/multi_stream.pex \
+  bazel-bin/examples/src/python/multi_stream.pex \
   heron.examples.src.python.multi_stream_topology.MultiStream
 $ heron submit local \
-  bazel-bin/heron/examples/src/python/half_acking.pex - \
+  bazel-bin/examples/src/python/half_acking.pex - \
   HalfAcking
 $ heron submit local \
-  bazel-bin/heron/examples/src/python/custom_grouping.pex \
+  bazel-bin/examples/src/python/custom_grouping.pex \
   heron.examples.src.python.custom_grouping_topology.CustomGrouping
 ```
 
diff --git a/website/scripts/python-doc-gen.sh b/website/scripts/python-doc-gen.sh
index ba50a43da7..10ab111d32 100755
--- a/website/scripts/python-doc-gen.sh
+++ b/website/scripts/python-doc-gen.sh
@@ -1,12 +1,17 @@
 #!/bin/bash
+set -e
 
 HERONPY_VERSION=$1
 HERON_ROOT_DIR=$(git rev-parse --show-toplevel)
 INPUT=heronpy
-TMP_DIR=$(mktemp -d)
+TMP_DIR=$(mktemp --directory)
 
-sudo pip install heronpy==${HERONPY_VERSION}
-sudo pip install --ignore-installed six
+VENV="$(mktemp --directory)"
+virtualenv "$VENV"
+source "$VENV/bin/activate"
+# TODO: make this a virtualenv
+pip install "heronpy==${HERONPY_VERSION}" "pdoc~=0.3.2"
+pip install --ignore-installed six
 
 mkdir -p static/api && rm -rf static/api/python
 


 

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services