You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2021/09/14 00:00:57 UTC

[GitHub] [beam] robertwb commented on a change in pull request #15499: [BEAM-12876] Adding doc and glossary entry for resource hints

robertwb commented on a change in pull request #15499:
URL: https://github.com/apache/beam/pull/15499#discussion_r707797444



##########
File path: website/www/site/content/en/documentation/runtime/resource-hints.md
##########
@@ -0,0 +1,85 @@
+---
+title: "Resource hints"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Resource hints
+
+Resource hints let pipeline authors provide information to a runner about compute resource requirements. You can use resource hints to define requirements for specific transforms or for an entire pipeline. The runner is responsible for interpreting resource hints, and runners can ignore unsupported hints.
+
+Resource hints can be nested. For example, resource hints can be specified on subtransforms of a composite transform, and that composite transform can also have resource hints applied. By default, the innermost hint takes precedence. However, hints can define custom reconciliation behavior. For example,  `min_ram` takes the maximum value for all `min_ram` values set on a given step in the pipeline.
+
+{{< language-switcher java py >}}
+
+## Available hints
+
+Currently, Beam supports the following resource hints:
+
+* `min_ram="numberXB"`: The minimum amount of RAM to allocate to workers. Beam can parse various byte units, including MB, GB, MiB, and GiB (for example, `min_ram="4GB"`). This hint is intended to provide advisory minimal memory requirements for processing a transform.
+* `accelerator="hint"`: This hint is intended to describe a hardware accelerator to use for processing a transform. For example, the following is valid accelerator syntax for the Dataflow runner: `accelerator="type:<type>;count:<n>;<options>"`
+
+The syntax of resource hints can vary between runners. For an example implementation, see the [Dataflow resource hints](https://cloud.google.com/dataflow/docs/guides/right-fitting#available_resource_hints).

Review comment:
       The syntax of how to specify resource hints does not vary between runners. I might say something like "the interpretation and actuation of resource hints can vary between runners."

##########
File path: website/www/site/content/en/documentation/runtime/resource-hints.md
##########
@@ -0,0 +1,85 @@
+---
+title: "Resource hints"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Resource hints
+
+Resource hints let pipeline authors provide information to a runner about compute resource requirements. You can use resource hints to define requirements for specific transforms or for an entire pipeline. The runner is responsible for interpreting resource hints, and runners can ignore unsupported hints.
+
+Resource hints can be nested. For example, resource hints can be specified on subtransforms of a composite transform, and that composite transform can also have resource hints applied. By default, the innermost hint takes precedence. However, hints can define custom reconciliation behavior. For example,  `min_ram` takes the maximum value for all `min_ram` values set on a given step in the pipeline.
+
+{{< language-switcher java py >}}
+
+## Available hints
+
+Currently, Beam supports the following resource hints:
+
+* `min_ram="numberXB"`: The minimum amount of RAM to allocate to workers. Beam can parse various byte units, including MB, GB, MiB, and GiB (for example, `min_ram="4GB"`). This hint is intended to provide advisory minimal memory requirements for processing a transform.
+* `accelerator="hint"`: This hint is intended to describe a hardware accelerator to use for processing a transform. For example, the following is valid accelerator syntax for the Dataflow runner: `accelerator="type:<type>;count:<n>;<options>"`
+
+The syntax of resource hints can vary between runners. For an example implementation, see the [Dataflow resource hints](https://cloud.google.com/dataflow/docs/guides/right-fitting#available_resource_hints).
+
+## Specifying resource hints for a pipeline
+
+To specify resource hints for an entire pipeline, you can use command line options. The following command shows the basic syntax.
+
+{{< highlight java >}}
+mvn compile exec:java -Dexec.mainClass=com.example.MyPipeline \
+    -Dexec.args="... \
+                 --resourceHints=min_ram=<N>GB \
+                 --resourceHints=accelerator='hint'" \
+    -Pdirect-runner
+{{< /highlight >}}
+{{< highlight py >}}
+python my_pipeline.py \
+    ... \
+    --resource_hints min_ram=<N>GB \
+    --resource_hints accelerator="hint"
+{{< /highlight >}}
+
+{{< paragraph class="language-java" >}}
+With the Java SDK, you can also specify pipeline-scoped hints programmatically using [ResourceHintsOptions](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/resourcehints/ResourceHintsOptions.java#L30).

Review comment:
       This is not exclusive to Java, it's more how pipeline options of all types work. I would drop this paragraph, and above write "you can use pipeline options" in place of "you can use command line options" (with the code snippets continuing to specify the pipeline options on the command line).

##########
File path: website/www/site/content/en/documentation/runtime/resource-hints.md
##########
@@ -0,0 +1,85 @@
+---
+title: "Resource hints"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Resource hints
+
+Resource hints let pipeline authors provide information to a runner about compute resource requirements. You can use resource hints to define requirements for specific transforms or for an entire pipeline. The runner is responsible for interpreting resource hints, and runners can ignore unsupported hints.
+
+Resource hints can be nested. For example, resource hints can be specified on subtransforms of a composite transform, and that composite transform can also have resource hints applied. By default, the innermost hint takes precedence. However, hints can define custom reconciliation behavior. For example,  `min_ram` takes the maximum value for all `min_ram` values set on a given step in the pipeline.
+
+{{< language-switcher java py >}}
+
+## Available hints
+
+Currently, Beam supports the following resource hints:
+
+* `min_ram="numberXB"`: The minimum amount of RAM to allocate to workers. Beam can parse various byte units, including MB, GB, MiB, and GiB (for example, `min_ram="4GB"`). This hint is intended to provide advisory minimal memory requirements for processing a transform.
+* `accelerator="hint"`: This hint is intended to describe a hardware accelerator to use for processing a transform. For example, the following is valid accelerator syntax for the Dataflow runner: `accelerator="type:<type>;count:<n>;<options>"`
+
+The syntax of resource hints can vary between runners. For an example implementation, see the [Dataflow resource hints](https://cloud.google.com/dataflow/docs/guides/right-fitting#available_resource_hints).
+
+## Specifying resource hints for a pipeline
+
+To specify resource hints for an entire pipeline, you can use command line options. The following command shows the basic syntax.
+
+{{< highlight java >}}
+mvn compile exec:java -Dexec.mainClass=com.example.MyPipeline \
+    -Dexec.args="... \
+                 --resourceHints=min_ram=<N>GB \
+                 --resourceHints=accelerator='hint'" \
+    -Pdirect-runner
+{{< /highlight >}}
+{{< highlight py >}}
+python my_pipeline.py \
+    ... \
+    --resource_hints min_ram=<N>GB \
+    --resource_hints accelerator="hint"
+{{< /highlight >}}
+
+{{< paragraph class="language-java" >}}
+With the Java SDK, you can also specify pipeline-scoped hints programmatically using [ResourceHintsOptions](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/resourcehints/ResourceHintsOptions.java#L30).
+{{< /paragraph >}}
+
+## Specifying resource hints for a transform
+
+{{< paragraph class="language-java" >}}
+You can set resource hints programmatically on pipeline transforms using [ResourceHints](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/resourcehints/ResourceHints.java#L37).

Review comment:
       It would probably be better to reference https://beam.apache.org/releases/javadoc/2.32.0/org/apache/beam/sdk/transforms/PTransform.html#setResourceHints-org.apache.beam.sdk.transforms.resourcehints.ResourceHints-

##########
File path: website/www/site/content/en/documentation/runtime/resource-hints.md
##########
@@ -0,0 +1,85 @@
+---
+title: "Resource hints"
+---
+<!--
+Licensed under the Apache License, Version 2.0 (the "License");
+you may not use this file except in compliance with the License.
+You may obtain a copy of the License at
+
+http://www.apache.org/licenses/LICENSE-2.0
+
+Unless required by applicable law or agreed to in writing, software
+distributed under the License is distributed on an "AS IS" BASIS,
+WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+See the License for the specific language governing permissions and
+limitations under the License.
+-->
+
+# Resource hints
+
+Resource hints let pipeline authors provide information to a runner about compute resource requirements. You can use resource hints to define requirements for specific transforms or for an entire pipeline. The runner is responsible for interpreting resource hints, and runners can ignore unsupported hints.
+
+Resource hints can be nested. For example, resource hints can be specified on subtransforms of a composite transform, and that composite transform can also have resource hints applied. By default, the innermost hint takes precedence. However, hints can define custom reconciliation behavior. For example,  `min_ram` takes the maximum value for all `min_ram` values set on a given step in the pipeline.
+
+{{< language-switcher java py >}}
+
+## Available hints
+
+Currently, Beam supports the following resource hints:
+
+* `min_ram="numberXB"`: The minimum amount of RAM to allocate to workers. Beam can parse various byte units, including MB, GB, MiB, and GiB (for example, `min_ram="4GB"`). This hint is intended to provide advisory minimal memory requirements for processing a transform.
+* `accelerator="hint"`: This hint is intended to describe a hardware accelerator to use for processing a transform. For example, the following is valid accelerator syntax for the Dataflow runner: `accelerator="type:<type>;count:<n>;<options>"`
+
+The syntax of resource hints can vary between runners. For an example implementation, see the [Dataflow resource hints](https://cloud.google.com/dataflow/docs/guides/right-fitting#available_resource_hints).
+
+## Specifying resource hints for a pipeline
+
+To specify resource hints for an entire pipeline, you can use command line options. The following command shows the basic syntax.
+
+{{< highlight java >}}
+mvn compile exec:java -Dexec.mainClass=com.example.MyPipeline \
+    -Dexec.args="... \
+                 --resourceHints=min_ram=<N>GB \
+                 --resourceHints=accelerator='hint'" \
+    -Pdirect-runner
+{{< /highlight >}}
+{{< highlight py >}}
+python my_pipeline.py \
+    ... \
+    --resource_hints min_ram=<N>GB \
+    --resource_hints accelerator="hint"
+{{< /highlight >}}
+
+{{< paragraph class="language-java" >}}
+With the Java SDK, you can also specify pipeline-scoped hints programmatically using [ResourceHintsOptions](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/resourcehints/ResourceHintsOptions.java#L30).
+{{< /paragraph >}}
+
+## Specifying resource hints for a transform
+
+{{< paragraph class="language-java" >}}
+You can set resource hints programmatically on pipeline transforms using [ResourceHints](https://github.com/apache/beam/blob/master/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/resourcehints/ResourceHints.java#L37).
+{{< /paragraph >}}
+
+{{< paragraph class="language-py" >}}
+You can set resource hints programmatically on pipeline transforms using [PTransforms.with_resource_hints](https://github.com/apache/beam/blob/dd20b4fd7547d5421eeae7ef0d1d62c3e3d6727a/sdks/python/apache_beam/transforms/ptransform.py#L421) (also see [ResourceHint](https://github.com/apache/beam/blob/master/sdks/python/apache_beam/transforms/resources.py#L51)).

Review comment:
       Similarly, perhaps refer to the docs rather than the code (for the first link). 




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org