You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@beam.apache.org by da...@apache.org on 2022/11/15 14:10:37 UTC
[beam] branch master updated: Editorial review of the ML notebooks. (#24125)

This is an automated email from the ASF dual-hosted git repository.

damccorm pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/beam.git


The following commit(s) were added to refs/heads/master by this push:
     new 0f4ca6363b3 Editorial review of the ML notebooks. (#24125)
0f4ca6363b3 is described below

commit 0f4ca6363b3ce0e5de3ad36517bb406aa6391a18
Author: Rebecca Szper <98...@users.noreply.github.com>
AuthorDate: Tue Nov 15 06:10:13 2022 -0800

    Editorial review of the ML notebooks. (#24125)
    
    * Editorial review of the ML notebooks.
    
    * Editorial review of the ML notebooks.
    
    * Editorial review of the ML notebooks.
    
    * Update examples/notebooks/beam-ml/custom_remote_inference.ipynb
    
    Co-authored-by: Danny McCormick <da...@google.com>
    
    * Updating based on feedback
    
    * Update examples/notebooks/beam-ml/run_inference_sklearn.ipynb
    
    Co-authored-by: Danny McCormick <da...@google.com>
    
    * Update examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
    
    Co-authored-by: Danny McCormick <da...@google.com>
    
    * Update examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
    
    Co-authored-by: Danny McCormick <da...@google.com>
    
    * Update examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
    
    Co-authored-by: Danny McCormick <da...@google.com>
    
    * Updating based on feedback
    
    Co-authored-by: Danny McCormick <da...@google.com>
---
 examples/notebooks/beam-ml/README.md               |  23 +--
 .../beam-ml/custom_remote_inference.ipynb          | 108 +++++++----
 .../beam-ml/dataframe_api_preprocessing.ipynb      | 211 ++++++++++-----------
 .../notebooks/beam-ml/run_custom_inference.ipynb   |  73 ++++---
 .../beam-ml/run_inference_multi_model.ipynb        | 120 +++++++-----
 .../notebooks/beam-ml/run_inference_pytorch.ipynb  | 141 +++++++-------
 .../run_inference_pytorch_tensorflow_sklearn.ipynb | 129 ++++++++-----
 .../notebooks/beam-ml/run_inference_sklearn.ipynb  |  80 ++++----
 .../beam-ml/run_inference_tensorflow.ipynb         | 114 +++++------
 9 files changed, 541 insertions(+), 458 deletions(-)

diff --git a/examples/notebooks/beam-ml/README.md b/examples/notebooks/beam-ml/README.md
index d93b5ef8b7f..673537e7663 100644
--- a/examples/notebooks/beam-ml/README.md
+++ b/examples/notebooks/beam-ml/README.md
@@ -18,27 +18,28 @@
 -->
 # ML Sample Notebooks
 
-As of Beam 2.40 users now have access to a
+Starting with the Apache Beam SDK version 2.40, users have access to a
 [RunInference](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.RunInference)
 transform.
 
-This allows inferences or predictions of on data for
-popular ML frameworks like TensorFlow, PyTorch and
-scikit-learn.
+This transform allows you to make predictions and inference on data with machine learning (ML) models.
+The model handler abstracts the user from the configuration needed for
+specific frameworks, such as Tensorflow, PyTorch, and others. For a full list of supported frameworks,
+see the Apache Beam [Machine Learning](https://beam.apache.org/documentation/sdks/python-machine-learning) page.
 
 ## Using The Notebooks
 
-These notebooks illustrate usages of Beam's RunInference, as well as different
-usages of implementations of [ModelHandler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.ModelHandler).
-Beam comes with various implementations of ModelHandler.
+These notebooks illustrate ways to use Apache Beam's RunInference transforms, as well as different
+use cases for [ModelHandler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.ModelHandler) implementations.
+Beam comes with [multiple ModelHandler implementations](https://beam.apache.org/documentation/sdks/python-machine-learning/#modify-a-pipeline-to-use-an-ml-model).
 
 ### Loading the Notebooks
 
-1. A quick way to get started is with [Colab](https://colab.sandbox.google.com/).
-2. Load the notebook from github, for example:
+1. To get started quickly with notebooks, use [Colab](https://colab.sandbox.google.com/).
+2. In Colab, open the notebook from GitHub using the notebook URL, for example:
 ```
-https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb.
+https://github.com/apache/beam/blob/master/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
 ```
 
-3. To run most notebooks, you will need to change the GCP project and bucket
+3. To run most notebooks, you need to change the Google Cloud project and bucket
 to your project and bucket.
diff --git a/examples/notebooks/beam-ml/custom_remote_inference.ipynb b/examples/notebooks/beam-ml/custom_remote_inference.ipynb
index 713c6559965..036a9d39d4e 100644
--- a/examples/notebooks/beam-ml/custom_remote_inference.ipynb
+++ b/examples/notebooks/beam-ml/custom_remote_inference.ipynb
@@ -34,13 +34,17 @@
         "id": "0UGzzndTBPWQ"
       },
       "source": [
-        "# Remote inference in Beam\n",
+        "# Remote inference in Apache Beam\n",
         "\n",
-        "The prefered way of running inference in Beam is by using the [RunInference API](https://beam.apache.org/documentation/sdks/python-machine-learning/). The RunInference API enables you to run your models as part of your pipeline in a way that is optimized for machine learning inference. It supports features such as batching, so that you do not need to take care of it yourself. For more info on the RunInference API you can check out the [RunInference notebook](https://github.com/a [...]
+        "The prefered way to run inference in Apache Beam is by using the [RunInference API](https://beam.apache.org/documentation/sdks/python-machine-learning/). \n",
+        "The RunInference API enables you to run your models as part of your pipeline in a way that is optimized for machine learning inference. \n",
+        "To reduce the number of steps that you need to take, RunInference supports features like batching. For more infomation about the RunInference API, review the [RunInference API](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.html#apache_beam.ml.inference.RunInference), \n",
+        "which demonstrates how to implement model inference in PyTorch, scikit-learn, and TensorFlow.\n",
         "\n",
-        "As of now, RunInference API doesn't support making remote inference calls (e.g. Natural Language API, Cloud Vision API and others). Therefore, in order to use these remote APIs with Beam, one needs to write custom inference call. \n",
+        "Currently, the RunInference API doesn't support making remote inference calls using the Natural Language API, Cloud Vision API, and so on. \n",
+        "Therefore, to use these remote APIs with Apache Beam, you need to write custom inference calls.\n",
         "\n",
-        "This notebook shows how you can implement such a custom inference call in Beam. We are using Cloud Vision API for demonstration. \n"
+        "This notebook shows how to implement a custom inference call in Apache Beam. This example uses the Google Cloud Vision API."
       ]
     },
     {
@@ -49,10 +53,10 @@
         "id": "GNbarEZsalS1"
       },
       "source": [
-        "## Use case: run Cloud Vision API\n",
+        "## Use case: run the Cloud Vision API\n",
         "\n",
-        "The Cloud Vision API can be used to retrieve labels that describe an image.\n",
-        "For example:"
+        "You can use the Cloud Vision API to retrieve labels that describe an image.\n",
+        "For example, the following image shows a lion with possible labels."
       ]
     },
     {
@@ -70,21 +74,23 @@
         "id": "4io1vzkzF683"
       },
       "source": [
-        "We want to run the Google Cloud Vision API on a large set of images. Beam is the ideal tool to handle this. In this notebook we will show how to retrieve image labels with this API on a small set of images.\n",
+        "We want to run the Google Cloud Vision API on a large set of images, and Apache Beam is the ideal tool to handle this workflow.\n",
+        "This example notebook demonstates how to retrieve image labels with this API on a small set of images.\n",
         "\n",
-        "The steps needed to implement this are shown in the notebook:\n",
-        "* read the images\n",
-        "* batch your images together to optimize your model call\n",
-        "* send your images to an external API to run inference\n",
-        "* post-process the results of your API\n",
+        "The notebook follows these steps to implement this workflow:\n",
+        "* Read the images.\n",
+        "* Batch the images together to optimize the model call.\n",
+        "* Send the images to an external API to run inference.\n",
+        "* Post-process the results of your API.\n",
         "\n",
-        "⚠️  beware of API quotas and the heavy load you might incur on your external API. Make sure you have set up your pipeline and API correctly for your use case.\n",
+        "**Caution:** Be aware of API quotas and the heavy load you might incur on your external API. Verify that your pipeline and API are configured correctly for your use case.\n",
         "\n",
-        "For optimizing the calls to external API, you can confgure [PipelineOptions](https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options) to limit the parallel calls to the external remote API. Different Runners in Beam provide options to handle the parallelism, for example:\n",
-        "* [DirectRunner](https://beam.apache.org/documentation/runners/direct/) provides `direct_num_workers`.\n",
-        "* [DataflowRunner](https://beam.apache.org/documentation/runners/dataflow/) provides `max_num_workers`.\n",
+        "To optimize the calls to the external API, limit the parallel calls to the external remote API by configuring [PipelineOptions](https://beam.apache.org/documentation/programming-guide/#configuring-pipeline-options).\n",
+        "In Apache Beam, different runners provide options to handle the parallelism, for example:\n",
+        "* With the [Direct Runner](https://beam.apache.org/documentation/runners/direct/), use `direct_num_workers`.\n",
+        "* With the [Google Cloud Dataflow Runner](https://beam.apache.org/documentation/runners/dataflow/), use `max_num_workers`.\n",
         "\n",
-        "You can find details about other runners here: [Link](https://beam.apache.org/documentation/runners/capability-matrix/) "
+        "For information about other runners, see the [Beam capability matrix](https://beam.apache.org/documentation/runners/capability-matrix/) "
       ]
     },
     {
@@ -93,7 +99,9 @@
         "id": "FAawWOaiIYaS"
       },
       "source": [
-        "## Installation"
+        "## Installation\n",
+        "\n",
+        "This section provides installation steps."
       ]
     },
     {
@@ -102,7 +110,7 @@
         "id": "XhpKOxINrIqz"
       },
       "source": [
-        "Install dependencies"
+        "First, download and install the dependencies."
       ]
     },
     {
@@ -119,7 +127,7 @@
         "!pip install google-cloud-vision==3.1.1\n",
         "!pip install requests\n",
         "\n",
-        "# restart the runtime in order to use newly installed versions\n",
+        "# To use the newly installed version, restart the runtime.\n",
         "exit() "
       ]
     },
@@ -129,7 +137,7 @@
         "id": "C-RVR2eprc0r"
       },
       "source": [
-        "Authenticate with Google so that you will be able to use the Cloud Vision API."
+        "To use the Cloud Vision API, authenticate with Google Cloud."
       ]
     },
     {
@@ -140,7 +148,7 @@
       },
       "outputs": [],
       "source": [
-        "# Follow the steps to configure your GCP setup\n",
+        "# Follow the steps to configure your Google Cloup setup.\n",
         "!gcloud init --console-only"
       ]
     },
@@ -162,7 +170,9 @@
         "id": "mL4MaHm_XOVd"
       },
       "source": [
-        "## Remote inference on Google Cloud vision API"
+        "## Remote inference on Cloud Vision API\n",
+        "\n",
+        "This section demonstates the steps to run remote inference on the Cloud Vision API."
       ]
     },
     {
@@ -189,7 +199,8 @@
         "id": "09k08IYlLmON"
       },
       "source": [
-        "For this use case we have selected some images part of the [MSCoco dataset](https://cocodataset.org/#explore), as a list of image urls. This is what we will use as input for our pipeline."
+        "For this example, we use images from the [MSCoco dataset](https://cocodataset.org/#explore) as a list of image urls.\n",
+        "This data is used as the pipeline input."
       ]
     },
     {
@@ -225,17 +236,18 @@
       "source": [
         "### Custom DoFn\n",
         "\n",
-        "In order to implement remote inference, we must create our own DoFn class. This class will be responsible to send a batch of images to the Cloud vision API.\n",
+        "In order to implement remote inference, create a DoFn class. This class sends a batch of images to the Cloud vision API.\n",
         "\n",
-        "The custom DoFn allows us to initialize our API, or in case of a custom model, a model can also be loaded in the `setup` function. \n",
+        "The custom DoFn makes it possible to initialize the API. In case of a custom model, a model can also be loaded in the `setup` function. \n",
         "\n",
-        "The `process` function is the most interesting part. In this function we need to implement the actual model call and return its results.\n",
+        "The `process` function is the most interesting part. In this function we implement the model call and return its results.\n",
         "\n",
-        "⚠️ When running remote inference, you must be prepared to encounter, identify, and handle failure as gracefully as possible. We recommend using the following techniques: \n",
+        "**Caution:** When running remote inference, prepare to encounter, identify, and handle failure as gracefully as possible. We recommend using the following techniques: \n",
         "\n",
-        "* Exponential backoff: Retrying failed remote calls with exponentially growing pauses between retries. Using exponential backoff ensures that failures don't lead to an overwhelming number of retries in quick succession. \n",
+        "* **Exponential backoff:** Retry failed remote calls with exponentially growing pauses between retries. Using exponential backoff ensures that failures don't lead to an overwhelming number of retries in quick succession. \n",
         "\n",
-        "* Dead letter queues: Routing failed inferences to a separate PCollection without failing the whole transform. This allows you to continue execution without failing the job (batch jobs' default behavior) or retrying indefinitely (streaming jobs' default behavior). You can then run custom pipeline logic on the deadletter queue to log the failure, alert, and push the failed message to temporary storage so that it can eventually be reprocessed. "
+        "* **Dead letter queues:** Route failed inferences to a separate `PCollection` without failing the whole transform. You can continue execution without failing the job (batch jobs' default behavior) or retrying indefinitely (streaming jobs' default behavior).\n",
+        "You can then run custom pipeline logic on the deadletter queue to log the failure, alert, and push the failed message to temporary storage so that it can eventually be reprocessed. "
       ]
     },
     {
@@ -257,15 +269,15 @@
         "    feature = Feature()\n",
         "    feature.type_ = Feature.Type.LABEL_DETECTION\n",
         "\n",
-        "    # list of image_urls\n",
+        "    # The list of image_urls\n",
         "    image_urls = [image_url for (image_url, image_bytes) in images_batch]\n",
         "\n",
-        "    # create a batch request for all images in the batch\n",
+        "    # Create a batch request for all images in the batch.\n",
         "    images = [vision.Image(content=image_bytes) for (image_url, image_bytes) in images_batch]\n",
         "    image_requests = [vision.AnnotateImageRequest(image=image, features=[feature]) for image in images]\n",
         "    batch_image_request = vision.BatchAnnotateImagesRequest(requests=image_requests)\n",
         "\n",
-        "    # send batch request to remote endpoint\n",
+        "    # Send batch request to the remote endpoint.\n",
         "    responses = self._client.batch_annotate_images(request=batch_image_request).responses\n",
         "    \n",
         "    return list(zip(image_urls, responses))\n"
@@ -279,15 +291,18 @@
       "source": [
         "### Batching\n",
         "\n",
-        "Before we can chain all the different steps together in a pipeline, there is one more thing we need to understand: batching. When running inference with your model (both in Beam itself or in an external API), you can batch your input together to allow for more efficient execution of your model. When using a custom DoFn, you need to take care of the batching yourself, in contrast with the RunInference API which takes care of this for you.\n",
+        "Before we can chain together the pipeline steps, we need to understand batching.\n",
+        "When running inference with your model, either in Apache Beam or in an external API, you can batch your input to increase the efficiency of the model execution.\n",
+        "When using a custom DoFn, as in this example, you need to manage the batching.\n",
         "\n",
-        "In order to achieve this in our pipeline: we will introduce one more step in our pipeline, a `BatchElements` transform that will group elements together to form a batch of the desired size.\n",
+        "To manage the batching in this pipeline, include a `BatchElements` transform to group elements together and form a batch of the desired size.\n",
         "\n",
-        "⚠️ If you have a streaming pipeline, you may considering using [GroupIntoBatches](https://beam.apache.org/documentation/transforms/python/aggregation/groupintobatches/) as `BatchElements` doesn't batch things across bundles. `GroupIntoBatches` requires choosing a key within which things are batched.\n",
+        "* If you have a streaming pipeline, consider using [GroupIntoBatches](https://beam.apache.org/documentation/transforms/python/aggregation/groupintobatches/)\n",
+        "because `BatchElements` doesn't batch items across bundles. `GroupIntoBatches` requires choosing a key within which items are batched.\n",
         "\n",
-        "⚠️ When batching make sure that the input batch matches the max payload of the external API.  \n",
+        "* When batching, make sure that the input batch matches the maximum payload of the external API.  \n",
         "\n",
-        "⚠️ If you are designing your own API endpoint, then make sure that it can handle batches. \n",
+        "* If you are designing your own API endpoint, make sure that it can handle batches. \n",
         "\n",
         "  "
       ]
@@ -298,9 +313,17 @@
         "id": "4sXHwZk9Url2"
       },
       "source": [
-        "### Create pipeline\n",
+        "### Create the pipeline\n",
+        "\n",
+        "This section demonstrates how to chain the steps together to do the following:\n",
+        "\n",
+        "* Read data.\n",
+        "\n",
+        "* Transform the data to fit the model input.\n",
+        "\n",
+        "* Run remote inference.\n",
         "\n",
-        "Now we can chain the different steps all together to read data, transform it to fit the model input, run remote inference and finally process and display the results."
+        "* Process and display the results."
       ]
     },
     {
@@ -598,7 +621,8 @@
       "source": [
         "### Metrics\n",
         "\n",
-        "You should consider monitoring and measuring performance of a pipeline when deploying since monitoring can provide insight into the status and health of the application. See [RunInference Metrics](https://beam.apache.org/documentation/ml/runinference-metrics/) for an example of the types of metrics you may want to consider tracking."
+        "Because monitoring can provide insight into the status and health of the application, consider monitoring and measuring pipeline performance.\n",
+        "For information about the available tracking metrics, see [RunInference Metrics](https://beam.apache.org/documentation/ml/runinference-metrics/)."
       ]
     },
     {
diff --git a/examples/notebooks/beam-ml/dataframe_api_preprocessing.ipynb b/examples/notebooks/beam-ml/dataframe_api_preprocessing.ipynb
index 0dbd0e66ddf..b77a6356982 100644
--- a/examples/notebooks/beam-ml/dataframe_api_preprocessing.ipynb
+++ b/examples/notebooks/beam-ml/dataframe_api_preprocessing.ipynb
@@ -3,7 +3,7 @@
     {
       "cell_type": "code",
       "source": [
-        "#@title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
+        "# @title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
         "\n",
         "# Licensed to the Apache Software Foundation (ASF) under one\n",
         "# or more contributor license agreements. See the NOTICE file\n",
@@ -20,11 +20,11 @@
         "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
         "# KIND, either express or implied. See the License for the\n",
         "# specific language governing permissions and limitations\n",
-        "# under the License."
+        "# under the License"
       ],
       "metadata": {
-        "id": "sARMhsXz8yR1",
-        "cellView": "form"
+        "cellView": "form",
+        "id": "sARMhsXz8yR1"
       },
       "execution_count": null,
       "outputs": []
@@ -32,29 +32,29 @@
     {
       "cell_type": "markdown",
       "source": [
-        "# Overview\n",
+        "# Preprocessing with the Apache Beam DataFrames API\n",
         "\n",
-        "One of the most common tools used for data exploration and pre-processing is [pandas DataFrames](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html). Pandas has become very popular for its ease of use. It has very intuitive methods to perform common analytical tasks and data pre-processing. \n",
+        "[Pandas DataFrames](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html) is one of the most common tools used for data exploration and preprocessing. Pandas is popular because of its ease of use. It has intuitive methods to perform common analytical tasks and data preprocessing. \n",
         "\n",
-        "Pandas loads all of the data into memory on a single machine (one node) for rapid execution. This works well when dealing with small-scale datasets. However, many projects involve datasets that can grow too big to fit in memory. These use cases generally require the usage of parallel data processing frameworks such as Apache Beam.\n",
+        "For rapid execution, Pandas loads all of the data into memory on a single machine (one node). This configuration works well when dealing with small-scale datasets. However, many projects involve datasets that are too big to fit in memory. These use cases generally require parallel data processing frameworks, such as Apache Beam.\n",
         "\n",
         "\n",
-        "## Beam DataFrames\n",
+        "## Apache Beam DataFrames\n",
         "\n",
         "\n",
         "Beam DataFrames provide a pandas-like\n",
         "API to declare and define Beam processing pipelines. It provides a familiar interface for machine learning practioners to build complex data-processing pipelines by only invoking standard pandas commands.\n",
         "\n",
-        "> ℹ️ To learn more about Beam DataFrames, take a look at the\n",
+        "To learn more about Apache Beam DataFrames, see the\n",
         "[Beam DataFrames overview](https://beam.apache.org/documentation/dsls/dataframes/overview) page.\n",
         "\n",
         "## Goal\n",
-        "The goal of this notebook is to explore a dataset preprocessed it for machine learning model training using the Beam DataFrames API.\n",
+        "The goal of this notebook is to explore a dataset preprocessed with the Beam DataFrame API for machine learning model training.\n",
         "\n",
         "\n",
         "## Tutorial outline\n",
         "\n",
-        "In this notebook, we walk through the use of the Beam DataFrames API to perform common data exploration as well as pre-processing steps that are necessary to prepare your dataset for machine learning model training and inference, such as:  \n",
+        "This notebook demonstrates the use of the Apache Beam DataFrames API to perform common data exploration as well as the preprocessing steps that are necessary to prepare your dataset for machine learning model training and inference. These steps include the following:  \n",
         "\n",
         "*   Removing unwanted columns.\n",
         "*   One-hot encoding categorical columns.\n",
@@ -69,9 +69,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "# Installation\n",
+        "## Installation\n",
         "\n",
-        "As we want to explore the elements within a `PCollection`, we can make use of the the Interactive runner by installing Apache Beam with the `interactive` component. The latest implemented DataFrames API methods invoked in this notebook are available in Beam <b>2.43</b> or later.\n"
+        "To explore the elements within a `PCollection`, install Apache Beam with the `interactive` component to use the Interactive runner. The latest implemented DataFrames API methods invoked in this notebook are available in Apache Beam SDK versions 2.43 and later.\n"
       ],
       "metadata": {
         "id": "A0f2HJ22D4lt"
@@ -83,7 +83,7 @@
         "id": "pCjwrwNWnuqI"
       },
       "source": [
-        "Install latest version"
+        "Install the latest Apache Beam SDK version."
       ]
     },
     {
@@ -105,12 +105,12 @@
     {
       "cell_type": "markdown",
       "source": [
-        "# Part I : Local exploration with the Interactive Beam runner\n",
-        "We first use the [Interactive Beam](https://beam.apache.org/releases/pydoc/2.20.0/apache_beam.runners.interactive.interactive_beam.html) to explore and develop our pipeline.\n",
-        "This allows us to test our code interactively, building out the pipeline as we go before deploying it on a distributed runner. \n",
+        "## Part I : Local exploration with the Interactive Beam runner\n",
+        "Start by using the [Interactive Beam](https://beam.apache.org/releases/pydoc/2.20.0/apache_beam.runners.interactive.interactive_beam.html) to explore and develop your pipeline.\n",
+        "This runner allows you to test the code interactively, progressively building out the pipeline before deploying it on a distributed runner. \n",
         "\n",
         "\n",
-        "> ℹ️ In this section, we will only be working with a subset of the original dataset since we're only using the the compute resources of the notebook instance.\n"
+        "This section uses a subset of the original dataset, because the notebook instance has limited compute resources.\n"
       ],
       "metadata": {
         "id": "3NO6RgB7GkkE"
@@ -122,14 +122,14 @@
         "id": "5I3G094hoB1P"
       },
       "source": [
-        "# Loading the data\n",
+        "### Load the data\n",
         "\n",
-        "Pandas has the\n",
+        "To read CSV files into Dataframes, Pandas has the\n",
         "[`pandas.read_csv`](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.read_csv.html)\n",
-        "function to easily read CSV files into DataFrames.\n",
-        "We're using the beam\n",
+        "function.\n",
+        "This notebook uses the Beam\n",
         "[`beam.dataframe.io.read_csv`](https://beam.apache.org/releases/pydoc/current/apache_beam.dataframe.io.html#apache_beam.dataframe.io.read_csv)\n",
-        "function that emulates `pandas.read_csv`. The main difference between them is that the beam method returns a deferred Beam DataFrame while pandas return a standard DataFrame.\n"
+        "function, which emulates `pandas.read_csv`. The main difference is that the Beam function returns a deferred Beam DataFrame whereas the Pandas function returns a standard DataFrame.\n"
       ]
     },
     {
@@ -150,12 +150,12 @@
         "from apache_beam.runners.interactive.interactive_runner import InteractiveRunner\n",
         "from apache_beam.runners.dataflow import DataflowRunner\n",
         "\n",
-        "# Available options: [sample_1000, sample_10000, sample_100000, full] where\n",
-        "# sample contains all of the dataset (around 1000000 samples)\n",
+        "# Available options: [sample_1000, sample_10000, sample_100000, full], where\n",
+        "# sample contains the entire dataset (around 1000000 samples).\n",
         "\n",
         "source_csv_file = 'gs://apache-beam-samples/nasa_jpl_asteroid/sample_10000.csv'\n",
         "\n",
-        "# Initialize pipline\n",
+        "# Initialize the pipeline.\n",
         "p = beam.Pipeline(InteractiveRunner())\n",
         "\n",
         "beam_df = p | beam.dataframe.io.read_csv(source_csv_file)\n"
@@ -167,21 +167,18 @@
         "id": "paf7yf3YpCh8"
       },
       "source": [
-        "# Data pre-processing\n",
+        "### Preprocess the data\n",
         "\n",
-        "## Dataset description \n",
-        "\n",
-        "### [NASA - Nearest Earth Objects dataset](https://cneos.jpl.nasa.gov/ca/)\n",
-        "There are an innumerable number of objects in the outer space. Some of them are closer than we think. Even though we might think that a distance of 70,000 Km can not potentially harm us, but at an astronomical scale, this is a very small distance and can disrupt many natural phenomena. \n",
-        "\n",
-        "These objects/asteroids can thus prove to be harmful. Hence, it is wise to know what is surrounding us and what can harm us amongst those. Thus, this dataset compiles the list of NASA certified asteroids that are classified as the nearest earth object."
+        "This example uses the [NASA - Nearest Earth Objects dataset](https://cneos.jpl.nasa.gov/ca/).\n",
+        "This dataset includes information about objects in the outer space. Some objects are close enough to Earth to cause harm.\n",
+        "Therefore, this dataset compiles the list of NASA certified asteroids that are classified as the nearest earth objects to understand which objects pose a risk."
       ]
     },
     {
       "cell_type": "markdown",
       "source": [
         "\n",
-        "Let's first inspect the columns of our dataset and their types"
+        "Inspect the dataset columns and their types."
       ],
       "metadata": {
         "id": "cvAu5T0ENjuQ"
@@ -229,7 +226,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "When using Interactive Beam, we can use `ib.collect()` to bring a Beam DataFrame into local memory as a Pandas DataFrame."
+        "When using Interactive Beam, to bring a Beam DataFrame into local memory as a Pandas DataFrame, use `ib.collect()`."
       ],
       "metadata": {
         "id": "1Wa6fpbyQige"
@@ -663,11 +660,11 @@
     {
       "cell_type": "markdown",
       "source": [
-        "We can see that our datasets consists of both:\n",
+        "The datasets contain the following two types of columns:\n",
         "\n",
-        "* **Numerical columns:** These columns need to be transformed through [normalization](https://developers.google.com/machine-learning/data-prep/transform/normalization) before they can be used for training a machine learning model.\n",
+        "* **Numerical columns:** Use [normalization](https://developers.google.com/machine-learning/data-prep/transform/normalization) to transform these columns so that they can be used to train a machine learning model.\n",
         "\n",
-        "* **Categorical columns:** We need to transform those columns with [one-hot encoding](https://developers.google.com/machine-learning/data-prep/transform/transform-categorical) to use them during training. \n"
+        "* **Categorical columns:** Transform those columns with [one-hot encoding](https://developers.google.com/machine-learning/data-prep/transform/transform-categorical) to use them during training. \n"
       ],
       "metadata": {
         "id": "8jV9odKhNyF2"
@@ -676,7 +673,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "We can also explore use the standard pandas command `DataFrame.describe()` to generate descriptive statistics for the numerical columns like percentile, mean, std, etc. "
+        "Use the standard pandas command `DataFrame.describe()` to generate descriptive statistics for the numerical columns like percentile, mean, std, and so on. "
       ],
       "metadata": {
         "id": "MGAErO0lAYws"
@@ -1007,21 +1004,21 @@
         "id": "D9uJtHLSSAMC"
       },
       "source": [
-        "Before executing any transformations, we need to check if all the columns need to be used for model training. Let's first have a look at the column description as provided by the [JPL website](https://ssd.jpl.nasa.gov/sbdb_query.cgi):\n",
+        "Before running any transformations, verify that all of the columns need to be used for model training. Start by looking at the column description provided by the [JPL website](https://ssd.jpl.nasa.gov/sbdb_query.cgi):\n",
         "\n",
         "* **spk_id:** Object primary SPK-ID\n",
         "* **full_name:** Asteroid name\n",
         "* **near_earth_object:** Near-earth object flag\n",
-        "* **absolute_magnitude:** the apparent magnitude an object would have if it were located at a distance of 10 parsecs.\n",
-        "* **diameter:** object diameter (from equivalent sphere) km Unit\n",
-        "* **albedo:** a measure of the diffuse reflection of solar radiation out of the total solar radiation and measured on a scale from 0 to 1.\n",
-        "* **diameter_sigma:** 1-sigma uncertainty in object diameter km Unit.\n",
-        "* **eccentricity:** value between 0 and 1 that referes to how flat or round the shape of the asteroid is  \n",
-        "* **inclination:** angle with respect to x-y ecliptic plane\n",
-        "* **moid_ld:** Earth Minimum Orbit Intersection Distance au Unit\n",
-        "* **object_class:** the classification of the asteroid. Checkout this [link](https://pdssbn.astro.umd.edu/data_other/objclass.shtml) for a more detailed description.\n",
-        "* **Semi-major axis au Unit:** the length of half of the long axis in AU unit\n",
-        "* **hazardous_flag:** Hazardous Asteroid Flag"
+        "* **absolute_magnitude:** The apparent magnitude an object would have if it were located at a distance of 10 parsecs.\n",
+        "* **diameter:** Object diameter (from equivalent sphere) km unit.\n",
+        "* **albedo:** A measure of the diffuse reflection of solar radiation out of the total solar radiation and measured on a scale from 0 to 1.\n",
+        "* **diameter_sigma:** 1-sigma uncertainty in object diameter km unit.\n",
+        "* **eccentricity:** A value between 0 and 1 that refers to how flat or round the asteroid is  \n",
+        "* **inclination:** The angle with respect to the x-y ecliptic plane\n",
+        "* **moid_ld:** Earth Minimum Orbit Intersection Distance au unit\n",
+        "* **object_class:** The classification of the asteroid. For a more detailed description, see [NASA object classifications](https://pdssbn.astro.umd.edu/data_other/objclass.shtml).\n",
+        "* **Semi-major axis au Unit:** The length of half of the long axis in AU unit.\n",
+        "* **hazardous_flag:** Identifies hazardous asteroids."
       ]
     },
     {
@@ -1030,7 +1027,7 @@
         "id": "DzYVKbwTp72d"
       },
       "source": [
-        "Columns **'spk_id'** and **'full_name'** are unique for each row.  These columns can be removed since they are not needed for model training."
+        "The **'spk_id'** and **'full_name'** columns are unique for each row. You can remove these columns, because they are not needed for model training."
       ]
     },
     {
@@ -1050,7 +1047,7 @@
         "id": "fRvNyahSuX_y"
       },
       "source": [
-        "Let's have a look at the number of missing values"
+        "Review the number of missing values."
       ]
     },
     {
@@ -1156,7 +1153,7 @@
         "id": "00MRdFGLwQiD"
       },
       "source": [
-        "It can be observed that most of the columns do not have missing values. However, columns **'diameter'**, **'albedo'** and **'diameter_sigma'** have many missing values. Since these values cannot be measured or derived, we can remove them since they will not be required for training the machine learning model."
+        "Most of the columns do not have missing values. However, the columns **'diameter'**, **'albedo'** and **'diameter_sigma'** have many missing values. Because these values cannot be measured or derived and aren't needed for training the ML model, remove the columns."
       ]
     },
     {
@@ -1499,13 +1496,23 @@
         "ib.collect(beam_df)"
       ]
     },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "v03ABuXJKEmv"
+      },
+      "source": [
+        "### Normalize the data"
+      ]
+    },
     {
       "cell_type": "markdown",
       "metadata": {
         "id": "a3PojL3WBqgE"
       },
       "source": [
-        "Next, we need to normalize the numerical columns before using them to train a model. A common method of standarization is to subtract the mean and divide by standard deviation (a.k.a [z-score](https://developers.google.com/machine-learning/data-prep/transform/normalization#z-score)). This improves the performance and training stability of the model during training and inference.\n"
+        "Next, normalize the numerical columns so that they can be used to train a model. To standarize the data, you can subtract the mean and divide by the standard deviation. This process is also known as finding the [z-score](https://developers.google.com/machine-learning/data-prep/transform/normalization#z-score).\n",
+        "This step improves the performance and training stability of the model during training and inference.\n"
       ]
     },
     {
@@ -1514,7 +1521,7 @@
         "id": "sZ2_gB8wENF1"
       },
       "source": [
-        "Let's first get both the the numerical columns and categorical columns"
+        "First, retrieve both the numerical columns and the categorical columns."
       ]
     },
     {
@@ -1528,16 +1535,7 @@
         "numerical_cols = beam_df.select_dtypes(include=np.number).columns.tolist()\n",
         "categorical_cols = list(set(beam_df.columns) - set(numerical_cols))"
       ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "v03ABuXJKEmv"
-      },
-      "source": [
-        "Normalizing the data"
-      ]
-    },
+    }
     {
       "cell_type": "code",
       "execution_count": null,
@@ -1846,10 +1844,10 @@
         }
       ],
       "source": [
-        "# Get numerical columns\n",
+        "# Get the numerical columns.\n",
         "beam_df_numericals = beam_df.filter(items=numerical_cols)\n",
         "\n",
-        "# Standarize dataframes only with numerical columns\n",
+        "# Standarize DataFrames with only the numerical columns.\n",
         "beam_df_numericals = (beam_df_numericals - beam_df_numericals.mean())/beam_df_numericals.std()\n",
         "\n",
         "ib.collect(beam_df_numericals)"
@@ -1861,7 +1859,7 @@
         "id": "qdNILsajFvex"
       },
       "source": [
-        "Next, we need to convert the categorical columns into one-hot encoded variables to use them during training. \n"
+        "Convert the categorical columns into one-hot encoded variables to use them during training.\n"
       ]
     },
     {
@@ -1874,12 +1872,12 @@
       "source": [
         "def get_one_hot_encoding(df: pd.DataFrame, categorical_col:list) -> pd.DataFrame:\n",
         "  beam_df_categorical= beam_df[categorical_col]\n",
-        "  # Get unique values\n",
+        "  # Get unique values.\n",
         "  with dataframe.allow_non_parallel_operations():\n",
         "    unique_classes = pd.CategoricalDtype(ib.collect(beam_df_categorical.unique(as_series=True)))\n",
-        "  # Use `str.get_dummies()` to get the one-hot encoded representation of the categorical columns\n",
+        "  # Use `str.get_dummies()` to get the one-hot encoded representation of the categorical columns.\n",
         "  beam_df_categorical = beam_df_categorical.astype(unique_classes).str.get_dummies()\n",
-        "  # Add column name prefix to the newly created categorical columns\n",
+        "  # Add a column name prefix to the newly created categorical columns.\n",
         "  beam_df_categorical = beam_df_categorical.add_prefix(f'{categorical_col}_')\n",
         "\n",
         "  return beam_df_categorical"
@@ -2594,11 +2592,11 @@
         "id": "rVdSIyCB0spw"
       },
       "source": [
-        "# Putting it all together\n",
+        "### Run the pipeline\n",
         "\n",
-        "Let's now try to summarize all the steps that we've executed above into a full pipeline implementation and visualize our pre-processed data.\n",
+        "This section combines the previous steps into a full pipeline implementation, and then visualizes the preprocessed data.\n",
         "\n",
-        "> ℹ️ Note that the only standard Beam method invoked here is the `pipeline` instance. The rest of the pre-processing commands are all based on native pandas methods that have been integrated with the Beam DataFrame API. "
+        "Note that the only standard Apache Beam method invoked here is the `pipeline` instance. The rest of the preprocessing commands are based on native Pandas methods that are integrated with the Apache Beam DataFrame API."
       ]
     },
     {
@@ -3306,28 +3304,28 @@
         }
       ],
       "source": [
-        "# Specify the location of source csv file to be processed\n",
+        "# Specify the location of the source CSV file.\n",
         "source_csv_file = 'gs://apache-beam-samples/nasa_jpl_asteroid/sample_10000.csv'\n",
         "\n",
-        "# Initialize pipline\n",
+        "# Initialize the pipeline.\n",
         "p = beam.Pipeline(InteractiveRunner())\n",
         "\n",
-        "# Create a deferred Beam DataFrame with the contents of our csv file.\n",
+        "# Create a deferred Apache Beam DataFrame with the contents of the CSV file.\n",
         "beam_df = p | beam.dataframe.io.read_csv(source_csv_file)\n",
         "\n",
-        "# Drop irrelavant columns/columns with missing values\n",
+        "# Drop irrelevant columns and columns with missing values.\n",
         "beam_df = beam_df.drop(['spk_id', 'full_name','diameter', 'albedo', 'diameter_sigma'], axis='columns', inplace=False)\n",
         "\n",
-        "# Get numerical columns/columns with categorical variables\n",
+        "# Get numerical columns and columns with categorical values.\n",
         "numerical_cols = beam_df.select_dtypes(include=np.number).columns.tolist()\n",
         "categorical_cols = list(set(beam_df.columns) - set(numerical_cols))\n",
         "\n",
-        "# Normalize the numerical variables \n",
+        "# Normalize the numerical values.\n",
         "beam_df_numericals = beam_df.filter(items=numerical_cols)\n",
         "beam_df_numericals = (beam_df_numericals - beam_df_numericals.mean())/beam_df_numericals.std()\n",
         "\n",
         "\n",
-        "# One-hot encode the categorical variables \n",
+        "# One-hot encode the categorical values.\n",
         "for categorical_col in categorical_cols:\n",
         "  beam_df_categorical= get_one_hot_encoding(df=beam_df, categorical_col=categorical_col)\n",
         "  beam_df_numericals = beam_df_numericals.merge(beam_df_categorical, left_index = True, right_index = True)\n",
@@ -3341,8 +3339,9 @@
         "id": "xZvJTqa3XKI_"
       },
       "source": [
-        "# Part II : Process the full dataset with the Distributed Runner\n",
-        "Now that we've showcased how to build and execute the pipeline locally using the Interactive Runner. It's time to execute our pipeline on our full dataset by switching to a distributed runner. For this example, we will exectue our pipeline on [Dataflow](https://cloud.google.com/dataflow/docs/guides/deploying-a-pipeline)."
+        "## Part II : Process the full dataset with the distributed runner\n",
+        "The previous section demonstrates how to build and execute the pipeline locally using the interactive runner.\n",
+        "This section demonstrates how to run the pipeline on the full dataset by switching to a distributed runner. For this example, the pipeline runs on [Dataflow](https://cloud.google.com/dataflow/docs/guides/deploying-a-pipeline)."
       ]
     },
     {
@@ -3362,9 +3361,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "> ℹ️ Note that we are now processing the full dataset `full.csv` that containts approximately 1 million rows. We're also writing the results to a `csv` file instead of using `ib.collect()` to materialize the deferred dataframe.\n",
+        "These steps process the full dataset, `full.csv`, which contains approximately one million rows. To materialize the deferred dataframe, these steps also write the results to a CSV file instead of using `ib.collect()`.\n",
         "\n",
-        "> ℹ️ The only things we need to change to switch from an interactive runner towards a distributed one are the pipeline options. The rest of the pipeline steps are exactly identical."
+        "To switch from an interactive runner to a distributed runner, update the pipeline options. The rest of the pipeline steps don't change."
       ],
       "metadata": {
         "id": "Qk1GaYoSc9-1"
@@ -3373,40 +3372,40 @@
     {
       "cell_type": "code",
       "source": [
-        "# Specify the location of source csv file to be processed (full dataset)\n",
+        "# Specify the location of the source CSV file (the full dataset).\n",
         "source_csv_file = 'gs://apache-beam-samples/nasa_jpl_asteroid/full.csv'\n",
         "\n",
-        "# Build a new pipeline that will execute on Dataflow.\n",
+        "# Build a new pipeline that runs on Dataflow.\n",
         "p = beam.Pipeline(DataflowRunner(),\n",
         "                  options=beam.options.pipeline_options.PipelineOptions(\n",
         "                      project=PROJECT_ID,\n",
         "                      region=REGION,\n",
         "                      temp_location=TEMP_DIR,\n",
-        "                      # Disable autoscaling for a quicker demo\n",
+        "                      # To speed up the demo, disable autoscaling.\n",
         "                      autoscaling_algorithm='NONE',\n",
         "                      num_workers=10))\n",
         "\n",
-        "# Create a deferred Beam DataFrame with the contents of our csv file.\n",
+        "# Create a deferred Apache Beam DataFrame with the contents of the CSV file.\n",
         "beam_df = p | beam.dataframe.io.read_csv(source_csv_file)\n",
         "\n",
-        "# Drop irrelavant columns/columns with missing values\n",
+        "# Drop irrelevant columns and columns with missing values.\n",
         "beam_df = beam_df.drop(['spk_id', 'full_name','diameter', 'albedo', 'diameter_sigma'], axis='columns', inplace=False)\n",
         "\n",
-        "# Get numerical columns/columns with categorical variables\n",
+        "# Get numerical columns and columns with categorical values.\n",
         "numerical_cols = beam_df.select_dtypes(include=np.number).columns.tolist()\n",
         "categorical_cols = list(set(beam_df.columns) - set(numerical_cols))\n",
         "\n",
-        "# Normalize the numerical variables \n",
+        "# Normalize the numerical values. \n",
         "beam_df_numericals = beam_df.filter(items=numerical_cols)\n",
         "beam_df_numericals = (beam_df_numericals - beam_df_numericals.mean())/beam_df_numericals.std()\n",
         "\n",
         "\n",
-        "# One-hot encode the categorical variables \n",
+        "# One-hot encode the categorical values. \n",
         "for categorical_col in categorical_cols:\n",
         "  beam_df_categorical= get_one_hot_encoding(df=beam_df, categorical_col=categorical_col)\n",
         "  beam_df_numericals = beam_df_numericals.merge(beam_df_categorical, left_index = True, right_index = True\n",
         "\n",
-        "# Write the pre-processed dataset to csv\n",
+        "# Write the preprocessed dataset to a CSV file.\n",
         "beam_df_numericals.to_csv(os.path.join(OUTPUT_DIR, \"preprocessed_data.csv\"))"
       ],
       "metadata": {
@@ -3418,7 +3417,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "Let's now submit and execute our pipeline."
+        "Submit and run the pipeline."
       ],
       "metadata": {
         "id": "a789u4Yecs_g"
@@ -3438,7 +3437,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "The execution of the pipeline job will take some time until it finishes."
+        "Wait while the the pipeline job runs."
       ],
       "metadata": {
         "id": "dzdqmzKzTOng"
@@ -3447,15 +3446,15 @@
     {
       "cell_type": "markdown",
       "source": [
-        "# What's next \n",
+        "## What's next \n",
         "\n",
-        "Now that we've seen how we can analyze and preprocess a large-scale dataset with the Beam DataFrames API, we can now train a model on a classification task on our preprocessed dataset.  \n",
+        "This tutorial demonstrated how to analyze and preprocess a large-scale dataset with the Apache Beam DataFrames API. You can now train a model on a classification task using the preprocessed dataset.\n",
         "\n",
-        "To learn more on how to get started with classifying structured data, refer to:\n",
+        "To learn more about how to get started with classifying structured data, see:\n",
         "\n",
         "*   [Structred data classification from scratch](https://keras.io/examples/structured_data/structured_data_classification_from_scratch/)\n",
         "\n",
-        "We suggest finding another dataset to try out the Beam DataFrames API processing with. Make sure think carefully about which features to include in your model and how they should be represented.\n",
+        "To continue learning, find another dataset to use with the Apache Beam DataFrames API processing. Think carefully about which features to include in your model and how to represent them.\n",
         "\n"
       ],
       "metadata": {
@@ -3465,13 +3464,13 @@
     {
       "cell_type": "markdown",
       "source": [
-        "# References\n",
+        "## Resources\n",
         "\n",
-        "* [Beam DataFrames overview](https://beam.apache.org/documentation/dsls/dataframes/overview) -- an overview of the Beam DataFrames API.\n",
-        "* [Differences from pandas](https://beam.apache.org/documentation/dsls/dataframes/differences-from-pandas) -- goes through some of the differences between Beam DataFrames and Pandas DataFrames, as well as some of the   workarounds for unsupported operations.\n",
-        "* [10 minutes to Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html) -- a quickstart guide to Pandas DataFrames.\n",
-        "* [Pandas DataFrame API](https://pandas.pydata.org/pandas-docs/stable/reference/frame.html) -- the API reference for Pandas DataFrames.\n",
-        "* [Data preparation and feature training in ML](https://developers.google.com/machine-learning/data-prep) -- A guideline on data transformation for ML training."
+        "* [Beam DataFrames overview](https://beam.apache.org/documentation/dsls/dataframes/overview) -- An overview of the Apache Beam DataFrames API.\n",
+        "* [Differences from pandas](https://beam.apache.org/documentation/dsls/dataframes/differences-from-pandas) -- Reviews the differences between Apache Beam DataFrames and Pandas DataFrames, as well as some of the workarounds for unsupported operations.\n",
+        "* [10 minutes to Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/10min.html) -- A quickstart guide to the Pandas DataFrames.\n",
+        "* [Pandas DataFrame API](https://pandas.pydata.org/pandas-docs/stable/reference/frame.html) -- The API reference for the Pandas DataFrames.\n",
+        "* [Data preparation and feature training in ML](https://developers.google.com/machine-learning/data-prep) -- A guideline about data transformation for ML training."
       ],
       "metadata": {
         "id": "nG9WXXVcMCe_"
diff --git a/examples/notebooks/beam-ml/run_custom_inference.ipynb b/examples/notebooks/beam-ml/run_custom_inference.ipynb
index b2fa0aced49..9d57bf9f475 100644
--- a/examples/notebooks/beam-ml/run_custom_inference.ipynb
+++ b/examples/notebooks/beam-ml/run_custom_inference.ipynb
@@ -5,7 +5,6 @@
       "execution_count": 1,
       "id": "C1rAsD2L-hSO",
       "metadata": {
-        "cellView": "form",
         "id": "C1rAsD2L-hSO"
       },
       "outputs": [],
@@ -27,7 +26,7 @@
         "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
         "# KIND, either express or implied. See the License for the\n",
         "# specific language governing permissions and limitations\n",
-        "# under the License.\n"
+        "# under the License"
       ]
     },
     {
@@ -37,21 +36,15 @@
         "id": "b6f8f3af-744e-4eaa-8a30-6d03e8e4d21e"
       },
       "source": [
-        "# Bring your own Machine Learning (ML) model to Beam RunInference\n",
+        "# Bring your own machine learning (ML) model to Beam RunInference\n",
         "\n",
-        "<button>\n",
-        "  <a href=\"https://beam.apache.org/documentation/sdks/python-machine-learning/\">\n",
-        "    <img src=\"https://beam.apache.org/images/favicon.ico\" alt=\"Open the docs\" height=\"16\"/>\n",
-        "    Beam RunInference\n",
-        "  </a>\n",
-        "</button>\n",
+        "This notebook demonstrates how to run inference on your custom framework using the\n",
+        "[ModelHandler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.ModelHandler) class.\n",
         "\n",
-        "In this notebook, we walk through a simple example to show how to build your own ML model handler using\n",
-        "[ModelHandler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.ModelHandler).\n",
-        "\n",
-        "Named-Entity Recognition (NER) is one of the most common tasks for Natural Language Processing (NLP), \n",
-        "which locates and classifies named entities in unstructured text into pre-defined labels such as person name, organization, date, etc. \n",
-        "In this example, we illustrate how to use the popular spaCy package to load a ML model and perform inference in a Beam pipeline using RunInference PTransform.\n"
+        "Named-Entity Recognition (NER) is one of the most common tasks for natural language processing (NLP). \n",
+        "NLP locates and named entities in unstructured text and classifies the entities using pre-defined labels, such as person name, organization, date, and so on.\n",
+        "This example illustrates how to use the popular `spaCy` package to load an ML model and perform inference in an Apache Beam pipeline using the RunInference `PTransform`.\n",
+        "For more information about the RunInference API, see [Machine Learning](https://beam.apache.org/documentation/sdks/python-machine-learning) in the Apache Beam documentation."
       ]
     },
     {
@@ -61,11 +54,11 @@
         "id": "299af9bb-b2fc-405c-96e7-ee0a6ae24bdd"
       },
       "source": [
-        "## Package Dependencies\n",
+        "## Install package dependencies\n",
         "\n",
-        "The RunInference library is available in Apache Beam version <b>2.40</b> or later.\n",
+        "The RunInference library is available in Apache Beam versions 2.40 and later.\n",
         "\n",
-        "`spaCy` and `pandas` need to be installed. Here, a small NER model (`en_core_web_sm`) is also installed but any valid spaCy model could be used."
+        "For this example, you need to install `spaCy` and `pandas`. A small NER model (`en_core_web_sm`) is also installed, but you can use any valid `spaCy` model."
       ]
     },
     {
@@ -81,7 +74,7 @@
       },
       "outputs": [],
       "source": [
-        "# uncomment these to install the required packages\n",
+        "# Uncomment the following lines to install the required packages.\n",
         "# %pip install spacy pandas\n",
         "# %pip install \"apache-beam[gcp, dataframe, interactive]\"\n",
         "# !python -m spacy download en_core_web_sm"
@@ -91,7 +84,11 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Let us play with spaCy first"
+        "## Learn more about `spaCy`\n",
+        "\n",
+        "To learn more about `spaCy`, create a `spaCy` language object in memory using `spaCy`'s trained models.\n",
+        "You can install these models as Python packages.\n",
+        "For more inforamtion, see spaCy's [Models and Languages](https://spacy.io/usage/models) documentation."
       ]
     },
     {
@@ -100,10 +97,6 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# create a spaCy language object in memory using spaCy's trained models,\n",
-        "# which can be installed as Python packages.\n",
-        "# more information can be found at https://spacy.io/usage/models.\n",
-        "\n",
         "import spacy\n",
         "\n",
         "nlp = spacy.load(\"en_core_web_sm\")\n"
@@ -115,7 +108,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# some text strings for fun\n",
+        "# Add text strings.\n",
         "text_strings = [\n",
         "    \"The New York Times is an American daily newspaper based in New York City with a worldwide readership.\",\n",
         "    \"It was founded in 1851 by Henry Jarvis Raymond and George Jones, and was initially published by Raymond, Jones & Company.\"\n",
@@ -128,7 +121,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# check what entities spaCy can recognize\n",
+        "# Check which entities spaCy can recognize.\n",
         "doc = nlp(text_strings[0])\n"
       ]
     },
@@ -192,7 +185,7 @@
         }
       ],
       "source": [
-        "# visualize the results\n",
+        "# Visualize the results.\n",
         "from spacy import displacy\n",
         "displacy.render(doc, style=\"ent\")\n"
       ]
@@ -241,7 +234,7 @@
         }
       ],
       "source": [
-        "# another example\n",
+        "# Visualize another example.\n",
         "displacy.render(nlp(text_strings[1]), style=\"ent\")"
       ]
     },
@@ -249,7 +242,9 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Now time to create our own `ModelHandler` to use spaCy for inference"
+        "## Create a`ModelHandler` to use `spaCy` for inference\n",
+        "\n",
+        "This section demonstrates how to create your own `ModelHandler`."
       ]
     },
     {
@@ -284,7 +279,7 @@
         "\n",
         "pipeline = beam.Pipeline()\n",
         "\n",
-        "# only print the results to check\n",
+        "# Print the results for verification.\n",
         "with pipeline as p:\n",
         "    (p \n",
         "    | \"CreateSentences\" >> beam.Create(text_strings)\n",
@@ -298,7 +293,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# Now define SpacyModelHandler to load the model and perform the inference\n",
+        "# Define `SpacyModelHandler` to load the model and perform the inference.\n",
         "\n",
         "from apache_beam.ml.inference.base import RunInference\n",
         "from apache_beam.ml.inference.base import ModelHandler\n",
@@ -348,7 +343,7 @@
         "        Returns:\n",
         "          An Iterable of type PredictionResult.\n",
         "        \"\"\"\n",
-        "        # loop each text string and use tuple to store the inference results\n",
+        "        # Loop each text string, and use a tuple to store the inference results.\n",
         "        predictions = []\n",
         "        for one_text in batch:\n",
         "            doc = model(one_text)\n",
@@ -374,7 +369,7 @@
         }
       ],
       "source": [
-        "# quick check to show the inference results are correct\n",
+        "# Verify that the inference results are correct.\n",
         "with pipeline as p:\n",
         "    (p \n",
         "    | \"CreateSentences\" >> beam.Create(text_strings)\n",
@@ -387,7 +382,9 @@
       "cell_type": "markdown",
       "metadata": {},
       "source": [
-        "## Use `KeyedModelHandler` to handle keyed data"
+        "## Use `KeyedModelHandler` to handle keyed data\n",
+        "\n",
+        "If you have keyed data, use `KeyedModelHandler`."
       ]
     },
     {
@@ -396,7 +393,7 @@
       "metadata": {},
       "outputs": [],
       "source": [
-        "# some text strings with keys, which are useful to distinguish examples\n",
+        "# You can use these text strings with keys to distinguish examples.\n",
         "text_strings_with_keys = [\n",
         "    (\"example_0\", \"The New York Times is an American daily newspaper based in New York City with a worldwide readership.\"),\n",
         "    (\"example_1\", \"It was founded in 1851 by Henry Jarvis Raymond and George Jones, and was initially published by Raymond, Jones & Company.\")\n",
@@ -417,12 +414,12 @@
         "\n",
         "keyed_spacy_model_handler = KeyedModelHandler(SpacyModelHandler(\"en_core_web_sm\"))\n",
         "\n",
-        "# quick check to show the inference results are correct\n",
+        "# Verify that the inference results are correct.\n",
         "with pipeline as p:\n",
         "    results = (p \n",
         "    | \"CreateSentences\" >> beam.Create(text_strings_with_keys)\n",
         "    | \"RunInferenceSpacy\" >> RunInference(keyed_spacy_model_handler)\n",
-        "    # Map to Row objects to generate a schema suitable for conversion\n",
+        "    # Generate a schema suitable for conversion to a dataframe using Map to Row objects.\n",
         "    # to a dataframe.\n",
         "    | 'ToRows' >> beam.Map(lambda row: beam.Row(key=row[0], text=row[1][0], predictions=row[1][1]))\n",
         "    )"
@@ -460,7 +457,7 @@
         }
       ],
       "source": [
-        "# convert results to a pandas dataframe\n",
+        "# Convert the results to a pandas dataframe.\n",
         "import apache_beam.runners.interactive.interactive_beam as ib\n",
         "\n",
         "beam_df = to_dataframe(results)\n",
diff --git a/examples/notebooks/beam-ml/run_inference_multi_model.ipynb b/examples/notebooks/beam-ml/run_inference_multi_model.ipynb
index 65624eb74b1..a1e52b23546 100644
--- a/examples/notebooks/beam-ml/run_inference_multi_model.ipynb
+++ b/examples/notebooks/beam-ml/run_inference_multi_model.ipynb
@@ -57,12 +57,12 @@
     {
       "cell_type": "markdown",
       "source": [
-        "A single machine learning  model may not always be the perfect solution for a give task. Oftentimes, machine learning model tasks involve aggregating mutliple models together to produce one optimal predictive model and boost performance. \n",
+        "A single machine learning model might not be the right solution for your task. Often, machine learning model tasks involve aggregating mutliple models together to produce one optimal predictive model and to boost performance. \n",
         " \n",
         "\n",
-        "In this notebook, we will shows you an example on how to implement a cascade model in Beam using the [RunInference API](https://beam.apache.org/documentation/sdks/python-machine-learning/). The RunInference API enables you to run your Beam transfroms as part of your pipeline for optimal machine learning inference in beam.     \n",
+        "This notebook shows how to implement a cascade model in Apache Beam using the [RunInference API](https://beam.apache.org/documentation/sdks/python-machine-learning/). The RunInference API enables you to run your Beam transforms as part of your pipeline for optimal machine learning inference.\n",
         "\n",
-        "Make sure to checkout this [notebook](https://colab.research.google.com/drive/111USL4VhUa0xt_mKJxl5nC1YLOC8_yF4?usp=sharing#scrollTo=746b67a7-3562-467f-bea3-d8cd18c14927) to get familiar with the RunInference API."
+        "For more information about the RunInference API, review the [RunInference notebook](https://colab.research.google.com/drive/111USL4VhUa0xt_mKJxl5nC1YLOC8_yF4?usp=sharing#scrollTo=746b67a7-3562-467f-bea3-d8cd18c14927)."
       ],
       "metadata": {
         "id": "6vZWSLyuM_P4"
@@ -80,12 +80,12 @@
     {
       "cell_type": "markdown",
       "source": [
-        "Image captioning has various different applications such as image indexing for information retreival, usage in virtual assistants and many other natural language processing applications.\n",
+        "Image captioning has various applications, such as image indexing for information retreival, virtual assistant training, and various natural language processing applications.\n",
         "\n",
-        "We want to use beam to generate captions on a a large set of images. Beam is the ideal tool to handle this. We will use two models for this task:\n",
+        "This example shows how to generate captions on a a large set of images. Apache Beam is the ideal tool to handle this workflow. We use two models for this task:\n",
         "\n",
         "* [BLIP](https://github.com/salesforce/BLIP): Used to generate a set of candidate captions for a given image. \n",
-        "* [CLIP](https://github.com/openai/CLIP): Used to rank the generated captions by the order in which they better represent the the given image."
+        "* [CLIP](https://github.com/openai/CLIP): Used to rank the generated captions based on accuracy."
       ],
       "metadata": {
         "id": "cP1sBhNacS8b"
@@ -103,17 +103,17 @@
     {
       "cell_type": "markdown",
       "source": [
-        "The steps needed to build this pipeline can be summarized as follows:\n",
+        "The steps to build this pipeline are as follows:\n",
         "* Read the images.\n",
         "* Preprocess the images for caption generation for inference with the BLIP model.\n",
-        "* Inference with BLIP to generate a list of caption candidates . \n",
+        "* Inference with BLIP to generate a list of caption candidates.\n",
         "* Aggregate the generated captions with their source image.\n",
         "* Preprocess the aggregated image-caption pair to rank them with CLIP.\n",
-        "* Inference wih CLIP to generated the caption ranking. \n",
-        "* Print the image names and the captions sorted according to their ranking\n",
+        "* Inference with CLIP to generate the caption ranking. \n",
+        "* Print the image names and the captions sorted according to their ranking.\n",
         "\n",
         "\n",
-        "The following image illustrates the steps that will be followed in the inference pipelines in more details:"
+        "The following diagram illustrates the steps in the inference pipelines used in this notebook:"
       ],
       "metadata": {
         "id": "lBPfy-bYgLuD"
@@ -164,7 +164,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "The RunInference library is available in Apache Beam version **2.40** or later. "
+        "This section shows how to install the dependencies for this example.\n",
+        "\n",
+        "The RunInference library is available in the Apache Beam SDK versions 2.40 and later."
       ],
       "metadata": {
         "id": "E0uy4-nWNdBa"
@@ -233,7 +235,7 @@
         "!pip install fairscale==0.4.4 --quiet\n",
         "!pip install apache_beam[gcp]>=2.40.0  \n",
         "\n",
-        "# restart the runtime in order to use newly installed versions\n",
+        "# To use the newly installed versions, restart the runtime.\n",
         "exit() "
       ]
     },
@@ -282,7 +284,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "### CLIP"
+        "### CLIP\n",
+        "\n",
+        "Download and install the CLIP dependencies."
       ],
       "metadata": {
         "id": "iMsN4vUXilTg"
@@ -339,7 +343,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "### BLIP"
+        "### BLIP\n",
+        "\n",
+        "Download and install the BLIP dependencies."
       ],
       "metadata": {
         "id": "Rg9mKAWnR8s4"
@@ -411,7 +417,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## I/O helper functions "
+        "### I/O helper functions\n",
+        "\n",
+        "Download and install the dependencies for the I/O helper functions."
       ],
       "metadata": {
         "id": "FGHgvycOyicj"
@@ -466,9 +474,10 @@
     {
       "cell_type": "markdown",
       "source": [
-        "Here we define the preprocessing and postprocessing function for each of the models.\n",
+        "Define the preprocessing and postprocessing function for each of the models.\n",
         "\n",
-        "> ℹ️ We use `DoFn.setup()` to prepare the instance for processing bundles of elements by initializing and cache the processing transform resources. As such, we avoid unnecessary re-initializations on every invocation to the processing method."
+        "To prepare the instance for processing bundles of elements by initializing and to cache the processing transform resources, use `DoFn.setup()`.\n",
+        "This step avoids unnecessary re-initializations on every invocation to the processing method."
       ],
       "metadata": {
         "id": "wEViP715fes4"
@@ -477,7 +486,8 @@
     {
       "cell_type": "markdown",
       "source": [
-        "### BLIP"
+        "### BLIP\n",
+        "Define the preprocessing and postprocessing function for BLIP."
       ],
       "metadata": {
         "id": "X1UGv6bbyNxY"
@@ -501,7 +511,7 @@
         "\n",
         "  def setup(self):\n",
         "    \n",
-        "    # Initialize image transformer\n",
+        "    # Initialize the image transformer.\n",
         "    self._transform = transforms.Compose([\n",
         "      transforms.Resize((384, 384),interpolation=InterpolationMode.BICUBIC),\n",
         "      transforms.ToTensor(),\n",
@@ -510,10 +520,10 @@
         "\n",
         "  def process(self, element):\n",
         "    image_url, image = element \n",
-        "    # This should be changed when this ticket is resolved https://github.com/apache/beam/issues/21863\n",
+        "    # Update this step when this ticket is resolved: https://github.com/apache/beam/issues/21863\n",
         "    preprocessed_img = self._transform(image).unsqueeze(0)\n",
         "    preprocessed_img = preprocessed_img.repeat(self._captions_per_image, 1, 1, 1)\n",
-        "    # Parse the processed input to a dictionary to a format suitable for RunInference\n",
+        "    # Parse the processed input to a dictionary to a format suitable for RunInference.\n",
         "    preprocessed_dict = {'inputs': preprocessed_img}\n",
         "\n",
         "    return [(image_url, preprocessed_dict)]\n",
@@ -536,7 +546,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "###CLIP"
+        "### CLIP \n",
+        "\n",
+        "Define the preprocessing and postprocessing function for CLIP."
       ],
       "metadata": {
         "id": "EZHfa1KzWWDI"
@@ -566,22 +578,22 @@
         "\n",
         "  def setup(self):\n",
         "    \n",
-        "    # Initialize the CLIP feature extractor \n",
+        "    # Initialize the CLIP feature extractor.\n",
         "    feature_extractor_config = CLIPConfig.from_pretrained(self._feature_extractor_config_path)\n",
         "    feature_extractor = CLIPFeatureExtractor(feature_extractor_config)\n",
         "\n",
-        "    # Initialize the CLIP tokenizer\n",
+        "    # Initialize the CLIP tokenizer.\n",
         "    tokenizer = CLIPTokenizer(self._tokenizer_vocab_config_path,\n",
         "                              self._merges_file_config_path)\n",
         "    \n",
-        "    # Initialize the CLIP processor used to process the image-caption pair \n",
+        "    # Initialize the CLIP processor used to process the image-caption pair.\n",
         "    self._processor = CLIPProcessor(feature_extractor=feature_extractor,\n",
         "                                    tokenizer=tokenizer)\n",
         "\n",
         "  def process(self, element: Tuple[str, Dict[str, List[Any]]]):\n",
         "\n",
         "    image_url, image_captions_pair = element \n",
-        "    # Unpack the image and captions after grouping them with 'CoGroupByKey()' \n",
+        "    # Unpack the image and captions after grouping them with 'CoGroupByKey()'.\n",
         "    image = image_captions_pair['image'][0]\n",
         "    captions = image_captions_pair['captions'][0]\n",
         "    preprocessed_clip_input = self._processor(images = image,\n",
@@ -630,7 +642,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "> ℹ️ Note that we will use a `KeyedModelHandler` for both models to attach a key to the general `ModelHandler`. The key is used to keep a reference of which image the inference is assoicated with, and it used in our post processing steps. In our case, we're using the `image_url` as the key."
+        "Note that we use a `KeyedModelHandler` for both models to attach a key to the general `ModelHandler`.\n",
+        "The key is used to keep a reference to the image that the inference is associated with and is used in the postprocessing steps.\n",
+        "In this example, we use the `image_url` as the key."
       ],
       "metadata": {
         "id": "BTmSPnjj8M2m"
@@ -647,7 +661,7 @@
         "    Restricting max_batch_size to 1 means there is only 1 example per `batch`\n",
         "    in the run_inference() call.\n",
         "    \"\"\"\n",
-        "    # This should be changed when this ticket is resolved https://github.com/apache/beam/issues/21863\n",
+        "    # Update this step when this ticket is resolved: https://github.com/apache/beam/issues/21863\n",
         "      def batch_elements_kwargs(self):\n",
         "          return {'max_batch_size': 1}"
       ],
@@ -660,7 +674,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "> ℹ️ Note that we will use a `KeyedModelHandler` for both models to attach a key to the general `ModelHandler`. The key will be used for aggregation transforms of different inputs. "
+        "Note that we use a `KeyedModelHandler` for both models to attach a key to the general `ModelHandler`. The key is used for aggregation transforms of different inputs."
       ],
       "metadata": {
         "id": "gNLRO0EwvcGP"
@@ -669,7 +683,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## BLIP"
+        "## BLIP\n",
+        "\n",
+        "Use BLIP to generate a set of candidate captions for a given image."
       ],
       "metadata": {
         "id": "OXz7TuK4W_ZN"
@@ -680,8 +696,8 @@
       "source": [
         "MAX_CAPTION_LENGTH = 80\n",
         "MIN_CAPTION_LENGTH = 10\n",
-        "# Increasing beam search can improve the quality of the captions but results in\n",
-        "# more compute time\n",
+        "# Increasing Beam search might improve the quality of the captions,\n",
+        "# but also results in more compute time\n",
         "NUM_BEAMS = 1\n"
       ],
       "metadata": {
@@ -708,8 +724,8 @@
         "    self._min_length = min_length\n",
         "\n",
         "  def forward(self, inputs: torch.Tensor):\n",
-        "    # squeeze because RunInference adds an extra dimension, which is empty\n",
-        "    # This should be changed when this ticket is resolved https://github.com/apache/beam/issues/21863\n",
+        "    # Squeeze because RunInference adds an extra dimension, which is empty.\n",
+        "    # Update this step when this ticket is resolved: https://github.com/apache/beam/issues/21863\n",
         "    inputs = inputs.squeeze(0)\n",
         "    captions = self._model.generate(inputs,\n",
         "                                    sample=True,\n",
@@ -740,7 +756,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## CLIP"
+        "## CLIP\n",
+        "\n",
+        "Use CLIP to rank the generated captions based on the accuracy with which they represent the image."
       ],
       "metadata": {
         "id": "-8PG_0txMiYA"
@@ -752,8 +770,8 @@
         "class CLIPWrapper(CLIPModel):\n",
         "\n",
         "  def forward(self, **kwargs: Dict[str, torch.Tensor]):\n",
-        "    # squeeze because RunInference adds an extra dimension, which is empty\n",
-        "    # This should be changed when this ticket is resolved https://github.com/apache/beam/issues/21863\n",
+        "    # Squeeze because RunInference adds an extra dimension, which is empty.\n",
+        "    # Update this step when this ticket is resolved: https://github.com/apache/beam/issues/21863.\n",
         "    kwargs = {key: tensor.squeeze(0) for key, tensor in kwargs.items()}\n",
         "    output = super().forward(**kwargs)\n",
         "    logits = output.logits_per_image\n",
@@ -777,7 +795,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## Specify the images to display"
+        "## Specify the images to display\n",
+        "\n",
+        "This section demonstrates how to specify the images to display for captioning."
       ],
       "metadata": {
         "id": "azC12uqDn0bq"
@@ -799,7 +819,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "Let's visualize the images that we will use for captioning "
+        "Visualize the images to use for captioning."
       ],
       "metadata": {
         "id": "c3fpgc15hzcq"
@@ -868,7 +888,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## Initialize pipeline run parameters "
+        "## Initialize pipeline run parameters\n",
+        "\n",
+        "Specify the number of captions generated per image and the number of captions to display with each image."
       ],
       "metadata": {
         "id": "m8S8VQHvoEZf"
@@ -877,10 +899,10 @@
     {
       "cell_type": "code",
       "source": [
-        "# Number of Generated captions per image\n",
+        "# Number of captions generated per image.\n",
         "NUM_CAPTIONS_PER_IMAGE = 10\n",
         "\n",
-        "# Top captions to display\n",
+        "# Number of top captions to display.\n",
         "NUM_TOP_CAPTIONS_TO_DISPLAY = 3\n"
       ],
       "metadata": {
@@ -901,9 +923,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "> ℹ️ Note that we are using raw images from the `read_images` pipeline as input to both models. This is done because each model needs to preprocess the raw images differently (i.e. they require a different embedding representation for image captioning and image/captions pair ranking resp.).\n",
+        "This example uses raw images from the `read_images` pipeline as inputs for both models, because each model needs to preprocess the raw images differently. They require a different embedding representation for image captioning and image-captions pair ranking.\n",
         "\n",
-        "> ℹ️ We use `CoGroupByKey` to aggregate the raw images with the generated captions by their key (i.e. the image url). This process produces a tuple of image-captions pairs that is then passed to the CLIP transform and used for ranking."
+        "To aggregate the raw images with the generated caption by their key (the image URL), this example uses `CoGroupByKey`. This process produces a tuple of image-captions pairs that is then passed to the CLIP transform and used for ranking."
       ],
       "metadata": {
         "id": "G4a2ACIYeJyj"
@@ -980,11 +1002,11 @@
     {
       "cell_type": "markdown",
       "source": [
-        "# References\n",
+        "# Resources\n",
         "\n",
-        "* [RunInference API](https://beam.apache.org/documentation/sdks/python-machine-learning/) -- an official guide to the RunInference API.\n",
-        "* [RunInference Demo](https://colab.research.google.com/drive/10iPQTCmaLJL4_OohS00R9Wmor6d57JkS#scrollTo=ZVtBsKDgW1dl) -- a demo on ensemble model in colab\n",
-        "* [The advantages of having a DAG and what it unlocks for you](https://beam.apache.org/documentation/dsls/dataframes/differences-from-pandas) -- A guide on the advantages of using a Beam DAG for ML workflow orchestration and inference. "
+        "* [RunInference API](https://beam.apache.org/documentation/sdks/python-machine-learning/): an official guide to the RunInference API.\n",
+        "* [RunInference Demo](https://colab.research.google.com/drive/10iPQTCmaLJL4_OohS00R9Wmor6d57JkS#scrollTo=ZVtBsKDgW1dl): an ensemble model demo in Colab.\n",
+        "* [The advantages of having a DAG and what it unlocks for you](https://beam.apache.org/documentation/dsls/dataframes/differences-from-pandas): a guide on the advantages of using a Beam DAG for ML workflow orchestration and inference."
       ],
       "metadata": {
         "id": "HMH_ldJsrJoz"
diff --git a/examples/notebooks/beam-ml/run_inference_pytorch.ipynb b/examples/notebooks/beam-ml/run_inference_pytorch.ipynb
index 0909b4e0b5b..90948709a3a 100644
--- a/examples/notebooks/beam-ml/run_inference_pytorch.ipynb
+++ b/examples/notebooks/beam-ml/run_inference_pytorch.ipynb
@@ -4,8 +4,7 @@
   "metadata": {
     "colab": {
       "provenance": [],
-      "collapsed_sections": [],
-      "toc_visible": true
+      "collapsed_sections": []
     },
     "kernelspec": {
       "name": "python3",
@@ -19,7 +18,7 @@
     {
       "cell_type": "code",
       "source": [
-        "#@title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
+        "# @title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
         "\n",
         "# Licensed to the Apache Software Foundation (ASF) under one\n",
         "# or more contributor license agreements. See the NOTICE file\n",
@@ -36,11 +35,11 @@
         "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
         "# KIND, either express or implied. See the License for the\n",
         "# specific language governing permissions and limitations\n",
-        "# under the License."
+        "# under the License"
       ],
       "metadata": {
-        "id": "C1rAsD2L-hSO",
-        "cellView": "form"
+        "cellView": "form",
+        "id": "C1rAsD2L-hSO"
       },
       "execution_count": 3,
       "outputs": []
@@ -52,24 +51,16 @@
       },
       "source": [
         "# Apache Beam RunInference for PyTorch\n",
+        "This notebook demonstrates the use of the RunInference transform for PyTorch. Apache Beam includes implementations of the [ModelHandler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.ModelHandler) class for [users of PyTorch](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.pytorch_inference.html). For more information about the RunInference API, see [Machine Learning](https://beam.apache.or [...]
         "\n",
-        "<button>\n",
-        "  <a href=\"https://beam.apache.org/documentation/sdks/python-machine-learning/\">\n",
-        "    <img src=\"https://beam.apache.org/images/favicon.ico\" alt=\"Open the docs\" height=\"16\"/>\n",
-        "    Beam RunInference\n",
-        "  </a>\n",
-        "</button>\n",
         "\n",
-        "In this notebook, we walk through the use of the RunInference transform for PyTorch. Apache Beam includes implementations of the [ModelHandler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.ModelHandler) class for [users of PyTorch](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.pytorch_inference.html).\n",
-        "\n",
-        "\n",
-        "This notebook illustrates common RunInference patterns such as the following:\n",
+        "This notebook illustrates common RunInference patterns,such as:\n",
         "*   Using a database with RunInference.\n",
-        "*   Post processing results after RunInference.\n",
+        "*   Postprocessing results after using RunInference.\n",
         "*   Inference with multiple models in the same pipeline.\n",
         "\n",
         "\n",
-        "The linear regression models used in these samples are trained on data that correspondes to the 5 and 10 times table; that is,`y = 5x` and `y = 10x` respectively."
+        "The linear regression models used in these samples are trained on data that correspondes to the 5 and 10 times tables; that is, `y = 5x` and `y = 10x` respectively."
       ]
     },
     {
@@ -78,18 +69,18 @@
         "id": "299af9bb-b2fc-405c-96e7-ee0a6ae24bdd"
       },
       "source": [
-        "### Dependencies\n",
+        "## Dependencies\n",
         "\n",
-        "The RunInference library is available in Apache Beam version <b>2.40</b> or later.\n",
+        "The RunInference library is available in Apache Beam versions <b>2.40</b> and later.\n",
         "\n",
-        "Pytorch module is needed to use Pytorch RunInference API. use `pip` to install Pytorch."
+        "To use Pytorch RunInference API, you need to install the PyTorch module. To install PyTorch, use `pip`:"
       ]
     },
     {
       "cell_type": "code",
       "source": [
-        "# issue: https://github.com/apache/beam/issues/22218. Because of the updates to the Google cloud APIs, Beam SDK from 2.34.0 till 2.40.0 has some dependency conflicts. See the issue for more details.\n",
-        "# Workaround to install the apache beam without getting stuck for long time. Runtime might need to restart after this step.\n",
+        "# Because of updates to the Google Cloud APIs, Apache Beam SDK versions 2.34.0 to 2.40.0 have some dependency conflicts. For more details, see the following Beam issue: https://github.com/apache/beam/issues/22218.\n",
+        "# This workaround installs the the Apache Beam SDK without getting stuck for long time. After this step, you might need to restart the runtime.\n",
         "!pip install google-api-core --quiet\n",
         "!pip install google-cloud-pubsub google-cloud-bigquery-storage --quiet\n",
         "!pip install apache-beam[gcp,dataframe] --quiet"
@@ -179,7 +170,7 @@
         "project = \"<your GCP project>\"\n",
         "bucket = \"<your GCP bucket>\"\n",
         "\n",
-        "# set the project to avoid warnings.\n",
+        "# To avoid warnings, set the project.\n",
         "os.environ['GOOGLE_CLOUD_PROJECT'] = project\n",
         "\n",
         "save_model_dir_multiply_five = 'five_times_table_torch.pt'\n",
@@ -193,7 +184,8 @@
         "id": "b2b7cedc-79f5-4599-8178-e5da35dba032"
       },
       "source": [
-        "## Create data and Pytorch models for RunInference transform"
+        "## Create data and PyTorch models for the RunInference transform\n",
+        "Create linear regression models, prepare train and test data, and train models."
       ]
     },
     {
@@ -202,7 +194,8 @@
         "id": "202e5a3e-4ccd-4ae3-9852-e47de0721839"
       },
       "source": [
-        "### Linear regression model in Pytorch."
+        "### Create a linear regression model in PyTorch\n",
+        "Use the following code to create a linear regression model."
       ]
     },
     {
@@ -228,7 +221,7 @@
         "id": "1918435c-0029-4eb6-8eee-bda5470eb2ff"
       },
       "source": [
-        "### Prepare train and test data for an example model.\n",
+        "### Prepare train and test data for an example model\n",
         "This example model is a 5 times table.\n",
         "\n",
         "* `x` contains values in the range from 0 to 99.\n",
@@ -255,7 +248,8 @@
         "id": "9dc22aec-08c3-43ab-a5ce-451cb63c485a"
       },
       "source": [
-        "### Train the linear regression mode on 5 times data."
+        "### Train the linear regression mode on 5 times data\n",
+        "Use the following to train your linear regression model on the 5 times table."
       ]
     },
     {
@@ -290,7 +284,7 @@
         "id": "bd106b29-6187-42c1-9743-1666c147b5e3"
       },
       "source": [
-        "Save the model using `torch.save()` and verify if the saved model file exists."
+        "Save the model using `torch.save()` and then confirm that the saved model file exists."
       ]
     },
     {
@@ -314,7 +308,7 @@
       ],
       "source": [
         "torch.save(five_times_model.state_dict(), save_model_dir_multiply_five)\n",
-        "print(os.path.exists(save_model_dir_multiply_five)) # verify if the model is saved"
+        "print(os.path.exists(save_model_dir_multiply_five)) # Verify that the model is saved."
       ]
     },
     {
@@ -323,7 +317,7 @@
         "id": "fa84cfca-83c6-4a91-aea1-3dd034c42ae0"
       },
       "source": [
-        "### Prepare train and test data to train a 10 times model.\n",
+        "### Prepare train and test data for a 10 times model\n",
         "* `x` contains values in the range from 0 to 99.\n",
         "* `y` is a list of 10 * `x`. "
       ]
@@ -346,7 +340,8 @@
         "id": "24d946dc-4fe0-4030-8f6a-aa8d27fd353d"
       },
       "source": [
-        "### Train the linear regression model on 10 times data."
+        "### Train the linear regression model on 10 times data\n",
+        "Use the following to train your linear regression model on the 10 times table."
       ]
     },
     {
@@ -378,7 +373,7 @@
         "id": "6f959e3b-230b-45e2-9df3-dd1f11acacd7"
       },
       "source": [
-        "Save the model using `torch.save()`"
+        "Save the model using `torch.save()`."
       ]
     },
     {
@@ -411,7 +406,8 @@
         "id": "2e20efc4-13e8-46e2-9848-c0347deaa5af"
       },
       "source": [
-        "# Pattern 1: RunInference for predictions."
+        "## Pattern 1: RunInference for predictions\n",
+        "This pattern demonstrates how to use RunInference for predictions."
       ]
     },
     {
@@ -420,10 +416,10 @@
         "id": "1099fe94-d4cf-422e-a0d3-0cfba8af64d5"
       },
       "source": [
-        "### Step 1 - Use RunInference within the pipeline.\n",
+        "### Use RunInference within the pipeline\n",
         "\n",
-        "1. Create pytorch model handler object by passing required arguments such as `state_dict_path`, `model_class`, `model_params` to the `PytorchModelHandlerTensor` class.\n",
-        "2. Pass the `PytorchModelHandlerTensor` object to the RunInference transform to peform prediction on unkeyed data."
+        "1. Create a PyTorch model handler object by passing required arguments such as `state_dict_path`, `model_class`, `model_params` to the `PytorchModelHandlerTensor` class.\n",
+        "2. Pass the `PytorchModelHandlerTensor` object to the RunInference transform to perform predictions on unkeyed data."
       ]
     },
     {
@@ -473,7 +469,9 @@
         "id": "9d95e69b-203f-4abb-9abb-360bdf4d769a"
       },
       "source": [
-        "# Pattern 2: Post-process RunInference results.\n",
+        "## Pattern 2: Post-process RunInference results.\n",
+        "This pattern demonstrates how to post-process the RunInference results.\n",
+        "\n",
         "Add a `PredictionProcessor` to the pipeline after `RunInference`. `PredictionProcessor` processes the output of the `RunInference` transform."
       ]
     },
@@ -530,16 +528,9 @@
         "id": "2be80463-cf79-481c-9d6a-81e500f1707b"
       },
       "source": [
-        "# Pattern 3: Attach a key"
-      ]
-    },
-    {
-      "cell_type": "markdown",
-      "metadata": {
-        "id": "f22da313-5bf8-4334-865b-bbfafc374e63"
-      },
-      "source": [
-        "## Step 1 - Create a source with attached key.\n"
+        "## Pattern 3: Attach a key\n",
+        "\n",
+        "This pattern demonstrates how attach a key to allow your model to handle keyed data."
       ]
     },
     {
@@ -548,15 +539,16 @@
         "id": "746b67a7-3562-467f-bea3-d8cd18c14927"
       },
       "source": [
-        "## Step 2 - Modify model handler and post processor.\n",
-        "* Modify the pipeline to read from sources like CSV files and BigQuery.\n",
+        "### Modify the model handler and post processor\n",
+        "\n",
+        "Modify the pipeline to read from sources like CSV files and BigQuery.\n",
         "\n",
-        "In this step we:\n",
+        "In this step we do the following:\n",
         "\n",
-        "* Wrap the `PytorchModelHandlerTensor` object around `KeyedModelHandler` to handle keyed data.\n",
-        "* Add a map transform, which converts a table row into `Tuple[str, float]`.\n",
-        "* Add a map transform which converts `Tuple[str, float]` from  to `Tuple[str, torch.Tensor]`.\n",
-        "* Modify the post inference processor to output results along with the key."
+        "* To handle keyed data, wrap the `PytorchModelHandlerTensor` object around `KeyedModelHandler`.\n",
+        "* Add a map transform that converts a table row into `Tuple[str, float]`.\n",
+        "* Add a map transform that converts `Tuple[str, float]` from  to `Tuple[str, torch.Tensor]`.\n",
+        "* Modify the post-inference processor to output results with the key."
       ]
     },
     {
@@ -579,6 +571,15 @@
         "        output_value = element[1].inference\n",
         "        yield (f\"key: {key}, input: {input_value.item()} output: {output_value.item()}\" )"
       ]
+    },
+        {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "f22da313-5bf8-4334-865b-bbfafc374e63"
+      },
+      "source": [
+        "### Create a source with attached key\n"
+      ]
     },
     {
       "cell_type": "markdown",
@@ -586,7 +587,7 @@
         "id": "c9b0fb49-d605-4f26-931a-57f42b0ad253"
       },
       "source": [
-        "#### Use BigQuery as the source."
+        "#### Use BigQuery as the source"
       ]
     },
     {
@@ -595,7 +596,7 @@
         "id": "45ce4330-7d33-4c53-8033-f4fa02383894"
       },
       "source": [
-        "Install Google Cloud BigQuery API using `pip`."
+        "To install the Google Cloud BigQuery API, use `pip`:"
       ]
     },
     {
@@ -615,7 +616,7 @@
         "id": "6e869347-dd49-40be-b1e5-749699dc0d83"
       },
       "source": [
-        "Create a table in the BigQuery using the snippet below, which has two columns: One holds the key and the second holds the test value. To use BiqQuery, a Google Cloud account with the BigQuery API enabled is required."
+        "Create a table in BigQuery using the following snippet, which has two columns. The first column holds the key and the second column holds the test value. To use BiqQuery, you need a Google Cloud account with the BigQuery API enabled."
       ]
     },
     {
@@ -701,7 +702,7 @@
         "id": "479c9319-3295-4288-971c-dd0f0adfdd1e"
       },
       "source": [
-        "Use `BigQuery` as the source in the pipeline to read keyed data."
+        "To read keyed data, use BigQuery as the pipeline source."
       ]
     },
     {
@@ -754,7 +755,7 @@
         "id": "53ee7f24-5625-475a-b8cc-9c031591f304"
       },
       "source": [
-        "### Using CSV file as the source."
+        "#### Use a CSV file as the source"
       ]
     },
     {
@@ -769,7 +770,7 @@
     {
       "cell_type": "code",
       "source": [
-        "# creates a csv file with the below values.\n",
+        "# creates a CSV file with the values.\n",
         "csv_values = [(\"first_question\", 105.00),\n",
         "      (\"second_question\", 108.00),\n",
         "      (\"third_question\", 1000.00),\n",
@@ -836,9 +837,11 @@
         "id": "742abfbb-545e-435b-8833-2410ce29d22c"
       },
       "source": [
-        "# Pattern 4: Inference with multiple models in the same pipeline.\n",
+        "## Pattern 4: Inference with multiple models in the same pipeline\n",
+        "This pattern demonstrates how use inference with multiple models in the same pipeline.\n",
         "\n",
-        "## Inference with multiple models in parallel. "
+        "### Inference with multiple models in parallel\n",
+        "This section demonstrates how use inference with multiple models in parallel."
       ]
     },
     {
@@ -872,8 +875,8 @@
         "id": "70ebed52-4ead-4cae-ac96-8cf206012ce1"
       },
       "source": [
-        "In this, the same data is run through two different models: the one that we’ve been using to multiply by 5 \n",
-        "and a new model, which will learn to multiply by 10."
+        "In this section, the same data is run through two different models: the one that we use to multiply by 5 \n",
+        "and a new model that learns to multiply by 10."
       ]
     },
     {
@@ -937,16 +940,18 @@
         "id": "e71e6706-5d8d-4322-9def-ac7fb20d4a50"
       },
       "source": [
-        "## Inference with multiple models in sequence \n",
+        "### Inference with multiple models in sequence\n",
+        "This section demonstrates how use inference with multiple models in sequence.\n",
         "\n",
         "In a sequential pattern, data is sent to one or more models in sequence, \n",
         "with the output from each model chaining to the next model.\n",
+        "The following list demonstrates the sequence used in this section.\n",
         "\n",
         "1. Read the data from BigQuery.\n",
         "2. Map the data.\n",
-        "3. RunInference with multiply by 5 model.\n",
+        "3. Use RunInference with the multiply by 5 model.\n",
         "4. Process the results.\n",
-        "5. RunInference with multiply by 10 model.\n",
+        "5. Use RunInference with the multiply by 10 model.\n",
         "6. Process the results.\n"
       ]
     },
diff --git a/examples/notebooks/beam-ml/run_inference_pytorch_tensorflow_sklearn.ipynb b/examples/notebooks/beam-ml/run_inference_pytorch_tensorflow_sklearn.ipynb
index cbca4a1e896..6c1e765920d 100644
--- a/examples/notebooks/beam-ml/run_inference_pytorch_tensorflow_sklearn.ipynb
+++ b/examples/notebooks/beam-ml/run_inference_pytorch_tensorflow_sklearn.ipynb
@@ -1,4 +1,19 @@
 {
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "provenance": [],
+      "collapsed_sections": []
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    },
+    "language_info": {
+      "name": "python"
+    }
+  },
   "cells": [
     {
       "cell_type": "code",
@@ -34,7 +49,7 @@
         "id": "faayYQYrQzY3"
       },
       "source": [
-        "## RunInference in Beam"
+        "## Use RunInference in Apache Beam"
       ]
     },
     {
@@ -43,9 +58,10 @@
         "id": "JjAt1GesQ9sg"
       },
       "source": [
-        "Starting with Apache Beam 2.40.0, a new API called RunInference can be used for using machine learning (ML) models to do local and remote inference with batch and streaming pipelines. RunInference API leverages Apache Beam concepts such as the BatchElements transform and the Shared class, to enable you to use models in your pipelines to create transforms optimized for machine learning inferences.\n",
+        "Starting with Apache Beam 2.40.0, you can use Apache Beam with the RunInference API to use machine learning (ML) models for local and remote inference with batch and streaming pipelines.\n",
+        "The RunInference API leverages Apache Beam concepts, such as the BatchElements transform and the Shared class, to support models in your pipelines that create transforms optimized for machine learning inferences.\n",
         "\n",
-        "One can find more details about RunInference API, here:https://beam.apache.org/documentation/sdks/python-machine-learning/"
+        "For more information about the RunInference API, see [Machine Learning](https://beam.apache.org/documentation/sdks/python-machine-learning) in the Apache Beam documentation."
       ]
     },
     {
@@ -54,13 +70,13 @@
         "id": "A8xNRyZMW1yK"
       },
       "source": [
-        "In this notebook, we show how to use RunInference with three different popular ML frameworks: PyTorch, TensorFlow and Scikit-learn. We showcase three pipelines that uses a text classification model for generating prediction.\n",
+        "This notebook demonstrates how to use the RunInference API with three popular ML frameworks: PyTorch, TensorFlow, and scikit-learn. The three pipelines use a text classification model for generating predictions.\n",
         "\n",
-        "The different steps needed to build this pipeline can be summarized as follows:\n",
+        "Follow these steps to build a pipeline:\n",
         "* Read the images.\n",
-        "* Preprocess the text if needed\n",
-        "* Inference with PyTorch/TensorFlow/Scikit-learn Model\n",
-        "* PostProcess the output from RunInference if needed "
+        "* If needed, preprocess the text.\n",
+        "* Inference with the PyTorch, TensorFlow, or Scikit-learn model.\n",
+        "* If needed, postprocess the output from RunInference."
       ]
     },
     {
@@ -69,9 +85,9 @@
         "id": "CTtBTpsHZFCk"
       },
       "source": [
-        "### RunInference with a PyTorch Model\n",
+        "## RunInference with a PyTorch model\n",
         "\n",
-        "\n"
+        "This section demonstrates how to use the RunInference API with a PyTorch model."
       ]
     },
     {
@@ -80,7 +96,9 @@
         "id": "5kkjbcIzZIf6"
       },
       "source": [
-        "#### Install Dependency"
+        "### Install dependencies\n",
+        "\n",
+        "First, download and install the dependencies."
       ]
     },
     {
@@ -180,9 +198,9 @@
         "id": "ObRPUrlEbjHj"
       },
       "source": [
-        "#### Model\n",
+        "### Model\n",
         "\n",
-        "We are using a pretrained text classification model, [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english?text=I+like+you.+I+love+you). This model is a fine-tune checkpoint of DistilBERT-base-uncased, fine-tuned on SST-2 dataset.\n"
+        "This example uses a pretrained text classification model, [distilbert-base-uncased-finetuned-sst-2-english](https://huggingface.co/distilbert-base-uncased-finetuned-sst-2-english?text=I+like+you.+I+love+you). This model is a checkpoint of DistilBERT-base-uncased, fine-tuned on the SST-2 dataset.\n"
       ]
     },
     {
@@ -219,7 +237,9 @@
         "id": "vA1UmbFRb5C-"
       },
       "source": [
-        "#### Helper Functions"
+        "### Helper functions\n",
+        "\n",
+        "The model also uses helper functions."
       ]
     },
     {
@@ -260,13 +280,13 @@
         "    self._tokenizer = DistilBertTokenizer.from_pretrained(self._model_name)\n",
         "  \n",
         "  def process(self, text_input: str):\n",
-        "    # We need to pad the tokens tensors to max length to make sure that all the tensors\n",
-        "    # are of the same length and hence stack-able by the RunInference API, normally you would batch first\n",
-        "    # and tokenize the batch after and pad each tensor the the max length in the batch.\n",
-        "    # see: https://beam.apache.org/documentation/sdks/python-machine-learning/#unable-to-batch-tensor-elements\n",
+        "    # Pad the token tensors to max length to make sure that all of the tensors\n",
+        "    # are of the same length and stack-able by the RunInference API. Normally, you would batch first\n",
+        "    # then tokenize the batch, padding each tensor the max length in the batch.\n",
+        "    # See: https://beam.apache.org/documentation/sdks/python-machine-learning/#unable-to-batch-tensor-elements\n",
         "    tokens = self._tokenizer(text_input, return_tensors='pt', padding='max_length', max_length=512)\n",
-        "    # squeeze because tokenization adds an extra dimension, which is empty\n",
-        "    # in this case because we're tokenizing one element at a time.\n",
+        "    # Squeeze because tokenization adds an extra dimension, which is empty,\n",
+        "    # in this case because we tokenize one element at a time.\n",
         "    tokens = {key: torch.squeeze(val) for key, val in tokens.items()}\n",
         "    return [(text_input, tokens)]\n",
         "\n",
@@ -283,7 +303,9 @@
         "id": "WYYbQTMWctkW"
       },
       "source": [
-        "#### RunInference Pipeline"
+        "### RunInference pipeline\n",
+        "\n",
+        "This section demonstrates how to use create and run the RunInference pipeline."
       ]
     },
     {
@@ -379,7 +401,9 @@
         "id": "7KXeaQg3eCcp"
       },
       "source": [
-        "### RunInference with a TensorFlow Model\n"
+        "## RunInference with a TensorFlow model\n",
+        "\n",
+        "This section demonstrates how to use the RunInference API with a TensorFlow model."
       ]
     },
     {
@@ -388,7 +412,7 @@
         "id": "hEHxNka4eOhC"
       },
       "source": [
-        "Note: Tensorflow models are supported through tfx-bsl."
+        "Note: Tensorflow models are supported through `tfx-bsl`."
       ]
     },
     {
@@ -397,7 +421,9 @@
         "id": "8KyXULYbeYlD"
       },
       "source": [
-        "#### Install Dependency"
+        "### Install dependencies\n",
+        "\n",
+        "First, download and install the dependencies."
       ]
     },
     {
@@ -843,7 +869,7 @@
         "id": "h2JP7zsqerCT"
       },
       "source": [
-        "#### Model"
+        "### Model"
       ]
     },
     {
@@ -852,7 +878,8 @@
         "id": "ydYQ_5EyfeEM"
       },
       "source": [
-        "Download a pretrained binary classifier to perform sentiment analysis on an IMDB dataset from GCS. This model was trained by following this [tutorial](https://www.tensorflow.org/tutorials/keras/text_classification)"
+        "Download a pretrained binary classifier to perform sentiment analysis on an IMDB dataset from Google Cloud Storage.\n",
+        "This model was trained by following this [TensorFlow text classification tutorial](https://www.tensorflow.org/tutorials/keras/text_classification)."
       ]
     },
     {
@@ -872,7 +899,9 @@
         "id": "GZ-Ioc8ZfyIT"
       },
       "source": [
-        "#### Helper Functions"
+        "### Helper functions\n",
+        "\n",
+        "The model also uses helper functions."
       ]
     },
     {
@@ -917,7 +946,9 @@
         "id": "PZVwI4BbgaAI"
       },
       "source": [
-        "#### Prepare the Input"
+        "### Prepare the Input\n",
+        "\n",
+        "This section demonstrates how to prepare the input for your model."
       ]
     },
     {
@@ -946,9 +977,9 @@
       "source": [
         "input_strings_file = 'input_strings.tfrecord'\n",
         "\n",
-        "# Preprocess the input as RunInference is expecting a serialized tf.example as an input\n",
-        "# Write the processed input to a file \n",
-        "# One can also do it as a pipeline step by using beam.Map() \n",
+        "# Because RunInference is expecting a serialized tf.example as an input, preprocess the input.\n",
+        "# Write the processed input to a file. \n",
+        "# You can also do this preprocessing as a pipeline step by using beam.Map().\n",
         "\n",
         "with tf.io.TFRecordWriter(input_strings_file) as writer:\n",
         " for i in inputs:\n",
@@ -962,7 +993,9 @@
         "id": "BYkQl_l8gRgo"
       },
       "source": [
-        "#### RunInference Pipeline"
+        "### RunInference Pipeline\n",
+        "\n",
+        "This section demonstrates how to use create and run the RunInference pipeline."
       ]
     },
     {
@@ -1001,7 +1034,7 @@
         "saved_model_spec = model_spec_pb2.SavedModelSpec(model_path=model_dir)\n",
         "inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)\n",
         "\n",
-        "#A Beam IO that reads a file of serialized tf.Examples\n",
+        "# A Beam I/O that reads a file of serialized tf.Examples\n",
         "tfexample_beam_record = TFExampleRecord(file_pattern='input_strings.tfrecord')\n",
         "\n",
         "with beam.Pipeline() as pipeline:\n",
@@ -1019,7 +1052,9 @@
         "id": "8wBUckzHjGV6"
       },
       "source": [
-        "### RunInference with Scikit-Learn\n"
+        "## RunInference with scikit-learn\n",
+        "\n",
+        "This section demonstrates how to use the RunInference API with scikit-learn."
       ]
     },
     {
@@ -1028,7 +1063,9 @@
         "id": "6ArL_55kjxkO"
       },
       "source": [
-        "#### Install Dependency"
+        "### Install Dependencies\n",
+        "\n",
+        "First, download and install the dependencies."
       ]
     },
     {
@@ -1065,9 +1102,9 @@
         "id": "-7ABKlZvkFHy"
       },
       "source": [
-        "#### Model\n",
+        "### Model\n",
         "\n",
-        "Train and save a sentiment analysis pipeline on movie reviews to classify movie reviews as either positive or negative"
+        "To classify movie reviews as either positive or negative, train and save a sentiment analysis pipeline about movie reviews."
       ]
     },
     {
@@ -1076,7 +1113,7 @@
         "id": "WI_UXluPkRYq"
       },
       "source": [
-        "This model was trained by following this [tutorial](https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html#exercise-2-sentiment-analysis-on-movie-reviews)"
+        "This model was trained by following this [scikit-learn tutorial](https://scikit-learn.org/stable/tutorial/text_analytics/working_with_text_data.html#exercise-2-sentiment-analysis-on-movie-reviews)"
       ]
     },
     {
@@ -1094,7 +1131,9 @@
         "id": "KL4Cx8s0mBqn"
       },
       "source": [
-        "#### RunInference Pipeline"
+        "### RunInference Pipeline\n",
+        "\n",
+        "This section demonstrates how to use create and run the RunInference pipeline."
       ]
     },
     {
@@ -1138,13 +1177,13 @@
         }
       ],
       "source": [
-        "# One can choose a Sklearn model handler based on their input data type:\n",
-        "# 1. SklearnModelHandlerNumpy: For using numpy arrays as an input\n",
-        "# 2. SklearnModelHandlerPandas: For using pandas dataframes as an input\n",
+        "# Choose an sklearn model handler based on the input data type:\n",
+        "# 1. SklearnModelHandlerNumpy: For using numpy arrays as input.\n",
+        "# 2. SklearnModelHandlerPandas: For using pandas dataframes as input.\n",
         "\n",
-        "# Sklearn model handler supports loading of two serialized format: \n",
-        "# 1. ModelFileType.PICKLE: For models saved using pickle\n",
-        "# 2. ModelFileType.JOBLIB: For models saved using Joblib\n",
+        "# The sklearn model handler supports loading two serialized formats:\n",
+        "# 1. ModelFileType.PICKLE: For models saved using pickle.\n",
+        "# 2. ModelFileType.JOBLIB: For models saved using Joblib.\n",
         "\n",
         "model_handler = SklearnModelHandlerNumpy(model_uri=model_dir, model_file_type=ModelFileType.PICKLE)\n",
         "\n",
diff --git a/examples/notebooks/beam-ml/run_inference_sklearn.ipynb b/examples/notebooks/beam-ml/run_inference_sklearn.ipynb
index 116d79b64ef..1e97bfe48d3 100644
--- a/examples/notebooks/beam-ml/run_inference_sklearn.ipynb
+++ b/examples/notebooks/beam-ml/run_inference_sklearn.ipynb
@@ -4,8 +4,7 @@
   "metadata": {
     "colab": {
       "provenance": [],
-      "collapsed_sections": [],
-      "toc_visible": true
+      "collapsed_sections": []
     },
     "kernelspec": {
       "name": "python3",
@@ -19,7 +18,7 @@
     {
       "cell_type": "code",
       "source": [
-        "#@title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
+        "# @title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
         "\n",
         "# Licensed to the Apache Software Foundation (ASF) under one\n",
         "# or more contributor license agreements. See the NOTICE file\n",
@@ -36,11 +35,11 @@
         "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
         "# KIND, either express or implied. See the License for the\n",
         "# specific language governing permissions and limitations\n",
-        "# under the License."
+        "# under the License"
       ],
       "metadata": {
-        "id": "C1rAsD2L-hSO",
-        "cellView": "form"
+        "cellView": "form",
+        "id": "C1rAsD2L-hSO"
       },
       "execution_count": null,
       "outputs": []
@@ -52,26 +51,18 @@
       },
       "source": [
         "# Apache Beam RunInference for scikit-learn\n",
-        "\n",
-        "<button>\n",
-        "  <a href=\"https://beam.apache.org/documentation/sdks/python-machine-learning/\">\n",
-        "    <img src=\"https://beam.apache.org/images/favicon.ico\" alt=\"Open the docs\" height=\"16\"/>\n",
-        "    Beam RunInference\n",
-        "  </a>\n",
-        "</button>\n",
-        "\n",
-        "In this notebook, we walk through the use of the RunInference transform for [scikit-learn](https://scikit-learn.org/) also called sklearn.\n",
-        "Beam [RunInference](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.RunInference) has implementations of [ModelHandler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.ModelHandler) prebuilt for scikit-learn.\n",
+        "This notebook demonstrates the use of the RunInference transform for [scikit-learn](https://scikit-learn.org/) also called sklearn.\n",
+        "Apache Beam [RunInference](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.RunInference) has implementations of the [ModelHandler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.ModelHandler) class prebuilt for scikit-learn. For more information about the RunInference API, see [Machine Learning](https://beam.apache.org/documentation/sdks/python-machine [...]
         "\n",
         "Users can choose a model handler for their input data type:\n",
         "* The [numpy model handler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.sklearn_inference.html#apache_beam.ml.inference.sklearn_inference.SklearnModelHandlerNumpy)\n",
         "* The [pandas dataframes model handler](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.sklearn_inference.html#apache_beam.ml.inference.sklearn_inference.SklearnModelHandlerNumpy)\n",
         "\n",
-        "These ModelHandlers coupled with RunInference take care of batching, vectorization and optomizing predictions for your scikit-learn pipeline or model.\n",
+        "With RunInference, these ModelHandlers manage batching, vectorization, and prediction optimization for your scikit-learn pipeline or model.\n",
         "\n",
-        "This notebook illustrates common RunInference patterns such as the following:\n",
-        "*   Generating predictions.\n",
-        "*   Post processing results after RunInference.\n",
+        "This notebook demonstrates the following common RunInference patterns:\n",
+        "*   Generate predictions.\n",
+        "*   Postprocess results after RunInference.\n",
         "*   Inference with multiple models in the same pipeline.\n",
         "\n",
         "The linear regression models used in these samples are trained on data that correspondes to the 5 and 10 times table; that is,`y = 5x` and `y = 10x` respectively."
@@ -80,11 +71,11 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## Setup\n",
-        "\n",
-        "1. Install dependencies for beam.\n",
+        "## Before you begin\n",
+        "Complete the following setup steps:\n",
+        "1. Install dependencies for Apache Beam.\n",
         "1. Authenticate with Google Cloud.\n",
-        "1. Specify your project and bucket.  This will be needed to save and load models."
+        "1. Specify your project and bucket. You need the project and bucket to save and load models."
       ],
       "metadata": {
         "id": "zzwnMzzgdyPB"
@@ -161,9 +152,9 @@
         "id": "32c9ba40-9396-48f4-9e7f-a2acced98bb2"
       },
       "source": [
-        "## About scikit-learn Versions\n",
+        "## About scikit-learn versions\n",
         "\n",
-        "scikit-learn is a build-dependency of Apache Beam. If a different version of sklearn needs to be installed, use `%pip install scikit-learn==<version>`"
+        "`scikit-learn` is a build-dependency of Apache Beam. If you need to install a different version of sklearn , use `%pip install scikit-learn==<version>`"
       ]
     },
     {
@@ -200,7 +191,7 @@
         "from apache_beam.ml.inference.base import RunInference\n",
         "from apache_beam.options.pipeline_options import PipelineOptions\n",
         "\n",
-        "# NOTE: If you get an error - restart your runtime.\n"
+        "# NOTE: If an error occurs, restart your runtime.\n"
       ]
     },
     {
@@ -214,10 +205,10 @@
         "import os\n",
         "\n",
         "# Constants\n",
-        "project = \"<Replace With Your Project>\"\n",
-        "bucket = \"<Replace With Your Bucket>\" \n",
+        "project = \"<PROJECT_ID>\"\n",
+        "bucket = \"<BUCKET_NAME>\" \n",
         "\n",
-        "# set the project to avoid warnings.\n",
+        "# To avoid warnings, set the project.\n",
         "os.environ['GOOGLE_CLOUD_PROJECT'] = project\n"
       ]
     },
@@ -227,13 +218,13 @@
         "id": "6695cd22-e0bf-438f-8223-4a93392e6616"
       },
       "source": [
-        "## Create the Data and the scikit-learn Model.\n",
-        "This cell demonstrates:\n",
-        "1. Creating the data to train the scikit-learn linear regression model.\n",
-        "2. Training the linear regression model.\n",
-        "3. Saving the scikit-learn model using `pickle`.\n",
+        "## Create the data and the scikit-learn model\n",
+        "This section demonstrates the following steps:\n",
+        "1. Create the data to train the scikit-learn linear regression model.\n",
+        "2. Train the linear regression model.\n",
+        "3. Save the scikit-learn model using `pickle`.\n",
         "\n",
-        "For this example we create two models, one with 5 times and one with 10 times."
+        "In this example, we create two models, one with the 5 times model and a section with the 10 times model."
       ]
     },
     {
@@ -258,7 +249,7 @@
         "five_times_model_filename = 'sklearn_5x_model.pkl'\n",
         "train_and_save_model(x, y, five_times_model_filename)\n",
         "\n",
-        "# change y to be 10 times and output a 10 times table.\n",
+        "# Change y to be 10 times, and output a 10 times table.\n",
         "ten_times_model_filename = 'sklearn_10x_model.pkl'\n",
         "train_and_save_model(x, y, ten_times_model_filename)\n",
         "y = (x * 10).reshape(-1, 1)\n",
@@ -271,9 +262,9 @@
         "id": "69008a3d-3d15-4643-828c-b0419b347d01"
       },
       "source": [
-        "### scikit-learn RunInference pipeline.\n",
-        "\n",
-        "1. Define the scikit-learn model handler that accepts array_like object as input.\n",
+        "### scikit-learn RunInference pipeline\n",
+        "This section demonstrates the following steps:\n",
+        "1. Define the scikit-learn model handler that accepts an `array_like` object as input.\n",
         "2. Read the data from BigQuery.\n",
         "3. Use the scikit-learn trained model and the scikit-learn RunInference transform on unkeyed data."
       ]
@@ -315,7 +306,7 @@
     {
       "cell_type": "code",
       "source": [
-        "# Populated BigQuery Table\n",
+        "# Populated BigQuery table\n",
         "\n",
         "from google.cloud import bigquery\n",
         "\n",
@@ -397,7 +388,7 @@
         "pipeline_options = PipelineOptions().from_dictionary(\n",
         "                                      {'temp_location':f'gs://{bucket}/tmp'})\n",
         "\n",
-        "# Define BigQuery table specification.\n",
+        "# Define the BigQuery table specification.\n",
         "table_name = 'maths_problems_1'\n",
         "table_spec = f'{project}:maths.{table_name}'\n",
         "\n",
@@ -418,9 +409,10 @@
       },
       "source": [
         "### Sklearn RunInference on keyed inputs.\n",
+        "This section demonstrates the following steps:\n",
         "1. Wrap the `SklearnModelHandlerNumpy` object around `KeyedModelHandler` to handle keyed data.\n",
         "2. Read the data from BigQuery.\n",
-        "3. Use the Sklearn trained model and the Sklearn RunInference transform on a keyed data."
+        "3. Use the sklearn trained model and the sklearn RunInference transform on a keyed data."
       ]
     },
     {
@@ -464,7 +456,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## How to Run Multiple Models\n",
+        "## Run multiple models\n",
         "\n",
         "This pipeline takes two RunInference transforms with different models and then combines the output."
       ],
diff --git a/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb b/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
index 457531cfa74..3f4fa1e39e2 100644
--- a/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
+++ b/examples/notebooks/beam-ml/run_inference_tensorflow.ipynb
@@ -10,13 +10,16 @@
       "display_name": "Python 3",
       "name": "python3"
     },
+    "language_info": {
+      "name": "python"
+    },
     "accelerator": "GPU"
   },
   "cells": [
     {
       "cell_type": "code",
       "source": [
-        "#@title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
+        "# @title ###### Licensed to the Apache Software Foundation (ASF), Version 2.0 (the \"License\")\n",
         "\n",
         "# Licensed to the Apache Software Foundation (ASF) under one\n",
         "# or more contributor license agreements. See the NOTICE file\n",
@@ -33,7 +36,7 @@
         "# \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY\n",
         "# KIND, either express or implied. See the License for the\n",
         "# specific language governing permissions and limitations\n",
-        "# under the License."
+        "# under the License"
       ],
       "metadata": {
         "id": "fFjof1NgAJwu"
@@ -45,33 +48,29 @@
       "cell_type": "markdown",
       "source": [
         "# Apache Beam RunInference with TensorFlow\n",
-        "\n",
-        "<button>\n",
-        "  <a href=\"https://beam.apache.org/documentation/sdks/python-machine-learning/\">\n",
-        "    <img src=\"https://beam.apache.org/images/favicon.ico\" alt=\"Open the docs\" height=\"16\"/>\n",
-        "    Beam RunInference\n",
-        "  </a>\n",
-        "</button>\n",
+        "This notebook demonstrates the use of the RunInference transform for [TensorFlow](https://www.tensorflow.org/).\n",
+        "Beam [RunInference](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.RunInference) accepts a ModelHandler generated from [`tfx-bsl`](https://github.com/tensorflow/tfx-bsl) via CreateModelHandler.\n",
         "\n",
         "The Apache Beam RunInference transform is used for making predictions for\n",
-        "a variety of machine learning models. From version 1.10.0 of tfx-bsl you can\n",
-        "create a TensorFlow ModelHandler for use with Apache Beam.\n",
-        "\n",
-        "In this notebook, we walk through the use of the RunInference transform for [TensorFlow](https://www.tensorflow.org/).\n",
-        "Beam [RunInference](https://beam.apache.org/releases/pydoc/current/apache_beam.ml.inference.base.html#apache_beam.ml.inference.base.RunInference) accepts ModelHandler generated from [tfx-bsl](https://github.com/tensorflow/tfx-bsl) via CreateModelHandler.\n",
-        "\n",
-        "\n",
-        "\n",
-        "In this notebook we walk through:\n",
-        "- Importing [tfx-bsl](https://github.com/tensorflow/tfx-bsl)\n",
-        "- Building a simple TensorFlow model\n",
-        "- Setting up example data into TensorFlow protos.\n",
-        "- Running those examples and getting a prediction inside an Apache Beam pipeline."
+        "a variety of machine learning models. In versions 1.10.0 and later of `tfx-bsl`, you can\n",
+        "create a TensorFlow ModelHandler for use with Apache Beam. For more information about the RunInference API, see [Machine Learning](https://beam.apache.org/documentation/sdks/python-machine-learning) in the Apache Beam documentation.\n",
+        "\n",
+        "This notebook demonstrates the following steps:\n",
+        "- Import [`tfx-bsl`](https://github.com/tensorflow/tfx-bsl).\n",
+        "- Build a simple TensorFlow model.\n",
+        "- Set up example data in TensorFlow protos.\n",
+        "- Run those examples and get a prediction inside an Apache Beam pipeline."
       ],
       "metadata": {
         "id": "HrCtxslBGK8Z"
       }
     },
+    {
+      "cell_type": "markdown",
+      "source": [
+        "First, import `tfx-bsl`."
+      ]
+    },
     {
       "cell_type": "code",
       "metadata": {
@@ -87,7 +86,7 @@
     {
       "cell_type": "markdown",
       "source": [
-        "In order to use RunInference you will need beam 2.40 or higher. Creation of a ModelHandler is supported in tfx-bsl 1.10 or higher."
+        "To use RunInference, install Apache Beam version 2.40 or later. Creation of a ModelHandler is supported in `tfx-bsl` versions 1.10 and later."
       ],
       "metadata": {
         "id": "gVCtGOKTHMm4"
@@ -121,8 +120,8 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## Authenticate With Cloud\n",
-        "This notebook relies on saving your model into Google Cloud. First authenticate this notebook to use your Google Cloud account."
+        "## Authenticate with Google Cloud\n",
+        "This notebook relies on saving your model to Google Cloud. To use your Google Cloud account, authenticate this notebook."
       ],
       "metadata": {
         "id": "X80jy3FqHjK4"
@@ -143,10 +142,10 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## Import Your Dependencies and Setup Your Bucket\n",
-        "Replace project and bucket with your variables with your project.\n",
+        "## Import dependencies and set up your bucket\n",
+        "Replace `PROJECT_ID` and `BUCKET_NAME` with the ID of your project and the name of your bucket.\n",
         "\n",
-        "**Important**: If you get an error, restart your runtime."
+        "**Important**: If an error occurs, restart your runtime."
       ],
       "metadata": {
         "id": "40qtP6zJuMXm"
@@ -178,8 +177,8 @@
         "\n",
         "from apache_beam.options.pipeline_options import PipelineOptions\n",
         "\n",
-        "project = \"<Replace With Your Project>\"\n",
-        "bucket = \"<Replace With Your Bucket>\"\n",
+        "project = \"<PROJECT_ID>\"\n",
+        "bucket = \"<BUCKET_NAME>\"\n",
         "\n",
         "save_model_dir_multiply = f'gs://{bucket}/tfx-inference/model/multiply_five/v1/'\n"
       ],
@@ -189,9 +188,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "## Create and Test a Simple Model\n",
+        "## Create and test a simple model\n",
         "\n",
-        "This creates a model that predicts the 5 times table."
+        "This step creates a model that predicts the 5 times table."
       ],
       "metadata": {
         "id": "YzvZWEv-1oiK"
@@ -207,13 +206,13 @@
         "outputId": "e2b7bc2c-a3a4-44a8-86be-962bf505540a"
       },
       "source": [
-        "# Create training data which represents the 5 times multiplication table for 0 to 99. \n",
-        "# x is the data and y the labels. \n",
+        "# Create training data that represents the 5 times multiplication table for the numbers 0 to 99. \n",
+        "# x is the data and y is the labels. \n",
         "x = numpy.arange(0, 100)   # Examples\n",
         "y = x * 5                  # Labels\n",
         "\n",
         "# Build a simple linear regression model.\n",
-        "# Note the model has a shape of (1) for its input layer, it will expect a single int64 value.\n",
+        "# Note that the model has a shape of (1) for its input layer and expects a single int64 value.\n",
         "input_layer = keras.layers.Input(shape=(1), dtype=tf.float32, name='x')\n",
         "output_layer= keras.layers.Dense(1)(input_layer)\n",
         "\n",
@@ -247,7 +246,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "### Test the Model\n"
+        "### Test the model\n",
+        "\n",
+        "This step tests the model that you created."
       ],
       "metadata": {
         "id": "O_a0-4Gb19cy"
@@ -290,9 +291,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "### Populate the Data into a TensorFlow Proto\n",
+        "### Populate the data in a TensorFlow proto\n",
         "\n",
-        "Tensorflow data uses protos. If you are loading from a file there are helpers for this. Since we are using generated data, this code populates a proto."
+        "Tensorflow data uses protos. If you are loading from a file, helpers exist for this step. Because we are using generated data, this code populates a proto."
       ],
       "metadata": {
         "id": "dEmleqiH3t71"
@@ -304,8 +305,9 @@
         "id": "XvKc9kQilPjx"
       },
       "source": [
-        "# This is an example of a proto that converts the samples and labels into\n",
-        "# tensors usable by tensorflow.\n",
+        "# This example shows a proto that converts the samples and labels into\n",
+        "# tensors usable by TensorFlow.\n",
+        "\n",
         "class ExampleProcessor:\n",
         "    def create_example_with_label(self, feature: numpy.float32,\n",
         "                             label: numpy.float32)-> tf.train.Example:\n",
@@ -324,7 +326,7 @@
         "    def create_feature(self, element: numpy.float32):\n",
         "        return tf.train.Feature(float_list=tf.train.FloatList(value=[element]))\n",
         "\n",
-        "# Create a labeled example file for 5 times table.\n",
+        "# Create a labeled example file for the 5 times table.\n",
         "\n",
         "example_five_times_table = 'example_five_times_table.tfrecord'\n",
         "\n",
@@ -351,7 +353,7 @@
       "source": [
         "### Fit The Model\n",
         "\n",
-        "This example builds a model. Since RunInference requires pretrained models, this segment builds a usable model."
+        "This example builds a model. Because RunInference requires pretrained models, this segment builds a usable model."
       ],
       "metadata": {
         "id": "G-sAu3cf31f3"
@@ -397,7 +399,9 @@
     {
       "cell_type": "markdown",
       "source": [
-        "### Save the Model"
+        "### Save the model\n",
+        "\n",
+        "This step shows how to save your model."
       ],
       "metadata": {
         "id": "r4dpR6dQ4JwX"
@@ -439,9 +443,9 @@
       "source": [
         "## Run the Pipeline\n",
         "\n",
-        "FormatOutput demonstrates how to extract values from the output protos.\n",
+        "`FormatOutput` demonstrates how to extract values from the output protos.\n",
         "\n",
-        "CreateModelHandler demonstrates the model handler that needs to be passed into beams RunInference API."
+        "`CreateModelHandler` demonstrates the model handler that needs to be passed into the Apache Beam RunInference API."
       ],
       "metadata": {
         "id": "P2UMmbNW4YQV"
@@ -470,8 +474,8 @@
         "\n",
         "tfexample_beam_record = tfx_bsl.public.tfxio.TFExampleRecord(file_pattern=predict_values_five_times_table)\n",
         "saved_model_spec = model_spec_pb2.SavedModelSpec(model_path=save_model_dir_multiply)\n",
-        "inferece_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)\n",
-        "model_handler = CreateModelHandler(inferece_spec_type)\n",
+        "inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)\n",
+        "model_handler = CreateModelHandler(inference_spec_type)\n",
         "with beam.Pipeline() as p:\n",
         "    _ = (p | tfexample_beam_record.RawRecordBeamSource() \n",
         "           | RunInference(model_handler)\n",
@@ -498,13 +502,13 @@
       "source": [
         "## KeyedModelHandler with TensorFlow\n",
         "\n",
-        "By defuault the ModelHandler does not expect a key.\n",
+        "By default, the `ModelHandler` does not expect a key.\n",
         "\n",
-        "If you know you will have keys associated with your examples then you should wrap the model handler with beam.KeyedModelHandler\n",
+        "If you know that keys are associated with your examples, wrap the model handler with `beam.KeyedModelHandler`.\n",
         "\n",
-        "There is also a beam.MaybeKeyedModelHandler if you are unsure if you have keys or not.\n",
+        "If you don't know whether keys are associated with your examples, use `beam.MaybeKeyedModelHandler`.\n",
         "\n",
-        "This also illustrates how to use tfx-bsl examples."
+        "This step also illustrates how to use `tfx-bsl` examples."
       ],
       "metadata": {
         "id": "IXikjkGdHm9n"
@@ -518,14 +522,14 @@
         "import tensorflow as tf\n",
         "\n",
         "class FormatOutputKeyed(FormatOutput):\n",
-        "  # inherit from FormatOutput above to simplify\n",
+        "  # To simplify, inherit from FormatOutput.\n",
         "  def process(self, tuple_in: Tuple):\n",
         "    key, element = tuple_in\n",
         "    output = super().process(element)\n",
         "    yield ' : '.join([key, next(output)])\n",
         "\n",
         "def make_example(num):\n",
-        "  # Return keyed values in the form of (key num, example)\n",
+        "  # Return keyed values in the form of (key num, example).\n",
         "  key = f'key {num}'\n",
         "  tf_proto = text_format.Parse(\n",
         "    \"\"\"\n",
@@ -545,8 +549,8 @@
         "\n",
         "tfexample_beam_record = tfx_bsl.public.tfxio.TFExampleRecord(file_pattern=predict_values_five_times_table)\n",
         "saved_model_spec = model_spec_pb2.SavedModelSpec(model_path=save_model_dir_multiply)\n",
-        "inferece_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)\n",
-        "keyed_model_handler = KeyedModelHandler(CreateModelHandler(inferece_spec_type))\n",
+        "inference_spec_type = model_spec_pb2.InferenceSpecType(saved_model_spec=saved_model_spec)\n",
+        "keyed_model_handler = KeyedModelHandler(CreateModelHandler(inference_spec_type))\n",
         "with beam.Pipeline() as p:\n",
         "    _ = (p | 'CreateExamples' >> beam.Create(examples)\n",
         "           | RunInference(keyed_model_handler)\n",