You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/07/08 16:51:55 UTC

[GitHub] [beam] TheNeuralBit commented on a diff in pull request #22069: Reviewing the RunInference ReadMe file for clarity.

TheNeuralBit commented on code in PR #22069:
URL: https://github.com/apache/beam/pull/22069#discussion_r916989262


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -60,27 +61,28 @@ for details."
 ---
 ## Image classification
 
-[`pytorch_image_classification.py`](./pytorch_image_classification.py) contains an implementation for a RunInference pipeline that performs image classification using the mobilenet_v2 architecture.
+[`pytorch_image_classification.py`](./pytorch_image_classification.py) contains an implementation for a RunInference pipeline that performs image classification using the `mobilenet_v2` architecture.
 
-The pipeline reads the images, performs basic preprocessing, passes them to the PyTorch implementation of RunInference, and then writes the predictions to a text file.
+The pipeline reads the images, performs basic preprocessing, passes the images to the PyTorch implementation of RunInference, and then writes the predictions to a text file.
 
 ### Dataset and model for image classification
 
-You will need to create or download images, and place them into your `IMAGES_DIR` directory. One popular dataset is from [ImageNet](https://www.image-net.org/). Please follow their instructions to download the images.
-- **Required**: A path to a file called `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` on which you want to run image segmentation. Paths can be different types of URIs such as your local file system, a AWS S3 bucket or GCP Cloud Storage bucket. For example:
+Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` (see below) have absolute paths.
+One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
+- **Required**: A path to a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. Paths can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-- **Required**: A path to a file called `MODEL_STATE_DICT` that contains the saved parameters of the maskrcnn_resnet50_fpn model. You will need to download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. Note that this requires `torchvision` library.
+- **Required**: Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
 ```
 import torch
 from torchvision.models.detection import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
 torch.save(model.state_dict(), 'mobilenet_v2.pth')
 ```
-- **Required**: A path to a file called `OUTPUT`, to which the pipeline will write the predictions.
-- **Optional**: `IMAGES_DIR`, which is the path to the directory where images are stored. Not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+- **Required**: A path to a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.

Review Comment:
   That makes sense, but to me this looks like a bulleted list defining the four inputs to the script, not a list of directions/actions. For each input it identifies if they are Required/Optional, and explains how to populate them. It's odd to me to then have one separate bullet that represents an instruction.



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -60,27 +61,28 @@ for details."
 ---
 ## Image classification
 
-[`pytorch_image_classification.py`](./pytorch_image_classification.py) contains an implementation for a RunInference pipeline that performs image classification using the mobilenet_v2 architecture.
+[`pytorch_image_classification.py`](./pytorch_image_classification.py) contains an implementation for a RunInference pipeline that performs image classification using the `mobilenet_v2` architecture.
 
-The pipeline reads the images, performs basic preprocessing, passes them to the PyTorch implementation of RunInference, and then writes the predictions to a text file.
+The pipeline reads the images, performs basic preprocessing, passes the images to the PyTorch implementation of RunInference, and then writes the predictions to a text file.
 
 ### Dataset and model for image classification
 
-You will need to create or download images, and place them into your `IMAGES_DIR` directory. One popular dataset is from [ImageNet](https://www.image-net.org/). Please follow their instructions to download the images.
-- **Required**: A path to a file called `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` on which you want to run image segmentation. Paths can be different types of URIs such as your local file system, a AWS S3 bucket or GCP Cloud Storage bucket. For example:
+Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` (see below) have absolute paths.
+One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
+- **Required**: A path to a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. Paths can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-- **Required**: A path to a file called `MODEL_STATE_DICT` that contains the saved parameters of the maskrcnn_resnet50_fpn model. You will need to download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. Note that this requires `torchvision` library.
+- **Required**: Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
 ```
 import torch
 from torchvision.models.detection import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
 torch.save(model.state_dict(), 'mobilenet_v2.pth')
 ```
-- **Required**: A path to a file called `OUTPUT`, to which the pipeline will write the predictions.
-- **Optional**: `IMAGES_DIR`, which is the path to the directory where images are stored. Not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+- **Required**: A path to a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.

Review Comment:
   That makes sense, but to me this looks like a bulleted list defining the four inputs to the script, not a list of directions/actions. For each input it identifies if they are Required/Optional, and explains how to populate them. It's odd to me to then have one separate bullet that represents an action.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org