You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/09/02 20:49:51 UTC

[GitHub] [beam] damccorm opened a new pull request, #23018: Clarify inference example docs

damccorm opened a new pull request, #23018:
URL: https://github.com/apache/beam/pull/23018

   I walked through the inference examples - they're great, but I wanted to try to clean up some points of friction, confusion, or inconsistency that I found and streamline them a bit. Note, this is a little opinionated, definitely feel free to disagree if you think this makes things less clear in places.
   
   Also removes the skip_header_lines arg when we're reading from txt files. I didn't get why we had that, but if there's a good reason we should doc it in the README.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/get-started-contributing/#make-the-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Go tests](https://github.com/apache/beam/workflows/Go%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Go+tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1237586117

   Run Python 3.7 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r962443439


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,37 +67,35 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
-torch.save(model.state_dict(), 'mobilenet_v2.pth')
+torch.save(model.state_dict(), 'mobilenet_v2.pth') # You can replace mobilenet_v2.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.
-5. Note the path to the `OUTPUT` file. This file is used by the pipeline to write the predictions.
 
 ### Running `pytorch_image_classification.py`
 
 To run the image classification pipeline locally, use the following command:
 ```sh
 python -m apache_beam.examples.inference.pytorch_image_classification \
   --input IMAGE_FILE_NAMES \

Review Comment:
   ```suggestion
     --input IMAGE_FILE_NAMES.txt \
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1235886096

   R: @yeandy @rszper 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1238049944

   Run Python 3.9 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1236483165

   Run Python 3.9 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] codecov[bot] commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
codecov[bot] commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1235906879

   # [Codecov](https://codecov.io/gh/apache/beam/pull/23018?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#23018](https://codecov.io/gh/apache/beam/pull/23018?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (315446d) into [master](https://codecov.io/gh/apache/beam/commit/fe297c365356c4b4f7dc294206d64300c6a09c3a?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (fe297c3) will **decrease** coverage by `0.00%`.
   > The diff coverage is `n/a`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #23018      +/-   ##
   ==========================================
   - Coverage   73.69%   73.69%   -0.01%     
   ==========================================
     Files         714      714              
     Lines       95240    95240              
   ==========================================
   - Hits        70191    70184       -7     
   - Misses      23752    23759       +7     
     Partials     1297     1297              
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | python | `83.50% <ø> (-0.02%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/23018?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...examples/inference/pytorch\_image\_classification.py](https://codecov.io/gh/apache/beam/pull/23018/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvaW5mZXJlbmNlL3B5dG9yY2hfaW1hZ2VfY2xhc3NpZmljYXRpb24ucHk=) | `0.00% <ø> (ø)` | |
   | [...m/examples/inference/pytorch\_image\_segmentation.py](https://codecov.io/gh/apache/beam/pull/23018/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvaW5mZXJlbmNlL3B5dG9yY2hfaW1hZ2Vfc2VnbWVudGF0aW9uLnB5) | `0.00% <ø> (ø)` | |
   | [...examples/inference/sklearn\_mnist\_classification.py](https://codecov.io/gh/apache/beam/pull/23018/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvaW5mZXJlbmNlL3NrbGVhcm5fbW5pc3RfY2xhc3NpZmljYXRpb24ucHk=) | `43.75% <ø> (ø)` | |
   | [sdks/python/apache\_beam/io/source\_test\_utils.py](https://codecov.io/gh/apache/beam/pull/23018/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vc291cmNlX3Rlc3RfdXRpbHMucHk=) | `88.01% <0.00%> (-1.39%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/23018/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `93.28% <0.00%> (-0.75%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/23018/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.30% <0.00%> (-0.38%)` | :arrow_down: |
   | [...on/apache\_beam/runners/dataflow/dataflow\_runner.py](https://codecov.io/gh/apache/beam/pull/23018/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kYXRhZmxvdy9kYXRhZmxvd19ydW5uZXIucHk=) | `82.87% <0.00%> (-0.14%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/23018/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5) | `91.72% <0.00%> (+0.75%)` | :arrow_up: |
   
   :mega: We’re building smart automated test selection to slash your CI/CD build times. [Learn more](https://about.codecov.io/iterative-testing/?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r962446569


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -192,14 +190,14 @@ He looked up and saw the sun and stars .
 To run the language modeling pipeline locally, use the following command:
 ```sh
 python -m apache_beam.examples.inference.pytorch_language_modeling \
-  --input SENTENCES \
+  --input SENTENCES \ # optional
   --output OUTPUT \
   --model_state_dict_path MODEL_STATE_DICT
 ```
-For example:
+For example, if you've followed the naming conventions recommended above:
 ```sh
 python -m apache_beam.examples.inference.pytorch_language_modeling \
-  --input sentences.txt \
+  --input SENTENCES.txt \ # optional

Review Comment:
   Move the comment or add `\` to the end?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1238337137

   Run Python PreCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r962448219


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -218,8 +216,8 @@ He looked up and saw the sun and stars .;moon
 ...
 ```
 Each line has data separated by a semicolon ";".
-The first item is the sentence with the last word masked. The second item
-is the word that the model predicts for the mask.
+The first item is the input sentence. The model masks the last word and tries to predict it;
+the second item is the word that the model predicts for the mask.
 
 ---
 ## MNITST digit classification

Review Comment:
   ```suggestion
   ## MNIST digit classification
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r963153511


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,37 +67,35 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
-torch.save(model.state_dict(), 'mobilenet_v2.pth')
+torch.save(model.state_dict(), 'mobilenet_v2.pth') # You can replace mobilenet_v2.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.
-5. Note the path to the `OUTPUT` file. This file is used by the pipeline to write the predictions.
 
 ### Running `pytorch_image_classification.py`
 
 To run the image classification pipeline locally, use the following command:
 ```sh
 python -m apache_beam.examples.inference.pytorch_image_classification \
   --input IMAGE_FILE_NAMES \
-  --images_dir IMAGES_DIR \
+  --images_dir IMAGES_DIR \ # Only needed if your IMAGE_FILE_NAMES file contains relative paths (they will be relative from IMAGES_DIR).

Review Comment:
   Good call - I moved all the comments out of the script blocks (here and elsewhere).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
bvolpato commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1237573924

   > I didn't run into this and I don't have rust installed so this is surprising - maybe its dependent on the version that gets installed? Its surprising, but I'm inclined to provide the minimal set that needs to be installed so that we don't need to keep that up to date (though if you convince me to pin to a specific version that changes 😄 )
   
   Do you get something on `which rustc`? Just doing `pip install transformers` was failing for me with the above error.
   Apparently it comes from https://github.com/huggingface/tokenizers, which is built on Rust.
   
   
   > Are you using my branch? I made a change to fix this
   
   My bad. I followed the README from your branch but was using the Beam module straight from the `pip install`.
   
   > I think I disagree - Beam should work with a range of versions of these deps, and if it doesn't (beyond what pip enforces) its probably concerning (especially because we're not doing things w/ much complexity here). I like the idea of making that clear to users as well. I'm open to being convinced on this point.
   
   I don't have a strong opinion here. I mentioned so we can be consistent with the installation of torch, and it feels that not specifying a version may put the burden on us if those libraries break something. 
   I'm not familiar with all the tests/stack here yet, but could we have some automation to go through steps on this tutorial from time to time? So at least we get to know before anyone reports.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1238049161

   @bvolpato 
   
   > Do you get something on which rustc?
   
   Nope - weird that you're seeing this and I'm not though. Which version of transformers is pip giving you (`pip show transformers`)? I'm on 4.21.2.
   
   Regardless, I'm probably not inclined to do too much here since this is out of our control (and other dependencies could add additional requirements).
   
   > I'm not familiar with all the tests/stack here yet, but could we have some automation to go through steps on examples/tutorials from time to time?
   
   I think the postcommits will at least make sure they compile correctly, doing a good dependency test here is pretty tough though because it is a little environmentally dependent (for example, I'm getting the error, you're not :) ). We'll also probably get different results depending on the platform.
   
   ---------
   
   Also, updated the examples to filter out empty lines


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r962428847


##########
sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py:
##########
@@ -104,8 +104,7 @@ def run(
 
   label_pixel_tuple = (
       pipeline
-      | "ReadFromInput" >> beam.io.ReadFromText(
-          known_args.input, skip_header_lines=1)

Review Comment:
   skip_header_lines is used, just in case if the csv file has column names.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r962443951


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,37 +67,35 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
-torch.save(model.state_dict(), 'mobilenet_v2.pth')
+torch.save(model.state_dict(), 'mobilenet_v2.pth') # You can replace mobilenet_v2.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.
-5. Note the path to the `OUTPUT` file. This file is used by the pipeline to write the predictions.
 
 ### Running `pytorch_image_classification.py`
 
 To run the image classification pipeline locally, use the following command:
 ```sh
 python -m apache_beam.examples.inference.pytorch_image_classification \
   --input IMAGE_FILE_NAMES \
-  --images_dir IMAGES_DIR \
+  --images_dir IMAGES_DIR \ # Only needed if your IMAGE_FILE_NAMES file contains relative paths (they will be relative from IMAGES_DIR).

Review Comment:
   Pull the comment elsewhere or add `\` after it. Can't run the shell as-is now



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,37 +67,35 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
-torch.save(model.state_dict(), 'mobilenet_v2.pth')
+torch.save(model.state_dict(), 'mobilenet_v2.pth') # You can replace mobilenet_v2.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.
-5. Note the path to the `OUTPUT` file. This file is used by the pipeline to write the predictions.
 
 ### Running `pytorch_image_classification.py`
 
 To run the image classification pipeline locally, use the following command:
 ```sh
 python -m apache_beam.examples.inference.pytorch_image_classification \
   --input IMAGE_FILE_NAMES \
-  --images_dir IMAGES_DIR \
+  --images_dir IMAGES_DIR \ # Only needed if your IMAGE_FILE_NAMES file contains relative paths (they will be relative from IMAGES_DIR).

Review Comment:
   Pull the comment elsewhere or add `\` after it? Can't run the shell as-is now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yeandy commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
yeandy commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r963621805


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,22 +67,20 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:

Review Comment:
   ```suggestion
   3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a Python shell:
   ```



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -118,22 +121,21 @@ The pipeline reads images, performs basic preprocessing, passes the images to th
 
 To use this transform, you need a dataset and model for image segmentation.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 A popular dataset is from [Coco](https://cocodataset.org/#home). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image segmentation. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image segmentation. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [maskrcnn_resnet50_fpn](https://pytorch.org/vision/0.12/models.html#id70) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [maskrcnn_resnet50_fpn](https://pytorch.org/vision/0.12/models.html#id70) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:

Review Comment:
   ```suggestion
   3. Download the [maskrcnn_resnet50_fpn](https://pytorch.org/vision/0.12/models.html#id70) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a Python shell:
   ```



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -118,22 +121,21 @@ The pipeline reads images, performs basic preprocessing, passes the images to th
 
 To use this transform, you need a dataset and model for image segmentation.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 A popular dataset is from [Coco](https://cocodataset.org/#home). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image segmentation. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image segmentation. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [maskrcnn_resnet50_fpn](https://pytorch.org/vision/0.12/models.html#id70) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [maskrcnn_resnet50_fpn](https://pytorch.org/vision/0.12/models.html#id70) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models.detection import maskrcnn_resnet50_fpn
 model = maskrcnn_resnet50_fpn(pretrained=True)
-torch.save(model.state_dict(), 'maskrcnn_resnet50_fpn.pth')
+torch.save(model.state_dict(), 'maskrcnn_resnet50_fpn.pth') # You can replace maskrcnn_resnet50_fpn.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a path to a file named `MODEL_STATE_DICT` that contains the saved parameters of the `maskrcnn_resnet50_fpn` model.

Review Comment:
   Why did you remove comments about `MODEL_STATE_DICT`?



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,22 +67,20 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
-torch.save(model.state_dict(), 'mobilenet_v2.pth')
+torch.save(model.state_dict(), 'mobilenet_v2.pth') # You can replace mobilenet_v2.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.
-5. Note the path to the `OUTPUT` file. This file is used by the pipeline to write the predictions.

Review Comment:
   Why did you remove comments about `MODEL_STATE_DICT` and `OUTPUT`?



##########
sdks/python/apache_beam/examples/inference/pytorch_image_classification.py:
##########
@@ -136,8 +142,8 @@ def run(
 
   filename_value_pair = (
       pipeline
-      | 'ReadImageNames' >> beam.io.ReadFromText(
-          known_args.input, skip_header_lines=1)
+      | 'ReadImageNames' >> beam.io.ReadFromText(known_args.input)
+      | 'RemoveEmptyLines' >> beam.ParDo(filter_empty_lines)

Review Comment:
   ```suggestion
         | 'FilterEmptyLines' >> beam.ParDo(filter_empty_lines)
   ```



##########
sdks/python/apache_beam/examples/inference/pytorch_image_segmentation.py:
##########
@@ -225,8 +231,8 @@ def run(
 
   filename_value_pair = (
       pipeline
-      | 'ReadImageNames' >> beam.io.ReadFromText(
-          known_args.input, skip_header_lines=1)
+      | 'ReadImageNames' >> beam.io.ReadFromText(known_args.input)
+      | 'RemoveEmptyLines' >> beam.ParDo(filter_empty_lines)

Review Comment:
   ```suggestion
         | 'FilterEmptyLines' >> beam.ParDo(filter_empty_lines)
   ```



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -171,16 +175,14 @@ The pipeline reads sentences, performs basic preprocessing to convert the last w
 
 To use this transform, you need a dataset and model for language modeling.
 
-1. Download the [BertForMaskedLM](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMaskedLM) model from Hugging Face's repository of pretrained models. You must already have `transformers` installed.
+1. Download the [BertForMaskedLM](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMaskedLM) model from Hugging Face's repository of pretrained models. You must already have `transformers` installed, then from a python shell run:

Review Comment:
   ```suggestion
   1. Download the [BertForMaskedLM](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMaskedLM) model from Hugging Face's repository of pretrained models. You must already have `transformers` installed, then from a Python shell run:
   ```



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -94,10 +92,12 @@ python -m apache_beam.examples.inference.pytorch_image_classification \
   --output OUTPUT \
   --model_state_dict_path MODEL_STATE_DICT
 ```
-For example:
+`images_dir` is only needed if your `IMAGE_FILE_NAMES` file contains relative paths (they will be relative from `IMAGES_DIR`).

Review Comment:
   ```suggestion
   `images_dir` is only needed if your `IMAGE_FILE_NAMES.txt` file contains relative paths (they will be relative from `IMAGES_DIR`).
   ```



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -145,10 +147,12 @@ python -m apache_beam.examples.inference.pytorch_image_segmentation \
   --output OUTPUT \
   --model_state_dict_path MODEL_STATE_DICT
 ```
-For example:
+`images_dir` is only needed if your `IMAGE_FILE_NAMES` file contains relative paths (they will be relative from `IMAGES_DIR`).

Review Comment:
   ```suggestion
   `images_dir` is only needed if your `IMAGE_FILE_NAMES.txt` file contains relative paths (they will be relative from `IMAGES_DIR`).
   ```



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -171,16 +175,14 @@ The pipeline reads sentences, performs basic preprocessing to convert the last w
 
 To use this transform, you need a dataset and model for language modeling.
 
-1. Download the [BertForMaskedLM](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMaskedLM) model from Hugging Face's repository of pretrained models. You must already have `transformers` installed.
+1. Download the [BertForMaskedLM](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMaskedLM) model from Hugging Face's repository of pretrained models. You must already have `transformers` installed, then from a python shell run:
 ```
 import torch
 from transformers import BertForMaskedLM
 model = BertForMaskedLM.from_pretrained('bert-base-uncased', return_dict=True)
-torch.save(model.state_dict(), 'BertForMaskedLM.pth')
+torch.save(model.state_dict(), 'BertForMaskedLM.pth') # You can replace BertForMaskedLM.pth with your preferred file name for your model state dictionary.
 ```
-2. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `BertForMaskedLM` model.

Review Comment:
   Why did you remove comments about `MODEL_STATE_DICT` and `OUTPUT`?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1237580464

   I updated the csv files and removed the column names from csv since the examples won't use them. The IT tests passes now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] rszper commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
rszper commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r962051775


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -24,7 +24,7 @@ API. <!---TODO: Add link to full documentation on Beam website when it's publish
 
 ## Prerequisites
 
-You must have `apache-beam>=2.40.0` installed in order to run these pipelines,
+You must have `apache-beam>=2.40.0` or greater installed in order to run these pipelines,

Review Comment:
   greater -> later



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r962320324


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -24,7 +24,7 @@ API. <!---TODO: Add link to full documentation on Beam website when it's publish
 
 ## Prerequisites
 
-You must have `apache-beam>=2.40.0` installed in order to run these pipelines,
+You must have `apache-beam>=2.40.0` or greater installed in order to run these pipelines,
 because the `apache_beam.examples.inference` module was added in that release.
 ```
 pip install apache-beam==2.40.0

Review Comment:
   Should we have `pip install 'apache-beam>=2.40.0'` to be consistent with the above?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1237590475

   > > I didn't run into this and I don't have rust installed so this is surprising - maybe its dependent on the version that gets installed? Its surprising, but I'm inclined to provide the minimal set that needs to be installed so that we don't need to keep that up to date (though if you convince me to pin to a specific version that changes 😄 )
   > 
   > Do you get something on `which rustc`? Just doing `pip install transformers` was failing for me with the above error (Mac M1, unsure if matters). Apparently it comes from https://github.com/huggingface/tokenizers, which is built on Rust.
   > 
   > > Are you using my branch? I made a change to fix this
   > 
   > My bad. I followed the README from your branch but was using the Beam module straight from the `pip install`.
   > 
   > > I think I disagree - Beam should work with a range of versions of these deps, and if it doesn't (beyond what pip enforces) its probably concerning (especially because we're not doing things w/ much complexity here). I like the idea of making that clear to users as well. I'm open to being convinced on this point.
   > 
   > I don't have a strong opinion here. I mentioned so we can be consistent with the installation of torch, and it feels that not specifying a version may put the burden on us if those libraries break something. I'm not familiar with all the tests/stack here yet, but could we have some automation to go through steps on examples/tutorials from time to time? So at least we get to know before anyone reports, or even worse, avoid that someone can't get it working and doesn't report at all.
   > 
   > To add on the last sentence, can we somehow encourage people to file an issue/report any problems or frictions that they have while going over the examples?
   
   @BjornPrime also encountered something related to Rust. May be he can provide some context if its related to Beam or Hugging face.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1238208299

   > > I updated the csv files and removed the column names from csv since the examples won't use them. The IT tests passes now
   > 
   > @AnandInguva thanks - where do those files live for future reference?
   
   they live at gs://apache-beam-ml/testing/inputs. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1239452167

   This PR also fixes https://github.com/apache/beam/issues/23056


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r963153362


##########
sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py:
##########
@@ -104,8 +104,7 @@ def run(
 
   label_pixel_tuple = (
       pipeline
-      | "ReadFromInput" >> beam.io.ReadFromText(
-          known_args.input, skip_header_lines=1)

Review Comment:
   I think its a pretty surprising experience, especially since we don't have headers in our example csv. I called out that the created csv shouldn't have headers though.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] yeandy commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
yeandy commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r963736235


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,22 +67,20 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
-torch.save(model.state_dict(), 'mobilenet_v2.pth')
+torch.save(model.state_dict(), 'mobilenet_v2.pth') # You can replace mobilenet_v2.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.
-5. Note the path to the `OUTPUT` file. This file is used by the pipeline to write the predictions.

Review Comment:
   👍 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1238049599

   > I updated the csv files and removed the column names from csv since the examples won't use them. The IT tests passes now
   
   @AnandInguva thanks - where do those files live for future reference?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1238049803

   Run Python 3.7 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1237529280

   > pip install transformers required Rust installed:
   > error: can't find Rust compiler
   >
   > Probably worth to mention as a prerequisite: https://www.rust-lang.org/tools/install
   
   I didn't run into this and I don't have rust installed so this is surprising - maybe its dependent on the version that gets installed? Its surprising, but I'm inclined to provide the minimal set that needs to be installed so that we don't need to keep that up to date (though if you convince me to pin to a specific version that changes 😄 )
   
   > Apparently, the first line of my IMAGE_FILE_NAMES.txt is being ignored all the time.
   
   Are you using my branch? I made a change to fix this
   
   > I had an extra blank line on my SENTENCES.txt file, and there were some weird failures. Likely could be ignored:
   
   I agree this should be fixed, I'll get it tomorrow morning
   
   > IMHO pip install torchvision/transformers could have fixed versions as well, so users don’t run into problems if those dependencies make incompatible changes.
   
   I think I disagree - Beam should work with a range of versions of these deps, and if it doesn't (beyond what pip enforces) its probably concerning (especially because we're not doing things w/ much complexity here). I like the idea of making that clear to users as well. I'm open to being convinced on this point.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1236489543

   Run Python 3.7 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1237589994

   Run Python 3.9 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] AnandInguva commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
AnandInguva commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1239450001

   LGTM


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] rszper commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
rszper commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1235952987

   LGTM, one minor comment


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] github-actions[bot] commented on pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
github-actions[bot] commented on PR #23018:
URL: https://github.com/apache/beam/pull/23018#issuecomment-1235886751

   Stopping reviewer notifications for this pull request: review requested by someone other than the bot, ceding control


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r963705183


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,22 +67,20 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [mobilenet_v2](https://pytorch.org/vision/stable/_modules/torchvision/models/mobilenetv2.html) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models import mobilenet_v2
 model = mobilenet_v2(pretrained=True)
-torch.save(model.state_dict(), 'mobilenet_v2.pth')
+torch.save(model.state_dict(), 'mobilenet_v2.pth') # You can replace mobilenet_v2.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `mobilenet_v2` model.
-5. Note the path to the `OUTPUT` file. This file is used by the pipeline to write the predictions.

Review Comment:
   `MODEL_STATE_DICT` has been created as part of a previous step, the `OUTPUT` line just confused me since it doesn't exist yet. The example command below is clearer IMO.



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -118,22 +121,21 @@ The pipeline reads images, performs basic preprocessing, passes the images to th
 
 To use this transform, you need a dataset and model for image segmentation.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 A popular dataset is from [Coco](https://cocodataset.org/#home). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image segmentation. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image segmentation. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```
-3. Download the [maskrcnn_resnet50_fpn](https://pytorch.org/vision/0.12/models.html#id70) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands:
+3. Download the [maskrcnn_resnet50_fpn](https://pytorch.org/vision/0.12/models.html#id70) model from Pytorch's repository of pretrained models. This model requires the torchvision library. To download this model, run the following commands from a python shell:
 ```
 import torch
 from torchvision.models.detection import maskrcnn_resnet50_fpn
 model = maskrcnn_resnet50_fpn(pretrained=True)
-torch.save(model.state_dict(), 'maskrcnn_resnet50_fpn.pth')
+torch.save(model.state_dict(), 'maskrcnn_resnet50_fpn.pth') # You can replace maskrcnn_resnet50_fpn.pth with your preferred file name for your model state dictionary.
 ```
-4. Create a path to a file named `MODEL_STATE_DICT` that contains the saved parameters of the `maskrcnn_resnet50_fpn` model.

Review Comment:
   Because this happens as part of a previous step



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -171,16 +175,14 @@ The pipeline reads sentences, performs basic preprocessing to convert the last w
 
 To use this transform, you need a dataset and model for language modeling.
 
-1. Download the [BertForMaskedLM](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMaskedLM) model from Hugging Face's repository of pretrained models. You must already have `transformers` installed.
+1. Download the [BertForMaskedLM](https://huggingface.co/docs/transformers/model_doc/bert#transformers.BertForMaskedLM) model from Hugging Face's repository of pretrained models. You must already have `transformers` installed, then from a python shell run:
 ```
 import torch
 from transformers import BertForMaskedLM
 model = BertForMaskedLM.from_pretrained('bert-base-uncased', return_dict=True)
-torch.save(model.state_dict(), 'BertForMaskedLM.pth')
+torch.save(model.state_dict(), 'BertForMaskedLM.pth') # You can replace BertForMaskedLM.pth with your preferred file name for your model state dictionary.
 ```
-2. Create a file named `MODEL_STATE_DICT` that contains the saved parameters of the `BertForMaskedLM` model.

Review Comment:
   Same as above



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] bvolpato commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
bvolpato commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r962319728


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,37 +67,35 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```

Review Comment:
   nit: Having a `gs://bucket/path/to/image3.jpg` would emphasize the ability to use images directly from Cloud Storage, as well as show the proper format to do so.



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -24,7 +24,7 @@ API. <!---TODO: Add link to full documentation on Beam website when it's publish
 
 ## Prerequisites
 
-You must have `apache-beam>=2.40.0` installed in order to run these pipelines,
+You must have `apache-beam>=2.40.0` or greater installed in order to run these pipelines,
 because the `apache_beam.examples.inference` module was added in that release.
 ```
 pip install apache-beam==2.40.0

Review Comment:
   Should we have `apache-beam>=2.40.0` to be consistent with the above?



##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -231,16 +229,15 @@ The pipeline reads rows of pixels corresponding to a digit, performs basic prepr
 
 To use this transform, you need a dataset and model for language modeling.
 
-1. Create a file named `INPUT` that contains labels and pixels to feed into the model. Each row should have comma-separated elements. The first element is the label. All other elements are pixel values. The content of the file should be similar to the following example:
+1. Create a file named `INPUT.csv` that contains labels and pixels to feed into the model. Each row should have comma-separated elements. The first element is the label. All other elements are pixel values. The content of the file should be similar to the following example:

Review Comment:
   +1, adding the extension can go a long way to show expected format and improve tools usage when following the example.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r963150978


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -67,37 +67,35 @@ The pipeline reads the images, performs basic preprocessing, passes the images t
 
 To use this transform, you need a dataset and model for image classification.
 
-1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES` have absolute paths.
+1. Create a directory named `IMAGES_DIR`. Create or download images and put them in this directory. The directory is not required if image names in the input file `IMAGE_FILE_NAMES.txt` you create in step 2 have absolute paths.
 One popular dataset is from [ImageNet](https://www.image-net.org/). Follow their instructions to download the images.
-2. Create a file named `IMAGE_FILE_NAMES` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
+2. Create a file named `IMAGE_FILE_NAMES.txt` that contains the absolute paths of each of the images in `IMAGES_DIR` that you want to use to run image classification. The path to the file can be different types of URIs such as your local file system, an AWS S3 bucket, or a GCP Cloud Storage bucket. For example:
 ```
 /absolute/path/to/image1.jpg
 /absolute/path/to/image2.jpg
 ```

Review Comment:
   I actually think that leading with local is a better experience here. Most people are going to want the simplest example path.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm merged pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm merged PR #23018:
URL: https://github.com/apache/beam/pull/23018


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org


[GitHub] [beam] damccorm commented on a diff in pull request #23018: Clarify inference example docs

Posted by GitBox <gi...@apache.org>.
damccorm commented on code in PR #23018:
URL: https://github.com/apache/beam/pull/23018#discussion_r963153412


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -24,7 +24,7 @@ API. <!---TODO: Add link to full documentation on Beam website when it's publish
 
 ## Prerequisites
 
-You must have `apache-beam>=2.40.0` installed in order to run these pipelines,
+You must have `apache-beam>=2.40.0` or greater installed in order to run these pipelines,
 because the `apache_beam.examples.inference` module was added in that release.
 ```
 pip install apache-beam==2.40.0

Review Comment:
   Good call, I updated to do this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org