You are viewing a plain text version of this content. The canonical link for it is here.
Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/15 15:20:30 UTC

[GitHub] [beam] AnandInguva commented on a diff in pull request #21887: Add README documentation for scikit-learn MNIST example

AnandInguva commented on code in PR #21887:
URL: https://github.com/apache/beam/pull/21887#discussion_r898117591


##########
sdks/python/apache_beam/examples/inference/README.md:
##########
@@ -235,3 +235,54 @@ He looked up and saw the sun and stars .;moon
 Each line has data separated by a semicolon ";".
 The first item is the sentence with the last word masked. The second item
 is the word that the model predicts for the mask.
+
+---
+## MNITST digit classification
+[`sklearn_mnist_classification.py`](./sklearn_mnist_classification.py) contains
+an implementation for a RunInference pipeline that performs image classification on handwritten digits from the [MNIST](https://en.wikipedia.org/wiki/MNIST_database) database.
+
+The pipeline reads rows of pixels corresponding to a digit, performs basic preprocessing, passes the pixels to the Scikit-learn implementation of RunInference, and then writes the predictions to a text file.
+
+### Dataset and model for language modeling
+- **Required**: A path to a file called `INPUT` that contains label and pixels to
+feed into the model. Each row should have elements that are comma-separated. The first element is the label. All subsuequent values are pixels from pixel0 to pixel784. It should look something like this:

Review Comment:
   It can be of any length.  the one you mentioned is of 28 x 28 image flattened. It could be 32 x 32 or 64 x 64 flattened as well. 
   
   We can say
   
   `All subsuequent elements would be pixel values.`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org