You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/09 22:11:39 UTC

[GitHub] [beam] AnandInguva opened a new pull request, #21781: Sklearn Mnist example and IT test

AnandInguva opened a new pull request, #21781:
URL: https://github.com/apache/beam/pull/21781

   **Please** add a meaningful description for your change here
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: "addresses #123), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment "fixes #<ISSUE NUMBER>" instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156806826

   PTAL @tvalentyn 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898368757


##########
sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py:
##########
@@ -0,0 +1,76 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""End-to-End test for Sklearn Inference"""
+
+import logging
+import unittest
+import uuid
+
+import pytest
+
+from apache_beam.examples.inference import sklearn_mnist_classification
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.testing.test_pipeline import TestPipeline
+
+
+def process_outputs(filepath):
+  with FileSystems().open(filepath) as f:
+    lines = f.readlines()
+  lines = [l.decode('utf-8').strip('\n') for l in lines]
+  return lines
+
+
+@pytest.mark.skip
+@pytest.mark.uses_sklearn
+@pytest.mark.it_postcommit
+class SklearnInference(unittest.TestCase):
+  def test_sklearn_mnist_classification(self):

Review Comment:
   optional, but it matches our pytorch pattern
   ```suggestion
   class SklearnInference(unittest.TestCase):
     @pytest.mark.uses_sklearn
     @pytest.mark.it_postcommit
     def test_sklearn_mnist_classification(self):
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898198034


##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 
+task sklearnInferenceTest {
+  dependsOn 'installGcpTest'
+  dependsOn ':sdks:python:sdist'
+  doLast {
+      def testOpts = basicTestOpts
+      def argMap = [
+          "test_opts": testOpts,
+          "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+          "collect": "it_run_inference",

Review Comment:
   Yes, it needs to be added to the torch tests. As of now, its not needed because you already have a marker `use_pytorch` which we use in tox environment. we make use of the `use_pytorch`  to collect torch it test as well. `it_run_inference` replaces the condition `use_pytorch` and `it_postcommit`
   
   We are going to create tags for frameworks that needs extra dependencies to run unit tests in the tox environment. 
   
   `it_run_inference ` would be sufficient for all the RunInference IT tests and I think `it_postcommit` is not required anymore. 
   
   Once this change goes in, I will make a change for other torch IT tests but its not a priority as of now



##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 
+task sklearnInferenceTest {
+  dependsOn 'installGcpTest'
+  dependsOn ':sdks:python:sdist'
+  doLast {
+      def testOpts = basicTestOpts
+      def argMap = [
+          "test_opts": testOpts,
+          "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+          "collect": "it_run_inference",

Review Comment:
   Yes, it needs to be added to the torch tests. As of now, its not needed because you already have a marker `use_pytorch` which we use in tox environment. we make use of the `use_pytorch`  to collect torch it test as well. `it_run_inference` replaces the condition `use_pytorch` and `it_postcommit` for pytest markers.
   
   We are going to create tags for frameworks that needs extra dependencies to run unit tests in the tox environment. 
   
   `it_run_inference ` would be sufficient for all the RunInference IT tests and I think `it_postcommit` is not required anymore. 
   
   Once this change goes in, I will make a change for other torch IT tests but its not a priority as of now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r897973071


##########
sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py:
##########
@@ -0,0 +1,51 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""End-to-End test for Sklearn Inference"""
+
+import logging
+import pytest
+import unittest
+import uuid
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.examples.inference import sklearn_mnist_classification
+from apache_beam.testing.test_pipeline import TestPipeline
+
+
+class SklearnInference(unittest.TestCase):
+  @pytest.mark.it_postcommit
+  def test_predictions_output_file(self):
+    test_pipeline = TestPipeline(is_integration_test=True)
+    input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'

Review Comment:
   Shouldn't this go under `apache-beam-ml/datasets/`?



##########
sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py:
##########
@@ -0,0 +1,95 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A pipeline that uses RunInference API to perform image classification."""
+
+import argparse
+from typing import Dict
+from typing import Iterable
+from typing import List
+from typing import Tuple
+
+import apache_beam as beam
+from apache_beam.ml.inference.api import PredictionResult
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.sklearn_inference import ModelFileType
+from apache_beam.ml.inference.sklearn_inference import SklearnModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+
+
+def process_input(row: str) -> Tuple[int, List[int]]:
+  data = row.split(',')
+  label, pixels = int(data[0]), data[1:]
+  pixels = [int(pixel) for pixel in pixels]
+  return label, pixels
+
+
+class PostProcessor(beam.DoFn):
+  def process(self, element: Tuple[int, PredictionResult]) -> Iterable[Dict]:
+    label, prediction_result = element
+    prediction = prediction_result.inference
+    yield {label: prediction}
+
+
+def parse_known_args(argv):
+  """Parses args for the workflow."""
+  parser = argparse.ArgumentParser()
+  parser.add_argument(
+      '--input_file',
+      dest='input',
+      help='CSV file with row containing label and pixel values.')
+  parser.add_argument(
+      '--output', dest='output', help='Path to save output predictions.')
+  parser.add_argument(
+      '--model_path',
+      dest='model_path',
+      help='Path to load the Sklearn model for Inference.')
+  return parser.parse_known_args(argv)
+
+
+def run(argv=None, save_main_session=True):
+  """Entry point. Defines and runs the pipeline."""
+  known_args, pipeline_args = parse_known_args(argv)
+  pipeline_options = PipelineOptions(pipeline_args)
+  pipeline_options.view_as(SetupOptions).save_main_session = save_main_session
+
+  model_loader = SklearnModelLoader(
+      model_file_type=ModelFileType.PICKLE, model_uri=known_args.model_path)
+
+  with beam.Pipeline(options=pipeline_options) as p:
+    label_pixel_tuple = (
+        p
+        | "ReadFromInput" >> beam.io.ReadFromText(
+            known_args.input, skip_header_lines=1)
+        | "Process inputs" >> beam.Map(process_input))
+
+    predictions = (
+        label_pixel_tuple
+        | "RunInference" >> RunInference(model_loader).with_output_types(
+            Tuple[int, PredictionResult])
+        | "PostProcessor" >> beam.ParDo(PostProcessor()))
+
+    if known_args.output:

Review Comment:
   Make output required. Remove if statement.



##########
sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py:
##########
@@ -0,0 +1,95 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A pipeline that uses RunInference API to perform image classification."""
+
+import argparse
+from typing import Dict
+from typing import Iterable
+from typing import List
+from typing import Tuple
+
+import apache_beam as beam
+from apache_beam.ml.inference.api import PredictionResult
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.sklearn_inference import ModelFileType
+from apache_beam.ml.inference.sklearn_inference import SklearnModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+
+
+def process_input(row: str) -> Tuple[int, List[int]]:
+  data = row.split(',')
+  label, pixels = int(data[0]), data[1:]
+  pixels = [int(pixel) for pixel in pixels]
+  return label, pixels
+
+
+class PostProcessor(beam.DoFn):
+  def process(self, element: Tuple[int, PredictionResult]) -> Iterable[Dict]:
+    label, prediction_result = element
+    prediction = prediction_result.inference
+    yield {label: prediction}
+
+
+def parse_known_args(argv):
+  """Parses args for the workflow."""
+  parser = argparse.ArgumentParser()
+  parser.add_argument(
+      '--input_file',
+      dest='input',
+      help='CSV file with row containing label and pixel values.')
+  parser.add_argument(
+      '--output', dest='output', help='Path to save output predictions.')

Review Comment:
   Add `required=True`



##########
sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py:
##########
@@ -0,0 +1,95 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A pipeline that uses RunInference API to perform image classification."""
+
+import argparse
+from typing import Dict
+from typing import Iterable
+from typing import List
+from typing import Tuple
+
+import apache_beam as beam
+from apache_beam.ml.inference.api import PredictionResult
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.sklearn_inference import ModelFileType
+from apache_beam.ml.inference.sklearn_inference import SklearnModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+
+
+def process_input(row: str) -> Tuple[int, List[int]]:
+  data = row.split(',')
+  label, pixels = int(data[0]), data[1:]
+  pixels = [int(pixel) for pixel in pixels]
+  return label, pixels
+
+
+class PostProcessor(beam.DoFn):
+  def process(self, element: Tuple[int, PredictionResult]) -> Iterable[Dict]:
+    label, prediction_result = element
+    prediction = prediction_result.inference
+    yield {label: prediction}
+
+
+def parse_known_args(argv):
+  """Parses args for the workflow."""
+  parser = argparse.ArgumentParser()
+  parser.add_argument(
+      '--input_file',
+      dest='input',

Review Comment:
   Add `required=True`



##########
sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py:
##########
@@ -0,0 +1,95 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A pipeline that uses RunInference API to perform image classification."""
+
+import argparse
+from typing import Dict
+from typing import Iterable
+from typing import List
+from typing import Tuple
+
+import apache_beam as beam
+from apache_beam.ml.inference.api import PredictionResult
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.sklearn_inference import ModelFileType
+from apache_beam.ml.inference.sklearn_inference import SklearnModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+
+
+def process_input(row: str) -> Tuple[int, List[int]]:
+  data = row.split(',')
+  label, pixels = int(data[0]), data[1:]
+  pixels = [int(pixel) for pixel in pixels]
+  return label, pixels
+
+
+class PostProcessor(beam.DoFn):
+  def process(self, element: Tuple[int, PredictionResult]) -> Iterable[Dict]:
+    label, prediction_result = element
+    prediction = prediction_result.inference
+    yield {label: prediction}
+
+
+def parse_known_args(argv):
+  """Parses args for the workflow."""
+  parser = argparse.ArgumentParser()
+  parser.add_argument(
+      '--input_file',
+      dest='input',
+      help='CSV file with row containing label and pixel values.')
+  parser.add_argument(
+      '--output', dest='output', help='Path to save output predictions.')
+  parser.add_argument(
+      '--model_path',
+      dest='model_path',

Review Comment:
   Add `required=True`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1151676878

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1151676872

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898056281


##########
sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py:
##########
@@ -0,0 +1,112 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A pipeline that uses RunInference API to classify MNIST data.
+
+This pipeline takes a text file in which data is comma separated ints. The first
+column would be the true label and the rest would be the pixel values. The data
+is processed and then a model trained on the MNIST data would be used to perform
+the inference. The pipeline writes the prediction to an output file in which
+users can then compare against the true label.
+"""
+
+import argparse
+from typing import Iterable
+from typing import List
+from typing import Tuple
+
+import apache_beam as beam
+from apache_beam.ml.inference.base import KeyedModelHandler
+from apache_beam.ml.inference.base import PredictionResult
+from apache_beam.ml.inference.base import RunInference
+from apache_beam.ml.inference.sklearn_inference import ModelFileType
+from apache_beam.ml.inference.sklearn_inference import SklearnModelHandlerNumpy
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+
+
+def process_input(row: str) -> Tuple[int, List[int]]:
+  data = row.split(',')
+  label, pixels = int(data[0]), data[1:]
+  pixels = [int(pixel) for pixel in pixels]
+  return label, pixels
+
+
+class PostProcessor(beam.DoFn):
+  """Process the PredictionResult to get the predicted label.
+  Returns a comma separated string with true label and predicted label.
+  """
+  def process(self, element: Tuple[int, PredictionResult]) -> Iterable[str]:
+    label, prediction_result = element
+    prediction = prediction_result.inference
+    yield '{},{}'.format(label, prediction)
+
+
+def parse_known_args(argv):
+  """Parses args for the workflow."""
+  parser = argparse.ArgumentParser()
+  parser.add_argument(
+      '--input_file',
+      dest='input',
+      required=True,
+      help='CSV file with row containing label and pixel values.')
+  parser.add_argument(
+      '--output',
+      dest='output',
+      required=True,
+      help='Path to save output predictions.')
+  parser.add_argument(
+      '--model_path',
+      dest='model_path',
+      required=True,
+      help='Path to load the Sklearn model for Inference.')
+  return parser.parse_known_args(argv)
+
+
+def run(argv=None, save_main_session=True):
+  """Entry point. Defines and runs the pipeline."""
+  known_args, pipeline_args = parse_known_args(argv)
+  pipeline_options = PipelineOptions(pipeline_args)
+  pipeline_options.view_as(SetupOptions).save_main_session = save_main_session
+
+  # In this example we pass keyed inputs to RunInference transform.
+  # Therefore, we use KeyedModelHandler wrapper over SklearnModelHandlerNumpy.
+  model_loader = KeyedModelHandler(
+      SklearnModelHandlerNumpy(
+          model_file_type=ModelFileType.PICKLE,
+          model_uri=known_args.model_path))
+
+  with beam.Pipeline(options=pipeline_options) as p:
+    label_pixel_tuple = (
+        p
+        | "ReadFromInput" >> beam.io.ReadFromText(
+            known_args.input, skip_header_lines=1)
+        | "PreProcessInputs" >> beam.Map(process_input))
+
+    predictions = (
+        label_pixel_tuple
+        | "RunInference" >> RunInference(model_loader)
+        | "PostProcessOutputs" >> beam.ParDo(PostProcessor()))
+
+    _ = predictions | "WriteOutputToGCS" >> beam.io.WriteToText(

Review Comment:
   ```suggestion
       _ = predictions | "WriteOutput" >> beam.io.WriteToText(
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1151676879

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] tvalentyn commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

tvalentyn commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r897928438


##########
sdks/python/apache_beam/examples/inference/sklearn_mnist_classification.py:
##########
@@ -0,0 +1,95 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""A pipeline that uses RunInference API to perform image classification."""
+
+import argparse
+from typing import Dict
+from typing import Iterable
+from typing import List
+from typing import Tuple
+
+import apache_beam as beam
+from apache_beam.ml.inference.api import PredictionResult
+from apache_beam.ml.inference.api import RunInference
+from apache_beam.ml.inference.sklearn_inference import ModelFileType
+from apache_beam.ml.inference.sklearn_inference import SklearnModelLoader
+from apache_beam.options.pipeline_options import PipelineOptions
+from apache_beam.options.pipeline_options import SetupOptions
+
+
+def process_input(row: str) -> Tuple[int, List[int]]:
+  data = row.split(',')
+  label, pixels = int(data[0]), data[1:]
+  pixels = [int(pixel) for pixel in pixels]
+  return label, pixels
+
+
+class PostProcessor(beam.DoFn):
+  def process(self, element: Tuple[int, PredictionResult]) -> Iterable[Dict]:
+    label, prediction_result = element
+    prediction = prediction_result.inference
+    yield {label: prediction}
+
+
+def parse_known_args(argv):
+  """Parses args for the workflow."""
+  parser = argparse.ArgumentParser()
+  parser.add_argument(
+      '--input_file',
+      dest='input',
+      help='CSV file with row containing label and pixel values.')
+  parser.add_argument(
+      '--output', dest='output', help='Path to save output predictions.')
+  parser.add_argument(
+      '--model_path',
+      dest='model_path',
+      help='Path to load the Sklearn model for Inference.')
+  return parser.parse_known_args(argv)
+
+
+def run(argv=None, save_main_session=True):
+  """Entry point. Defines and runs the pipeline."""
+  known_args, pipeline_args = parse_known_args(argv)
+  pipeline_options = PipelineOptions(pipeline_args)
+  pipeline_options.view_as(SetupOptions).save_main_session = save_main_session
+
+  model_loader = SklearnModelLoader(
+      model_file_type=ModelFileType.PICKLE, model_uri=known_args.model_path)
+
+  with beam.Pipeline(options=pipeline_options) as p:
+    label_pixel_tuple = (
+        p
+        | "ReadFromInput" >> beam.io.ReadFromText(
+            known_args.input, skip_header_lines=1)
+        | "Process inputs" >> beam.Map(process_input))
+
+    predictions = (
+        label_pixel_tuple
+        | "RunInference" >> RunInference(model_loader).with_output_types(
+            Tuple[int, PredictionResult])

Review Comment:
   not needed



##########
sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py:
##########
@@ -0,0 +1,51 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""End-to-End test for Sklearn Inference"""
+
+import logging
+import pytest
+import unittest
+import uuid
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.examples.inference import sklearn_mnist_classification
+from apache_beam.testing.test_pipeline import TestPipeline
+
+
+class SklearnInference(unittest.TestCase):
+  @pytest.mark.it_postcommit
+  def test_predictions_output_file(self):
+    test_pipeline = TestPipeline(is_integration_test=True)
+    input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'

Review Comment:
   if this will have to be downloaded separately, we should mention necessary instructions. You probably want to add a section in https://github.com/apache/beam/blob/master/sdks/python/apache_beam/examples/inference/README.md. 
   



##########
sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py:
##########
@@ -0,0 +1,51 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""End-to-End test for Sklearn Inference"""
+
+import logging
+import pytest
+import unittest
+import uuid
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.examples.inference import sklearn_mnist_classification
+from apache_beam.testing.test_pipeline import TestPipeline
+
+
+class SklearnInference(unittest.TestCase):
+  @pytest.mark.it_postcommit
+  def test_predictions_output_file(self):
+    test_pipeline = TestPipeline(is_integration_test=True)
+    input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'
+    output_file_dir = 'gs://apache-beam-ml/testing/predictions'  # pylint: disable=line-too-long
+    output_file = '/'.join([output_file_dir, str(uuid.uuid4()), 'result.txt'])
+    model_path = 'gs://apache-beam-ml/models/mnist_model_svm.pickle'  # pylint: disable=line-too-long
+    extra_opts = {
+        'input': input_file,
+        'output': output_file,
+        'model_path': model_path,
+    }
+    sklearn_mnist_classification.run(
+        test_pipeline.get_full_options_as_args(**extra_opts),
+        save_main_session=False)
+    self.assertEqual(FileSystems().exists(output_file), True)

Review Comment:
   Any plans to assert a specific result here?



##########
sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py:
##########
@@ -0,0 +1,51 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""End-to-End test for Sklearn Inference"""
+
+import logging
+import pytest
+import unittest
+import uuid
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.examples.inference import sklearn_mnist_classification
+from apache_beam.testing.test_pipeline import TestPipeline
+
+
+class SklearnInference(unittest.TestCase):
+  @pytest.mark.it_postcommit
+  def test_predictions_output_file(self):
+    test_pipeline = TestPipeline(is_integration_test=True)
+    input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'
+    output_file_dir = 'gs://apache-beam-ml/testing/predictions'  # pylint: disable=line-too-long
+    output_file = '/'.join([output_file_dir, str(uuid.uuid4()), 'result.txt'])
+    model_path = 'gs://apache-beam-ml/models/mnist_model_svm.pickle'  # pylint: disable=line-too-long

Review Comment:
   As mentioned on another PR, it's better to use gs://temp-storage-for-end-to-end-tests/ for output results so they are automatically cleaned up.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898113659


##########
sdks/python/apache_beam/ml/inference/sklearn_inference_it_test.py:
##########
@@ -0,0 +1,51 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one or more
+# contributor license agreements.  See the NOTICE file distributed with
+# this work for additional information regarding copyright ownership.
+# The ASF licenses this file to You under the Apache License, Version 2.0
+# (the "License"); you may not use this file except in compliance with
+# the License.  You may obtain a copy of the License at
+#
+#    http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+
+"""End-to-End test for Sklearn Inference"""
+
+import logging
+import pytest
+import unittest
+import uuid
+
+from apache_beam.io.filesystems import FileSystems
+from apache_beam.examples.inference import sklearn_mnist_classification
+from apache_beam.testing.test_pipeline import TestPipeline
+
+
+class SklearnInference(unittest.TestCase):
+  @pytest.mark.it_postcommit
+  def test_predictions_output_file(self):
+    test_pipeline = TestPipeline(is_integration_test=True)
+    input_file = 'gs://apache-beam-ml/testing/inputs/it_mnist_data.csv'

Review Comment:
   This test would be internal test. Also sickbayed it for now. There are is PR in work #21887 on how to run the sample



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1151671764

   Run Python 3.8 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898188462


##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 
+task sklearnInferenceTest {
+  dependsOn 'installGcpTest'
+  dependsOn ':sdks:python:sdist'
+  doLast {
+      def testOpts = basicTestOpts
+      def argMap = [
+          "test_opts": testOpts,
+          "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+          "collect": "it_run_inference",

Review Comment:
   Do we want to add `it_postcommit`?
   
   Will `it_run_inference` also be needed to be added to the pytorch examples? unless it's specifically for non-pytorch inference tests? Are there concerns with us creating too many of these tags to keep track which ones to use?



##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 

Review Comment:
   ```suggestion
   // Scikit-learn RunInference IT tests
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898198034


##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 
+task sklearnInferenceTest {
+  dependsOn 'installGcpTest'
+  dependsOn ':sdks:python:sdist'
+  doLast {
+      def testOpts = basicTestOpts
+      def argMap = [
+          "test_opts": testOpts,
+          "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+          "collect": "it_run_inference",

Review Comment:
   Yes, it needs to be added to the torch tests. As of now, its not needed because you already have a marker `use_pytorch` which we use in tox environment. we make use of the `use_pytorch`  to collect torch it test as well. `it_run_inference` replaces the condition `use_pytorch` and `it_postcommit` for pytest markers.
   
   Are there concerns with us creating too many of these tags to keep track which ones to use?
   
   -  We are going to create tags for frameworks that needs extra dependencies to run unit tests in the tox environment. 
   
   `it_run_inference ` would be sufficient for all the RunInference IT tests and I think `it_postcommit` is not required anymore. 
   
   Once this change goes in, I will make a change for other torch IT tests but its not a priority as of now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156859628

   PTAL @yeandy @tvalentyn 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1151676877

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898353971


##########
sdks/python/pytest.ini:
##########
@@ -48,7 +48,7 @@ markers =
     # We run these tests with multiple major pyarrow versions (BEAM-11211)
     uses_pyarrow: tests that utilize pyarrow in some way
     uses_pytorch: tests that utilize pytorch in some way
-    it_run_inference: collects for RunInference integration test runs.
+    uses_sklearn: collects for Sklearn RunInference integration test runs.

Review Comment:
   To match the above pattern
   ```suggestion
       uses_sklearn: tests that utilize scikit-learn in some way
   ```



##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -212,9 +212,30 @@ task torchTests {
       }
     }
 }
+// Scikit-learn RunInference IT tests

Review Comment:
   nit: add extra line
   ```suggestion
   
   }
   // Scikit-learn RunInference IT tests
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156671006

   cc: @yeandy Added the gradle task


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156674244

   Run Python 3.7 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156600043

   refactored the code with recent changes. Added the test but sickbayed it for now.  Adding the issue here: https://github.com/apache/beam/issues/21859 to unskip the tests.
   
   PTAL @ryanthompson591 @tvalentyn 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898198034


##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 
+task sklearnInferenceTest {
+  dependsOn 'installGcpTest'
+  dependsOn ':sdks:python:sdist'
+  doLast {
+      def testOpts = basicTestOpts
+      def argMap = [
+          "test_opts": testOpts,
+          "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+          "collect": "it_run_inference",

Review Comment:
   Yes, it needs to be added to the torch tests. As of now, its not needed because you already have a marker `use_pytorch` which we use in tox environment. 
   
   We are going to create tags for frameworks that needs extra dependencies to run unit tests in the tox environment. 
   
   `it_run_inference ` would be sufficient for all the RunInference IT tests and I think `it_postcommit` is not required anymore. 
   
   Once this change goes in, I will make a change for other torch IT tests but its not a priority as of now



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898350105


##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 
+task sklearnInferenceTest {
+  dependsOn 'installGcpTest'
+  dependsOn ':sdks:python:sdist'
+  doLast {
+      def testOpts = basicTestOpts
+      def argMap = [
+          "test_opts": testOpts,
+          "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+          "collect": "it_run_inference",

Review Comment:
   made a change to use `uses_sklearn and it_postcommit`



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156968327

   @pabloem test failure unrelated to the change


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] pabloem commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

pabloem commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156968374

   lgtm thanks folks


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1151676882

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1151680188

   Run Python 3.9 PostCommit


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] tvalentyn commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

tvalentyn commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898313808


##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 
+task sklearnInferenceTest {
+  dependsOn 'installGcpTest'
+  dependsOn ':sdks:python:sdist'
+  doLast {
+      def testOpts = basicTestOpts
+      def argMap = [
+          "test_opts": testOpts,
+          "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+          "collect": "it_run_inference",

Review Comment:
   I also got confused by this. 
   
   IMO it would be easier to keep consistency with current structure (uses_scikit_learn + it_postcommit), and then replace the with new structure at once/if you  to join pytest and sklearn into one suite



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] AnandInguva commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

AnandInguva commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156877239

   > Can you also uncomment pytest.skip and confirm that `gradlew :sdks:python:test-suites:direct:py37:inferencePostCommitIT` runs successfully? @AnandInguva
   
   I checked and it runs. I was able to collect all the Inference IT tests


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] pabloem merged pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

pabloem merged PR #21781:
URL: https://github.com/apache/beam/pull/21781


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] codecov[bot] commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

codecov[bot] commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156647117

   # [Codecov](https://codecov.io/gh/apache/beam/pull/21781?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#21781](https://codecov.io/gh/apache/beam/pull/21781?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (0375d74) into [master](https://codecov.io/gh/apache/beam/commit/b1a313e03807882e1bb178c689e4f935246d5534?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (b1a313e) will **decrease** coverage by `0.01%`.
   > The diff coverage is `47.50%`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #21781      +/-   ##
   ==========================================
   - Coverage   74.01%   73.99%   -0.02%     
   ==========================================
     Files         699      700       +1     
     Lines       92675    92715      +40     
   ==========================================
   + Hits        68592    68608      +16     
   - Misses      22828    22852      +24     
     Partials     1255     1255              
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | python | `83.63% <47.50%> (-0.03%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/21781?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...examples/inference/sklearn\_mnist\_classification.py](https://codecov.io/gh/apache/beam/pull/21781/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvaW5mZXJlbmNlL3NrbGVhcm5fbW5pc3RfY2xhc3NpZmljYXRpb24ucHk=) | `47.50% <47.50%> (ø)` | |
   | [sdks/python/apache\_beam/io/localfilesystem.py](https://codecov.io/gh/apache/beam/pull/21781/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vbG9jYWxmaWxlc3lzdGVtLnB5) | `90.97% <0.00%> (-0.76%)` | :arrow_down: |
   | [...eam/runners/portability/fn\_api\_runner/execution.py](https://codecov.io/gh/apache/beam/pull/21781/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL2V4ZWN1dGlvbi5weQ==) | `92.44% <0.00%> (-0.65%)` | :arrow_down: |
   | [sdks/python/apache\_beam/transforms/util.py](https://codecov.io/gh/apache/beam/pull/21781/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdHJhbnNmb3Jtcy91dGlsLnB5) | `96.06% <0.00%> (-0.16%)` | :arrow_down: |
   | [...examples/inference/pytorch\_image\_classification.py](https://codecov.io/gh/apache/beam/pull/21781/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvaW5mZXJlbmNlL3B5dG9yY2hfaW1hZ2VfY2xhc3NpZmljYXRpb24ucHk=) | `0.00% <0.00%> (ø)` | |
   | [...hon/apache\_beam/runners/direct/test\_stream\_impl.py](https://codecov.io/gh/apache/beam/pull/21781/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9kaXJlY3QvdGVzdF9zdHJlYW1faW1wbC5weQ==) | `94.02% <0.00%> (+0.74%)` | :arrow_up: |
   | [...python/apache\_beam/runners/worker/worker\_status.py](https://codecov.io/gh/apache/beam/pull/21781/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvd29ya2VyX3N0YXR1cy5weQ==) | `79.71% <0.00%> (+1.44%)` | :arrow_up: |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/21781?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/21781?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [b1a313e...0375d74](https://codecov.io/gh/apache/beam/pull/21781?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] tvalentyn commented on a diff in pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

tvalentyn commented on code in PR #21781:
URL: https://github.com/apache/beam/pull/21781#discussion_r898313808


##########
sdks/python/test-suites/direct/common.gradle:
##########
@@ -213,8 +213,30 @@ task torchTests {
     }
 }
 
+task sklearnInferenceTest {
+  dependsOn 'installGcpTest'
+  dependsOn ':sdks:python:sdist'
+  doLast {
+      def testOpts = basicTestOpts
+      def argMap = [
+          "test_opts": testOpts,
+          "suite": "postCommitIT-direct-py${pythonVersionSuffix}",
+          "collect": "it_run_inference",

Review Comment:
   I also got confused by this. 
   
   IMO it would be easier to keep consistency with current structure and then replace the with new structure at once if you plan to join pytest and sklearn into one suite



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on pull request #21781: Sklearn Mnist example and IT test

Posted by GitBox <gi...@apache.org>.

yeandy commented on PR #21781:
URL: https://github.com/apache/beam/pull/21781#issuecomment-1156876405

   Can you also uncomment pytest.skip and confirm that `gradlew :sdks:python:test-suites:direct:py37:inferencePostCommitIT` runs successfully?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org