You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/06/10 21:12:49 UTC

[GitHub] [beam] yeandy opened a new pull request, #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

yeandy opened a new pull request, #21810:
URL: https://github.com/apache/beam/pull/21810

   Split `PytorchModelHandler` into experimental `PytorchModelHandlerTensor` and `PytorchModelHandlerKeyedTensor` transforms.
   
   ------------------------
   
   Thank you for your contribution! Follow this checklist to help us incorporate your contribution quickly and easily:
   
    - [ ] [**Choose reviewer(s)**](https://beam.apache.org/contribute/#make-your-change) and mention them in a comment (`R: @username`).
    - [ ] Mention the appropriate issue in your description (for example: `addresses #123`), if applicable. This will automatically add a link to the pull request in the issue. If you would like the issue to automatically close on merging the pull request, comment `fixes #<ISSUE NUMBER>` instead.
    - [ ] Update `CHANGES.md` with noteworthy changes.
    - [ ] If this contribution is large, please file an Apache [Individual Contributor License Agreement](https://www.apache.org/licenses/icla.pdf).
   
   See the [Contributor Guide](https://beam.apache.org/contribute) for more tips on [how to make review process smoother](https://beam.apache.org/contribute/#make-reviewers-job-easier).
   
   To check the build health, please visit [https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md](https://github.com/apache/beam/blob/master/.test-infra/BUILD_STATUS.md)
   
   GitHub Actions Tests Status (on master branch)
   ------------------------------------------------------------------------------------------------
   [![Build python source distribution and wheels](https://github.com/apache/beam/workflows/Build%20python%20source%20distribution%20and%20wheels/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Build+python+source+distribution+and+wheels%22+branch%3Amaster+event%3Aschedule)
   [![Python tests](https://github.com/apache/beam/workflows/Python%20tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Python+Tests%22+branch%3Amaster+event%3Aschedule)
   [![Java tests](https://github.com/apache/beam/workflows/Java%20Tests/badge.svg?branch=master&event=schedule)](https://github.com/apache/beam/actions?query=workflow%3A%22Java+Tests%22+branch%3Amaster+event%3Aschedule)
   
   See [CI.md](https://github.com/apache/beam/blob/master/CI.md) for more information about GitHub Actions CI.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

yeandy commented on PR #21810:
URL: https://github.com/apache/beam/pull/21810#issuecomment-1152744646

   R: @TheNeuralBit @ryanthompson591 @robertwb 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21810:
URL: https://github.com/apache/beam/pull/21810#discussion_r896925697


##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -67,67 +84,114 @@ def __init__(
 
   def load_model(self) -> torch.nn.Module:
     """Loads and initializes a Pytorch model for processing."""
-    model = self._model_class(**self._model_params)
-    model.to(self._device)
-    file = FileSystems.open(self._state_dict_path, 'rb')
-    model.load_state_dict(torch.load(file))
-    model.eval()
-    return model
-
-  def _convert_to_device(self, examples: torch.Tensor) -> torch.Tensor:
+    return _load_model(
+        self._model_class,
+        self._state_dict_path,
+        self._device,
+        **self._model_params)

Review Comment:
   are you referring to something like this? https://github.com/apache/beam/pull/21806



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21810:
URL: https://github.com/apache/beam/pull/21810#issuecomment-1152748511

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] codecov[bot] commented on pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

codecov[bot] commented on PR #21810:
URL: https://github.com/apache/beam/pull/21810#issuecomment-1154280342

   # [Codecov](https://codecov.io/gh/apache/beam/pull/21810?src=pr&el=h1&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) Report
   > Merging [#21810](https://codecov.io/gh/apache/beam/pull/21810?src=pr&el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (e632ad4) into [master](https://codecov.io/gh/apache/beam/commit/edf9b7906cd188cc7d16737d9a779a5cf585a6a2?el=desc&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) (edf9b79) will **decrease** coverage by `0.03%`.
   > The diff coverage is `0.00%`.
   
   ```diff
   @@            Coverage Diff             @@
   ##           master   #21810      +/-   ##
   ==========================================
   - Coverage   74.15%   74.12%   -0.04%     
   ==========================================
     Files         698      698              
     Lines       92417    92420       +3     
   ==========================================
   - Hits        68530    68502      -28     
   - Misses      22636    22667      +31     
     Partials     1251     1251              
   ```
   
   | Flag | Coverage Δ | |
   |---|---|---|
   | python | `83.72% <0.00%> (-0.05%)` | :arrow_down: |
   
   Flags with carried forward coverage won't be shown. [Click here](https://docs.codecov.io/docs/carryforward-flags?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#carryforward-flags-in-the-pull-request-comment) to find out more.
   
   | [Impacted Files](https://codecov.io/gh/apache/beam/pull/21810?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | Coverage Δ | |
   |---|---|---|
   | [...thon/apache\_beam/ml/inference/pytorch\_inference.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vbWwvaW5mZXJlbmNlL3B5dG9yY2hfaW5mZXJlbmNlLnB5) | `0.00% <0.00%> (ø)` | |
   | [.../python/apache\_beam/testing/test\_stream\_service.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdGVzdGluZy90ZXN0X3N0cmVhbV9zZXJ2aWNlLnB5) | `88.09% <0.00%> (-4.77%)` | :arrow_down: |
   | [sdks/python/apache\_beam/utils/interactive\_utils.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vdXRpbHMvaW50ZXJhY3RpdmVfdXRpbHMucHk=) | `95.12% <0.00%> (-2.44%)` | :arrow_down: |
   | [...n/apache\_beam/ml/gcp/recommendations\_ai\_test\_it.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vbWwvZ2NwL3JlY29tbWVuZGF0aW9uc19haV90ZXN0X2l0LnB5) | `73.46% <0.00%> (-2.05%)` | :arrow_down: |
   | [sdks/python/apache\_beam/io/source\_test\_utils.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vaW8vc291cmNlX3Rlc3RfdXRpbHMucHk=) | `88.01% <0.00%> (-1.39%)` | :arrow_down: |
   | [...che\_beam/runners/interactive/interactive\_runner.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9pbnRlcmFjdGl2ZS9pbnRlcmFjdGl2ZV9ydW5uZXIucHk=) | `90.06% <0.00%> (-1.33%)` | :arrow_down: |
   | [...eam/runners/portability/fn\_api\_runner/execution.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy9wb3J0YWJpbGl0eS9mbl9hcGlfcnVubmVyL2V4ZWN1dGlvbi5weQ==) | `92.44% <0.00%> (-0.65%)` | :arrow_down: |
   | [...hon/apache\_beam/runners/worker/bundle\_processor.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vcnVubmVycy93b3JrZXIvYnVuZGxlX3Byb2Nlc3Nvci5weQ==) | `93.30% <0.00%> (-0.38%)` | :arrow_down: |
   | [...examples/inference/pytorch\_image\_classification.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vZXhhbXBsZXMvaW5mZXJlbmNlL3B5dG9yY2hfaW1hZ2VfY2xhc3NpZmljYXRpb24ucHk=) | `0.00% <0.00%> (ø)` | |
   | [sdks/python/apache\_beam/ml/inference/api.py](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation#diff-c2Rrcy9weXRob24vYXBhY2hlX2JlYW0vbWwvaW5mZXJlbmNlL2FwaS5weQ==) | | |
   | ... and [2 more](https://codecov.io/gh/apache/beam/pull/21810/diff?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation) | |
   
   ------
   
   [Continue to review full report at Codecov](https://codecov.io/gh/apache/beam/pull/21810?src=pr&el=continue&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   > **Legend** - [Click here to learn more](https://docs.codecov.io/docs/codecov-delta?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation)
   > `Δ = absolute <relative> (impact)`, `ø = not affected`, `? = missing data`
   > Powered by [Codecov](https://codecov.io/gh/apache/beam/pull/21810?src=pr&el=footer&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Last update [edf9b79...e632ad4](https://codecov.io/gh/apache/beam/pull/21810?src=pr&el=lastupdated&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation). Read the [comment docs](https://docs.codecov.io/docs/pull-request-comments?utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=The+Apache+Software+Foundation).
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21810:
URL: https://github.com/apache/beam/pull/21810#discussion_r894896292


##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -29,6 +29,7 @@
 from apache_beam.io.filesystems import FileSystems
 from apache_beam.ml.inference.api import PredictionResult
 from apache_beam.ml.inference.base import ModelHandler
+from apache_beam.utils.annotations import experimental
 
 
 class PytorchModelHandler(ModelHandler[torch.Tensor,

Review Comment:
   Is the intention to still keep `PytorchModelHandler`?



##########
sdks/python/apache_beam/ml/inference/pytorch_inference_test.py:
##########
@@ -38,6 +38,8 @@
   from apache_beam.ml.inference.api import PredictionResult
   from apache_beam.ml.inference.base import RunInference
   from apache_beam.ml.inference.pytorch_inference import PytorchModelHandler

Review Comment:
   For the tests, I changed them to use PytorchModelHandlerTensor, and PytorchModelHandlerKeyedTensor. Should the tests default to use `PytorchModelHandler` since PytorchModelHandlerTensor, and PytorchModelHandlerKeyedTensor are experimental? How should we handle this



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

yeandy commented on PR #21810:
URL: https://github.com/apache/beam/pull/21810#issuecomment-1154224554

   PTAL @TheNeuralBit 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21810:
URL: https://github.com/apache/beam/pull/21810#discussion_r895621888


##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -134,3 +135,185 @@ def get_metrics_namespace(self) -> str:
     Returns a namespace for metrics collected by the RunInference transform.
     """
     return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerTensor(ModelHandler[torch.Tensor,
+                                             PredictionResult,
+                                             torch.nn.Module]):
+  """ Implementation of the ModelHandler interface for PyTorch.
+
+      NOTE: This API and its implementation are under development and
+      do not provide backward compatibility guarantees.
+  """
+  def __init__(
+      self,
+      state_dict_path: str,
+      model_class: Callable[..., torch.nn.Module],
+      model_params: Dict[str, Any],
+      device: str = 'CPU'):
+    """
+    Initializes a PytorchModelHandler
+    :param state_dict_path: path to the saved dictionary of the model state.
+    :param model_class: class of the Pytorch model that defines the model
+    structure.
+    :param device: the device on which you wish to run the model. If
+    ``device = GPU`` then a GPU device will be used if it is available.
+    Otherwise, it will be CPU.
+
+    See https://pytorch.org/tutorials/beginner/saving_loading_models.html
+    for details
+    """
+    self._state_dict_path = state_dict_path
+    if device == 'GPU' and torch.cuda.is_available():
+      self._device = torch.device('cuda')
+    else:
+      self._device = torch.device('cpu')
+    self._model_class = model_class
+    self._model_params = model_params
+
+  def load_model(self) -> torch.nn.Module:
+    """Loads and initializes a Pytorch model for processing."""
+    model = self._model_class(**self._model_params)
+    model.to(self._device)
+    file = FileSystems.open(self._state_dict_path, 'rb')
+    model.load_state_dict(torch.load(file))
+    model.eval()
+    return model
+
+  def _convert_to_device(self, examples: torch.Tensor) -> torch.Tensor:
+    """
+    Converts samples to a style matching given device.
+
+    Note: A user may pass in device='GPU' but if GPU is not detected in the
+    environment it must be converted back to CPU.
+    """
+    if examples.device != self._device:
+      examples = examples.to(self._device)
+    return examples
+
+  def run_inference(
+      self, batch: List[torch.Tensor], model: torch.nn.Module,
+      **kwargs) -> Iterable[PredictionResult]:
+    """
+    Runs inferences on a batch of Tensors and returns an Iterable of
+    Tensor Predictions.
+
+    This method stacks the list of Tensors in a vectorized format to optimize
+    the inference call.
+    """
+    prediction_params = kwargs.get('prediction_params', {})
+    batched_tensors = torch.stack(batch)
+    batched_tensors = self._convert_to_device(batched_tensors)
+    predictions = model(batched_tensors, **prediction_params)
+    return [PredictionResult(x, y) for x, y in zip(batch, predictions)]
+
+  def get_num_bytes(self, batch: List[torch.Tensor]) -> int:
+    """Returns the number of bytes of data for a batch of Tensors."""
+    return sum((el.element_size() for tensor in batch for el in tensor))
+
+  def get_metrics_namespace(self) -> str:
+    """
+    Returns a namespace for metrics collected by the RunInference transform.
+    """
+    return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerKeyedTensor(ModelHandler[torch.Tensor,
+                                                  PredictionResult,
+                                                  torch.nn.Module]):
+  """ Implementation of the ModelHandler interface for PyTorch.
+
+      NOTE: This API and its implementation are under development and
+      do not provide backward compatibility guarantees.
+  """
+  def __init__(
+      self,
+      state_dict_path: str,
+      model_class: Callable[..., torch.nn.Module],
+      model_params: Dict[str, Any],
+      device: str = 'CPU'):
+    """
+    Initializes a PytorchModelHandler
+    :param state_dict_path: path to the saved dictionary of the model state.
+    :param model_class: class of the Pytorch model that defines the model
+    structure.
+    :param device: the device on which you wish to run the model. If
+    ``device = GPU`` then a GPU device will be used if it is available.
+    Otherwise, it will be CPU.
+
+    See https://pytorch.org/tutorials/beginner/saving_loading_models.html
+    for details
+    """
+    self._state_dict_path = state_dict_path
+    if device == 'GPU' and torch.cuda.is_available():
+      self._device = torch.device('cuda')
+    else:
+      self._device = torch.device('cpu')
+    self._model_class = model_class
+    self._model_params = model_params
+
+  def load_model(self) -> torch.nn.Module:
+    """Loads and initializes a Pytorch model for processing."""
+    model = self._model_class(**self._model_params)
+    model.to(self._device)
+    file = FileSystems.open(self._state_dict_path, 'rb')
+    model.load_state_dict(torch.load(file))
+    model.eval()
+    return model

Review Comment:
   Done.



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -134,3 +135,185 @@ def get_metrics_namespace(self) -> str:
     Returns a namespace for metrics collected by the RunInference transform.
     """
     return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerTensor(ModelHandler[torch.Tensor,
+                                             PredictionResult,
+                                             torch.nn.Module]):
+  """ Implementation of the ModelHandler interface for PyTorch.
+
+      NOTE: This API and its implementation are under development and
+      do not provide backward compatibility guarantees.
+  """
+  def __init__(
+      self,
+      state_dict_path: str,
+      model_class: Callable[..., torch.nn.Module],
+      model_params: Dict[str, Any],
+      device: str = 'CPU'):
+    """
+    Initializes a PytorchModelHandler
+    :param state_dict_path: path to the saved dictionary of the model state.
+    :param model_class: class of the Pytorch model that defines the model
+    structure.
+    :param device: the device on which you wish to run the model. If
+    ``device = GPU`` then a GPU device will be used if it is available.
+    Otherwise, it will be CPU.
+
+    See https://pytorch.org/tutorials/beginner/saving_loading_models.html
+    for details
+    """
+    self._state_dict_path = state_dict_path
+    if device == 'GPU' and torch.cuda.is_available():
+      self._device = torch.device('cuda')
+    else:
+      self._device = torch.device('cpu')
+    self._model_class = model_class
+    self._model_params = model_params
+
+  def load_model(self) -> torch.nn.Module:
+    """Loads and initializes a Pytorch model for processing."""
+    model = self._model_class(**self._model_params)
+    model.to(self._device)
+    file = FileSystems.open(self._state_dict_path, 'rb')
+    model.load_state_dict(torch.load(file))
+    model.eval()
+    return model
+
+  def _convert_to_device(self, examples: torch.Tensor) -> torch.Tensor:
+    """
+    Converts samples to a style matching given device.
+
+    Note: A user may pass in device='GPU' but if GPU is not detected in the
+    environment it must be converted back to CPU.
+    """
+    if examples.device != self._device:
+      examples = examples.to(self._device)
+    return examples
+
+  def run_inference(
+      self, batch: List[torch.Tensor], model: torch.nn.Module,
+      **kwargs) -> Iterable[PredictionResult]:
+    """
+    Runs inferences on a batch of Tensors and returns an Iterable of
+    Tensor Predictions.
+
+    This method stacks the list of Tensors in a vectorized format to optimize
+    the inference call.
+    """
+    prediction_params = kwargs.get('prediction_params', {})
+    batched_tensors = torch.stack(batch)
+    batched_tensors = self._convert_to_device(batched_tensors)
+    predictions = model(batched_tensors, **prediction_params)
+    return [PredictionResult(x, y) for x, y in zip(batch, predictions)]
+
+  def get_num_bytes(self, batch: List[torch.Tensor]) -> int:
+    """Returns the number of bytes of data for a batch of Tensors."""
+    return sum((el.element_size() for tensor in batch for el in tensor))
+
+  def get_metrics_namespace(self) -> str:
+    """
+    Returns a namespace for metrics collected by the RunInference transform.
+    """
+    return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerKeyedTensor(ModelHandler[torch.Tensor,

Review Comment:
   Done



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21810:
URL: https://github.com/apache/beam/pull/21810#issuecomment-1152748513

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] TheNeuralBit commented on a diff in pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

TheNeuralBit commented on code in PR #21810:
URL: https://github.com/apache/beam/pull/21810#discussion_r894928240


##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -134,3 +135,185 @@ def get_metrics_namespace(self) -> str:
     Returns a namespace for metrics collected by the RunInference transform.
     """
     return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerTensor(ModelHandler[torch.Tensor,
+                                             PredictionResult,
+                                             torch.nn.Module]):
+  """ Implementation of the ModelHandler interface for PyTorch.
+
+      NOTE: This API and its implementation are under development and
+      do not provide backward compatibility guarantees.
+  """
+  def __init__(
+      self,
+      state_dict_path: str,
+      model_class: Callable[..., torch.nn.Module],
+      model_params: Dict[str, Any],
+      device: str = 'CPU'):
+    """
+    Initializes a PytorchModelHandler
+    :param state_dict_path: path to the saved dictionary of the model state.
+    :param model_class: class of the Pytorch model that defines the model
+    structure.
+    :param device: the device on which you wish to run the model. If
+    ``device = GPU`` then a GPU device will be used if it is available.
+    Otherwise, it will be CPU.
+
+    See https://pytorch.org/tutorials/beginner/saving_loading_models.html
+    for details
+    """
+    self._state_dict_path = state_dict_path
+    if device == 'GPU' and torch.cuda.is_available():
+      self._device = torch.device('cuda')
+    else:
+      self._device = torch.device('cpu')
+    self._model_class = model_class
+    self._model_params = model_params
+
+  def load_model(self) -> torch.nn.Module:
+    """Loads and initializes a Pytorch model for processing."""
+    model = self._model_class(**self._model_params)
+    model.to(self._device)
+    file = FileSystems.open(self._state_dict_path, 'rb')
+    model.load_state_dict(torch.load(file))
+    model.eval()
+    return model
+
+  def _convert_to_device(self, examples: torch.Tensor) -> torch.Tensor:
+    """
+    Converts samples to a style matching given device.
+
+    Note: A user may pass in device='GPU' but if GPU is not detected in the
+    environment it must be converted back to CPU.
+    """
+    if examples.device != self._device:
+      examples = examples.to(self._device)
+    return examples
+
+  def run_inference(
+      self, batch: List[torch.Tensor], model: torch.nn.Module,
+      **kwargs) -> Iterable[PredictionResult]:
+    """
+    Runs inferences on a batch of Tensors and returns an Iterable of
+    Tensor Predictions.
+
+    This method stacks the list of Tensors in a vectorized format to optimize
+    the inference call.
+    """
+    prediction_params = kwargs.get('prediction_params', {})
+    batched_tensors = torch.stack(batch)
+    batched_tensors = self._convert_to_device(batched_tensors)
+    predictions = model(batched_tensors, **prediction_params)
+    return [PredictionResult(x, y) for x, y in zip(batch, predictions)]
+
+  def get_num_bytes(self, batch: List[torch.Tensor]) -> int:
+    """Returns the number of bytes of data for a batch of Tensors."""
+    return sum((el.element_size() for tensor in batch for el in tensor))
+
+  def get_metrics_namespace(self) -> str:
+    """
+    Returns a namespace for metrics collected by the RunInference transform.
+    """
+    return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerKeyedTensor(ModelHandler[torch.Tensor,

Review Comment:
   I think the `torch.Tensor` hint needs to be updated



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -29,6 +29,7 @@
 from apache_beam.io.filesystems import FileSystems
 from apache_beam.ml.inference.api import PredictionResult
 from apache_beam.ml.inference.base import ModelHandler
+from apache_beam.utils.annotations import experimental
 
 
 class PytorchModelHandler(ModelHandler[torch.Tensor,

Review Comment:
   No I think we want to drop `PytorchModelHandler`



##########
sdks/python/apache_beam/ml/inference/pytorch_inference_test.py:
##########
@@ -38,6 +38,8 @@
   from apache_beam.ml.inference.api import PredictionResult
   from apache_beam.ml.inference.base import RunInference
   from apache_beam.ml.inference.pytorch_inference import PytorchModelHandler

Review Comment:
   Let's drop `PytorchModelHandler` and exercise the type-specific ones.



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -134,3 +135,185 @@ def get_metrics_namespace(self) -> str:
     Returns a namespace for metrics collected by the RunInference transform.
     """
     return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerTensor(ModelHandler[torch.Tensor,
+                                             PredictionResult,
+                                             torch.nn.Module]):
+  """ Implementation of the ModelHandler interface for PyTorch.
+
+      NOTE: This API and its implementation are under development and
+      do not provide backward compatibility guarantees.

Review Comment:
   IIUC we want to drop the experimental tag and NOTE for this one, but we'll keep the named input one (KeyedTensor) marked experimental. See here: https://github.com/apache/beam/pull/21803#discussion_r894872155



##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -134,3 +135,185 @@ def get_metrics_namespace(self) -> str:
     Returns a namespace for metrics collected by the RunInference transform.
     """
     return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerTensor(ModelHandler[torch.Tensor,
+                                             PredictionResult,
+                                             torch.nn.Module]):
+  """ Implementation of the ModelHandler interface for PyTorch.
+
+      NOTE: This API and its implementation are under development and
+      do not provide backward compatibility guarantees.
+  """
+  def __init__(
+      self,
+      state_dict_path: str,
+      model_class: Callable[..., torch.nn.Module],
+      model_params: Dict[str, Any],
+      device: str = 'CPU'):
+    """
+    Initializes a PytorchModelHandler
+    :param state_dict_path: path to the saved dictionary of the model state.
+    :param model_class: class of the Pytorch model that defines the model
+    structure.
+    :param device: the device on which you wish to run the model. If
+    ``device = GPU`` then a GPU device will be used if it is available.
+    Otherwise, it will be CPU.
+
+    See https://pytorch.org/tutorials/beginner/saving_loading_models.html
+    for details
+    """
+    self._state_dict_path = state_dict_path
+    if device == 'GPU' and torch.cuda.is_available():
+      self._device = torch.device('cuda')
+    else:
+      self._device = torch.device('cpu')
+    self._model_class = model_class
+    self._model_params = model_params
+
+  def load_model(self) -> torch.nn.Module:
+    """Loads and initializes a Pytorch model for processing."""
+    model = self._model_class(**self._model_params)
+    model.to(self._device)
+    file = FileSystems.open(self._state_dict_path, 'rb')
+    model.load_state_dict(torch.load(file))
+    model.eval()
+    return model
+
+  def _convert_to_device(self, examples: torch.Tensor) -> torch.Tensor:
+    """
+    Converts samples to a style matching given device.
+
+    Note: A user may pass in device='GPU' but if GPU is not detected in the
+    environment it must be converted back to CPU.
+    """
+    if examples.device != self._device:
+      examples = examples.to(self._device)
+    return examples
+
+  def run_inference(
+      self, batch: List[torch.Tensor], model: torch.nn.Module,
+      **kwargs) -> Iterable[PredictionResult]:
+    """
+    Runs inferences on a batch of Tensors and returns an Iterable of
+    Tensor Predictions.
+
+    This method stacks the list of Tensors in a vectorized format to optimize
+    the inference call.
+    """
+    prediction_params = kwargs.get('prediction_params', {})
+    batched_tensors = torch.stack(batch)
+    batched_tensors = self._convert_to_device(batched_tensors)
+    predictions = model(batched_tensors, **prediction_params)
+    return [PredictionResult(x, y) for x, y in zip(batch, predictions)]
+
+  def get_num_bytes(self, batch: List[torch.Tensor]) -> int:
+    """Returns the number of bytes of data for a batch of Tensors."""
+    return sum((el.element_size() for tensor in batch for el in tensor))
+
+  def get_metrics_namespace(self) -> str:
+    """
+    Returns a namespace for metrics collected by the RunInference transform.
+    """
+    return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerKeyedTensor(ModelHandler[torch.Tensor,
+                                                  PredictionResult,
+                                                  torch.nn.Module]):
+  """ Implementation of the ModelHandler interface for PyTorch.
+
+      NOTE: This API and its implementation are under development and
+      do not provide backward compatibility guarantees.
+  """
+  def __init__(
+      self,
+      state_dict_path: str,
+      model_class: Callable[..., torch.nn.Module],
+      model_params: Dict[str, Any],
+      device: str = 'CPU'):
+    """
+    Initializes a PytorchModelHandler
+    :param state_dict_path: path to the saved dictionary of the model state.
+    :param model_class: class of the Pytorch model that defines the model
+    structure.
+    :param device: the device on which you wish to run the model. If
+    ``device = GPU`` then a GPU device will be used if it is available.
+    Otherwise, it will be CPU.
+
+    See https://pytorch.org/tutorials/beginner/saving_loading_models.html
+    for details
+    """
+    self._state_dict_path = state_dict_path
+    if device == 'GPU' and torch.cuda.is_available():
+      self._device = torch.device('cuda')
+    else:
+      self._device = torch.device('cpu')
+    self._model_class = model_class
+    self._model_params = model_params
+
+  def load_model(self) -> torch.nn.Module:
+    """Loads and initializes a Pytorch model for processing."""
+    model = self._model_class(**self._model_params)
+    model.to(self._device)
+    file = FileSystems.open(self._state_dict_path, 'rb')
+    model.load_state_dict(torch.load(file))
+    model.eval()
+    return model

Review Comment:
   Please extract common logic to a base class, or a private free function as in #21803.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] TheNeuralBit merged pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

TheNeuralBit merged PR #21810:
URL: https://github.com/apache/beam/pull/21810


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21810:
URL: https://github.com/apache/beam/pull/21810#issuecomment-1152748515

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] asf-ci commented on pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

asf-ci commented on PR #21810:
URL: https://github.com/apache/beam/pull/21810#issuecomment-1152748516

   Can one of the admins verify this patch?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] yeandy commented on a diff in pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

yeandy commented on code in PR #21810:
URL: https://github.com/apache/beam/pull/21810#discussion_r895622147


##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -134,3 +135,185 @@ def get_metrics_namespace(self) -> str:
     Returns a namespace for metrics collected by the RunInference transform.
     """
     return 'RunInferencePytorch'
+
+
+@experimental()
+class PytorchModelHandlerTensor(ModelHandler[torch.Tensor,
+                                             PredictionResult,
+                                             torch.nn.Module]):
+  """ Implementation of the ModelHandler interface for PyTorch.
+
+      NOTE: This API and its implementation are under development and
+      do not provide backward compatibility guarantees.

Review Comment:
   SG



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org

[GitHub] [beam] ryanthompson591 commented on a diff in pull request #21810: Split PytorchModelHandler into PytorchModelHandlerTensor and PytorchModelHandlerKeyedTensor

Posted by GitBox <gi...@apache.org>.

ryanthompson591 commented on code in PR #21810:
URL: https://github.com/apache/beam/pull/21810#discussion_r896870386


##########
sdks/python/apache_beam/ml/inference/pytorch_inference.py:
##########
@@ -67,67 +84,114 @@ def __init__(
 
   def load_model(self) -> torch.nn.Module:
     """Loads and initializes a Pytorch model for processing."""
-    model = self._model_class(**self._model_params)
-    model.to(self._device)
-    file = FileSystems.open(self._state_dict_path, 'rb')
-    model.load_state_dict(torch.load(file))
-    model.eval()
-    return model
-
-  def _convert_to_device(self, examples: torch.Tensor) -> torch.Tensor:
+    return _load_model(
+        self._model_class,
+        self._state_dict_path,
+        self._device,
+        **self._model_params)

Review Comment:
   What do you think about naming model parameters as a dictionary?
   
   The advantage is that users can specify exactly what their parameters should be.
   
   They would specify the parameters like this:
   
   model_parameters = {
     'key_1': 'parameter_1' 
   }
   
   Then in the future if optional parameters are added they won't collide.
   
   
   Feel free to do that change in another PR if you think it's a good idea.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org