You are viewing a plain text version of this content. The canonical link for it is here.

Posted to github@beam.apache.org by GitBox <gi...@apache.org> on 2022/11/23 17:59:37 UTC

[GitHub] [beam] AnandInguva opened a new issue, #24340: [Feature Request][Tracking]: Use accelerate from Hugging Face to optimize loading Pytorch models

AnandInguva opened a new issue, #24340:
URL: https://github.com/apache/beam/issues/24340

   ### What would you like to happen?
   
   [Accelerate](https://huggingface.co/blog/accelerate-large-models) module from HuggingFace is used to optimize model loading for PyTorch, Tensorflow. 
   
   The model loading pipeline in torch happens as below
   * Create the model
   * Load in memory its weights (in an object usually called state_dict)
   * Load those weights in the created model
   * Move the model on the device for inference
   
   This is not a smart way to load the model and HuggingFace accelerate helps mitigate this issue. 
   As per [documentation](https://huggingface.co/docs/accelerate/usage_guides/big_modeling#limits-and-further-development), this is still in development and needs at least one GPU to run this API but as per the docs this could be fixed in the future. 
   
   ### Issue Priority
   
   Priority: 3
   
   ### Issue Component
   
   Component: run-inference


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscribe@beam.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org