You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@mxnet.apache.org by GitBox <gi...@apache.org> on 2018/03/26 18:37:05 UTC

[GitHub] GSanchis opened a new issue #10250: Model too big?

GSanchis opened a new issue #10250: Model too big?
URL: https://github.com/apache/incubator-mxnet/issues/10250
 
 
   Hi all,
   
   I'm attempting to train a recommender system with MXNet 1.0.0 in Python3, but I'm running into the following problem: the dataset has rough 5M items, 200k users. This means that I am not able to have an embedding size larger than 100, since the model would not fit into memory:
   ```
   user_embed = mx.symbol.Embedding(name="user_embed", data=user,
                                    input_dim=5000000, output_dim=100)
   item_embed = mx.symbol.Embedding(name="item_embed", data=item,
                                    input_dim=200000, output_dim=100)
   pred = mx.symbol.sum_axis(pred, axis=1)
   pred = mx.symbol.Flatten(pred)
   pred = mx.symbol.LinearRegressionOutput(data=pred, label=score)
   ```
   I know that it's an overly simplistic model, but even this one doesn't fit into GPU memory... well, then a more complex model won't fit either.
   
   Results with embedding size 100 are ok with a smaller amount of items/users, but for the full dataset they are not anymore, so I assume the problem is that the embeddings do not have enough expressive power.
   
   Is there a way to reduce that memory footprint? Perhaps loading the embeddings "on-demand", i.e., only those that are actually required for a specific batch?
   
   Thanks in advance!!

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services