You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@bluemarlin.apache.org by GitBox <gi...@apache.org> on 2022/02/24 14:56:46 UTC

[GitHub] [incubator-bluemarlin] jimmylao edited a comment on issue #52: [BLUEMARLIN-29] : For DIN-Lookalike model, training is very slow.

jimmylao edited a comment on issue #52:
URL: https://github.com/apache/incubator-bluemarlin/issues/52#issuecomment-1049939996


   @Bimlesh759-AI 
   More data does not always generate better model. High quality data usually generates better model. 
   Most of the time, real world data is noisy and there exists a lot of redundancy. You may need to have more insight of the data. 
   1. Break-down information regarding your training/testing data? i.e.
     - the number of positive and negative training/testing samples of all 19 items
     - the number of positive and negative training/testing samples of each of the 19 items
   you may want to balance the number of training/testing samples of these 19 items.
   
   2. Since your dataset is huge, it's quite essential to do parallel training. I think you may try to use TF 2 for multiple GPU training already, how's your progress on this topic? There's another option: TF 1 + horovod, to my understanding, you tried this approach too, can you sync up what is the current status of parallel training with multiple GPUs or machines?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscribe@bluemarlin.apache.org

For queries about this service, please contact Infrastructure at:
users@infra.apache.org