You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@madlib.apache.org by GitBox <gi...@apache.org> on 2019/09/17 20:33:56 UTC

[GitHub] [madlib] khannaekta opened a new pull request #443: DL: Add training for multiple models

khannaekta opened a new pull request #443: DL: Add training for multiple models
URL: https://github.com/apache/madlib/pull/443
 
 
   This PR adds a new function to train multiple models in parallel
   with model hopping method supported for Greenplum DB only.
   
   Model hopping method involves the following steps:
   
   - Train models in parallel for a single epoch using the data local
   to each segment.
   - Move the models to the next segment in round-robin fashion.
   
   This method ensures that all of the models visit the entire dataset,
   which eliminates the need to average the model at the end.
   
   This PR also fixes the following issue:
   -  In the regular fit function, excessive amounts of threads were being
   created and left over by keras sessions. This issue was fixed by reusing
   the same session and the same computational graph throughout the
   process.
   - While deserializing weights, if the model shape expected less elements
   than present in the weights, the extra weights would get truncated.
   This is fixed by adding an explicit check for validating number of
   elements in model weights matches the model.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
users@infra.apache.org


With regards,
Apache Git Services