You are viewing a plain text version of this content. The canonical link for it is here.

Posted to issues@ignite.apache.org by "Artem Malykh (JIRA)" <ji...@apache.org> on 2017/10/27 16:44:00 UTC

[jira] [Created] (IGNITE-6783) Create common mechanism for group training.

Artem Malykh created IGNITE-6783:
------------------------------------

             Summary: Create common mechanism for group training.
                 Key: IGNITE-6783
                 URL: https://issues.apache.org/jira/browse/IGNITE-6783
             Project: Ignite
          Issue Type: Task
      Security Level: Public (Viewable by anyone)
            Reporter: Artem Malykh
            Assignee: Artem Malykh


In distributed ML it is a common task to train several models in parallel with ability to communicate with each other during training. Simple example of this case is training of neural network with SGD on different chunks of data located on several nodes. In such training we do the following in a loop: on each node we do one or several SGD steps then send gradient on central node which averages gradients from each of worker nodes and send back the averaged gradient. There is a pattern in this procedure which can be applied to other ML algos and it could be useful to extract this pattern.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)