You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@singa.apache.org by "wangwei (JIRA)" <ji...@apache.org> on 2015/12/16 13:45:46 UTC

[jira] [Commented] (SINGA-113) Model/Hybrid Partition Support

    [ https://issues.apache.org/jira/browse/SINGA-113?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15059939#comment-15059939 ] 

wangwei commented on SINGA-113:
-------------------------------

We may also need `partition_dim=-1`.
Take the alexnet as an example, the lower layers take most of the training time while the upper layers occupy most of the model parameters.
It is necessary to parallelize the training of the lower layers using data parallelism (i.e., with partition_dim = 0).
For the upper layers, we can either 
1. do model partitioning (i.e., with partition_dim = 1), which can parallelize the computation for upper layers, but would also lead to transferring features among workers.
2. put them into a single GPU card (i.e., with partition_dim = -1), although there is no parallelism, also no communication overhead from upper layers.

Sometimes, we can also set the partition_dim to -1 for input layers.

> Model/Hybrid Partition Support
> ------------------------------
>
>                 Key: SINGA-113
>                 URL: https://issues.apache.org/jira/browse/SINGA-113
>             Project: Singa
>          Issue Type: Bug
>            Reporter: Sheng Wang
>            Assignee: Sheng Wang
>
> This ticket is to add model partition and hybrid partition.
> model/hybrid partitions are achieved by adding assistant connection layers in original neuralnet model to handle data/gradient messages between two connected layers but on different machines.
> The partition is transparent to users.
> User just need to configure partition_dim field in NetProto:
> partition_dim = 0 : data partition
> partition_dim = 1 : model partition
> User can overwrite partition_dim for a specific layer in LayerProto with the same manner. This will result in hybrid partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)