You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@singa.apache.org by wa...@apache.org on 2016/01/13 04:46:20 UTC

svn commit: r1724348 [5/6] - in /incubator/singa/site/trunk/content/markdown/docs: ./ jp/ kr/

Added: incubator/singa/site/trunk/content/markdown/docs/kr/model-config.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/model-config.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/model-config.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/model-config.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,294 @@
+# Model Configuration
+
+---
+
+SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters
+of deep learning models.  For each SGD iteration, there is a
+[Worker](architecture.html) computing
+gradients of parameters from the NeuralNet and a [Updater]() updating parameter
+values based on gradients. Hence the model configuration mainly consists these
+three parts. We will introduce the NeuralNet, Worker and Updater in the
+following paragraphs and describe the configurations for them. All model
+configuration is specified in the model.conf file in the user provided
+workspace folder. E.g., the [cifar10 example folder](https://github.com/apache/incubator-singa/tree/master/examples/cifar10)
+has a model.conf file.
+
+
+## NeuralNet
+
+### Uniform model (neuralnet) representation
+
+<img src = "../images/model-categorization.png" style = "width: 400px"> Fig. 1:
+Deep learning model categorization</img>
+
+Many deep learning models have being proposed. Fig. 1 is a categorization of
+popular deep learning models based on the layer connections. The
+[NeuralNet](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h)
+abstraction of SINGA consists of multiple directly connected layers. This
+abstraction is able to represent models from all the three categorizations.
+
+  * For the feed-forward models, their connections are already directed.
+
+  * For the RNN models, we unroll them into directed connections, as shown in
+  Fig. 2.
+
+  * For the undirected connections in RBM, DBM, etc., we replace each undirected
+  connection with two directed connection, as shown in Fig. 3.
+
+<div style = "height: 200px">
+<div style = "float:left; text-align: center">
+<img src = "../images/unroll-rbm.png" style = "width: 280px"> <br/>Fig. 2: Unroll RBM </img>
+</div>
+<div style = "float:left; text-align: center; margin-left: 40px">
+<img src = "../images/unroll-rnn.png" style = "width: 550px"> <br/>Fig. 3: Unroll RNN </img>
+</div>
+</div>
+
+In specific, the NeuralNet class is defined in
+[neuralnet.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h) :
+
+    ...
+    vector<Layer*> layers_;
+    ...
+
+The Layer class is defined in
+[base_layer.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/base_layer.h):
+
+    vector<Layer*> srclayers_, dstlayers_;
+    LayerProto layer_proto_;  // layer configuration, including meta info, e.g., name
+    ...
+
+
+The connection with other layers are kept in the `srclayers_` and `dstlayers_`.
+Since there are many different feature transformations, there are many
+different Layer implementations correspondingly. For layers that have
+parameters in their feature transformation functions, they would have Param
+instances in the layer class, e.g.,
+
+    Param weight;
+
+
+### Configure the structure of a NeuralNet instance
+
+To train a deep learning model, the first step is to write the configurations
+for the model structure, i.e., the layers and connections for the NeuralNet.
+Like [Caffe](http://caffe.berkeleyvision.org/), we use the [Google Protocol
+Buffer](https://developers.google.com/protocol-buffers/) to define the
+configuration protocol. The
+[NetProto](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto)
+specifies the configuration fields for a NeuralNet instance,
+
+message NetProto {
+  repeated LayerProto layer = 1;
+  ...
+}
+
+The configuration is then
+
+    layer {
+      // layer configuration
+    }
+    layer {
+      // layer configuration
+    }
+    ...
+
+To configure the model structure, we just configure each layer involved in the model.
+
+    message LayerProto {
+      // the layer name used for identification
+      required string name = 1;
+      // source layer names
+      repeated string srclayers = 3;
+      // parameters, e.g., weight matrix or bias vector
+      repeated ParamProto param = 12;
+      // the layer type from the enum above
+      required LayerType type = 20;
+      // configuration for convolution layer
+      optional ConvolutionProto convolution_conf = 30;
+      // configuration for concatenation layer
+      optional ConcateProto concate_conf = 31;
+      // configuration for dropout layer
+      optional DropoutProto dropout_conf = 33;
+      ...
+    }
+
+A sample configuration for a feed-forward model is like
+
+    layer {
+      name : "input"
+      type : kRecordInput
+    }
+    layer {
+      name : "conv"
+      type : kInnerProduct
+      srclayers : "input"
+      param {
+        // configuration for parameter
+      }
+      innerproduct_conf {
+        // configuration for this specific layer
+      }
+      ...
+    }
+
+The layer type list is defined in
+[LayerType](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto).
+One type (kFoo) corresponds to one child class of Layer (FooLayer) and one
+configuration field (foo_conf). All built-in layers are introduced in the [layer page](layer.html).
+
+## Worker
+
+At the beginning, the Work will initialize the values of Param instances of
+each layer either randomly (according to user configured distribution) or
+loading from a [checkpoint file]().  For each training iteration, the worker
+visits layers of the neural network to compute gradients of Param instances of
+each layer. Corresponding to the three categories of models, there are three
+different algorithm to compute the gradients of a neural network.
+
+  1. Back-propagation (BP) for feed-forward models
+  2. Back-propagation through time (BPTT) for recurrent neural networks
+  3. Contrastive divergence (CD) for RBM, DBM, etc models.
+
+SINGA has provided these three algorithms as three Worker implementations.
+Users only need to configure in the model.conf file to specify which algorithm
+should be used. The configuration protocol is
+
+    message ModelProto {
+      ...
+      enum GradCalcAlg {
+      // BP algorithm for feed-forward models, e.g., CNN, MLP, RNN
+      kBP = 1;
+      // BPTT for recurrent neural networks
+      kBPTT = 2;
+      // CD algorithm for RBM, DBM etc., models
+      kCd = 3;
+      }
+      // gradient calculation algorithm
+      required GradCalcAlg alg = 8 [default = kBackPropagation];
+      ...
+    }
+
+These algorithms override the TrainOneBatch function of the Worker. E.g., the
+BPWorker implements it as
+
+    void BPWorker::TrainOneBatch(int step, Metric* perf) {
+      Forward(step, kTrain, train_net_, perf);
+      Backward(step, train_net_);
+    }
+
+The Forward function passes the raw input features of one mini-batch through
+all layers, and the Backward function visits the layers in reverse order to
+compute the gradients of the loss w.r.t each layer's feature and each layer's
+Param objects. Different algorithms would visit the layers in different orders.
+Some may traverses the neural network multiple times, e.g., the CDWorker's
+TrainOneBatch function is:
+
+    void CDWorker::TrainOneBatch(int step, Metric* perf) {
+      PostivePhase(step, kTrain, train_net_, perf);
+      NegativePhase(step, kTran, train_net_, perf);
+      GradientPhase(step, train_net_);
+    }
+
+Each `*Phase` function would visit all layers one or multiple times.
+All algorithms will finally call two functions of the Layer class:
+
+     /**
+      * Transform features from connected layers into features of this layer.
+      *
+      * @param phase kTrain, kTest, kPositive, etc.
+      */
+     virtual void ComputeFeature(Phase phase, Metric* perf) = 0;
+     /**
+      * Compute gradients for parameters (and connected layers).
+      *
+      * @param phase kTrain, kTest, kPositive, etc.
+      */
+     virtual void ComputeGradient(Phase phase) = 0;
+
+All [Layer implementations]() must implement the above two functions.
+
+
+## Updater
+
+Once the gradients of parameters are computed, the Updater will update
+parameter values.  There are many SGD variants for updating parameters, like
+[AdaDelta](http://arxiv.org/pdf/1212.5701v1.pdf),
+[AdaGrad](http://www.magicbroom.info/Papers/DuchiHaSi10.pdf),
+[RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf),
+[Nesterov](http://scholar.google.com/citations?view_op=view_citation&amp;hl=en&amp;user=DJ8Ep8YAAAAJ&amp;citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C)
+and SGD with momentum. The core functions of the Updater is
+
+    /**
+     * Update parameter values based on gradients
+     * @param step training step
+     * @param param pointer to the Param object
+     * @param grad_scale scaling factor for the gradients
+     */
+    void Update(int step, Param* param, float grad_scale=1.0f);
+    /**
+     * @param step training step
+     * @return the learning rate for this step
+     */
+    float GetLearningRate(int step);
+
+SINGA provides several built-in updaters and learning rate change methods.
+Users can configure them according to the UpdaterProto
+
+    message UpdaterProto {
+      enum UpdaterType{
+        // noraml SGD with momentum and weight decay
+        kSGD = 1;
+        // adaptive subgradient, http://www.magicbroom.info/Papers/DuchiHaSi10.pdf
+        kAdaGrad = 2;
+        // http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
+        kRMSProp = 3;
+        // Nesterov first optimal gradient method
+        kNesterov = 4;
+      }
+      // updater type
+      required UpdaterType type = 1 [default=kSGD];
+      // configuration for RMSProp algorithm
+      optional RMSPropProto rmsprop_conf = 50;
+
+      enum ChangeMethod {
+        kFixed = 0;
+        kInverseT = 1;
+        kInverse = 2;
+        kExponential = 3;
+        kLinear = 4;
+        kStep = 5;
+        kFixedStep = 6;
+      }
+      // change method for learning rate
+      required ChangeMethod lr_change= 2 [default = kFixed];
+
+      optional FixedStepProto fixedstep_conf=40;
+      ...
+      optional float momentum = 31 [default = 0];
+      optional float weight_decay = 32 [default = 0];
+      // base learning rate
+      optional float base_lr = 34 [default = 0];
+    }
+
+
+## Other model configuration fields
+
+Some other important configuration fields for training a deep learning model is
+listed:
+
+    // model name, e.g., "cifar10-dcnn", "mnist-mlp"
+    string name;
+    // displaying training info for every this number of iterations, default is 0
+    int32 display_freq;
+    // total num of steps/iterations for training
+    int32 train_steps;
+    // do test for every this number of training iterations, default is 0
+    int32 test_freq;
+    // run test for this number of steps/iterations, default is 0.
+    // The test dataset has test_steps * batchsize instances.
+    int32 test_steps;
+    // do checkpoint for every this number of training steps, default is 0
+    int32 checkpoint_freq;
+
+The pages of [checkpoint and restore](checkpoint.html) has details on checkpoint related fields.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/neural-net.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/neural-net.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/neural-net.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/neural-net.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,327 @@
+# Neural Net
+
+---
+
+`NeuralNet` in SINGA represents an instance of user's neural net model. As the
+neural net typically consists of a set of layers, `NeuralNet` comprises
+a set of unidirectionally connected [Layer](layer.html)s.
+This page describes how to convert an user's neural net into
+the configuration of `NeuralNet`.
+
+<img src="../images/model-category.png" align="center" width="200px"/>
+<span><strong>Figure 1 - Categorization of popular deep learning models.</strong></span>
+
+## Net structure configuration
+
+Users configure the `NeuralNet` by listing all layers of the neural net and
+specifying each layer's source layer names. Popular deep learning models can be
+categorized as Figure 1. The subsequent sections give details for each
+category.
+
+### Feed-forward models
+
+<div align = "left">
+<img src="../images/mlp-net.png" align="center" width="200px"/>
+<span><strong>Figure 2 - Net structure of a MLP model.</strong></span>
+</div>
+
+Feed-forward models, e.g., CNN and MLP, can easily get configured as their layer
+connections are undirected without circles. The
+configuration for the MLP model shown in Figure 1 is as follows,
+
+    net {
+      layer {
+        name : 'data"
+        type : kData
+      }
+      layer {
+        name : 'image"
+        type : kImage
+        srclayer: 'data'
+      }
+      layer {
+        name : 'label"
+        type : kLabel
+        srclayer: 'data'
+      }
+      layer {
+        name : 'hidden"
+        type : kHidden
+        srclayer: 'image'
+      }
+      layer {
+        name : 'softmax"
+        type : kSoftmaxLoss
+        srclayer: 'hidden'
+        srclayer: 'label'
+      }
+    }
+
+### Energy models
+
+<img src="../images/rbm-rnn.png" align="center" width="500px"/>
+<span><strong>Figure 3 - Convert connections in RBM and RNN.</strong></span>
+
+
+For energy models including RBM, DBM,
+etc., their connections are undirected (i.e., Category B). To represent these models using
+`NeuralNet`, users can simply replace each connection with two directed
+connections, as shown in Figure 3a. In other words, for each pair of connected layers, their source
+layer field should include each other's name.
+The full [RBM example](rbm.html) has
+detailed neural net configuration for a RBM model, which looks like
+
+    net {
+      layer {
+        name : "vis"
+        type : kVisLayer
+        param {
+          name : "w1"
+        }
+        srclayer: "hid"
+      }
+      layer {
+        name : "hid"
+        type : kHidLayer
+        param {
+          name : "w2"
+          share_from: "w1"
+        }
+        srclayer: "vis"
+      }
+    }
+
+### RNN models
+
+For recurrent neural networks (RNN), users can remove the recurrent connections
+by unrolling the recurrent layer.  For example, in Figure 3b, the original
+layer is unrolled into a new layer with 4 internal layers. In this way, the
+model is like a normal feed-forward model, thus can be configured similarly.
+The [RNN example](rnn.html) has a full neural net
+configuration for a RNN model.
+
+
+## Configuration for multiple nets
+
+Typically, a training job includes three neural nets for
+training, validation and test phase respectively. The three neural nets share most
+layers except the data layer, loss layer or output layer, etc..  To avoid
+redundant configurations for the shared layers, users can uses the `exclude`
+filed to filter a layer in the neural net, e.g., the following layer will be
+filtered when creating the testing `NeuralNet`.
+
+
+    layer {
+      ...
+      exclude : kTest # filter this layer for creating test net
+    }
+
+
+
+## Neural net partitioning
+
+A neural net can be partitioned in different ways to distribute the training
+over multiple workers.
+
+### Batch and feature dimension
+
+<img src="../images/partition_fc.png" align="center" width="400px"/>
+<span><strong>Figure 4 - Partitioning of a fully connected layer.</strong></span>
+
+
+Every layer's feature blob is considered a matrix whose rows are feature
+vectors. Thus, one layer can be split on two dimensions. Partitioning on
+dimension 0 (also called batch dimension) slices the feature matrix by rows.
+For instance, if the mini-batch size is 256 and the layer is partitioned into 2
+sub-layers, each sub-layer would have 128 feature vectors in its feature blob.
+Partitioning on this dimension has no effect on the parameters, as every
+[Param](param.html) object is replicated in the sub-layers. Partitioning on dimension
+1 (also called feature dimension) slices the feature matrix by columns. For
+example, suppose the original feature vector has 50 units, after partitioning
+into 2 sub-layers, each sub-layer would have 25 units. This partitioning may
+result in [Param](param.html) object being split, as shown in
+Figure 4. Both the bias vector and weight matrix are
+partitioned into two sub-layers.
+
+
+### Partitioning configuration
+
+There are 4 partitioning schemes, whose configurations are give below,
+
+  1. Partitioning each singe layer into sub-layers on batch dimension (see
+  below). It is enabled by configuring the partition dimension of the layer to
+  0, e.g.,
+
+          # with other fields omitted
+          layer {
+            partition_dim: 0
+          }
+
+  2. Partitioning each singe layer into sub-layers on feature dimension (see
+  below).  It is enabled by configuring the partition dimension of the layer to
+  1, e.g.,
+
+          # with other fields omitted
+          layer {
+            partition_dim: 1
+          }
+
+  3. Partitioning all layers into different subsets. It is enabled by
+  configuring the location ID of a layer, e.g.,
+
+          # with other fields omitted
+          layer {
+            location: 1
+          }
+          layer {
+            location: 0
+          }
+
+
+  4. Hybrid partitioning of strategy 1, 2 and 3. The hybrid partitioning is
+  useful for large models. An example application is to implement the
+  [idea proposed by Alex](http://arxiv.org/abs/1404.5997).
+  Hybrid partitioning is configured like,
+
+          # with other fields omitted
+          layer {
+            location: 1
+          }
+          layer {
+            location: 0
+          }
+          layer {
+            partition_dim: 0
+            location: 0
+          }
+          layer {
+            partition_dim: 1
+            location: 0
+          }
+
+Currently SINGA supports strategy-2 well. Other partitioning strategies are
+are under test and will be released in later version.
+
+## Parameter sharing
+
+Parameters can be shared in two cases,
+
+  * sharing parameters among layers via user configuration. For example, the
+  visible layer and hidden layer of a RBM shares the weight matrix, which is configured through
+  the `share_from` field as shown in the above RBM configuration. The
+  configurations must be the same (except name) for shared parameters.
+
+  * due to neural net partitioning, some `Param` objects are replicated into
+  different workers, e.g., partitioning one layer on batch dimension. These
+  workers share parameter values. SINGA controls this kind of parameter
+  sharing automatically, users do not need to do any configuration.
+
+  * the `NeuralNet` for training and testing (and validation) share most layers
+  , thus share `Param` values.
+
+If the shared `Param` instances resident in the same process (may in different
+threads), they use the same chunk of memory space for their values. But they
+would have different memory spaces for their gradients. In fact, their
+gradients will be averaged by the stub or server.
+
+## Advanced user guide
+
+### Creation
+
+    static NeuralNet* NeuralNet::Create(const NetProto& np, Phase phase, int num);
+
+The above function creates a `NeuralNet` for a given phase, and returns a
+pointer to the `NeuralNet` instance. The phase is in {kTrain,
+kValidation, kTest}. `num` is used for net partitioning which indicates the
+number of partitions.  Typically, a training job includes three neural nets for
+training, validation and test phase respectively. The three neural nets share most
+layers except the data layer, loss layer or output layer, etc.. The `Create`
+function takes in the full net configuration including layers for training,
+validation and test.  It removes layers for phases other than the specified
+phase based on the `exclude` field in
+[layer configuration](layer.html):
+
+    layer {
+      ...
+      exclude : kTest # filter this layer for creating test net
+    }
+
+The filtered net configuration is passed to the constructor of `NeuralNet`:
+
+    NeuralNet::NeuralNet(NetProto netproto, int npartitions);
+
+The constructor creates a graph representing the net structure firstly in
+
+    Graph* NeuralNet::CreateGraph(const NetProto& netproto, int npartitions);
+
+Next, it creates a layer for each node and connects layers if their nodes are
+connected.
+
+    void NeuralNet::CreateNetFromGraph(Graph* graph, int npartitions);
+
+Since the `NeuralNet` instance may be shared among multiple workers, the
+`Create` function returns a pointer to the `NeuralNet` instance .
+
+### Parameter sharing
+
+ `Param` sharing
+is enabled by first sharing the Param configuration (in `NeuralNet::Create`)
+to create two similar (e.g., the same shape) Param objects, and then calling
+(in `NeuralNet::CreateNetFromGraph`),
+
+    void Param::ShareFrom(const Param& from);
+
+It is also possible to share `Param`s of two nets, e.g., sharing parameters of
+the training net and the test net,
+
+    void NeuralNet:ShareParamsFrom(NeuralNet* other);
+
+It will call `Param::ShareFrom` for each Param object.
+
+### Access functions
+`NeuralNet` provides a couple of access function to get the layers and params
+of the net:
+
+    const std::vector<Layer*>& layers() const;
+    const std::vector<Param*>& params() const ;
+    Layer* name2layer(string name) const;
+    Param* paramid2param(int id) const;
+
+
+### Partitioning
+
+
+#### Implementation
+
+SINGA partitions the neural net in `CreateGraph` function, which creates one
+node for each (partitioned) layer. For example, if one layer's partition
+dimension is 0 or 1, then it creates `npartition` nodes for it; if the
+partition dimension is -1, a single node is created, i.e., no partitioning.
+Each node is assigned a partition (or location) ID. If the original layer is
+configured with a location ID, then the ID is assigned to each newly created node.
+These nodes are connected according to the connections of the original layers.
+Some connection layers will be added automatically.
+For instance, if two connected sub-layers are located at two
+different workers, then a pair of bridge layers is inserted to transfer the
+feature (and gradient) blob between them. When two layers are partitioned on
+different dimensions, a concatenation layer which concatenates feature rows (or
+columns) and a slice layer which slices feature rows (or columns) would be
+inserted. These connection layers help making the network communication and
+synchronization transparent to the users.
+
+#### Dispatching partitions to workers
+
+Each (partitioned) layer is assigned a location ID, based on which it is dispatched to one
+worker. Particularly, the pointer to the `NeuralNet` instance is passed
+to every worker within the same group, but each worker only computes over the
+layers that have the same partition (or location) ID as the worker's ID.  When
+every worker computes the gradients of the entire model parameters
+(strategy-2), we refer to this process as data parallelism.  When different
+workers compute the gradients of different parameters (strategy-3 or
+strategy-1), we call this process model parallelism.  The hybrid partitioning
+leads to hybrid parallelism where some workers compute the gradients of the
+same subset of model parameters while other workers compute on different model
+parameters.  For example, to implement the hybrid parallelism in for the
+[DCNN model](http://arxiv.org/abs/1404.5997), we set `partition_dim = 0` for
+lower layers and `partition_dim = 1` for higher layers.
+

Added: incubator/singa/site/trunk/content/markdown/docs/kr/neuralnet-partition.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/neuralnet-partition.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/neuralnet-partition.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/neuralnet-partition.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,54 @@
+# Neural Net Partition
+
+---
+
+The purposes of partitioning neural network is to distribute the partitions onto
+different working units (e.g., threads or nodes, called workers in this article)
+and parallelize the processing.
+Another reason for partition is to handle large neural network which cannot be
+hold in a single node. For instance, to train models against images with high
+resolution we need large neural networks (in terms of training parameters).
+
+Since *Layer* is the first class citizen in SIGNA, we do the partition against
+layers. Specifically, we support partitions at two levels. First, users can configure
+the location (i.e., worker ID) of each layer. In this way, users assign one worker
+for each layer. Secondly, for one layer, we can partition its neurons or partition
+the instances (e.g, images). They are called layer partition and data partition
+respectively. We illustrate the two types of partitions using an simple convolutional neural network.
+
+<img src="../images/conv-mnist.png" style="width: 220px"/>
+
+The above figure shows a convolutional neural network without any partition. It
+has 8 layers in total (one rectangular represents one layer). The first layer is
+DataLayer (data) which reads data from local disk files/databases (or HDFS). The second layer
+is a MnistLayer which parses the records from MNIST data to get the pixels of a batch
+of 8 images (each image is of size 28x28). The LabelLayer (label) parses the records to get the label
+of each image in the batch. The ConvolutionalLayer (conv1) transforms the input image to the
+shape of 8x27x27. The ReLULayer (relu1) conducts elementwise transformations. The PoolingLayer (pool1)
+sub-samples the images. The fc1 layer is fully connected with pool1 layer. It
+mulitplies each image with a weight matrix to generate a 10 dimension hidden feature which
+is then normalized by a SoftmaxLossLayer to get the prediction.
+
+<img src="../images/conv-mnist-datap.png" style="width: 1000px"/>
+
+The above figure shows the convolutional neural network after partitioning all layers
+except the DataLayer and ParserLayers, into 3 partitions using data partition.
+The read layers process 4 images of the batch, the black and blue layers process 2 images
+respectively. Some helper layers, i.e., SliceLayer, ConcateLayer, BridgeSrcLayer,
+BridgeDstLayer and SplitLayer, are added automatically by our partition algorithm.
+Layers of the same color resident in the same worker. There would be data transferring
+across different workers at the boundary layers (i.e., BridgeSrcLayer and BridgeDstLayer),
+e.g., between s-slice-mnist-conv1 and d-slice-mnist-conv1.
+
+<img src="../images/conv-mnist-layerp.png" style="width: 1000px"/>
+
+The above figure shows the convolutional neural network after partitioning all layers
+except the DataLayer and ParserLayers, into 2 partitions using layer partition. We can
+see that each layer processes all 8 images from the batch. But different partitions process
+different part of one image. For instance, the layer conv1-00 process only 4 channels. The other
+4 channels are processed by conv1-01 which residents in another worker.
+
+
+Since the partition is done at the layer level, we can apply different partitions for
+different layers to get a hybrid partition for the whole neural network. Moreover,
+we can also specify the layer locations to locate different layers to different workers.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/overview.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/overview.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/overview.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/overview.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,67 @@
+# 개요
+
+---
+
+SINGA는 대규모 데이터 분석을 위한 딥러닝 모델의 트레이닝을 목적으로 한 "분산 딥 러닝 플랫폼" 입니다.
+모델이 되는 뉴럴네트워크의 "레이어"개념에 따라 직관적으로 프로그래밍을 할 수 있도록 디자인되어 있습니다.
+
+* Convolutional Neural Network 와 같은 피드포워드 네트워크와 Restricted Boltzmann Machine 과 같은 에너지 모델, Recurrent Neural Network 모델 등 다양한 모델을 지원합니다.
+
+* 다양한 "레이어"가 Built-in Layer로 준비되어 있습니다.
+
+* SINGA 아키텍처는 synchronous (동기), asynchronous (비동기), 그리고 hybrid (하이브리드) 트레이닝을 할 수 있도록 설계되어 있습니다.
+
+* 대형 모델의 트레이닝을 병렬화하는 다양한 partition 스킴 (배치 및 특징 분할)을 지원합니다.
+
+
+## 목적
+
+확장성 : 분산 시스템으로써 더 많은 자원을 이용하여 특정 정밀도에 도달 할 때까지 트레이닝 속도를 향상시킨다.
+
+유용성 : 대규모 분산 모델의 효율적인 트레이닝에 필요한 데이터와 모델의 분할, 네트워크 통신등 프로그래머의 작업을 단순화하고, 복잡한 모델 및 알고리즘의 구축을 쉽게 한다.
+
+
+## 설계 이념
+
+확장성은 분산 딥러닝에서 중요한 연구 과제입니다.
+SINGA는 다양한 트레이닝 프레임워크의 확장성을 유지할 수 있도록 설계되어 있습니다.
+* Synchronous (동기) : 트레이닝의 1단계에서 얻을 수있는 효과를 높입니다.
+* Asynchronous (비동기) : 트레이닝의 수렴 속도를 향상시킵니다.
+* Hybrid (하이브리드) : 코스트 및 리소스 (클러스터 크기 등)에 맞는 효과와 수렴 속도의 균형을 잡고 확장성을 향상시킵니다.
+
+SINGA는 딥러닝 모델의 네트워크 "레이어" 개념에 따라 직관적으로 프로그래밍을 할 수 있도록 디자인되어 있습니다. 다양한 모델을 쉽게 구축하고 트레이닝 할 수 있습니다.
+
+## 시스템 개요
+
+<img src = "../ images / sgd.png"align = "center"width = "400px"/>
+<span> <strong> Figure 1 - SGD 흐름 </strong> </span>
+
+"딥러닝 모델을 학습한다"는 것은 특정 작업 (분류, 예측 등)을 달성하기 위해 사용되는 특징량(feature)을 생성하는 변환 함수의 최적의 파라미터를 찾는 것입니다.
+변수의 좋고 나쁨은, Cross-Entropy Loss (https://en.wikipedia.org/wiki/Cross_entropy) 등의 loss function (손실 함수)에서 확인합니다. 이 함수는 일반적으로 비선형 또는 비 볼록 함수이므로 閉解을 찾기가 어렵습니다.
+
+그래서 Stochastic Gradient Descent (확률적구배강하법)을 이용합니다.
+Figure 1과 같이 무작위로 초기화 된 매개 변수의 값을 손실 함수가 작아 지도록 반복 업데이트하고 있습니다.
+
+<img src = "../ images / overview.png"align = "center"width = "400px"/>
+<span> <strong> Figure 2 - SINGA 개요 </strong> </span>
+
+트레이닝에 필요한 워크로드는 workers와 servers에 분산됩니다. Figure 2와 같이 루프마다 workers는 *TrainOneBatch* 함수를 호출 매개 변수 기울기를 계산합니다.
+*TrainOneBatch* 신경망의 구조가 기술 된 *NeuralNet* 개체에 따라 "레이어"를 차례로 둘러보고 있습니다.
+계산 된 경사는 로컬 노드의 stub에 보내져 집계 된 후 해당 servers에 보내집니다. Servers는 업데이트 된 매개 변수를 workers로 전송 다음 루프를 실행합니다.
+
+
+## Job
+
+SINGA에서 "Job"은 뉴럴네트워크 모델과 데이터 트레이닝 방법, 클러스터 토폴로지 등이 기술 된 "Job Configuration"을 말합니다.
+Job configuration은 Figure 2에 그려진 다음의 4 가지 요소를 가집니다.
+
+  * [NeuralNet (neural-net.html) : 뉴럴네트워크의 구조와 각 "레이어"의 설정을 기술합니다.
+  * [TrainOneBatch (train-one-batch.html) : 모델 카테고리에 적합한 알고리즘을 기술합니다.
+  * [Updater] (updater.html) : server에서 매개 변수를 업데이트하는 방법을 기술합니다.
+  * [Cluster Topology (distributed-training.html) : workers와 servers 분산 토폴로지를 기술합니다.
+
+[main 함수 (programming-guide.html)의 SINGA 드라이버로 작업을 전달합니다.
+
+이 프로세스는 Hadoop에서의 Job 서브미션과 비슷합니다.
+유저가 main 함수에서 작업 설정을 합니다.
+Hadoop 유저는 자신의 mapper와 reducer를 설정하지만 SINGA 에서는 유저의 "레이어"나 Updater 등을 설정합니다.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/param.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/param.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/param.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/param.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,226 @@
+# Parameters
+
+---
+
+A `Param` object in SINGA represents a set of parameters, e.g., a weight matrix
+or a bias vector. *Basic user guide* describes how to configure for a `Param`
+object, and *Advanced user guide* provides details on implementing users'
+parameter initialization methods.
+
+## Basic user guide
+
+The configuration of a Param object is inside a layer configuration, as the
+`Param` are associated with layers. An example configuration is like
+
+    layer {
+      ...
+      param {
+        name : "p1"
+        init {
+          type : kConstant
+          value: 1
+        }
+      }
+    }
+
+The [SGD algorithm](overview.html) starts with initializing all
+parameters according to user specified initialization method (the `init` field).
+For the above example,
+all parameters in `Param` "p1" will be initialized to constant value 1. The
+configuration fields of a Param object is defined in [ParamProto](../api/classsinga_1_1ParamProto.html):
+
+  * name, an identifier string. It is an optional field. If not provided, SINGA
+  will generate one based on layer name and its order in the layer.
+  * init, field for setting initialization methods.
+  * share_from, name of another `Param` object, from which this `Param` will share
+  configurations and values.
+  * lr_scale, float value to be multiplied with the learning rate when
+  [updating the parameters](updater.html)
+  * wd_scale, float value to be multiplied with the weight decay when
+  [updating the parameters](updater.html)
+
+There are some other fields that are specific to initialization methods.
+
+### Initialization methods
+
+Users can set the `type` of `init` use the following built-in initialization
+methods,
+
+  * `kConst`, set all parameters of the Param object to a constant value
+
+        type: kConst
+        value: float  # default is 1
+
+  * `kGaussian`, initialize the parameters following a Gaussian distribution.
+
+        type: kGaussian
+        mean: float # mean of the Gaussian distribution, default is 0
+        std: float # standard variance, default is 1
+        value: float # default 0
+
+  * `kUniform`, initialize the parameters following an uniform distribution
+
+        type: kUniform
+        low: float # lower boundary, default is -1
+        high: float # upper boundary, default is 1
+        value: float # default 0
+
+  * `kGaussianSqrtFanIn`, initialize `Param` objects with two dimensions (i.e.,
+  matrix) using `kGaussian` and then
+  multiple each parameter with `1/sqrt(fan_in)`, where`fan_in` is the number of
+  columns of the matrix.
+
+  * `kUniformSqrtFanIn`, the same as `kGaussianSqrtFanIn` except that the
+  distribution is an uniform distribution.
+
+  * `kUniformFanInOut`, initialize matrix `Param` objects using `kUniform` and then
+  multiple each parameter with `sqrt(6/(fan_in + fan_out))`, where`fan_in +
+  fan_out` sums up the number of columns and rows of the matrix.
+
+For all above initialization methods except `kConst`, if their `value` is not
+1, every parameter will be multiplied with `value`. Users can also implement
+their own initialization method following the *Advanced user guide*.
+
+
+## Advanced user guide
+
+This sections describes the details on implementing new parameter
+initialization methods.
+
+### Base ParamGenerator
+All initialization methods are implemented as
+subclasses of the base `ParamGenerator` class.
+
+    class ParamGenerator {
+     public:
+      virtual void Init(const ParamGenProto&);
+      void Fill(Param*);
+
+     protected:
+      ParamGenProto proto_;
+    };
+
+Configurations of the initialization method is in `ParamGenProto`. The `Fill`
+function fills the `Param` object (passed in as an argument).
+
+### New ParamGenerator subclass
+
+Similar to implement a new Layer subclass, users can define a configuration
+protocol message,
+
+    # in user.proto
+    message FooParamProto {
+      optional int32 x = 1;
+    }
+    extend ParamGenProto {
+      optional FooParamProto fooparam_conf =101;
+    }
+
+The configuration of `Param` would be
+
+    param {
+      ...
+      init {
+        user_type: 'FooParam" # must use user_type for user defined methods
+        [fooparam_conf] { # must use brackets for configuring user defined messages
+          x: 10
+        }
+      }
+    }
+
+The subclass could be declared as,
+
+    class FooParamGen : public ParamGenerator {
+     public:
+      void Fill(Param*) override;
+    };
+
+Users can access the configuration fields in `Fill` by
+
+    int x = proto_.GetExtension(fooparam_conf).x();
+
+To use the new initialization method, users need to register it in the
+[main function](programming-guide.html).
+
+    driver.RegisterParamGenerator<FooParamGen>("FooParam")  # must be consistent with the user_type in configuration
+
+{% comment %}
+### Base Param class
+
+### Members
+
+    int local_version_;
+    int slice_start_;
+    vector<int> slice_offset_, slice_size_;
+
+    shared_ptr<Blob<float>> data_;
+    Blob<float> grad_;
+    ParamProto proto_;
+
+Each Param object has a local version and a global version (inside the data
+Blob). These two versions are used for synchronization. If multiple Param
+objects share the same values, they would have the same `data_` field.
+Consequently, their global version is the same. The global version is updated
+by [the stub thread](communication.html). The local version is
+updated in `Worker::Update` function which assigns the global version to the
+local version. The `Worker::Collect` function is blocked until the global
+version is larger than the local version, i.e., when `data_` is updated. In
+this way, we synchronize workers sharing parameters.
+
+In Deep learning models, some Param objects are 100 times larger than others.
+To ensure the load-balance among servers, SINGA slices large Param objects. The
+slicing information is recorded by `slice_*`. Each slice is assigned a unique
+ID starting from 0. `slice_start_` is the ID of the first slice of this Param
+object. `slice_offset_[i]` is the offset of the i-th slice in this Param
+object. `slice_size_[i]` is the size of the i-th slice. These slice information
+is used to create messages for transferring parameter values or gradients to
+different servers.
+
+Each Param object has a `grad_` field for gradients. Param objects do not share
+this Blob although they may share `data_`.  Because each layer containing a
+Param object would contribute gradients. E.g., in RNN, the recurrent layers
+share parameters values, and the gradients used for updating are averaged from all recurrent
+these recurrent layers. In SINGA, the stub thread will aggregate local
+gradients for the same Param object. The server will do a global aggregation
+of gradients for the same Param object.
+
+The `proto_` field has some meta information, e.g., name and ID. It also has a
+field called `owner` which is the ID of the Param object that shares parameter
+values with others.
+
+### Functions
+The base Param class implements two sets of functions,
+
+    virtual void InitValues(int version = 0);  // initialize values according to `init_method`
+    void ShareFrom(const Param& other);  // share `data_` from `other` Param
+    --------------
+    virtual Msg* GenGetMsg(bool copy, int slice_idx);
+    virtual Msg* GenPutMsg(bool copy, int slice_idx);
+    ... // other message related functions.
+
+Besides the functions for processing the parameter values, there is a set of
+functions for generating and parsing messages. These messages are for
+transferring parameter values or gradients between workers and servers. Each
+message corresponds to one Param slice. If `copy` is false, it means the
+receiver of this message is in the same process as the sender. In such case,
+only pointers to the memory of parameter value (or gradient) are wrapped in
+the message; otherwise, the parameter values (or gradients) should be copied
+into the message.
+
+
+## Implementing Param subclass
+Users can extend the base Param class to implement their own parameter
+initialization methods and message transferring protocols. Similar to
+implementing a new Layer subclasses, users can create google protocol buffer
+messages for configuring the Param subclass. The subclass, denoted as FooParam
+should be registered in main.cc,
+
+    dirver.RegisterParam<FooParam>(kFooParam);  // kFooParam should be different to 0, which is for the base Param type
+
+
+  * type, an integer representing the `Param` type. Currently SINGA provides one
+    `Param` implementation with type 0 (the default type). If users want
+    to use their own Param implementation, they should extend the base Param
+    class and configure this field with `kUserParam`
+
+{% endcomment %}

Added: incubator/singa/site/trunk/content/markdown/docs/kr/programmer-guide.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/programmer-guide.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/programmer-guide.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/programmer-guide.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,98 @@
+# Programmer Guide
+
+---
+
+To submit a training job, users must provide the configuration of the
+four components shown in Figure 1:
+
+  * a [NeuralNet](neural-net.html) describing the neural net structure with the detailed layer setting and their connections;
+  * a [TrainOneBatch](train-one-batch.html) algorithm which is tailored for different model categories;
+  * an [Updater](updater.html) defining the protocol for updating parameters at the server side;
+  * a [Cluster Topology](distributed-training.html) specifying the distributed architecture of workers and servers.
+
+The *Basic user guide* section describes how to submit a training job using
+built-in components; while the *Advanced user guide* section presents details
+on writing user's own main function to register components implemented by
+themselves. In addition, the training data must be prepared, which has the same
+[process](data.html) for both advanced users and basic users.
+
+<img src="../images/overview.png" align="center" width="400px"/>
+<span><strong>Figure 1 - SINGA overview.</strong></span>
+
+
+
+## Basic user guide
+
+Users can use the default main function provided by SINGA to submit the training
+job. For this case, a job configuration file written as a google protocol
+buffer message for the [JobProto](../api/classsinga_1_1JobProto.html) must be provided in the command line,
+
+    ./bin/singa-run.sh -conf <path to job conf> [-resume] [-test]
+
+* `-resume` is for continuing the training from last [checkpoint](checkpoint.html).
+* `-test` is for testing the performance of previously trained model and extracting features for new data,
+more details are available [here](test.html).
+
+The [MLP](mlp.html) and [CNN](cnn.html)
+examples use built-in components. Please read the corresponding pages for their
+job configuration files. The subsequent pages will illustrate the details on
+each component of the configuration.
+
+## Advanced user guide
+
+If a user's model contains some user-defined components, e.g.,
+[Updater](updater.html), he has to write a main function to
+register these components. It is similar to Hadoop's main function. Generally,
+the main function should
+
+  * initialize SINGA, e.g., setup logging.
+
+  * register user-defined components.
+
+  * create and pass the job configuration to SINGA driver
+
+An example main function is like
+
+    #include <string>
+    #include "singa.h"
+    #include "user.h"  // header for user code
+
+    int main(int argc, char** argv) {
+      singa::Driver driver;
+      driver.Init(argc, argv);
+      bool resume;
+      // parse resume option from argv.
+
+      // register user defined layers
+      driver.RegisterLayer<FooLayer, std::string>("kFooLayer");
+      // register user defined updater
+      driver.RegisterUpdater<FooUpdater, std::string>("kFooUpdater");
+      ...
+      auto jobConf = driver.job_conf();
+      //  update jobConf
+
+      driver.Submit(resume, jobConf);
+      return 0;
+    }
+
+The Driver class' `Init` method will load a job configuration file provided by
+users as a command line argument (`-conf <job conf>`). It contains at least the
+cluster topology and returns the `jobConf` for users to update or fill in
+configurations of neural net, updater, etc. If users define subclasses of
+Layer, Updater, Worker and Param, they should register them through the driver.
+Finally, the job configuration is submitted to the driver which starts the
+training.
+
+We will provide helper functions to make the configuration easier in the
+future, like [keras](https://github.com/fchollet/keras).
+
+Users need to compile and link their code (e.g., layer implementations and the main
+file) with SINGA library (*.libs/libsinga.so*) to generate an
+executable file, e.g., with name *mysinga*.  To launch the program, users just pass the
+path of the *mysinga* and base job configuration to *./bin/singa-run.sh*.
+
+    ./bin/singa-run.sh -conf <path to job conf> -exec <path to mysinga> [other arguments]
+
+The [RNN application](rnn.html) provides a full example of
+implementing the main function for training a specific RNN model.
+

Added: incubator/singa/site/trunk/content/markdown/docs/kr/programming-guide.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/programming-guide.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/programming-guide.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/programming-guide.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,95 @@
+# Programming Guide
+
+---
+
+To submit a training job, users must provide the configuration of the
+four components shown in Figure 1:
+
+  * a [NeuralNet](neural-net.html) describing the neural net structure with the detailed layer setting and their connections;
+  * a [TrainOneBatch](train-one-batch.html) algorithm which is tailored for different model categories;
+  * an [Updater](updater.html) defining the protocol for updating parameters at the server side;
+  * a [Cluster Topology](distributed-training.html) specifying the distributed architecture of workers and servers.
+
+The *Basic user guide* section describes how to submit a training job using
+built-in components; while the *Advanced user guide* section presents details
+on writing user's own main function to register components implemented by
+themselves. In addition, the training data must be prepared, which has the same
+[process](data.html) for both advanced users and basic users.
+
+<img src="../images/overview.png" align="center" width="400px"/>
+<span><strong>Figure 1 - SINGA overview.</strong></span>
+
+
+
+## Basic user guide
+
+Users can use the default main function provided SINGA to submit the training
+job. For this case, a job configuration file written as a google protocol
+buffer message for the [JobProto](../api/classsinga_1_1JobProto.html) must be provided in the command line,
+
+    ./bin/singa-run.sh -conf <path to job conf> [-resume]
+
+`-resume` is for continuing the training from last
+[checkpoint](checkpoint.html).
+The [MLP](mlp.html) and [CNN](cnn.html)
+examples use built-in components. Please read the corresponding pages for their
+job configuration files. The subsequent pages will illustrate the details on
+each component of the configuration.
+
+## Advanced user guide
+
+If a user's model contains some user-defined components, e.g.,
+[Updater](updater.html), he has to write a main function to
+register these components. It is similar to Hadoop's main function. Generally,
+the main function should
+
+  * initialize SINGA, e.g., setup logging.
+
+  * register user-defined components.
+
+  * create and pass the job configuration to SINGA driver
+
+
+An example main function is like
+
+    #include "singa.h"
+    #include "user.h"  // header for user code
+
+    int main(int argc, char** argv) {
+      singa::Driver driver;
+      driver.Init(argc, argv);
+      bool resume;
+      // parse resume option from argv.
+
+      // register user defined layers
+      driver.RegisterLayer<FooLayer>(kFooLayer);
+      // register user defined updater
+      driver.RegisterUpdater<FooUpdater>(kFooUpdater);
+      ...
+      auto jobConf = driver.job_conf();
+      //  update jobConf
+
+      driver.Train(resume, jobConf);
+      return 0;
+    }
+
+The Driver class' `Init` method will load a job configuration file provided by
+users as a command line argument (`-conf <job conf>`). It contains at least the
+cluster topology and returns the `jobConf` for users to update or fill in
+configurations of neural net, updater, etc. If users define subclasses of
+Layer, Updater, Worker and Param, they should register them through the driver.
+Finally, the job configuration is submitted to the driver which starts the
+training.
+
+We will provide helper functions to make the configuration easier in the
+future, like [keras](https://github.com/fchollet/keras).
+
+Users need to compile and link their code (e.g., layer implementations and the main
+file) with SINGA library (*.libs/libsinga.so*) to generate an
+executable file, e.g., with name *mysinga*.  To launch the program, users just pass the
+path of the *mysinga* and base job configuration to *./bin/singa-run.sh*.
+
+    ./bin/singa-run.sh -conf <path to job conf> -exec <path to mysinga> [other arguments]
+
+The [RNN application](rnn.html) provides a full example of
+implementing the main function for training a specific RNN model.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/quick-start.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/quick-start.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/quick-start.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/quick-start.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,177 @@
+# 퀵 스타트
+
+---
+
+## SINGA 인스톨
+
+SINGA 인스톨은 [여기](installation.html)를 참조하십시오.
+
+### Zookeeper 실행
+
+SINGA 트레이닝은 [zookeeper](https://zookeeper.apache.org/)를 이용합니다. 우선 zookeeper 서비스가 시작되어 있는지 확인하십시오.
+
+준비된 thirdparty 스크립트를 사용하여 zookeeper를 설치 한 경우 다음 스크립트를 실행하십시오.
+
+    #goto top level folder
+    cd SINGA_ROOT
+    ./bin/zk-service.sh start
+
+(`./bin/zk-service.sh stop` // zookeeper 중지).
+
+기본 포트를 사용하지 않고 zookeeper를 시작시킬 때`conf / singa.conf`을 편집하십시오.
+
+    zookeeper_host : "localhost : YOUR_PORT"
+
+## 독립형 모드에서 실행
+
+독립형 모드에서 SINGA을 실행할 때, [Mesos](http://mesos.apache.org/)와 [YARN](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop- yarn-site / YARN.html)과 같은 클러스터 관리자 이용하지 않는 경우를 말합니다.
+
+### Single 노드에서의 훈련
+
+하나의 프로세스가 출시됩니다.
+예를 들어,
+[CIFAR-10](http://www.cs.toronto.edu/~kriz/cifar.html) 데이터 세트를 이용하여
+[CNN 모델](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks)을 트레이닝시킵니다.
+하이퍼 파라미터는 [cuda-convnet](https://code.google.com/p/cuda-convnet/)에 따라 설정되어 있습니다.
+자세한 내용은 [CNN 샘플](cnn.html) 페이지를 참조하십시오.
+
+
+#### 데이터와 작업 설정
+
+데이터 세트 다운로드와 Triaing이나 Test를위한 데이터 샤드의 생성은 다음과 같이 실시합니다.
+
+    cd examples / cifar10 /
+    cp Makefile.example Makefile
+    make download
+    make create
+
+Training과 Test 데이터 세트는 각각 * cifar10-train-shard *
+그리고 * cifar10-test-shard * 폴더에 만들어집니다. 모든 이미지의 특징 평균을 묘사 한 * image_mean.bin * 파일도 생성됩니다.
+
+CNN 모델 학습에 필요한 소스 코드는 모든 SINGA에 포함되어 있습니다. 코드를 추가 할 필요가 없습니다.
+작업 설정 파일 (*job.conf*) 을 지정하여 스크립트 (*.. / .. / bin / singa-run.sh*)를 실행합니다.
+SINGA 코드를 변경하거나 추가 할 때, 프로그래밍 가이드 (programming-guide.html)를 참조하십시오.
+
+#### 병렬화 없이 트레이닝
+
+Cluster Topology의 기본값은 하나의 worker와 하나의 server가 있습니다.
+데이터와 신경망의 병렬 처리는되지 않습니다.
+
+훈련을 시작하려면 다음 스크립트를 실행합니다.
+
+    # goto top level folder
+    cd ../../
+    ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+
+현재 실행중인 작업의 목록을 보려면
+
+    ./bin/singa-console.sh list
+
+    JOB ID | NUM PROCS
+    ---------- | -----------
+    24 | 1
+
+작업 종료하려면
+
+    ./bin/singa-console.sh kill JOB_ID
+
+
+로그 및 작업 정보 * / tmp / singa-log * 폴더에 저장됩니다.
+* conf / singa.conf * 파일`log-dir`에서 변경 가능합니다.
+
+
+#### 비동기 병렬 트레이닝
+
+    # job.conf
+    ...
+    cluster {
+      nworker_groups : 2
+      nworkers_per_procs : 2
+      workspace : "examples/cifar10/"
+    }
+
+여러 worker 그룹을 출시함으로써
+In SINGA, 비동기 트레이닝 (architecture.html)을 수행 할 수 있습니다.
+예를 들어, *job.conf* 을 위와 같이 변경합니다.
+기본적으로 하나의 worker 그룹이 하나의 worker를 갖도록 설정되어 있습니다.
+위의 설정은 하나의 프로세스에 2 개의 worker가 설정되어 있기 때문에 2 개의 worker 그룹이 동일한 프로세스로 실행됩니다.
+결과 메모리 [Downpour (frameworks.html) 트레이닝 프레임 워크로 실행됩니다.
+
+사용자는 데이터의 분산을 신경 쓸 필요는 없습니다.
+랜덤 오프셋에 따라 각 worker 그룹에 데이터가 보내집니다.
+각 worker는 다른 데이터 파티션을 담당합니다.
+
+    # job.conf
+    ...
+    neuralnet {
+      layer {
+        ...
+        sharddata_conf {
+          random_skip : 5000
+        }
+      }
+      ...
+    }
+
+스크립트 실행 :
+
+    ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+#### 동기화 병렬 트레이닝
+
+    # job.conf
+    ...
+    cluster {
+      nworkers_per_group : 2
+      nworkers_per_procs : 2
+      workspace : "examples/cifar10/"
+    }
+
+하나의 worker 그룹으로 여러 worker를 실행하여 동기 트레이닝 (architecture.html)을 수행 할 수 있습니다.
+예를 들어, *job.conf* 파일을 위와 같이 변경합니다.
+위의 설정은 하나의 worker 그룹에 2 개의 worker가 설정되었습니다.
+worker 우리는 그룹 내에서 동기화합니다.
+이것은 메모리 [sandblaster (frameworks.html)로 실행됩니다.
+모델은 2 개의 worker로 분할됩니다. 각 레이어가 2 개의 worker로 보냅니다.
+배분 된 레이어는 원본 레이어와 기능은 같지만 특징 인스턴스의 수가`B / g`됩니다.
+여기서`B`는 미니밧찌 인스턴스의 숫자로`g`는 그룹의 worker 수 있습니다.
+[다른 체계 (neural-net.html)를 이용한 레이어 (신경망) 파티션 방법도 있습니다.
+
+다른 설정은 모두 "병렬화 없음"의 경우와 동일합니다.
+
+    ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+### 클러스터에서의 훈련
+
+클러스터 설정을 변경하여 위 트레이닝 프레임 워크의 확장합니다.
+
+    nworker_per_procs : 1
+
+모든 프로세스는 하나의 worker 스레드를 생성합니다.
+결과 worker 우리는 다른 프로세스 (노드)에서 생성됩니다.
+클러스터의 노드를 특정하려면 *SINGA_ROOT/conf/* 의 *hostfile* 의 설​​정이 필요합니다.
+
+e.g.,
+
+    logbase-a01
+    logbase-a02
+
+zookeeper location도 설정해야합니다.
+
+e.g.,
+
+    # conf/singa.conf
+    zookeeper_host : "logbase-a01"
+
+스크립트의 실행은 "Single 노드 트레이닝"과 동일합니다.
+
+    ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+## Mesos에서 실행
+
+* working * ...
+
+## 다음
+
+SINGA 의 코드 변경 및 추가에 대한 자세한 내용은 [프로그래밍 가이드](programming-guide.html)를 참조하십시오.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/rbm.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/rbm.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/rbm.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/rbm.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,365 @@
+# RBM Example
+
+---
+
+This example uses SINGA to train 4 RBM models and one auto-encoder model over the
+[MNIST dataset](http://yann.lecun.com/exdb/mnist/). The auto-encoder model is trained
+to reduce the dimensionality of the MNIST image feature. The RBM models are trained
+to initialize parameters of the auto-encoder model. This example application is
+from [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf).
+
+## Running instructions
+
+Running scripts are provided in *SINGA_ROOT/examples/rbm* folder.
+
+The MNIST dataset has 70,000 handwritten digit images. The
+[data preparation](data.html) page
+has details on converting this dataset into SINGA recognizable format. Users can
+simply run the following commands to download and convert the dataset.
+
+    # at SINGA_ROOT/examples/mnist/
+    $ cp Makefile.example Makefile
+    $ make download
+    $ make create
+
+The training is separated into two phases, namely pre-training and fine-tuning.
+The pre-training phase trains 4 RBMs in sequence,
+
+    # at SINGA_ROOT/
+    $ ./bin/singa-run.sh -conf examples/rbm/rbm1.conf
+    $ ./bin/singa-run.sh -conf examples/rbm/rbm2.conf
+    $ ./bin/singa-run.sh -conf examples/rbm/rbm3.conf
+    $ ./bin/singa-run.sh -conf examples/rbm/rbm4.conf
+
+The fine-tuning phase trains the auto-encoder by,
+
+    $ ./bin/singa-run.sh -conf examples/rbm/autoencoder.conf
+
+
+## Training details
+
+### RBM1
+
+<img src="../images/example-rbm1.png" align="center" width="200px"/>
+<span><strong>Figure 1 - RBM1.</strong></span>
+
+The neural net structure for training RBM1 is shown in Figure 1.
+The data layer and parser layer provides features for training RBM1.
+The visible layer (connected with parser layer) of RBM1 accepts the image feature
+(784 dimension). The hidden layer is set to have 1000 neurons (units).
+These two layers are configured as,
+
+    layer{
+      name: "RBMVis"
+      type: kRBMVis
+      srclayers:"mnist"
+      srclayers:"RBMHid"
+      rbm_conf{
+        hdim: 1000
+      }
+      param{
+        name: "w1"
+        init{
+          type: kGaussian
+          mean: 0.0
+          std: 0.1
+        }
+      }
+      param{
+        name: "b11"
+        init{
+          type: kConstant
+          value: 0.0
+        }
+      }
+    }
+
+    layer{
+      name: "RBMHid"
+      type: kRBMHid
+      srclayers:"RBMVis"
+      rbm_conf{
+        hdim: 1000
+      }
+      param{
+        name: "w1_"
+        share_from: "w1"
+      }
+      param{
+        name: "b12"
+        init{
+          type: kConstant
+          value: 0.0
+        }
+      }
+    }
+
+
+
+For RBM, the weight matrix is shared by the visible and hidden layers. For instance,
+`w1` is shared by `vis` and `hid` layers shown in Figure 1. In SINGA, we can configure
+the `share_from` field to enable [parameter sharing](param.html)
+as shown above for the param `w1` and `w1_`.
+
+[Contrastive Divergence](train-one-batch.html#contrastive-divergence)
+is configured as the algorithm for [TrainOneBatch](train-one-batch.html).
+Following Hinton's paper, we configure the [updating protocol](updater.html)
+as follows,
+
+    # Updater Configuration
+    updater{
+      type: kSGD
+      momentum: 0.2
+      weight_decay: 0.0002
+      learning_rate{
+        base_lr: 0.1
+        type: kFixed
+      }
+    }
+
+Since the parameters of RBM0 will be used to initialize the auto-encoder, we should
+configure the `workspace` field to specify a path for the checkpoint folder.
+For example, if we configure it as,
+
+    cluster {
+      workspace: "examples/rbm/rbm1/"
+    }
+
+Then SINGA will [checkpoint the parameters](checkpoint.html) into *examples/rbm/rbm1/*.
+
+### RBM1
+<img src="../images/example-rbm2.png" align="center" width="200px"/>
+<span><strong>Figure 2 - RBM2.</strong></span>
+
+Figure 2 shows the net structure of training RBM2.
+The visible units of RBM2 accept the output from the Sigmoid1 layer. The Inner1 layer
+is a  `InnerProductLayer` whose parameters are set to the `w1` and `b12` learned
+from RBM1.
+The neural net configuration is (with layers for data layer and parser layer omitted).
+
+    layer{
+      name: "Inner1"
+      type: kInnerProduct
+      srclayers:"mnist"
+      innerproduct_conf{
+        num_output: 1000
+      }
+      param{ name: "w1" }
+      param{ name: "b12"}
+    }
+
+    layer{
+      name: "Sigmoid1"
+      type: kSigmoid
+      srclayers:"Inner1"
+    }
+
+    layer{
+      name: "RBMVis"
+      type: kRBMVis
+      srclayers:"Sigmoid1"
+      srclayers:"RBMHid"
+      rbm_conf{
+        hdim: 500
+      }
+      param{
+        name: "w2"
+        ...
+      }
+      param{
+        name: "b21"
+        ...
+      }
+    }
+
+    layer{
+      name: "RBMHid"
+      type: kRBMHid
+      srclayers:"RBMVis"
+      rbm_conf{
+        hdim: 500
+      }
+      param{
+        name: "w2_"
+        share_from: "w2"
+      }
+      param{
+        name: "b22"
+        ...
+      }
+    }
+
+To load w0 and b02 from RBM0's checkpoint file, we configure the `checkpoint_path` as,
+
+    checkpoint_path: "examples/rbm/rbm1/checkpoint/step6000-worker0"
+    cluster{
+      workspace: "examples/rbm/rbm2"
+    }
+
+The workspace is changed for checkpointing `w2`, `b21` and `b22` into
+*examples/rbm/rbm2/*.
+
+### RBM3
+
+<img src="../images/example-rbm3.png" align="center" width="200px"/>
+<span><strong>Figure 3 - RBM3.</strong></span>
+
+Figure 3 shows the net structure of training RBM3. In this model, a layer with
+250 units is added as the hidden layer of RBM3. The visible units of RBM3
+accepts output from Sigmoid2 layer. Parameters of Inner1 and Innner2 are set to
+`w1,b12,w2,b22` which can be load from the checkpoint file of RBM2,
+i.e., "examples/rbm/rbm2/".
+
+### RBM4
+
+
+<img src="../images/example-rbm4.png" align="center" width="200px"/>
+<span><strong>Figure 4 - RBM4.</strong></span>
+
+Figure 4 shows the net structure of training RBM4. It is similar to Figure 3,
+but according to [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf), the hidden units of the
+top RBM (RBM4) have stochastic real-valued states drawn from a unit variance
+Gaussian whose mean is determined by the input from the RBM's logistic visible
+units. So we add a `gaussian` field in the RBMHid layer to control the
+sampling distribution (Gaussian or Bernoulli). In addition, this
+RBM has a much smaller learning rate (0.001).  The neural net configuration for
+the RBM4 and the updating protocol is (with layers for data layer and parser
+layer omitted),
+
+    # Updater Configuration
+    updater{
+      type: kSGD
+      momentum: 0.9
+      weight_decay: 0.0002
+      learning_rate{
+        base_lr: 0.001
+        type: kFixed
+      }
+    }
+
+    layer{
+      name: "RBMVis"
+      type: kRBMVis
+      srclayers:"Sigmoid3"
+      srclayers:"RBMHid"
+      rbm_conf{
+        hdim: 30
+      }
+      param{
+        name: "w4"
+        ...
+      }
+      param{
+        name: "b41"
+        ...
+      }
+    }
+
+    layer{
+      name: "RBMHid"
+      type: kRBMHid
+      srclayers:"RBMVis"
+      rbm_conf{
+        hdim: 30
+        gaussian: true
+      }
+      param{
+        name: "w4_"
+        share_from: "w4"
+      }
+      param{
+        name: "b42"
+        ...
+      }
+    }
+
+### Auto-encoder
+In the fine-tuning stage, the 4 RBMs are "unfolded" to form encoder and decoder
+networks that are initialized using the parameters from the previous 4 RBMs.
+
+<img src="../images/example-autoencoder.png" align="center" width="500px"/>
+<span><strong>Figure 5 - Auto-Encoders.</strong></span>
+
+
+Figure 5 shows the neural net structure for training the auto-encoder.
+[Back propagation (kBP)] (train-one-batch.html) is
+configured as the algorithm for `TrainOneBatch`. We use the same cluster
+configuration as RBM models. For updater, we use [AdaGrad](updater.html#adagradupdater) algorithm with
+fixed learning rate.
+
+    ### Updater Configuration
+    updater{
+      type: kAdaGrad
+      learning_rate{
+      base_lr: 0.01
+      type: kFixed
+      }
+    }
+
+
+
+According to [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf),
+we configure a EuclideanLoss layer to compute the reconstruction error. The neural net
+configuration is (with some of the middle layers omitted),
+
+    layer{ name: "data" }
+    layer{ name:"mnist" }
+    layer{
+      name: "Inner1"
+      param{ name: "w1" }
+      param{ name: "b12" }
+    }
+    layer{ name: "Sigmoid1" }
+    ...
+    layer{
+      name: "Inner8"
+      innerproduct_conf{
+        num_output: 784
+        transpose: true
+      }
+      param{
+        name: "w8"
+        share_from: "w1"
+      }
+      param{ name: "b11" }
+    }
+    layer{ name: "Sigmoid8" }
+
+    # Euclidean Loss Layer Configuration
+    layer{
+      name: "loss"
+      type:kEuclideanLoss
+      srclayers:"Sigmoid8"
+      srclayers:"mnist"
+    }
+
+To load pre-trained parameters from the 4 RBMs' checkpoint file we configure `checkpoint_path` as
+
+    ### Checkpoint Configuration
+    checkpoint_path: "examples/rbm/checkpoint/rbm1/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm2/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm3/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm4/checkpoint/step6000-worker0"
+
+
+## Visualization Results
+
+<div>
+<img src="../images/rbm-weight.PNG" align="center" width="300px"/>
+
+<img src="../images/rbm-feature.PNG" align="center" width="300px"/>
+<br/>
+<span><strong>Figure 6 - Bottom RBM weight matrix.</strong></span>
+&nbsp;
+&nbsp;
+&nbsp;
+&nbsp;
+
+<span><strong>Figure 7 - Top layer features.</strong></span>
+</div>
+
+Figure 6 visualizes sample columns of the weight matrix of RBM1, We can see the
+Gabor-like filters are learned. Figure 7 depicts the features extracted from
+the top-layer of the auto-encoder, wherein one point represents one image.
+Different colors represent different digits. We can see that most images are
+well clustered according to the ground truth.