You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@singa.apache.org by wa...@apache.org on 2016/01/13 04:46:20 UTC
svn commit: r1724348 [5/6] - in /incubator/singa/site/trunk/content/markdown/docs: ./ jp/ kr/

Added: incubator/singa/site/trunk/content/markdown/docs/kr/model-config.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/model-config.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/model-config.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/model-config.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,294 @@
+# Model Configuration
+
+---
+
+SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters
+of deep learning models.  For each SGD iteration, there is a
+[Worker](architecture.html) computing
+gradients of parameters from the NeuralNet and a [Updater]() updating parameter
+values based on gradients. Hence the model configuration mainly consists these
+three parts. We will introduce the NeuralNet, Worker and Updater in the
+following paragraphs and describe the configurations for them. All model
+configuration is specified in the model.conf file in the user provided
+workspace folder. E.g., the [cifar10 example folder](https://github.com/apache/incubator-singa/tree/master/examples/cifar10)
+has a model.conf file.
+
+
+## NeuralNet
+
+### Uniform model (neuralnet) representation
+
+<img src = "../images/model-categorization.png" style = "width: 400px"> Fig. 1:
+Deep learning model categorization</img>
+
+Many deep learning models have being proposed. Fig. 1 is a categorization of
+popular deep learning models based on the layer connections. The
+[NeuralNet](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h)
+abstraction of SINGA consists of multiple directly connected layers. This
+abstraction is able to represent models from all the three categorizations.
+
+  * For the feed-forward models, their connections are already directed.
+
+  * For the RNN models, we unroll them into directed connections, as shown in
+  Fig. 2.
+
+  * For the undirected connections in RBM, DBM, etc., we replace each undirected
+  connection with two directed connection, as shown in Fig. 3.
+
+<div style = "height: 200px">
+<div style = "float:left; text-align: center">
+<img src = "../images/unroll-rbm.png" style = "width: 280px"> <br/>Fig. 2: Unroll RBM </img>
+</div>
+<div style = "float:left; text-align: center; margin-left: 40px">
+<img src = "../images/unroll-rnn.png" style = "width: 550px"> <br/>Fig. 3: Unroll RNN </img>
+</div>
+</div>
+
+In specific, the NeuralNet class is defined in
+[neuralnet.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h) :
+
+    ...
+    vector<Layer*> layers_;
+    ...
+
+The Layer class is defined in
+[base_layer.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/base_layer.h):
+
+    vector<Layer*> srclayers_, dstlayers_;
+    LayerProto layer_proto_;  // layer configuration, including meta info, e.g., name
+    ...
+
+
+The connection with other layers are kept in the `srclayers_` and `dstlayers_`.
+Since there are many different feature transformations, there are many
+different Layer implementations correspondingly. For layers that have
+parameters in their feature transformation functions, they would have Param
+instances in the layer class, e.g.,
+
+    Param weight;
+
+
+### Configure the structure of a NeuralNet instance
+
+To train a deep learning model, the first step is to write the configurations
+for the model structure, i.e., the layers and connections for the NeuralNet.
+Like [Caffe](http://caffe.berkeleyvision.org/), we use the [Google Protocol
+Buffer](https://developers.google.com/protocol-buffers/) to define the
+configuration protocol. The
+[NetProto](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto)
+specifies the configuration fields for a NeuralNet instance,
+
+message NetProto {
+  repeated LayerProto layer = 1;
+  ...
+}
+
+The configuration is then
+
+    layer {
+      // layer configuration
+    }
+    layer {
+      // layer configuration
+    }
+    ...
+
+To configure the model structure, we just configure each layer involved in the model.
+
+    message LayerProto {
+      // the layer name used for identification
+      required string name = 1;
+      // source layer names
+      repeated string srclayers = 3;
+      // parameters, e.g., weight matrix or bias vector
+      repeated ParamProto param = 12;
+      // the layer type from the enum above
+      required LayerType type = 20;
+      // configuration for convolution layer
+      optional ConvolutionProto convolution_conf = 30;
+      // configuration for concatenation layer
+      optional ConcateProto concate_conf = 31;
+      // configuration for dropout layer
+      optional DropoutProto dropout_conf = 33;
+      ...
+    }
+
+A sample configuration for a feed-forward model is like
+
+    layer {
+      name : "input"
+      type : kRecordInput
+    }
+    layer {
+      name : "conv"
+      type : kInnerProduct
+      srclayers : "input"
+      param {
+        // configuration for parameter
+      }
+      innerproduct_conf {
+        // configuration for this specific layer
+      }
+      ...
+    }
+
+The layer type list is defined in
+[LayerType](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto).
+One type (kFoo) corresponds to one child class of Layer (FooLayer) and one
+configuration field (foo_conf). All built-in layers are introduced in the [layer page](layer.html).
+
+## Worker
+
+At the beginning, the Work will initialize the values of Param instances of
+each layer either randomly (according to user configured distribution) or
+loading from a [checkpoint file]().  For each training iteration, the worker
+visits layers of the neural network to compute gradients of Param instances of
+each layer. Corresponding to the three categories of models, there are three
+different algorithm to compute the gradients of a neural network.
+
+  1. Back-propagation (BP) for feed-forward models
+  2. Back-propagation through time (BPTT) for recurrent neural networks
+  3. Contrastive divergence (CD) for RBM, DBM, etc models.
+
+SINGA has provided these three algorithms as three Worker implementations.
+Users only need to configure in the model.conf file to specify which algorithm
+should be used. The configuration protocol is
+
+    message ModelProto {
+      ...
+      enum GradCalcAlg {
+      // BP algorithm for feed-forward models, e.g., CNN, MLP, RNN
+      kBP = 1;
+      // BPTT for recurrent neural networks
+      kBPTT = 2;
+      // CD algorithm for RBM, DBM etc., models
+      kCd = 3;
+      }
+      // gradient calculation algorithm
+      required GradCalcAlg alg = 8 [default = kBackPropagation];
+      ...
+    }
+
+These algorithms override the TrainOneBatch function of the Worker. E.g., the
+BPWorker implements it as
+
+    void BPWorker::TrainOneBatch(int step, Metric* perf) {
+      Forward(step, kTrain, train_net_, perf);
+      Backward(step, train_net_);
+    }
+
+The Forward function passes the raw input features of one mini-batch through
+all layers, and the Backward function visits the layers in reverse order to
+compute the gradients of the loss w.r.t each layer's feature and each layer's
+Param objects. Different algorithms would visit the layers in different orders.
+Some may traverses the neural network multiple times, e.g., the CDWorker's
+TrainOneBatch function is:
+
+    void CDWorker::TrainOneBatch(int step, Metric* perf) {
+      PostivePhase(step, kTrain, train_net_, perf);
+      NegativePhase(step, kTran, train_net_, perf);
+      GradientPhase(step, train_net_);
+    }
+
+Each `*Phase` function would visit all layers one or multiple times.
+All algorithms will finally call two functions of the Layer class:
+
+     /**
+      * Transform features from connected layers into features of this layer.
+      *
+      * @param phase kTrain, kTest, kPositive, etc.
+      */
+     virtual void ComputeFeature(Phase phase, Metric* perf) = 0;
+     /**
+      * Compute gradients for parameters (and connected layers).
+      *
+      * @param phase kTrain, kTest, kPositive, etc.
+      */
+     virtual void ComputeGradient(Phase phase) = 0;
+
+All [Layer implementations]() must implement the above two functions.
+
+
+## Updater
+
+Once the gradients of parameters are computed, the Updater will update
+parameter values.  There are many SGD variants for updating parameters, like
+[AdaDelta](http://arxiv.org/pdf/1212.5701v1.pdf),
+[AdaGrad](http://www.magicbroom.info/Papers/DuchiHaSi10.pdf),
+[RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf),
+[Nesterov](http://scholar.google.com/citations?view_op=view_citation&amp;hl=en&amp;user=DJ8Ep8YAAAAJ&amp;citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C)
+and SGD with momentum. The core functions of the Updater is
+
+    /**
+     * Update parameter values based on gradients
+     * @param step training step
+     * @param param pointer to the Param object
+     * @param grad_scale scaling factor for the gradients
+     */
+    void Update(int step, Param* param, float grad_scale=1.0f);
+    /**
+     * @param step training step
+     * @return the learning rate for this step
+     */
+    float GetLearningRate(int step);
+
+SINGA provides several built-in updaters and learning rate change methods.
+Users can configure them according to the UpdaterProto
+
+    message UpdaterProto {
+      enum UpdaterType{
+        // noraml SGD with momentum and weight decay
+        kSGD = 1;
+        // adaptive subgradient, http://www.magicbroom.info/Papers/DuchiHaSi10.pdf
+        kAdaGrad = 2;
+        // http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
+        kRMSProp = 3;
+        // Nesterov first optimal gradient method
+        kNesterov = 4;
+      }
+      // updater type
+      required UpdaterType type = 1 [default=kSGD];
+      // configuration for RMSProp algorithm
+      optional RMSPropProto rmsprop_conf = 50;
+
+      enum ChangeMethod {
+        kFixed = 0;
+        kInverseT = 1;
+        kInverse = 2;
+        kExponential = 3;
+        kLinear = 4;
+        kStep = 5;
+        kFixedStep = 6;
+      }
+      // change method for learning rate
+      required ChangeMethod lr_change= 2 [default = kFixed];
+
+      optional FixedStepProto fixedstep_conf=40;
+      ...
+      optional float momentum = 31 [default = 0];
+      optional float weight_decay = 32 [default = 0];
+      // base learning rate
+      optional float base_lr = 34 [default = 0];
+    }
+
+
+## Other model configuration fields
+
+Some other important configuration fields for training a deep learning model is
+listed:
+
+    // model name, e.g., "cifar10-dcnn", "mnist-mlp"
+    string name;
+    // displaying training info for every this number of iterations, default is 0
+    int32 display_freq;
+    // total num of steps/iterations for training
+    int32 train_steps;
+    // do test for every this number of training iterations, default is 0
+    int32 test_freq;
+    // run test for this number of steps/iterations, default is 0.
+    // The test dataset has test_steps * batchsize instances.
+    int32 test_steps;
+    // do checkpoint for every this number of training steps, default is 0
+    int32 checkpoint_freq;
+
+The pages of [checkpoint and restore](checkpoint.html) has details on checkpoint related fields.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/neural-net.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/neural-net.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/neural-net.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/neural-net.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,327 @@
+# Neural Net
+
+---
+
+`NeuralNet` in SINGA represents an instance of user's neural net model. As the
+neural net typically consists of a set of layers, `NeuralNet` comprises
+a set of unidirectionally connected [Layer](layer.html)s.
+This page describes how to convert an user's neural net into
+the configuration of `NeuralNet`.
+
+<img src="../images/model-category.png" align="center" width="200px"/>
+<span><strong>Figure 1 - Categorization of popular deep learning models.</strong></span>
+
+## Net structure configuration
+
+Users configure the `NeuralNet` by listing all layers of the neural net and
+specifying each layer's source layer names. Popular deep learning models can be
+categorized as Figure 1. The subsequent sections give details for each
+category.
+
+### Feed-forward models
+
+<div align = "left">
+<img src="../images/mlp-net.png" align="center" width="200px"/>
+<span><strong>Figure 2 - Net structure of a MLP model.</strong></span>
+</div>
+
+Feed-forward models, e.g., CNN and MLP, can easily get configured as their layer
+connections are undirected without circles. The
+configuration for the MLP model shown in Figure 1 is as follows,
+
+    net {
+      layer {
+        name : 'data"
+        type : kData
+      }
+      layer {
+        name : 'image"
+        type : kImage
+        srclayer: 'data'
+      }
+      layer {
+        name : 'label"
+        type : kLabel
+        srclayer: 'data'
+      }
+      layer {
+        name : 'hidden"
+        type : kHidden
+        srclayer: 'image'
+      }
+      layer {
+        name : 'softmax"
+        type : kSoftmaxLoss
+        srclayer: 'hidden'
+        srclayer: 'label'
+      }
+    }
+
+### Energy models
+
+<img src="../images/rbm-rnn.png" align="center" width="500px"/>
+<span><strong>Figure 3 - Convert connections in RBM and RNN.</strong></span>
+
+
+For energy models including RBM, DBM,
+etc., their connections are undirected (i.e., Category B). To represent these models using
+`NeuralNet`, users can simply replace each connection with two directed
+connections, as shown in Figure 3a. In other words, for each pair of connected layers, their source
+layer field should include each other's name.
+The full [RBM example](rbm.html) has
+detailed neural net configuration for a RBM model, which looks like
+
+    net {
+      layer {
+        name : "vis"
+        type : kVisLayer
+        param {
+          name : "w1"
+        }
+        srclayer: "hid"
+      }
+      layer {
+        name : "hid"
+        type : kHidLayer
+        param {
+          name : "w2"
+          share_from: "w1"
+        }
+        srclayer: "vis"
+      }
+    }
+
+### RNN models
+
+For recurrent neural networks (RNN), users can remove the recurrent connections
+by unrolling the recurrent layer.  For example, in Figure 3b, the original
+layer is unrolled into a new layer with 4 internal layers. In this way, the
+model is like a normal feed-forward model, thus can be configured similarly.
+The [RNN example](rnn.html) has a full neural net
+configuration for a RNN model.
+
+
+## Configuration for multiple nets
+
+Typically, a training job includes three neural nets for
+training, validation and test phase respectively. The three neural nets share most
+layers except the data layer, loss layer or output layer, etc..  To avoid
+redundant configurations for the shared layers, users can uses the `exclude`
+filed to filter a layer in the neural net, e.g., the following layer will be
+filtered when creating the testing `NeuralNet`.
+
+
+    layer {
+      ...
+      exclude : kTest # filter this layer for creating test net
+    }
+
+
+
+## Neural net partitioning
+
+A neural net can be partitioned in different ways to distribute the training
+over multiple workers.
+
+### Batch and feature dimension
+
+<img src="../images/partition_fc.png" align="center" width="400px"/>
+<span><strong>Figure 4 - Partitioning of a fully connected layer.</strong></span>
+
+
+Every layer's feature blob is considered a matrix whose rows are feature
+vectors. Thus, one layer can be split on two dimensions. Partitioning on
+dimension 0 (also called batch dimension) slices the feature matrix by rows.
+For instance, if the mini-batch size is 256 and the layer is partitioned into 2
+sub-layers, each sub-layer would have 128 feature vectors in its feature blob.
+Partitioning on this dimension has no effect on the parameters, as every
+[Param](param.html) object is replicated in the sub-layers. Partitioning on dimension
+1 (also called feature dimension) slices the feature matrix by columns. For
+example, suppose the original feature vector has 50 units, after partitioning
+into 2 sub-layers, each sub-layer would have 25 units. This partitioning may
+result in [Param](param.html) object being split, as shown in
+Figure 4. Both the bias vector and weight matrix are
+partitioned into two sub-layers.
+
+
+### Partitioning configuration
+
+There are 4 partitioning schemes, whose configurations are give below,
+
+  1. Partitioning each singe layer into sub-layers on batch dimension (see
+  below). It is enabled by configuring the partition dimension of the layer to
+  0, e.g.,
+
+          # with other fields omitted
+          layer {
+            partition_dim: 0
+          }
+
+  2. Partitioning each singe layer into sub-layers on feature dimension (see
+  below).  It is enabled by configuring the partition dimension of the layer to
+  1, e.g.,
+
+          # with other fields omitted
+          layer {
+            partition_dim: 1
+          }
+
+  3. Partitioning all layers into different subsets. It is enabled by
+  configuring the location ID of a layer, e.g.,
+
+          # with other fields omitted
+          layer {
+            location: 1
+          }
+          layer {
+            location: 0
+          }
+
+
+  4. Hybrid partitioning of strategy 1, 2 and 3. The hybrid partitioning is
+  useful for large models. An example application is to implement the
+  [idea proposed by Alex](http://arxiv.org/abs/1404.5997).
+  Hybrid partitioning is configured like,
+
+          # with other fields omitted
+          layer {
+            location: 1
+          }
+          layer {
+            location: 0
+          }
+          layer {
+            partition_dim: 0
+            location: 0
+          }
+          layer {
+            partition_dim: 1
+            location: 0
+          }
+
+Currently SINGA supports strategy-2 well. Other partitioning strategies are
+are under test and will be released in later version.
+
+## Parameter sharing
+
+Parameters can be shared in two cases,
+
+  * sharing parameters among layers via user configuration. For example, the
+  visible layer and hidden layer of a RBM shares the weight matrix, which is configured through
+  the `share_from` field as shown in the above RBM configuration. The
+  configurations must be the same (except name) for shared parameters.
+
+  * due to neural net partitioning, some `Param` objects are replicated into
+  different workers, e.g., partitioning one layer on batch dimension. These
+  workers share parameter values. SINGA controls this kind of parameter
+  sharing automatically, users do not need to do any configuration.
+
+  * the `NeuralNet` for training and testing (and validation) share most layers
+  , thus share `Param` values.
+
+If the shared `Param` instances resident in the same process (may in different
+threads), they use the same chunk of memory space for their values. But they
+would have different memory spaces for their gradients. In fact, their
+gradients will be averaged by the stub or server.
+
+## Advanced user guide
+
+### Creation
+
+    static NeuralNet* NeuralNet::Create(const NetProto& np, Phase phase, int num);
+
+The above function creates a `NeuralNet` for a given phase, and returns a
+pointer to the `NeuralNet` instance. The phase is in {kTrain,
+kValidation, kTest}. `num` is used for net partitioning which indicates the
+number of partitions.  Typically, a training job includes three neural nets for
+training, validation and test phase respectively. The three neural nets share most
+layers except the data layer, loss layer or output layer, etc.. The `Create`
+function takes in the full net configuration including layers for training,
+validation and test.  It removes layers for phases other than the specified
+phase based on the `exclude` field in
+[layer configuration](layer.html):
+
+    layer {
+      ...
+      exclude : kTest # filter this layer for creating test net
+    }
+
+The filtered net configuration is passed to the constructor of `NeuralNet`:
+
+    NeuralNet::NeuralNet(NetProto netproto, int npartitions);
+
+The constructor creates a graph representing the net structure firstly in
+
+    Graph* NeuralNet::CreateGraph(const NetProto& netproto, int npartitions);
+
+Next, it creates a layer for each node and connects layers if their nodes are
+connected.
+
+    void NeuralNet::CreateNetFromGraph(Graph* graph, int npartitions);
+
+Since the `NeuralNet` instance may be shared among multiple workers, the
+`Create` function returns a pointer to the `NeuralNet` instance .
+
+### Parameter sharing
+
+ `Param` sharing
+is enabled by first sharing the Param configuration (in `NeuralNet::Create`)
+to create two similar (e.g., the same shape) Param objects, and then calling
+(in `NeuralNet::CreateNetFromGraph`),
+
+    void Param::ShareFrom(const Param& from);
+
+It is also possible to share `Param`s of two nets, e.g., sharing parameters of
+the training net and the test net,
+
+    void NeuralNet:ShareParamsFrom(NeuralNet* other);
+
+It will call `Param::ShareFrom` for each Param object.
+
+### Access functions
+`NeuralNet` provides a couple of access function to get the layers and params
+of the net:
+
+    const std::vector<Layer*>& layers() const;
+    const std::vector<Param*>& params() const ;
+    Layer* name2layer(string name) const;
+    Param* paramid2param(int id) const;
+
+
+### Partitioning
+
+
+#### Implementation
+
+SINGA partitions the neural net in `CreateGraph` function, which creates one
+node for each (partitioned) layer. For example, if one layer's partition
+dimension is 0 or 1, then it creates `npartition` nodes for it; if the
+partition dimension is -1, a single node is created, i.e., no partitioning.
+Each node is assigned a partition (or location) ID. If the original layer is
+configured with a location ID, then the ID is assigned to each newly created node.
+These nodes are connected according to the connections of the original layers.
+Some connection layers will be added automatically.
+For instance, if two connected sub-layers are located at two
+different workers, then a pair of bridge layers is inserted to transfer the
+feature (and gradient) blob between them. When two layers are partitioned on
+different dimensions, a concatenation layer which concatenates feature rows (or
+columns) and a slice layer which slices feature rows (or columns) would be
+inserted. These connection layers help making the network communication and
+synchronization transparent to the users.
+
+#### Dispatching partitions to workers
+
+Each (partitioned) layer is assigned a location ID, based on which it is dispatched to one
+worker. Particularly, the pointer to the `NeuralNet` instance is passed
+to every worker within the same group, but each worker only computes over the
+layers that have the same partition (or location) ID as the worker's ID.  When
+every worker computes the gradients of the entire model parameters
+(strategy-2), we refer to this process as data parallelism.  When different
+workers compute the gradients of different parameters (strategy-3 or
+strategy-1), we call this process model parallelism.  The hybrid partitioning
+leads to hybrid parallelism where some workers compute the gradients of the
+same subset of model parameters while other workers compute on different model
+parameters.  For example, to implement the hybrid parallelism in for the
+[DCNN model](http://arxiv.org/abs/1404.5997), we set `partition_dim = 0` for
+lower layers and `partition_dim = 1` for higher layers.
+

Added: incubator/singa/site/trunk/content/markdown/docs/kr/neuralnet-partition.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/neuralnet-partition.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/neuralnet-partition.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/neuralnet-partition.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,54 @@
+# Neural Net Partition
+
+---
+
+The purposes of partitioning neural network is to distribute the partitions onto
+different working units (e.g., threads or nodes, called workers in this article)
+and parallelize the processing.
+Another reason for partition is to handle large neural network which cannot be
+hold in a single node. For instance, to train models against images with high
+resolution we need large neural networks (in terms of training parameters).
+
+Since *Layer* is the first class citizen in SIGNA, we do the partition against
+layers. Specifically, we support partitions at two levels. First, users can configure
+the location (i.e., worker ID) of each layer. In this way, users assign one worker
+for each layer. Secondly, for one layer, we can partition its neurons or partition
+the instances (e.g, images). They are called layer partition and data partition
+respectively. We illustrate the two types of partitions using an simple convolutional neural network.
+
+<img src="../images/conv-mnist.png" style="width: 220px"/>
+
+The above figure shows a convolutional neural network without any partition. It
+has 8 layers in total (one rectangular represents one layer). The first layer is
+DataLayer (data) which reads data from local disk files/databases (or HDFS). The second layer
+is a MnistLayer which parses the records from MNIST data to get the pixels of a batch
+of 8 images (each image is of size 28x28). The LabelLayer (label) parses the records to get the label
+of each image in the batch. The ConvolutionalLayer (conv1) transforms the input image to the
+shape of 8x27x27. The ReLULayer (relu1) conducts elementwise transformations. The PoolingLayer (pool1)
+sub-samples the images. The fc1 layer is fully connected with pool1 layer. It
+mulitplies each image with a weight matrix to generate a 10 dimension hidden feature which
+is then normalized by a SoftmaxLossLayer to get the prediction.
+
+<img src="../images/conv-mnist-datap.png" style="width: 1000px"/>
+
+The above figure shows the convolutional neural network after partitioning all layers
+except the DataLayer and ParserLayers, into 3 partitions using data partition.
+The read layers process 4 images of the batch, the black and blue layers process 2 images
+respectively. Some helper layers, i.e., SliceLayer, ConcateLayer, BridgeSrcLayer,
+BridgeDstLayer and SplitLayer, are added automatically by our partition algorithm.
+Layers of the same color resident in the same worker. There would be data transferring
+across different workers at the boundary layers (i.e., BridgeSrcLayer and BridgeDstLayer),
+e.g., between s-slice-mnist-conv1 and d-slice-mnist-conv1.
+
+<img src="../images/conv-mnist-layerp.png" style="width: 1000px"/>
+
+The above figure shows the convolutional neural network after partitioning all layers
+except the DataLayer and ParserLayers, into 2 partitions using layer partition. We can
+see that each layer processes all 8 images from the batch. But different partitions process
+different part of one image. For instance, the layer conv1-00 process only 4 channels. The other
+4 channels are processed by conv1-01 which residents in another worker.
+
+
+Since the partition is done at the layer level, we can apply different partitions for
+different layers to get a hybrid partition for the whole neural network. Moreover,
+we can also specify the layer locations to locate different layers to different workers.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/overview.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/overview.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/overview.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/overview.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,67 @@
+# ê°ì
+
+---
+
+SINGAë ëê·ëª¨ ë°ì´í° ë¶ìì ìí ë¥ë¬ë ëª¨ë¸ì í¸ë ì´ëì ëª©ì ì¼ë¡ í "ë¶ì° ë¥ ë¬ë íë«í¼" ìëë¤.
+ëª¨ë¸ì´ ëë ë´ë´ë¤í¸ìí¬ì "ë ì´ì´"ê°ëì ë°ë¼ ì§ê´ì ì¼ë¡ íë¡ê·¸ëë°ì í  ì ìëë¡ ëìì¸ëì´ ììµëë¤.
+
+* Convolutional Neural Network ì ê°ì í¼ëí¬ìë ë¤í¸ìí¬ì Restricted Boltzmann Machine ê³¼ ê°ì ìëì§ ëª¨ë¸, Recurrent Neural Network ëª¨ë¸ ë± ë¤ìí ëª¨ë¸ì ì§ìí©ëë¤.
+
+* ë¤ìí "ë ì´ì´"ê° Built-in Layerë¡ ì¤ë¹ëì´ ììµëë¤.
+
+* SINGA ìí¤íì²ë synchronous (ëê¸°), asynchronous (ë¹ëê¸°), ê·¸ë¦¬ê³  hybrid (íì´ë¸ë¦¬ë) í¸ë ì´ëì í  ì ìëë¡ ì¤ê³ëì´ ììµëë¤.
+
+* ëí ëª¨ë¸ì í¸ë ì´ëì ë³ë ¬ííë ë¤ìí partition ì¤í´ (ë°°ì¹ ë° í¹ì§ ë¶í )ì ì§ìí©ëë¤.
+
+
+## ëª©ì 
+
+íì¥ì± : ë¶ì° ìì¤íì¼ë¡ì¨ ë ë§ì ììì ì´ì©íì¬ í¹ì  ì ë°ëì ëë¬ í  ëê¹ì§ í¸ë ì´ë ìëë¥¼ í¥ììí¨ë¤.
+
+ì ì©ì± : ëê·ëª¨ ë¶ì° ëª¨ë¸ì í¨ì¨ì ì¸ í¸ë ì´ëì íìí ë°ì´í°ì ëª¨ë¸ì ë¶í , ë¤í¸ìí¬ íµì ë± íë¡ê·¸ëë¨¸ì ììì ë¨ìííê³ , ë³µì¡í ëª¨ë¸ ë° ìê³ ë¦¬ì¦ì êµ¬ì¶ì ì½ê² íë¤.
+
+
+## ì¤ê³ ì´ë
+
+íì¥ì±ì ë¶ì° ë¥ë¬ëìì ì¤ìí ì°êµ¬ ê³¼ì ìëë¤.
+SINGAë ë¤ìí í¸ë ì´ë íë ììí¬ì íì¥ì±ì ì ì§í  ì ìëë¡ ì¤ê³ëì´ ììµëë¤.
+* Synchronous (ëê¸°) : í¸ë ì´ëì 1ë¨ê³ìì ì»ì ììë í¨ê³¼ë¥¼ ëìëë¤.
+* Asynchronous (ë¹ëê¸°) : í¸ë ì´ëì ìë ´ ìëë¥¼ í¥ììíµëë¤.
+* Hybrid (íì´ë¸ë¦¬ë) : ì½ì¤í¸ ë° ë¦¬ìì¤ (í´ë¬ì¤í° í¬ê¸° ë±)ì ë§ë í¨ê³¼ì ìë ´ ìëì ê· íì ì¡ê³  íì¥ì±ì í¥ììíµëë¤.
+
+SINGAë ë¥ë¬ë ëª¨ë¸ì ë¤í¸ìí¬ "ë ì´ì´" ê°ëì ë°ë¼ ì§ê´ì ì¼ë¡ íë¡ê·¸ëë°ì í  ì ìëë¡ ëìì¸ëì´ ììµëë¤. ë¤ìí ëª¨ë¸ì ì½ê² êµ¬ì¶íê³  í¸ë ì´ë í  ì ììµëë¤.
+
+## ìì¤í ê°ì
+
+<img src = "../ images / sgd.png"align = "center"width = "400px"/>
+<span> <strong> Figure 1 - SGD íë¦ </strong> </span>
+
+"ë¥ë¬ë ëª¨ë¸ì íìµíë¤"ë ê²ì í¹ì  ìì (ë¶ë¥, ìì¸¡ ë±)ì ë¬ì±íê¸° ìí´ ì¬ì©ëë í¹ì§ë(feature)ì ìì±íë ë³í í¨ìì ìµì ì íë¼ë¯¸í°ë¥¼ ì°¾ë ê²ìëë¤.
+ë³ìì ì¢ê³  ëì¨ì, Cross-Entropy Loss (https://en.wikipedia.org/wiki/Cross_entropy) ë±ì loss function (ìì¤ í¨ì)ìì íì¸í©ëë¤. ì´ í¨ìë ì¼ë°ì ì¼ë¡ ë¹ì í ëë ë¹ ë³¼ë¡ í¨ìì´ë¯ë¡ éè§£ì ì°¾ê¸°ê° ì´ë µìµëë¤.
+
+ê·¸ëì Stochastic Gradient Descent (íë¥ ì êµ¬ë°°ê°íë²)ì ì´ì©í©ëë¤.
+Figure 1ê³¼ ê°ì´ ë¬´ììë¡ ì´ê¸°í ë ë§¤ê° ë³ìì ê°ì ìì¤ í¨ìê° ìì ì§ëë¡ ë°ë³µ ìë°ì´í¸íê³  ììµëë¤.
+
+<img src = "../ images / overview.png"align = "center"width = "400px"/>
+<span> <strong> Figure 2 - SINGA ê°ì </strong> </span>
+
+í¸ë ì´ëì íìí ìí¬ë¡ëë workersì serversì ë¶ì°ë©ëë¤. Figure 2ì ê°ì´ ë£¨íë§ë¤ workersë *TrainOneBatch* í¨ìë¥¼ í¸ì¶ ë§¤ê° ë³ì ê¸°ì¸ê¸°ë¥¼ ê³ì°í©ëë¤.
+*TrainOneBatch* ì ê²½ë§ì êµ¬ì¡°ê° ê¸°ì  ë *NeuralNet* ê°ì²´ì ë°ë¼ "ë ì´ì´"ë¥¼ ì°¨ë¡ë¡ ëë¬ë³´ê³  ììµëë¤.
+ê³ì° ë ê²½ì¬ë ë¡ì»¬ ë¸ëì stubì ë³´ë´ì ¸ ì§ê³ ë í í´ë¹ serversì ë³´ë´ì§ëë¤. Serversë ìë°ì´í¸ ë ë§¤ê° ë³ìë¥¼ workersë¡ ì ì¡ ë¤ì ë£¨íë¥¼ ì¤íí©ëë¤.
+
+
+## Job
+
+SINGAìì "Job"ì ë´ë´ë¤í¸ìí¬ ëª¨ë¸ê³¼ ë°ì´í° í¸ë ì´ë ë°©ë², í´ë¬ì¤í° í í´ë¡ì§ ë±ì´ ê¸°ì  ë "Job Configuration"ì ë§í©ëë¤.
+Job configurationì Figure 2ì ê·¸ë ¤ì§ ë¤ìì 4 ê°ì§ ììë¥¼ ê°ì§ëë¤.
+
+Â Â * [NeuralNet (neural-net.html) : ë´ë´ë¤í¸ìí¬ì êµ¬ì¡°ì ê° "ë ì´ì´"ì ì¤ì ì ê¸°ì í©ëë¤.
+Â Â * [TrainOneBatch (train-one-batch.html) : ëª¨ë¸ ì¹´íê³ ë¦¬ì ì í©í ìê³ ë¦¬ì¦ì ê¸°ì í©ëë¤.
+Â Â * [Updater] (updater.html) : serverìì ë§¤ê° ë³ìë¥¼ ìë°ì´í¸íë ë°©ë²ì ê¸°ì í©ëë¤.
+Â Â * [Cluster Topology (distributed-training.html) : workersì servers ë¶ì° í í´ë¡ì§ë¥¼ ê¸°ì í©ëë¤.
+
+[main í¨ì (programming-guide.html)ì SINGA ëë¼ì´ë²ë¡ ììì ì ë¬í©ëë¤.
+
+ì´ íë¡ì¸ì¤ë Hadoopììì Job ìë¸ë¯¸ìê³¼ ë¹ì·í©ëë¤.
+ì ì ê° main í¨ììì ìì ì¤ì ì í©ëë¤.
+Hadoop ì ì ë ìì ì mapperì reducerë¥¼ ì¤ì íì§ë§ SINGA ììë ì ì ì "ë ì´ì´"ë Updater ë±ì ì¤ì í©ëë¤.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/param.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/param.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/param.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/param.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,226 @@
+# Parameters
+
+---
+
+A `Param` object in SINGA represents a set of parameters, e.g., a weight matrix
+or a bias vector. *Basic user guide* describes how to configure for a `Param`
+object, and *Advanced user guide* provides details on implementing users'
+parameter initialization methods.
+
+## Basic user guide
+
+The configuration of a Param object is inside a layer configuration, as the
+`Param` are associated with layers. An example configuration is like
+
+    layer {
+      ...
+      param {
+        name : "p1"
+        init {
+          type : kConstant
+          value: 1
+        }
+      }
+    }
+
+The [SGD algorithm](overview.html) starts with initializing all
+parameters according to user specified initialization method (the `init` field).
+For the above example,
+all parameters in `Param` "p1" will be initialized to constant value 1. The
+configuration fields of a Param object is defined in [ParamProto](../api/classsinga_1_1ParamProto.html):
+
+  * name, an identifier string. It is an optional field. If not provided, SINGA
+  will generate one based on layer name and its order in the layer.
+  * init, field for setting initialization methods.
+  * share_from, name of another `Param` object, from which this `Param` will share
+  configurations and values.
+  * lr_scale, float value to be multiplied with the learning rate when
+  [updating the parameters](updater.html)
+  * wd_scale, float value to be multiplied with the weight decay when
+  [updating the parameters](updater.html)
+
+There are some other fields that are specific to initialization methods.
+
+### Initialization methods
+
+Users can set the `type` of `init` use the following built-in initialization
+methods,
+
+  * `kConst`, set all parameters of the Param object to a constant value
+
+        type: kConst
+        value: float  # default is 1
+
+  * `kGaussian`, initialize the parameters following a Gaussian distribution.
+
+        type: kGaussian
+        mean: float # mean of the Gaussian distribution, default is 0
+        std: float # standard variance, default is 1
+        value: float # default 0
+
+  * `kUniform`, initialize the parameters following an uniform distribution
+
+        type: kUniform
+        low: float # lower boundary, default is -1
+        high: float # upper boundary, default is 1
+        value: float # default 0
+
+  * `kGaussianSqrtFanIn`, initialize `Param` objects with two dimensions (i.e.,
+  matrix) using `kGaussian` and then
+  multiple each parameter with `1/sqrt(fan_in)`, where`fan_in` is the number of
+  columns of the matrix.
+
+  * `kUniformSqrtFanIn`, the same as `kGaussianSqrtFanIn` except that the
+  distribution is an uniform distribution.
+
+  * `kUniformFanInOut`, initialize matrix `Param` objects using `kUniform` and then
+  multiple each parameter with `sqrt(6/(fan_in + fan_out))`, where`fan_in +
+  fan_out` sums up the number of columns and rows of the matrix.
+
+For all above initialization methods except `kConst`, if their `value` is not
+1, every parameter will be multiplied with `value`. Users can also implement
+their own initialization method following the *Advanced user guide*.
+
+
+## Advanced user guide
+
+This sections describes the details on implementing new parameter
+initialization methods.
+
+### Base ParamGenerator
+All initialization methods are implemented as
+subclasses of the base `ParamGenerator` class.
+
+    class ParamGenerator {
+     public:
+      virtual void Init(const ParamGenProto&);
+      void Fill(Param*);
+
+     protected:
+      ParamGenProto proto_;
+    };
+
+Configurations of the initialization method is in `ParamGenProto`. The `Fill`
+function fills the `Param` object (passed in as an argument).
+
+### New ParamGenerator subclass
+
+Similar to implement a new Layer subclass, users can define a configuration
+protocol message,
+
+    # in user.proto
+    message FooParamProto {
+      optional int32 x = 1;
+    }
+    extend ParamGenProto {
+      optional FooParamProto fooparam_conf =101;
+    }
+
+The configuration of `Param` would be
+
+    param {
+      ...
+      init {
+        user_type: 'FooParam" # must use user_type for user defined methods
+        [fooparam_conf] { # must use brackets for configuring user defined messages
+          x: 10
+        }
+      }
+    }
+
+The subclass could be declared as,
+
+    class FooParamGen : public ParamGenerator {
+     public:
+      void Fill(Param*) override;
+    };
+
+Users can access the configuration fields in `Fill` by
+
+    int x = proto_.GetExtension(fooparam_conf).x();
+
+To use the new initialization method, users need to register it in the
+[main function](programming-guide.html).
+
+    driver.RegisterParamGenerator<FooParamGen>("FooParam")  # must be consistent with the user_type in configuration
+
+{% comment %}
+### Base Param class
+
+### Members
+
+    int local_version_;
+    int slice_start_;
+    vector<int> slice_offset_, slice_size_;
+
+    shared_ptr<Blob<float>> data_;
+    Blob<float> grad_;
+    ParamProto proto_;
+
+Each Param object has a local version and a global version (inside the data
+Blob). These two versions are used for synchronization. If multiple Param
+objects share the same values, they would have the same `data_` field.
+Consequently, their global version is the same. The global version is updated
+by [the stub thread](communication.html). The local version is
+updated in `Worker::Update` function which assigns the global version to the
+local version. The `Worker::Collect` function is blocked until the global
+version is larger than the local version, i.e., when `data_` is updated. In
+this way, we synchronize workers sharing parameters.
+
+In Deep learning models, some Param objects are 100 times larger than others.
+To ensure the load-balance among servers, SINGA slices large Param objects. The
+slicing information is recorded by `slice_*`. Each slice is assigned a unique
+ID starting from 0. `slice_start_` is the ID of the first slice of this Param
+object. `slice_offset_[i]` is the offset of the i-th slice in this Param
+object. `slice_size_[i]` is the size of the i-th slice. These slice information
+is used to create messages for transferring parameter values or gradients to
+different servers.
+
+Each Param object has a `grad_` field for gradients. Param objects do not share
+this Blob although they may share `data_`.  Because each layer containing a
+Param object would contribute gradients. E.g., in RNN, the recurrent layers
+share parameters values, and the gradients used for updating are averaged from all recurrent
+these recurrent layers. In SINGA, the stub thread will aggregate local
+gradients for the same Param object. The server will do a global aggregation
+of gradients for the same Param object.
+
+The `proto_` field has some meta information, e.g., name and ID. It also has a
+field called `owner` which is the ID of the Param object that shares parameter
+values with others.
+
+### Functions
+The base Param class implements two sets of functions,
+
+    virtual void InitValues(int version = 0);  // initialize values according to `init_method`
+    void ShareFrom(const Param& other);  // share `data_` from `other` Param
+    --------------
+    virtual Msg* GenGetMsg(bool copy, int slice_idx);
+    virtual Msg* GenPutMsg(bool copy, int slice_idx);
+    ... // other message related functions.
+
+Besides the functions for processing the parameter values, there is a set of
+functions for generating and parsing messages. These messages are for
+transferring parameter values or gradients between workers and servers. Each
+message corresponds to one Param slice. If `copy` is false, it means the
+receiver of this message is in the same process as the sender. In such case,
+only pointers to the memory of parameter value (or gradient) are wrapped in
+the message; otherwise, the parameter values (or gradients) should be copied
+into the message.
+
+
+## Implementing Param subclass
+Users can extend the base Param class to implement their own parameter
+initialization methods and message transferring protocols. Similar to
+implementing a new Layer subclasses, users can create google protocol buffer
+messages for configuring the Param subclass. The subclass, denoted as FooParam
+should be registered in main.cc,
+
+    dirver.RegisterParam<FooParam>(kFooParam);  // kFooParam should be different to 0, which is for the base Param type
+
+
+  * type, an integer representing the `Param` type. Currently SINGA provides one
+    `Param` implementation with type 0 (the default type). If users want
+    to use their own Param implementation, they should extend the base Param
+    class and configure this field with `kUserParam`
+
+{% endcomment %}

Added: incubator/singa/site/trunk/content/markdown/docs/kr/programmer-guide.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/programmer-guide.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/programmer-guide.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/programmer-guide.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,98 @@
+# Programmer Guide
+
+---
+
+To submit a training job, users must provide the configuration of the
+four components shown in Figure 1:
+
+  * a [NeuralNet](neural-net.html) describing the neural net structure with the detailed layer setting and their connections;
+  * a [TrainOneBatch](train-one-batch.html) algorithm which is tailored for different model categories;
+  * an [Updater](updater.html) defining the protocol for updating parameters at the server side;
+  * a [Cluster Topology](distributed-training.html) specifying the distributed architecture of workers and servers.
+
+The *Basic user guide* section describes how to submit a training job using
+built-in components; while the *Advanced user guide* section presents details
+on writing user's own main function to register components implemented by
+themselves. In addition, the training data must be prepared, which has the same
+[process](data.html) for both advanced users and basic users.
+
+<img src="../images/overview.png" align="center" width="400px"/>
+<span><strong>Figure 1 - SINGA overview.</strong></span>
+
+
+
+## Basic user guide
+
+Users can use the default main function provided by SINGA to submit the training
+job. For this case, a job configuration file written as a google protocol
+buffer message for the [JobProto](../api/classsinga_1_1JobProto.html) must be provided in the command line,
+
+    ./bin/singa-run.sh -conf <path to job conf> [-resume] [-test]
+
+* `-resume` is for continuing the training from last [checkpoint](checkpoint.html).
+* `-test` is for testing the performance of previously trained model and extracting features for new data,
+more details are available [here](test.html).
+
+The [MLP](mlp.html) and [CNN](cnn.html)
+examples use built-in components. Please read the corresponding pages for their
+job configuration files. The subsequent pages will illustrate the details on
+each component of the configuration.
+
+## Advanced user guide
+
+If a user's model contains some user-defined components, e.g.,
+[Updater](updater.html), he has to write a main function to
+register these components. It is similar to Hadoop's main function. Generally,
+the main function should
+
+  * initialize SINGA, e.g., setup logging.
+
+  * register user-defined components.
+
+  * create and pass the job configuration to SINGA driver
+
+An example main function is like
+
+    #include <string>
+    #include "singa.h"
+    #include "user.h"  // header for user code
+
+    int main(int argc, char** argv) {
+      singa::Driver driver;
+      driver.Init(argc, argv);
+      bool resume;
+      // parse resume option from argv.
+
+      // register user defined layers
+      driver.RegisterLayer<FooLayer, std::string>("kFooLayer");
+      // register user defined updater
+      driver.RegisterUpdater<FooUpdater, std::string>("kFooUpdater");
+      ...
+      auto jobConf = driver.job_conf();
+      //  update jobConf
+
+      driver.Submit(resume, jobConf);
+      return 0;
+    }
+
+The Driver class' `Init` method will load a job configuration file provided by
+users as a command line argument (`-conf <job conf>`). It contains at least the
+cluster topology and returns the `jobConf` for users to update or fill in
+configurations of neural net, updater, etc. If users define subclasses of
+Layer, Updater, Worker and Param, they should register them through the driver.
+Finally, the job configuration is submitted to the driver which starts the
+training.
+
+We will provide helper functions to make the configuration easier in the
+future, like [keras](https://github.com/fchollet/keras).
+
+Users need to compile and link their code (e.g., layer implementations and the main
+file) with SINGA library (*.libs/libsinga.so*) to generate an
+executable file, e.g., with name *mysinga*.  To launch the program, users just pass the
+path of the *mysinga* and base job configuration to *./bin/singa-run.sh*.
+
+    ./bin/singa-run.sh -conf <path to job conf> -exec <path to mysinga> [other arguments]
+
+The [RNN application](rnn.html) provides a full example of
+implementing the main function for training a specific RNN model.
+

Added: incubator/singa/site/trunk/content/markdown/docs/kr/programming-guide.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/programming-guide.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/programming-guide.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/programming-guide.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,95 @@
+# Programming Guide
+
+---
+
+To submit a training job, users must provide the configuration of the
+four components shown in Figure 1:
+
+  * a [NeuralNet](neural-net.html) describing the neural net structure with the detailed layer setting and their connections;
+  * a [TrainOneBatch](train-one-batch.html) algorithm which is tailored for different model categories;
+  * an [Updater](updater.html) defining the protocol for updating parameters at the server side;
+  * a [Cluster Topology](distributed-training.html) specifying the distributed architecture of workers and servers.
+
+The *Basic user guide* section describes how to submit a training job using
+built-in components; while the *Advanced user guide* section presents details
+on writing user's own main function to register components implemented by
+themselves. In addition, the training data must be prepared, which has the same
+[process](data.html) for both advanced users and basic users.
+
+<img src="../images/overview.png" align="center" width="400px"/>
+<span><strong>Figure 1 - SINGA overview.</strong></span>
+
+
+
+## Basic user guide
+
+Users can use the default main function provided SINGA to submit the training
+job. For this case, a job configuration file written as a google protocol
+buffer message for the [JobProto](../api/classsinga_1_1JobProto.html) must be provided in the command line,
+
+    ./bin/singa-run.sh -conf <path to job conf> [-resume]
+
+`-resume` is for continuing the training from last
+[checkpoint](checkpoint.html).
+The [MLP](mlp.html) and [CNN](cnn.html)
+examples use built-in components. Please read the corresponding pages for their
+job configuration files. The subsequent pages will illustrate the details on
+each component of the configuration.
+
+## Advanced user guide
+
+If a user's model contains some user-defined components, e.g.,
+[Updater](updater.html), he has to write a main function to
+register these components. It is similar to Hadoop's main function. Generally,
+the main function should
+
+  * initialize SINGA, e.g., setup logging.
+
+  * register user-defined components.
+
+  * create and pass the job configuration to SINGA driver
+
+
+An example main function is like
+
+    #include "singa.h"
+    #include "user.h"  // header for user code
+
+    int main(int argc, char** argv) {
+      singa::Driver driver;
+      driver.Init(argc, argv);
+      bool resume;
+      // parse resume option from argv.
+
+      // register user defined layers
+      driver.RegisterLayer<FooLayer>(kFooLayer);
+      // register user defined updater
+      driver.RegisterUpdater<FooUpdater>(kFooUpdater);
+      ...
+      auto jobConf = driver.job_conf();
+      //  update jobConf
+
+      driver.Train(resume, jobConf);
+      return 0;
+    }
+
+The Driver class' `Init` method will load a job configuration file provided by
+users as a command line argument (`-conf <job conf>`). It contains at least the
+cluster topology and returns the `jobConf` for users to update or fill in
+configurations of neural net, updater, etc. If users define subclasses of
+Layer, Updater, Worker and Param, they should register them through the driver.
+Finally, the job configuration is submitted to the driver which starts the
+training.
+
+We will provide helper functions to make the configuration easier in the
+future, like [keras](https://github.com/fchollet/keras).
+
+Users need to compile and link their code (e.g., layer implementations and the main
+file) with SINGA library (*.libs/libsinga.so*) to generate an
+executable file, e.g., with name *mysinga*.  To launch the program, users just pass the
+path of the *mysinga* and base job configuration to *./bin/singa-run.sh*.
+
+    ./bin/singa-run.sh -conf <path to job conf> -exec <path to mysinga> [other arguments]
+
+The [RNN application](rnn.html) provides a full example of
+implementing the main function for training a specific RNN model.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/quick-start.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/quick-start.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/quick-start.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/quick-start.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,177 @@
+# íµ ì¤íí¸
+
+---
+
+## SINGA ì¸ì¤í¨
+
+SINGA ì¸ì¤í¨ì [ì¬ê¸°](installation.html)ë¥¼ ì°¸ì¡°íììì¤.
+
+### Zookeeper ì¤í
+
+SINGA í¸ë ì´ëì [zookeeper](https://zookeeper.apache.org/)ë¥¼ ì´ì©í©ëë¤. ì°ì  zookeeper ìë¹ì¤ê° ììëì´ ìëì§ íì¸íììì¤.
+
+ì¤ë¹ë thirdparty ì¤í¬ë¦½í¸ë¥¼ ì¬ì©íì¬ zookeeperë¥¼ ì¤ì¹ í ê²½ì° ë¤ì ì¤í¬ë¦½í¸ë¥¼ ì¤ííììì¤.
+
+Â Â Â Â #goto top level folder
+Â Â Â Â cd SINGA_ROOT
+Â Â Â Â ./bin/zk-service.sh start
+
+(`./bin/zk-service.sh stop` // zookeeper ì¤ì§).
+
+ê¸°ë³¸ í¬í¸ë¥¼ ì¬ì©íì§ ìê³  zookeeperë¥¼ ìììí¬ ë`conf / singa.conf`ì í¸ì§íììì¤.
+
+Â Â Â Â zookeeper_host : "localhost : YOUR_PORT"
+
+## ëë¦½í ëª¨ëìì ì¤í
+
+ëë¦½í ëª¨ëìì SINGAì ì¤íí  ë, [Mesos](http://mesos.apache.org/)ì [YARN](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop- yarn-site / YARN.html)ê³¼ ê°ì í´ë¬ì¤í° ê´ë¦¬ì ì´ì©íì§ ìë ê²½ì°ë¥¼ ë§í©ëë¤.
+
+### Single ë¸ëììì íë ¨
+
+íëì íë¡ì¸ì¤ê° ì¶ìë©ëë¤.
+ìë¥¼ ë¤ì´,
+[CIFAR-10](http://www.cs.toronto.edu/~kriz/cifar.html) ë°ì´í° ì¸í¸ë¥¼ ì´ì©íì¬
+[CNN ëª¨ë¸](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks)ì í¸ë ì´ëìíµëë¤.
+íì´í¼ íë¼ë¯¸í°ë [cuda-convnet](https://code.google.com/p/cuda-convnet/)ì ë°ë¼ ì¤ì ëì´ ììµëë¤.
+ìì¸í ë´ì©ì [CNN ìí](cnn.html) íì´ì§ë¥¼ ì°¸ì¡°íììì¤.
+
+
+#### ë°ì´í°ì ìì ì¤ì 
+
+ë°ì´í° ì¸í¸ ë¤ì´ë¡ëì Triaingì´ë Testë¥¼ìí ë°ì´í° ì¤ëì ìì±ì ë¤ìê³¼ ê°ì´ ì¤ìí©ëë¤.
+
+Â Â Â Â cd examples / cifar10 /
+Â Â Â Â cp Makefile.example Makefile
+Â Â Â Â make download
+Â Â Â Â make create
+
+Trainingê³¼ Test ë°ì´í° ì¸í¸ë ê°ê° * cifar10-train-shard *
+ê·¸ë¦¬ê³  * cifar10-test-shard * í´ëì ë§ë¤ì´ì§ëë¤. ëª¨ë  ì´ë¯¸ì§ì í¹ì§ íê· ì ë¬ì¬ í * image_mean.bin * íì¼ë ìì±ë©ëë¤.
+
+CNN ëª¨ë¸ íìµì íìí ìì¤ ì½ëë ëª¨ë  SINGAì í¬í¨ëì´ ììµëë¤. ì½ëë¥¼ ì¶ê° í  íìê° ììµëë¤.
+ìì ì¤ì  íì¼ (*job.conf*) ì ì§ì íì¬ ì¤í¬ë¦½í¸ (*.. / .. / bin / singa-run.sh*)ë¥¼ ì¤íí©ëë¤.
+SINGA ì½ëë¥¼ ë³ê²½íê±°ë ì¶ê° í  ë, íë¡ê·¸ëë° ê°ì´ë (programming-guide.html)ë¥¼ ì°¸ì¡°íììì¤.
+
+#### ë³ë ¬í ìì´ í¸ë ì´ë
+
+Cluster Topologyì ê¸°ë³¸ê°ì íëì workerì íëì serverê° ììµëë¤.
+ë°ì´í°ì ì ê²½ë§ì ë³ë ¬ ì²ë¦¬ëëì§ ììµëë¤.
+
+íë ¨ì ììíë ¤ë©´ ë¤ì ì¤í¬ë¦½í¸ë¥¼ ì¤íí©ëë¤.
+
+Â Â Â Â # goto top level folder
+Â Â Â Â cd ../../
+Â Â Â Â ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+
+íì¬ ì¤íì¤ì¸ ììì ëª©ë¡ì ë³´ë ¤ë©´
+
+Â Â Â Â ./bin/singa-console.sh list
+
+Â Â Â Â JOB ID | NUM PROCS
+Â Â Â Â ---------- | -----------
+Â Â Â Â 24 | 1
+
+ìì ì¢ë£íë ¤ë©´
+
+Â Â Â Â ./bin/singa-console.sh kill JOB_ID
+
+
+ë¡ê·¸ ë° ìì ì ë³´ * / tmp / singa-log * í´ëì ì ì¥ë©ëë¤.
+* conf / singa.conf * íì¼`log-dir`ìì ë³ê²½ ê°ë¥í©ëë¤.
+
+
+#### ë¹ëê¸° ë³ë ¬ í¸ë ì´ë
+
+Â Â Â Â # job.conf
+Â Â Â Â ...
+Â Â Â Â cluster {
+Â Â Â Â Â Â nworker_groups : 2
+Â Â Â Â Â Â nworkers_per_procs : 2
+Â Â Â Â Â Â workspace : "examples/cifar10/"
+Â Â Â Â }
+
+ì¬ë¬ worker ê·¸ë£¹ì ì¶ìí¨ì¼ë¡ì¨
+In SINGA, ë¹ëê¸° í¸ë ì´ë (architecture.html)ì ìí í  ì ììµëë¤.
+ìë¥¼ ë¤ì´, *job.conf* ì ìì ê°ì´ ë³ê²½í©ëë¤.
+ê¸°ë³¸ì ì¼ë¡ íëì worker ê·¸ë£¹ì´ íëì workerë¥¼ ê°ëë¡ ì¤ì ëì´ ììµëë¤.
+ìì ì¤ì ì íëì íë¡ì¸ì¤ì 2 ê°ì workerê° ì¤ì ëì´ ìê¸° ëë¬¸ì 2 ê°ì worker ê·¸ë£¹ì´ ëì¼í íë¡ì¸ì¤ë¡ ì¤íë©ëë¤.
+ê²°ê³¼ ë©ëª¨ë¦¬ [Downpour (frameworks.html) í¸ë ì´ë íë ì ìí¬ë¡ ì¤íë©ëë¤.
+
+ì¬ì©ìë ë°ì´í°ì ë¶ì°ì ì ê²½ ì¸ íìë ììµëë¤.
+ëë¤ ì¤íìì ë°ë¼ ê° worker ê·¸ë£¹ì ë°ì´í°ê° ë³´ë´ì§ëë¤.
+ê° workerë ë¤ë¥¸ ë°ì´í° íí°ìì ë´ë¹í©ëë¤.
+
+Â Â Â Â # job.conf
+Â Â Â Â ...
+Â Â Â Â neuralnet {
+Â Â Â Â Â Â layer {
+Â Â Â Â Â Â Â Â ...
+Â Â Â Â Â Â Â Â sharddata_conf {
+Â Â Â Â Â Â Â Â Â Â random_skip : 5000
+Â Â Â Â Â Â Â Â }
+Â Â Â Â Â Â }
+Â Â Â Â Â Â ...
+Â Â Â Â }
+
+ì¤í¬ë¦½í¸ ì¤í :
+
+Â Â Â Â ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+#### ëê¸°í ë³ë ¬ í¸ë ì´ë
+
+Â Â Â Â # job.conf
+Â Â Â Â ...
+Â Â Â Â cluster {
+Â Â Â Â Â Â nworkers_per_group : 2
+Â Â Â Â Â Â nworkers_per_procs : 2
+Â Â Â Â Â Â workspace : "examples/cifar10/"
+Â Â Â Â }
+
+íëì worker ê·¸ë£¹ì¼ë¡ ì¬ë¬ workerë¥¼ ì¤ííì¬ ëê¸° í¸ë ì´ë (architecture.html)ì ìí í  ì ììµëë¤.
+ìë¥¼ ë¤ì´, *job.conf* íì¼ì ìì ê°ì´ ë³ê²½í©ëë¤.
+ìì ì¤ì ì íëì worker ê·¸ë£¹ì 2 ê°ì workerê° ì¤ì ëììµëë¤.
+worker ì°ë¦¬ë ê·¸ë£¹ ë´ìì ëê¸°íí©ëë¤.
+ì´ê²ì ë©ëª¨ë¦¬ [sandblaster (frameworks.html)ë¡ ì¤íë©ëë¤.
+ëª¨ë¸ì 2 ê°ì workerë¡ ë¶í ë©ëë¤. ê° ë ì´ì´ê° 2 ê°ì workerë¡ ë³´ëëë¤.
+ë°°ë¶ ë ë ì´ì´ë ìë³¸ ë ì´ì´ì ê¸°ë¥ì ê°ì§ë§ í¹ì§ ì¸ì¤í´ì¤ì ìê°`B / g`ë©ëë¤.
+ì¬ê¸°ì`B`ë ë¯¸ëë°§ì° ì¸ì¤í´ì¤ì ì«ìë¡`g`ë ê·¸ë£¹ì worker ì ììµëë¤.
+[ë¤ë¥¸ ì²´ê³ (neural-net.html)ë¥¼ ì´ì©í ë ì´ì´ (ì ê²½ë§) íí°ì ë°©ë²ë ììµëë¤.
+
+ë¤ë¥¸ ì¤ì ì ëª¨ë "ë³ë ¬í ìì"ì ê²½ì°ì ëì¼í©ëë¤.
+
+Â Â Â Â ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+### í´ë¬ì¤í°ììì íë ¨
+
+í´ë¬ì¤í° ì¤ì ì ë³ê²½íì¬ ì í¸ë ì´ë íë ì ìí¬ì íì¥í©ëë¤.
+
+Â Â Â Â nworker_per_procs : 1
+
+ëª¨ë  íë¡ì¸ì¤ë íëì worker ì¤ë ëë¥¼ ìì±í©ëë¤.
+ê²°ê³¼ worker ì°ë¦¬ë ë¤ë¥¸ íë¡ì¸ì¤ (ë¸ë)ìì ìì±ë©ëë¤.
+í´ë¬ì¤í°ì ë¸ëë¥¼ í¹ì íë ¤ë©´ *SINGA_ROOT/conf/* ì *hostfile* ì ì¤ââì ì´ íìí©ëë¤.
+
+e.g.,
+
+Â Â Â Â logbase-a01
+Â Â Â Â logbase-a02
+
+zookeeper locationë ì¤ì í´ì¼í©ëë¤.
+
+e.g.,
+
+Â Â Â Â # conf/singa.conf
+Â Â Â Â zookeeper_host : "logbase-a01"
+
+ì¤í¬ë¦½í¸ì ì¤íì "Single ë¸ë í¸ë ì´ë"ê³¼ ëì¼í©ëë¤.
+
+Â Â Â Â ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+## Mesosìì ì¤í
+
+* working * ...
+
+## ë¤ì
+
+SINGA ì ì½ë ë³ê²½ ë° ì¶ê°ì ëí ìì¸í ë´ì©ì [íë¡ê·¸ëë° ê°ì´ë](programming-guide.html)ë¥¼ ì°¸ì¡°íììì¤.

Added: incubator/singa/site/trunk/content/markdown/docs/kr/rbm.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/kr/rbm.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/kr/rbm.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/kr/rbm.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,365 @@
+# RBM Example
+
+---
+
+This example uses SINGA to train 4 RBM models and one auto-encoder model over the
+[MNIST dataset](http://yann.lecun.com/exdb/mnist/). The auto-encoder model is trained
+to reduce the dimensionality of the MNIST image feature. The RBM models are trained
+to initialize parameters of the auto-encoder model. This example application is
+from [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf).
+
+## Running instructions
+
+Running scripts are provided in *SINGA_ROOT/examples/rbm* folder.
+
+The MNIST dataset has 70,000 handwritten digit images. The
+[data preparation](data.html) page
+has details on converting this dataset into SINGA recognizable format. Users can
+simply run the following commands to download and convert the dataset.
+
+    # at SINGA_ROOT/examples/mnist/
+    $ cp Makefile.example Makefile
+    $ make download
+    $ make create
+
+The training is separated into two phases, namely pre-training and fine-tuning.
+The pre-training phase trains 4 RBMs in sequence,
+
+    # at SINGA_ROOT/
+    $ ./bin/singa-run.sh -conf examples/rbm/rbm1.conf
+    $ ./bin/singa-run.sh -conf examples/rbm/rbm2.conf
+    $ ./bin/singa-run.sh -conf examples/rbm/rbm3.conf
+    $ ./bin/singa-run.sh -conf examples/rbm/rbm4.conf
+
+The fine-tuning phase trains the auto-encoder by,
+
+    $ ./bin/singa-run.sh -conf examples/rbm/autoencoder.conf
+
+
+## Training details
+
+### RBM1
+
+<img src="../images/example-rbm1.png" align="center" width="200px"/>
+<span><strong>Figure 1 - RBM1.</strong></span>
+
+The neural net structure for training RBM1 is shown in Figure 1.
+The data layer and parser layer provides features for training RBM1.
+The visible layer (connected with parser layer) of RBM1 accepts the image feature
+(784 dimension). The hidden layer is set to have 1000 neurons (units).
+These two layers are configured as,
+
+    layer{
+      name: "RBMVis"
+      type: kRBMVis
+      srclayers:"mnist"
+      srclayers:"RBMHid"
+      rbm_conf{
+        hdim: 1000
+      }
+      param{
+        name: "w1"
+        init{
+          type: kGaussian
+          mean: 0.0
+          std: 0.1
+        }
+      }
+      param{
+        name: "b11"
+        init{
+          type: kConstant
+          value: 0.0
+        }
+      }
+    }
+
+    layer{
+      name: "RBMHid"
+      type: kRBMHid
+      srclayers:"RBMVis"
+      rbm_conf{
+        hdim: 1000
+      }
+      param{
+        name: "w1_"
+        share_from: "w1"
+      }
+      param{
+        name: "b12"
+        init{
+          type: kConstant
+          value: 0.0
+        }
+      }
+    }
+
+
+
+For RBM, the weight matrix is shared by the visible and hidden layers. For instance,
+`w1` is shared by `vis` and `hid` layers shown in Figure 1. In SINGA, we can configure
+the `share_from` field to enable [parameter sharing](param.html)
+as shown above for the param `w1` and `w1_`.
+
+[Contrastive Divergence](train-one-batch.html#contrastive-divergence)
+is configured as the algorithm for [TrainOneBatch](train-one-batch.html).
+Following Hinton's paper, we configure the [updating protocol](updater.html)
+as follows,
+
+    # Updater Configuration
+    updater{
+      type: kSGD
+      momentum: 0.2
+      weight_decay: 0.0002
+      learning_rate{
+        base_lr: 0.1
+        type: kFixed
+      }
+    }
+
+Since the parameters of RBM0 will be used to initialize the auto-encoder, we should
+configure the `workspace` field to specify a path for the checkpoint folder.
+For example, if we configure it as,
+
+    cluster {
+      workspace: "examples/rbm/rbm1/"
+    }
+
+Then SINGA will [checkpoint the parameters](checkpoint.html) into *examples/rbm/rbm1/*.
+
+### RBM1
+<img src="../images/example-rbm2.png" align="center" width="200px"/>
+<span><strong>Figure 2 - RBM2.</strong></span>
+
+Figure 2 shows the net structure of training RBM2.
+The visible units of RBM2 accept the output from the Sigmoid1 layer. The Inner1 layer
+is a  `InnerProductLayer` whose parameters are set to the `w1` and `b12` learned
+from RBM1.
+The neural net configuration is (with layers for data layer and parser layer omitted).
+
+    layer{
+      name: "Inner1"
+      type: kInnerProduct
+      srclayers:"mnist"
+      innerproduct_conf{
+        num_output: 1000
+      }
+      param{ name: "w1" }
+      param{ name: "b12"}
+    }
+
+    layer{
+      name: "Sigmoid1"
+      type: kSigmoid
+      srclayers:"Inner1"
+    }
+
+    layer{
+      name: "RBMVis"
+      type: kRBMVis
+      srclayers:"Sigmoid1"
+      srclayers:"RBMHid"
+      rbm_conf{
+        hdim: 500
+      }
+      param{
+        name: "w2"
+        ...
+      }
+      param{
+        name: "b21"
+        ...
+      }
+    }
+
+    layer{
+      name: "RBMHid"
+      type: kRBMHid
+      srclayers:"RBMVis"
+      rbm_conf{
+        hdim: 500
+      }
+      param{
+        name: "w2_"
+        share_from: "w2"
+      }
+      param{
+        name: "b22"
+        ...
+      }
+    }
+
+To load w0 and b02 from RBM0's checkpoint file, we configure the `checkpoint_path` as,
+
+    checkpoint_path: "examples/rbm/rbm1/checkpoint/step6000-worker0"
+    cluster{
+      workspace: "examples/rbm/rbm2"
+    }
+
+The workspace is changed for checkpointing `w2`, `b21` and `b22` into
+*examples/rbm/rbm2/*.
+
+### RBM3
+
+<img src="../images/example-rbm3.png" align="center" width="200px"/>
+<span><strong>Figure 3 - RBM3.</strong></span>
+
+Figure 3 shows the net structure of training RBM3. In this model, a layer with
+250 units is added as the hidden layer of RBM3. The visible units of RBM3
+accepts output from Sigmoid2 layer. Parameters of Inner1 and Innner2 are set to
+`w1,b12,w2,b22` which can be load from the checkpoint file of RBM2,
+i.e., "examples/rbm/rbm2/".
+
+### RBM4
+
+
+<img src="../images/example-rbm4.png" align="center" width="200px"/>
+<span><strong>Figure 4 - RBM4.</strong></span>
+
+Figure 4 shows the net structure of training RBM4. It is similar to Figure 3,
+but according to [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf), the hidden units of the
+top RBM (RBM4) have stochastic real-valued states drawn from a unit variance
+Gaussian whose mean is determined by the input from the RBM's logistic visible
+units. So we add a `gaussian` field in the RBMHid layer to control the
+sampling distribution (Gaussian or Bernoulli). In addition, this
+RBM has a much smaller learning rate (0.001).  The neural net configuration for
+the RBM4 and the updating protocol is (with layers for data layer and parser
+layer omitted),
+
+    # Updater Configuration
+    updater{
+      type: kSGD
+      momentum: 0.9
+      weight_decay: 0.0002
+      learning_rate{
+        base_lr: 0.001
+        type: kFixed
+      }
+    }
+
+    layer{
+      name: "RBMVis"
+      type: kRBMVis
+      srclayers:"Sigmoid3"
+      srclayers:"RBMHid"
+      rbm_conf{
+        hdim: 30
+      }
+      param{
+        name: "w4"
+        ...
+      }
+      param{
+        name: "b41"
+        ...
+      }
+    }
+
+    layer{
+      name: "RBMHid"
+      type: kRBMHid
+      srclayers:"RBMVis"
+      rbm_conf{
+        hdim: 30
+        gaussian: true
+      }
+      param{
+        name: "w4_"
+        share_from: "w4"
+      }
+      param{
+        name: "b42"
+        ...
+      }
+    }
+
+### Auto-encoder
+In the fine-tuning stage, the 4 RBMs are "unfolded" to form encoder and decoder
+networks that are initialized using the parameters from the previous 4 RBMs.
+
+<img src="../images/example-autoencoder.png" align="center" width="500px"/>
+<span><strong>Figure 5 - Auto-Encoders.</strong></span>
+
+
+Figure 5 shows the neural net structure for training the auto-encoder.
+[Back propagation (kBP)] (train-one-batch.html) is
+configured as the algorithm for `TrainOneBatch`. We use the same cluster
+configuration as RBM models. For updater, we use [AdaGrad](updater.html#adagradupdater) algorithm with
+fixed learning rate.
+
+    ### Updater Configuration
+    updater{
+      type: kAdaGrad
+      learning_rate{
+      base_lr: 0.01
+      type: kFixed
+      }
+    }
+
+
+
+According to [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf),
+we configure a EuclideanLoss layer to compute the reconstruction error. The neural net
+configuration is (with some of the middle layers omitted),
+
+    layer{ name: "data" }
+    layer{ name:"mnist" }
+    layer{
+      name: "Inner1"
+      param{ name: "w1" }
+      param{ name: "b12" }
+    }
+    layer{ name: "Sigmoid1" }
+    ...
+    layer{
+      name: "Inner8"
+      innerproduct_conf{
+        num_output: 784
+        transpose: true
+      }
+      param{
+        name: "w8"
+        share_from: "w1"
+      }
+      param{ name: "b11" }
+    }
+    layer{ name: "Sigmoid8" }
+
+    # Euclidean Loss Layer Configuration
+    layer{
+      name: "loss"
+      type:kEuclideanLoss
+      srclayers:"Sigmoid8"
+      srclayers:"mnist"
+    }
+
+To load pre-trained parameters from the 4 RBMs' checkpoint file we configure `checkpoint_path` as
+
+    ### Checkpoint Configuration
+    checkpoint_path: "examples/rbm/checkpoint/rbm1/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm2/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm3/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm4/checkpoint/step6000-worker0"
+
+
+## Visualization Results
+
+<div>
+<img src="../images/rbm-weight.PNG" align="center" width="300px"/>
+
+<img src="../images/rbm-feature.PNG" align="center" width="300px"/>
+<br/>
+<span><strong>Figure 6 - Bottom RBM weight matrix.</strong></span>
+&nbsp;
+&nbsp;
+&nbsp;
+&nbsp;
+
+<span><strong>Figure 7 - Top layer features.</strong></span>
+</div>
+
+Figure 6 visualizes sample columns of the weight matrix of RBM1, We can see the
+Gabor-like filters are learned. Figure 7 depicts the features extracted from
+the top-layer of the auto-encoder, wherein one point represents one image.
+Different colors represent different digits. We can see that most images are
+well clustered according to the ground truth.