You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@singa.apache.org by wa...@apache.org on 2016/01/13 04:46:20 UTC
svn commit: r1724348 [2/6] - in /incubator/singa/site/trunk/content/markdown/docs: ./ jp/ kr/

Added: incubator/singa/site/trunk/content/markdown/docs/jp/layer.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/layer.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/layer.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/layer.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,614 @@
+# Layers
+
+---
+
+Layer is a core abstraction in SINGA. It performs a variety of feature
+transformations for extracting high-level features, e.g., loading raw features,
+parsing RGB values, doing convolution transformation, etc.
+
+The *Basic user guide* section introduces the configuration of a built-in
+layer. *Advanced user guide* explains how to extend the base Layer class to
+implement users' functions.
+
+## Basic user guide
+
+### Layer configuration
+
+Configuration of two example layers are shown below,
+
+    layer {
+      name: "data"
+      type: kCSVRecord
+      store_conf { }
+    }
+    layer{
+      name: "fc1"
+      type: kInnerProduct
+      srclayers: "data"
+      innerproduct_conf{ }
+      param{ }
+    }
+
+There are some common fields for all kinds of layers:
+
+  * `name`: a string used to differentiate two layers in a neural net.
+  * `type`: an integer used for identifying a specific Layer subclass. The types of built-in
+  layers are listed in LayerType (defined in job.proto).
+  For user-defined layer subclasses, `user_type` should be used instead of `type`.
+  * `srclayers`: names of the source layers.
+  In SINGA, all connections are [converted](neural-net.html) to directed connections.
+  * `param`: configuration for a [Param](param.html) instance.
+  There can be multiple Param objects in one layer.
+
+Different layers may have different configurations. These configurations
+are defined in `<type>_conf`.  E.g., "fc1" layer has
+`innerproduct_conf`. The subsequent sections
+explain the functionality of each built-in layer and how to configure it.
+
+### Built-in Layer subclasses
+SINGA has provided many built-in layers, which can be used directly to create neural nets.
+These layers are categorized according to their functionalities,
+
+  * Input layers for loading records (e.g., images) from disk files, HDFS or network into memory.
+  * Neuron layers for feature transformation, e.g., [convolution](../api/classsinga_1_1ConvolutionLayer.html), [pooling](../api/classsinga_1_1PoolingLayer.html), dropout, etc.
+  * Loss layers for measuring the training objective loss, e.g., Cross Entropy loss or Euclidean loss.
+  * Output layers for outputting the prediction results (e.g., probabilities of each category) or features into persistent storage, e.g., disk or HDFS.
+  * Connection layers for connecting layers when the neural net is partitioned.
+
+#### Input layers
+
+Input layers load training/test data from disk or other places (e.g., HDFS or network)
+into memory.
+
+##### StoreInputLayer
+
+[StoreInputLayer](../api/classsinga_1_1StoreInputLayer.html) is a base layer for
+loading data from data store. The data store can be a KVFile or TextFile (LMDB,
+LevelDB, HDFS, etc., will be supported later). Its `ComputeFeature` function reads
+batchsize (string:key, string:value) tuples. Each tuple is parsed by a `Parse` function
+implemented by its subclasses.
+
+The configuration for this layer is in `store_conf`,
+
+    store_conf {
+      backend: # "kvfile" or "textfile"
+      path: # path to the data store
+      batchsize :
+      ...
+    }
+
+##### SingleLabelRecordLayer
+
+It is a subclass of StoreInputLayer. It assumes the (key, value) tuple loaded
+from a data store contains a feature vector (and a label) for one data instance.
+All feature vectors are of the same fixed length. The shape of one instance
+is configured through the `shape` field, e.g., the following configuration
+specifies the shape for the CIFAR10 images.
+
+    store_conf {
+      shape: 3  #channels
+      shape: 32 #height
+      shape: 32 #width
+    }
+
+It may do some preprocessing like [standardization](http://ufldl.stanford.edu/wiki/index.php/Data_Preprocessing).
+The data for preprocessing is loaded by and parsed in a virtual function, which is implemented by
+its subclasses.
+
+##### RecordInputLayer
+
+It is a subclass of SingleLabelRecordLayer. It parses the value field from one
+tuple into a RecordProto, which is generated by Google Protobuf according
+to common.proto.  It can be used to store features for images (e.g., using the pixel field)
+or other objects (using the data field). The key field is not parsed.
+
+    type: kRecordInput
+    store_conf {
+      has_label: # default is true
+      ...
+    }
+
+##### CSVInputLayer
+
+It is a subclass of SingleLabelRecordLayer. The value field from one tuple is parsed
+as a CSV line (separated by comma). The first number would be parsed as a label if
+`has_label` is configured in `store_conf`. Otherwise, all numbers would be parsed
+into one row of the `data_` Blob.
+
+    type: kCSVInput
+    store_conf {
+      has_label: # default is true
+      ...
+    }
+
+##### ImagePreprocessLayer
+
+This layer does image preprocessing, e.g., cropping, mirroring and scaling, against
+the data Blob from its source layer. It deprecates the RGBImageLayer which
+works on the Record from ShardDataLayer. It still uses the same configuration as
+RGBImageLayer,
+
+    type: kImagePreprocess
+    rgbimage_conf {
+      scale: float
+      cropsize: int  # cropping each image to keep the central part with this size
+      mirror: bool  # mirror the image by set image[i,j]=image[i,len-j]
+      meanfile: "Image_Mean_File_Path"
+    }
+
+##### ShardDataLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[ShardDataLayer](../api/classsinga_1_1ShardDataLayer.html) is a subclass of DataLayer,
+which reads Records from disk file. The file should be created using
+[DataShard](../api/classsinga_1_1DataShard.html)
+class. With the data file prepared, users configure the layer as
+
+    type: kShardData
+    sharddata_conf {
+      path: "path to data shard folder"
+      batchsize: int
+      random_skip: int
+    }
+
+`batchsize` specifies the number of records to be trained for one mini-batch.
+The first `rand() % random_skip` `Record`s will be skipped at the first
+iteration. This is to enforce that different workers work on different Records.
+
+##### LMDBDataLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[LMDBDataLayer] is similar to ShardDataLayer, except that the Records are
+loaded from LMDB.
+
+    type: kLMDBData
+    lmdbdata_conf {
+      path: "path to LMDB folder"
+      batchsize: int
+      random_skip: int
+    }
+
+##### ParserLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+It get a vector of Records from DataLayer and parse features into
+a Blob.
+
+    virtual void ParseRecords(Phase phase, const vector<Record>& records, Blob<float>* blob) = 0;
+
+
+##### LabelLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[LabelLayer](../api/classsinga_1_1LabelLayer.html) is a subclass of ParserLayer.
+It parses a single label from each Record. Consequently, it
+will put $b$ (mini-batch size) values into the Blob. It has no specific configuration fields.
+
+
+##### MnistImageLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+[MnistImageLayer] is a subclass of ParserLayer. It parses the pixel values of
+each image from the MNIST dataset. The pixel
+values may be normalized as `x/norm_a - norm_b`. For example, if `norm_a` is
+set to 255 and `norm_b` is set to 0, then every pixel will be normalized into
+[0, 1].
+
+    type: kMnistImage
+    mnistimage_conf {
+      norm_a: float
+      norm_b: float
+    }
+
+##### RGBImageLayer (Deprected)
+Deprected! Please use the ImagePreprocessLayer.
+[RGBImageLayer](../api/classsinga_1_1RGBImageLayer.html) is a subclass of ParserLayer.
+It parses the RGB values of one image from each Record. It may also
+apply some transformations, e.g., cropping, mirroring operations. If the
+`meanfile` is specified, it should point to a path that contains one Record for
+the mean of each pixel over all training images.
+
+    type: kRGBImage
+    rgbimage_conf {
+      scale: float
+      cropsize: int  # cropping each image to keep the central part with this size
+      mirror: bool  # mirror the image by set image[i,j]=image[i,len-j]
+      meanfile: "Image_Mean_File_Path"
+    }
+
+##### PrefetchLayer
+
+[PrefetchLayer](../api/classsinga_1_1PrefetchLayer.html) embeds other input layers
+to do data prefeching.  It will launch a thread to call the embedded layers to load and extract features.
+It ensures that the I/O task and computation task can work simultaneously.
+One example PrefetchLayer configuration is,
+
+    layer {
+      name: "prefetch"
+      type: kPrefetch
+      sublayers {
+        name: "data"
+        type: kShardData
+        sharddata_conf { }
+      }
+      sublayers {
+        name: "rgb"
+        type: kRGBImage
+        srclayers:"data"
+        rgbimage_conf { }
+      }
+      sublayers {
+        name: "label"
+        type: kLabel
+        srclayers: "data"
+      }
+      exclude:kTest
+    }
+
+The layers on top of the PrefetchLayer should use the name of the embedded
+layers as their source layers. For example, the "rgb" and "label" should be
+configured to the `srclayers` of other layers.
+
+
+#### Output Layers
+
+Output layers get data from their source layers and write them to persistent storage,
+e.g., disk files or HDFS (to be supported).
+
+##### RecordOutputLayer
+
+This layer gets data (and label if it is available) from its source layer and converts it into records of type
+RecordProto. Records are written as (key = instance No., value = serialized record) tuples into Store, e.g., KVFile. The configuration of this layer
+should include the specifics of the Store backend via `store_conf`.
+
+    layer {
+      name: "output"
+      type: kRecordOutput
+      srclayers:
+      store_conf {
+        backend: "kvfile"
+        path:
+      }
+    }
+
+##### CSVOutputLayer
+This layer gets data (and label if it available) from its source layer and converts it into
+a string per instance with fields separated by commas (i.e., CSV format). The shape information
+is not kept in the string. All strings are written into
+Store, e.g., text file. The configuration of this layer should include the specifics of the Store backend via `store_conf`.
+
+    layer {
+      name: "output"
+      type: kCSVOutput
+      srclayers:
+      store_conf {
+        backend: "textfile"
+        path:
+      }
+    }
+
+#### Neuron Layers
+
+Neuron layers conduct feature transformations.
+
+##### ConvolutionLayer
+
+[ConvolutionLayer](../api/classsinga_1_1ConvolutionLayer.html) conducts convolution transformation.
+
+    type: kConvolution
+    convolution_conf {
+      num_filters: int
+      kernel: int
+      stride: int
+      pad: int
+    }
+    param { } # weight/filter matrix
+    param { } # bias vector
+
+The int value `num_filters` stands for the count of the applied filters; the int
+value `kernel` stands for the convolution kernel size (equal width and height);
+the int value `stride` stands for the distance between the successive filters;
+the int value `pad` pads each with a given int number of pixels border of
+zeros.
+
+##### InnerProductLayer
+
+[InnerProductLayer](../api/classsinga_1_1InnerProductLayer.html) is fully connected with its (single) source layer.
+Typically, it has two parameter fields, one for weight matrix, and the other
+for bias vector. It rotates the feature of the source layer (by multiplying with weight matrix) and
+shifts it (by adding the bias vector).
+
+    type: kInnerProduct
+    innerproduct_conf {
+      num_output: int
+    }
+    param { } # weight matrix
+    param { } # bias vector
+
+
+##### PoolingLayer
+
+[PoolingLayer](../api/classsinga_1_1PoolingLayer.html) is used to do a normalization (or averaging or sampling) of the
+feature vectors from the source layer.
+
+    type: kPooling
+    pooling_conf {
+      pool: AVE|MAX // Choose whether use the Average Pooling or Max Pooling
+      kernel: int   // size of the kernel filter
+      pad: int      // the padding size
+      stride: int   // the step length of the filter
+    }
+
+The pooling layer has two methods: Average Pooling and Max Pooling.
+Use the enum AVE and MAX to choose the method.
+
+  * Max Pooling selects the max value for each filtering area as a point of the
+  result feature blob.
+  * Average Pooling averages all values for each filtering area at a point of the
+    result feature blob.
+
+##### ReLULayer
+
+[ReLuLayer](../api/classsinga_1_1ReLULayer.html) has rectified linear neurons, which conducts the following
+transformation, `f(x) = Max(0, x)`. It has no specific configuration fields.
+
+##### STanhLayer
+
+[STanhLayer](../api/classsinga_1_1TanhLayer.html) uses the scaled tanh as activation function, i.e., `f(x)=1.7159047* tanh(0.6666667 * x)`.
+It has no specific configuration fields.
+
+##### SigmoidLayer
+
+[SigmoidLayer] uses the sigmoid (or logistic) as activation function, i.e.,
+`f(x)=sigmoid(x)`.  It has no specific configuration fields.
+
+
+##### Dropout Layer
+[DropoutLayer](../api/asssinga_1_1DropoutLayer.html) is a layer that randomly dropouts some inputs.
+This scheme helps deep learning model away from over-fitting.
+
+    type: kDropout
+    dropout_conf {
+      dropout_ratio: float # dropout probability
+    }
+
+##### LRNLayer
+[LRNLayer](../api/classsinga_1_1LRNLayer.html), (Local Response Normalization), normalizes over the channels.
+
+    type: kLRN
+    lrn_conf {
+      local_size: int
+      alpha: float  // scaling parameter
+      beta: float   // exponential number
+    }
+
+`local_size` specifies  the quantity of the adjoining channels which will be summed up.
+ For `WITHIN_CHANNEL`, it means the side length of the space region which will be summed up.
+
+
+#### Loss Layers
+
+Loss layers measures the objective training loss.
+
+##### SoftmaxLossLayer
+
+[SoftmaxLossLayer](../api/classsinga_1_1SoftmaxLossLayer.html) is a combination of the Softmax transformation and
+Cross-Entropy loss. It applies Softmax firstly to get a prediction probability
+for each output unit (neuron) and compute the cross-entropy against the ground truth.
+It is generally used as the final layer to generate labels for classification tasks.
+
+    type: kSoftmaxLoss
+    softmaxloss_conf {
+      topk: int
+    }
+
+The configuration field `topk` is for selecting the labels with `topk`
+probabilities as the prediction results. It is tedious for users to view the
+prediction probability of every label.
+
+#### ConnectionLayer
+
+Subclasses of ConnectionLayer are utility layers that connects other layers due
+to neural net partitioning or other cases.
+
+##### ConcateLayer
+
+[ConcateLayer](../api/classsinga_1_1ConcateLayer.html) connects more than one source layers to concatenate their feature
+blob along given dimension.
+
+    type: kConcate
+    concate_conf {
+      concate_dim: int  // define the dimension
+    }
+
+##### SliceLayer
+
+[SliceLayer](../api/classsinga_1_1SliceLayer.html) connects to more than one destination layers to slice its feature
+blob along given dimension.
+
+    type: kSlice
+    slice_conf {
+      slice_dim: int
+    }
+
+##### SplitLayer
+
+[SplitLayer](../api/classsinga_1_1SplitLayer.html) connects to more than one destination layers to replicate its
+feature blob.
+
+    type: kSplit
+    split_conf {
+      num_splits: int
+    }
+
+##### BridgeSrcLayer & BridgeDstLayer
+
+[BridgeSrcLayer](../api/classsinga_1_1BridgeSrcLayer.html) &
+[BridgeDstLayer](../api/classsinga_1_1BridgeDstLayer.html) are utility layers assisting data (e.g., feature or
+gradient) transferring due to neural net partitioning. These two layers are
+added implicitly. Users typically do not need to configure them in their neural
+net configuration.
+
+### OutputLayer
+
+It write the prediction results or the extracted features into file, HTTP stream
+or other places. Currently SINGA has not implemented any specific output layer.
+
+## Advanced user guide
+
+The base Layer class is introduced in this section, followed by how to
+implement a new Layer subclass.
+
+### Base Layer class
+
+#### Members
+
+    LayerProto layer_conf_;
+    Blob<float> data_, grad_;
+    vector<AuxType> aux_data_;
+
+The base layer class keeps the user configuration in `layer_conf_`.
+Almost all layers has $b$ (mini-batch size) feature vectors, which are stored
+in the `data_` [Blob](../api/classsinga_1_1Blob.html) (A Blob is a chunk of memory space, proposed in
+[Caffe](http://caffe.berkeleyvision.org/)).
+There are layers without feature vectors; instead, they share the data from
+source layers.
+The `grad_` Blob is for storing the gradients of the
+objective loss w.r.t. the `data_` Blob. It is necessary in [BP algorithm](../api/classsinga_1_1BPWorker.html),
+hence we put it as a member of the base class. For [CD algorithm](../api/classsinga_1_1CDWorker.html), the `grad_`
+field is not used; instead, the layers for the RBM model may have a Blob for the positive
+phase feature and a Blob for the negative phase feature. For a recurrent layer
+in RNN, one row of the feature blob corresponds to the feature of one internal layer.
+The `aux_data_` stores the auxiliary data, e.g., image label (set `AuxType` to int).
+If images have variant number of labels, the AuxType can be defined to `vector<int>`.
+Currently, we hard code `AuxType` to int. It will be added as a template argument of Layer class later.
+
+If a layer has parameters, these parameters are declared using type
+[Param](param.html). Since some layers do not have
+parameters, we do not declare any `Param` in the base layer class.
+
+#### Functions
+
+    virtual void Setup(const LayerProto& conf, const vector<Layer*>& srclayers);
+    virtual void ComputeFeature(int flag, const vector<Layer*>& srclayers) = 0;
+    virtual void ComputeGradient(int flag, const vector<Layer*>& srclayers) = 0;
+
+The `Setup` function reads user configuration, i.e. `conf`, and information
+from source layers, e.g., mini-batch size,  to set the
+shape of the `data_` (and `grad_`) field as well
+as some other layer specific fields.
+<!---
+If `npartitions` is larger than 1, then
+users need to reduce the sizes of `data_`, `grad_` Blobs or Param objects. For
+example, if the `partition_dim=0` and there is no source layer, e.g., this
+layer is a (bottom) data layer, then its `data_` and `grad_` Blob should have
+`b/npartitions` feature vectors; If the source layer is also partitioned on
+dimension 0, then this layer should have the same number of feature vectors as
+the source layer. More complex partition cases are discussed in
+[Neural net partitioning](neural-net.html#neural-net-partitioning). Typically, the
+Setup function just set the shapes of `data_` Blobs and Param objects.
+-->
+Memory will not be allocated until computation over the data structure happens.
+
+The `ComputeFeature` function evaluates the feature blob by transforming (e.g.
+convolution and pooling) features from the source layers.  `ComputeGradient`
+computes the gradients of parameters associated with this layer.  These two
+functions are invoked by the [TrainOneBatch](train-one-batch.html)
+function during training. Hence, they should be consistent with the
+`TrainOneBatch` function. Particularly, for feed-forward and RNN models, they are
+trained using [BP algorithm](train-one-batch.html#back-propagation),
+which requires each layer's `ComputeFeature`
+function to compute `data_` based on source layers, and requires each layer's
+`ComputeGradient` to compute gradients of parameters and source layers'
+`grad_`. For energy models, e.g., RBM, they are trained by
+[CD algorithm](train-one-batch.html#contrastive-divergence), which
+requires each layer's `ComputeFeature` function to compute the feature vectors
+for the positive phase or negative phase depending on the `phase` argument, and
+requires the `ComputeGradient` function to only compute parameter gradients.
+For some layers, e.g., loss layer or output layer, they can put the loss or
+prediction result into the `metric` argument, which will be averaged and
+displayed periodically.
+
+### Implementing a new Layer subclass
+
+Users can extend the Layer class or other subclasses to implement their own feature transformation
+logics as long as the two virtual functions are overridden to be consistent with
+the `TrainOneBatch` function. The `Setup` function may also be overridden to
+read specific layer configuration.
+
+The [RNNLM](rnn.html) provides a couple of user-defined layers. You can refer to them as examples.
+
+#### Layer specific protocol message
+
+To implement a new layer, the first step is to define the layer specific
+configuration. Suppose the new layer is `FooLayer`, the layer specific
+google protocol message `FooLayerProto` should be defined as
+
+    # in user.proto
+    package singa
+    import "job.proto"
+    message FooLayerProto {
+      optional int32 a = 1;  // specific fields to the FooLayer
+    }
+
+In addition, users need to extend the original `LayerProto` (defined in job.proto of SINGA)
+to include the `foo_conf` as follows.
+
+    extend LayerProto {
+      optional FooLayerProto foo_conf = 101;  // unique field id, reserved for extensions
+    }
+
+If there are multiple new layers, then each layer that has specific
+configurations would have a `<type>_conf` field and takes one unique extension number.
+SINGA has reserved enough extension numbers, e.g., starting from 101 to 1000.
+
+    # job.proto of SINGA
+    LayerProto {
+      ...
+      extensions 101 to 1000;
+    }
+
+With user.proto defined, users can use
+[protoc](https://developers.google.com/protocol-buffers/) to generate the `user.pb.cc`
+and `user.pb.h` files.  In users' code, the extension fields can be accessed via,
+
+    auto conf = layer_proto_.GetExtension(foo_conf);
+    int a = conf.a();
+
+When defining configurations of the new layer (in job.conf), users should use
+`user_type` for its layer type instead of `type`. In addition, `foo_conf`
+should be enclosed in brackets.
+
+    layer {
+      name: "foo"
+      user_type: "kFooLayer"  # Note user_type of user-defined layers is string
+      [foo_conf] {      # Note there is a pair of [] for extension fields
+        a: 10
+      }
+    }
+
+#### New Layer subclass declaration
+
+The new layer subclass can be implemented like the built-in layer subclasses.
+
+    class FooLayer : public singa::Layer {
+     public:
+      void Setup(const LayerProto& conf, const vector<Layer*>& srclayers) override;
+      void ComputeFeature(int flag, const vector<Layer*>& srclayers) override;
+      void ComputeGradient(int flag, const vector<Layer*>& srclayers) override;
+
+     private:
+      //  members
+    };
+
+Users must override the two virtual functions to be called by the
+`TrainOneBatch` for either BP or CD algorithm. Typically, the `Setup` function
+will also be overridden to initialize some members. The user configured fields
+can be accessed through `layer_conf_` as shown in the above paragraphs.
+
+#### New Layer subclass registration
+
+The newly defined layer should be registered in [main.cc](http://singa.incubator.apache.org/docs/programming-guide) by adding
+
+    driver.RegisterLayer<FooLayer, std::string>("kFooLayer"); // "kFooLayer" should be matched to layer configurations in job.conf.
+
+After that, the [NeuralNet](neural-net.html) can create instances of the new Layer subclass.

Added: incubator/singa/site/trunk/content/markdown/docs/jp/lmdb.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/lmdb.md?rev=1724348&view=auto
==============================================================================
    (empty)

Added: incubator/singa/site/trunk/content/markdown/docs/jp/mesos.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/mesos.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/mesos.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/mesos.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,84 @@
+#Distributed Training on Mesos
+
+This guide explains how to start SINGA distributed training on a Mesos cluster. It assumes that both Mesos and HDFS are already running, and every node has SINGA installed.
+We assume the architecture depicted below, in which a cluster nodes are Docker container. Refer to [Docker guide](docker.html) for details of how to start individual nodes and set up network connection between them (make sure [weave](http://weave.works/guides/weave-docker-ubuntu-simple.html) is running at each node, and the cluster's headnode is running in container `node0`)
+
+![Nothing](http://www.comp.nus.edu.sg/~dinhtta/files/singa_mesos.png)
+
+---
+
+## Start HDFS and Mesos
+Go inside each container, using:
+````
+docker exec -it nodeX /bin/bash
+````
+and configure it as follows:
+
+* On container `node0`
+
+        hadoop namenode -format
+        hadoop-daemon.sh start namenode
+        /opt/mesos-0.22.0/build/bin/mesos-master.sh --work_dir=/opt --log_dir=/opt --quiet > /dev/null &
+        zk-service.sh start
+
+* On container `node1, node2, ...`
+
+        hadoop-daemon.sh start datanode
+        /opt/mesos-0.22.0/build/bin/mesos-slave.sh --master=node0:5050 --log_dir=/opt --quiet > /dev/null &
+
+To check if the setup has been successful, check that HDFS namenode has registered `N` datanodes, via:
+
+````
+hadoop dfsadmin -report
+````
+
+#### Mesos logs
+Mesos logs are stored at `/opt/lt-mesos-master.INFO` on `node0` and `/opt/lt-mesos-slave.INFO` at other nodes.
+
+---
+
+## Starting SINGA training on Mesos
+Assumed that Mesos and HDFS are already started, SINGA job can be launched at **any** container.
+
+#### Launching job
+
+1. Log in to any container, then
+        cd incubator-singa/tool/mesos
+<a name="job_start"></a>
+2. Check that configuration files are correct:
+    + `scheduler.conf` contains information about the master nodes
+    + `singa.conf` contains information about Zookeeper node0
+    + Job configuration file `job.conf` **contains full path to the examples directories (NO RELATIVE PATH!).**
+3. Start the job:
+    + If starting for the first time:
+
+	          ./scheduler <job config file> -scheduler_conf <scheduler config file> -singa_conf <SINGA config file>
+    + If not the first time:
+
+	          ./scheduler <job config file>
+
+**Notes.** Each running job is given a `frameworkID`. Look for the log message of the form:
+
+             Framework registered with XXX-XXX-XXX-XXX-XXX-XXX
+
+#### Monitoring and Debugging
+
+Each Mesos job is given a `frameworkID` and a *sandbox* directory is created for each job.
+The directory is in the specified `work_dir` (or `/tmp/mesos`) by default. For example, the error
+during SINGA execution can be found at:
+
+            /tmp/mesos/slaves/xxxxx-Sx/frameworks/xxxxx/executors/SINGA_x/runs/latest/stderr
+
+Other artifacts, like files downloaded from HDFS (`job.conf`) and `stdout` can be found in the same
+directory.
+
+#### Stopping
+
+There are two way to kill the running job:
+
+1. If the scheduler is running in the foreground, simply kill it (using `Ctrl-C`, for example).
+
+2. If the scheduler is running in the background, kill it using Mesos's REST API:
+
+          curl -d "frameworkId=XXX-XXX-XXX-XXX-XXX-XXX" -X POST http://<master>/master/shutdown
+

Added: incubator/singa/site/trunk/content/markdown/docs/jp/mlp.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/mlp.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/mlp.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/mlp.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,195 @@
+# MLP Example
+
+---
+
+Multilayer perceptron (MLP) is a subclass of feed-forward neural networks.
+A MLP typically consists of multiple directly connected layers, with each layer fully
+connected to the next one. In this example, we will use SINGA to train a
+[simple MLP model proposed by Ciresan](http://arxiv.org/abs/1003.0358)
+for classifying handwritten digits from the [MNIST dataset](http://yann.lecun.com/exdb/mnist/).
+
+## Running instructions
+
+Please refer to the [installation](installation.html) page for
+instructions on building SINGA, and the [quick start](quick-start.html)
+for instructions on starting zookeeper.
+
+We have provided scripts for preparing the training and test dataset in *examples/cifar10/*.
+
+    # in examples/mnist
+    $ cp Makefile.example Makefile
+    $ make download
+    $ make create
+
+After the datasets are prepared, we start the training by
+
+    ./bin/singa-run.sh -conf examples/mnist/job.conf
+
+After it is started, you should see output like
+
+    Record job information to /tmp/singa-log/job-info/job-1-20150817-055231
+    Executing : ./singa -conf /xxx/incubator-singa/examples/mnist/job.conf -singa_conf /xxx/incubator-singa/conf/singa.conf -singa_job 1
+    E0817 07:15:09.211885 34073 cluster.cc:51] proc #0 -> 192.168.5.128:49152 (pid = 34073)
+    E0817 07:15:14.972231 34114 server.cc:36] Server (group = 0, id = 0) start
+    E0817 07:15:14.972520 34115 worker.cc:134] Worker (group = 0, id = 0) start
+    E0817 07:15:24.462602 34073 trainer.cc:373] Test step-0, loss : 2.341021, accuracy : 0.109100
+    E0817 07:15:47.341076 34073 trainer.cc:373] Train step-0, loss : 2.357269, accuracy : 0.099000
+    E0817 07:16:07.173364 34073 trainer.cc:373] Train step-10, loss : 2.222740, accuracy : 0.201800
+    E0817 07:16:26.714855 34073 trainer.cc:373] Train step-20, loss : 2.091030, accuracy : 0.327200
+    E0817 07:16:46.590946 34073 trainer.cc:373] Train step-30, loss : 1.969412, accuracy : 0.442100
+    E0817 07:17:06.207080 34073 trainer.cc:373] Train step-40, loss : 1.865466, accuracy : 0.514800
+    E0817 07:17:25.890033 34073 trainer.cc:373] Train step-50, loss : 1.773849, accuracy : 0.569100
+    E0817 07:17:51.208935 34073 trainer.cc:373] Test step-60, loss : 1.613709, accuracy : 0.662100
+    E0817 07:17:53.176766 34073 trainer.cc:373] Train step-60, loss : 1.659150, accuracy : 0.652600
+    E0817 07:18:12.783370 34073 trainer.cc:373] Train step-70, loss : 1.574024, accuracy : 0.666000
+    E0817 07:18:32.904942 34073 trainer.cc:373] Train step-80, loss : 1.529380, accuracy : 0.670500
+    E0817 07:18:52.608111 34073 trainer.cc:373] Train step-90, loss : 1.443911, accuracy : 0.703500
+    E0817 07:19:12.168465 34073 trainer.cc:373] Train step-100, loss : 1.387759, accuracy : 0.721000
+    E0817 07:19:31.855865 34073 trainer.cc:373] Train step-110, loss : 1.335246, accuracy : 0.736500
+    E0817 07:19:57.327133 34073 trainer.cc:373] Test step-120, loss : 1.216652, accuracy : 0.769900
+
+After the training of some steps (depends on the setting) or the job is
+finished, SINGA will [checkpoint](checkpoint.html) the model parameters.
+
+## Details
+
+To train a model in SINGA, you need to prepare the datasets,
+and a job configuration which specifies the neural net structure, training
+algorithm (BP or CD), SGD update algorithm (e.g. Adagrad),
+number of training/test steps, etc.
+
+### Data preparation
+
+Before using SINGA, you need to write a program to pre-process the dataset you
+use to a format that SINGA can read. Please refer to the
+[Data Preparation](data.html) to get details about preparing
+this MNIST dataset.
+
+
+### Neural net
+
+<div style = "text-align: center">
+<img src = "../images/example-mlp.png" style = "width: 230px">
+<br/><strong>Figure 1 - Net structure of the MLP example. </strong></img>
+</div>
+
+
+Figure 1 shows the structure of the simple MLP model, which is constructed following
+[Ciresan's paper](http://arxiv.org/abs/1003.0358). The dashed circle contains
+two layers which represent one feature transformation stage. There are 6 such
+stages in total. They sizes of the [InnerProductLayer](layer.html#innerproductlayer)s in these circles decrease from
+2500->2000->1500->1000->500->10.
+
+Next we follow the guide in [neural net page](neural-net.html)
+and [layer page](layer.html) to write the neural net configuration.
+
+* We configure an input layer to read the training/testing records from a disk file.
+
+        layer {
+            name: "data"
+            type: kRecordInput
+            store_conf {
+              backend: "kvfile"
+              path: "examples/mnist/train_data.bin"
+              random_skip: 5000
+              batchsize: 64
+              shape: 784
+              std_value: 127.5
+              mean_value: 127.5
+             }
+             exclude: kTest
+          }
+
+        layer {
+            name: "data"
+            type: kRecordInput
+            store_conf {
+              backend: "kvfile"
+              path: "examples/mnist/test_data.bin"
+              batchsize: 100
+              shape: 784
+              std_value: 127.5
+              mean_value: 127.5
+             }
+             exclude: kTrain
+          }
+
+
+* All [InnerProductLayer](layer.html#innerproductlayer)s are configured similarly as,
+
+        layer{
+          name: "fc1"
+          type: kInnerProduct
+          srclayers:"data"
+          innerproduct_conf{
+            num_output: 2500
+          }
+          param{
+            name: "w1"
+            ...
+          }
+          param{
+            name: "b1"
+            ..
+          }
+        }
+
+    with the `num_output` decreasing from 2500 to 10.
+
+* A [STanhLayer](layer.html#stanhlayer) is connected to every InnerProductLayer
+except the last one. It transforms the feature via scaled tanh function.
+
+        layer{
+          name: "tanh1"
+          type: kSTanh
+          srclayers:"fc1"
+        }
+
+* The final [Softmax loss layer](layer.html#softmaxloss) connects
+to LabelLayer and the last STanhLayer.
+
+        layer{
+          name: "loss"
+          type:kSoftmaxLoss
+          softmaxloss_conf{ topk:1 }
+          srclayers:"fc6"
+          srclayers:"data"
+        }
+
+### Updater
+
+The [normal SGD updater](updater.html#updater) is selected.
+The learning rate shrinks by 0.997 every 60 steps (i.e., one epoch).
+
+    updater{
+      type: kSGD
+      learning_rate{
+        base_lr: 0.001
+        type : kStep
+        step_conf{
+          change_freq: 60
+          gamma: 0.997
+        }
+      }
+    }
+
+### TrainOneBatch algorithm
+
+The MLP model is a feed-forward model, hence
+[Back-propagation algorithm](train-one-batch#back-propagation)
+is selected.
+
+    train_one_batch {
+      alg: kBP
+    }
+
+### Cluster setting
+
+The following configuration set a single worker and server for training.
+[Training frameworks](frameworks.html) page introduces configurations of a couple of distributed
+training frameworks.
+
+    cluster {
+      nworker_groups: 1
+      nserver_groups: 1
+    }

Added: incubator/singa/site/trunk/content/markdown/docs/jp/model-config.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/model-config.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/model-config.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/model-config.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,294 @@
+# Model Configuration
+
+---
+
+SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters
+of deep learning models.  For each SGD iteration, there is a
+[Worker](architecture.html) computing
+gradients of parameters from the NeuralNet and a [Updater]() updating parameter
+values based on gradients. Hence the model configuration mainly consists these
+three parts. We will introduce the NeuralNet, Worker and Updater in the
+following paragraphs and describe the configurations for them. All model
+configuration is specified in the model.conf file in the user provided
+workspace folder. E.g., the [cifar10 example folder](https://github.com/apache/incubator-singa/tree/master/examples/cifar10)
+has a model.conf file.
+
+
+## NeuralNet
+
+### Uniform model (neuralnet) representation
+
+<img src = "../images/model-categorization.png" style = "width: 400px"> Fig. 1:
+Deep learning model categorization</img>
+
+Many deep learning models have being proposed. Fig. 1 is a categorization of
+popular deep learning models based on the layer connections. The
+[NeuralNet](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h)
+abstraction of SINGA consists of multiple directly connected layers. This
+abstraction is able to represent models from all the three categorizations.
+
+  * For the feed-forward models, their connections are already directed.
+
+  * For the RNN models, we unroll them into directed connections, as shown in
+  Fig. 2.
+
+  * For the undirected connections in RBM, DBM, etc., we replace each undirected
+  connection with two directed connection, as shown in Fig. 3.
+
+<div style = "height: 200px">
+<div style = "float:left; text-align: center">
+<img src = "../images/unroll-rbm.png" style = "width: 280px"> <br/>Fig. 2: Unroll RBM </img>
+</div>
+<div style = "float:left; text-align: center; margin-left: 40px">
+<img src = "../images/unroll-rnn.png" style = "width: 550px"> <br/>Fig. 3: Unroll RNN </img>
+</div>
+</div>
+
+In specific, the NeuralNet class is defined in
+[neuralnet.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h) :
+
+    ...
+    vector<Layer*> layers_;
+    ...
+
+The Layer class is defined in
+[base_layer.h](https://github.com/apache/incubator-singa/blob/master/include/neuralnet/base_layer.h):
+
+    vector<Layer*> srclayers_, dstlayers_;
+    LayerProto layer_proto_;  // layer configuration, including meta info, e.g., name
+    ...
+
+
+The connection with other layers are kept in the `srclayers_` and `dstlayers_`.
+Since there are many different feature transformations, there are many
+different Layer implementations correspondingly. For layers that have
+parameters in their feature transformation functions, they would have Param
+instances in the layer class, e.g.,
+
+    Param weight;
+
+
+### Configure the structure of a NeuralNet instance
+
+To train a deep learning model, the first step is to write the configurations
+for the model structure, i.e., the layers and connections for the NeuralNet.
+Like [Caffe](http://caffe.berkeleyvision.org/), we use the [Google Protocol
+Buffer](https://developers.google.com/protocol-buffers/) to define the
+configuration protocol. The
+[NetProto](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto)
+specifies the configuration fields for a NeuralNet instance,
+
+message NetProto {
+  repeated LayerProto layer = 1;
+  ...
+}
+
+The configuration is then
+
+    layer {
+      // layer configuration
+    }
+    layer {
+      // layer configuration
+    }
+    ...
+
+To configure the model structure, we just configure each layer involved in the model.
+
+    message LayerProto {
+      // the layer name used for identification
+      required string name = 1;
+      // source layer names
+      repeated string srclayers = 3;
+      // parameters, e.g., weight matrix or bias vector
+      repeated ParamProto param = 12;
+      // the layer type from the enum above
+      required LayerType type = 20;
+      // configuration for convolution layer
+      optional ConvolutionProto convolution_conf = 30;
+      // configuration for concatenation layer
+      optional ConcateProto concate_conf = 31;
+      // configuration for dropout layer
+      optional DropoutProto dropout_conf = 33;
+      ...
+    }
+
+A sample configuration for a feed-forward model is like
+
+    layer {
+      name : "input"
+      type : kRecordInput
+    }
+    layer {
+      name : "conv"
+      type : kInnerProduct
+      srclayers : "input"
+      param {
+        // configuration for parameter
+      }
+      innerproduct_conf {
+        // configuration for this specific layer
+      }
+      ...
+    }
+
+The layer type list is defined in
+[LayerType](https://github.com/apache/incubator-singa/blob/master/src/proto/model.proto).
+One type (kFoo) corresponds to one child class of Layer (FooLayer) and one
+configuration field (foo_conf). All built-in layers are introduced in the [layer page](layer.html).
+
+## Worker
+
+At the beginning, the Work will initialize the values of Param instances of
+each layer either randomly (according to user configured distribution) or
+loading from a [checkpoint file]().  For each training iteration, the worker
+visits layers of the neural network to compute gradients of Param instances of
+each layer. Corresponding to the three categories of models, there are three
+different algorithm to compute the gradients of a neural network.
+
+  1. Back-propagation (BP) for feed-forward models
+  2. Back-propagation through time (BPTT) for recurrent neural networks
+  3. Contrastive divergence (CD) for RBM, DBM, etc models.
+
+SINGA has provided these three algorithms as three Worker implementations.
+Users only need to configure in the model.conf file to specify which algorithm
+should be used. The configuration protocol is
+
+    message ModelProto {
+      ...
+      enum GradCalcAlg {
+      // BP algorithm for feed-forward models, e.g., CNN, MLP, RNN
+      kBP = 1;
+      // BPTT for recurrent neural networks
+      kBPTT = 2;
+      // CD algorithm for RBM, DBM etc., models
+      kCd = 3;
+      }
+      // gradient calculation algorithm
+      required GradCalcAlg alg = 8 [default = kBackPropagation];
+      ...
+    }
+
+These algorithms override the TrainOneBatch function of the Worker. E.g., the
+BPWorker implements it as
+
+    void BPWorker::TrainOneBatch(int step, Metric* perf) {
+      Forward(step, kTrain, train_net_, perf);
+      Backward(step, train_net_);
+    }
+
+The Forward function passes the raw input features of one mini-batch through
+all layers, and the Backward function visits the layers in reverse order to
+compute the gradients of the loss w.r.t each layer's feature and each layer's
+Param objects. Different algorithms would visit the layers in different orders.
+Some may traverses the neural network multiple times, e.g., the CDWorker's
+TrainOneBatch function is:
+
+    void CDWorker::TrainOneBatch(int step, Metric* perf) {
+      PostivePhase(step, kTrain, train_net_, perf);
+      NegativePhase(step, kTran, train_net_, perf);
+      GradientPhase(step, train_net_);
+    }
+
+Each `*Phase` function would visit all layers one or multiple times.
+All algorithms will finally call two functions of the Layer class:
+
+     /**
+      * Transform features from connected layers into features of this layer.
+      *
+      * @param phase kTrain, kTest, kPositive, etc.
+      */
+     virtual void ComputeFeature(Phase phase, Metric* perf) = 0;
+     /**
+      * Compute gradients for parameters (and connected layers).
+      *
+      * @param phase kTrain, kTest, kPositive, etc.
+      */
+     virtual void ComputeGradient(Phase phase) = 0;
+
+All [Layer implementations]() must implement the above two functions.
+
+
+## Updater
+
+Once the gradients of parameters are computed, the Updater will update
+parameter values.  There are many SGD variants for updating parameters, like
+[AdaDelta](http://arxiv.org/pdf/1212.5701v1.pdf),
+[AdaGrad](http://www.magicbroom.info/Papers/DuchiHaSi10.pdf),
+[RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf),
+[Nesterov](http://scholar.google.com/citations?view_op=view_citation&amp;hl=en&amp;user=DJ8Ep8YAAAAJ&amp;citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C)
+and SGD with momentum. The core functions of the Updater is
+
+    /**
+     * Update parameter values based on gradients
+     * @param step training step
+     * @param param pointer to the Param object
+     * @param grad_scale scaling factor for the gradients
+     */
+    void Update(int step, Param* param, float grad_scale=1.0f);
+    /**
+     * @param step training step
+     * @return the learning rate for this step
+     */
+    float GetLearningRate(int step);
+
+SINGA provides several built-in updaters and learning rate change methods.
+Users can configure them according to the UpdaterProto
+
+    message UpdaterProto {
+      enum UpdaterType{
+        // noraml SGD with momentum and weight decay
+        kSGD = 1;
+        // adaptive subgradient, http://www.magicbroom.info/Papers/DuchiHaSi10.pdf
+        kAdaGrad = 2;
+        // http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf
+        kRMSProp = 3;
+        // Nesterov first optimal gradient method
+        kNesterov = 4;
+      }
+      // updater type
+      required UpdaterType type = 1 [default=kSGD];
+      // configuration for RMSProp algorithm
+      optional RMSPropProto rmsprop_conf = 50;
+
+      enum ChangeMethod {
+        kFixed = 0;
+        kInverseT = 1;
+        kInverse = 2;
+        kExponential = 3;
+        kLinear = 4;
+        kStep = 5;
+        kFixedStep = 6;
+      }
+      // change method for learning rate
+      required ChangeMethod lr_change= 2 [default = kFixed];
+
+      optional FixedStepProto fixedstep_conf=40;
+      ...
+      optional float momentum = 31 [default = 0];
+      optional float weight_decay = 32 [default = 0];
+      // base learning rate
+      optional float base_lr = 34 [default = 0];
+    }
+
+
+## Other model configuration fields
+
+Some other important configuration fields for training a deep learning model is
+listed:
+
+    // model name, e.g., "cifar10-dcnn", "mnist-mlp"
+    string name;
+    // displaying training info for every this number of iterations, default is 0
+    int32 display_freq;
+    // total num of steps/iterations for training
+    int32 train_steps;
+    // do test for every this number of training iterations, default is 0
+    int32 test_freq;
+    // run test for this number of steps/iterations, default is 0.
+    // The test dataset has test_steps * batchsize instances.
+    int32 test_steps;
+    // do checkpoint for every this number of training steps, default is 0
+    int32 checkpoint_freq;
+
+The pages of [checkpoint and restore](checkpoint.html) has details on checkpoint related fields.

Added: incubator/singa/site/trunk/content/markdown/docs/jp/neural-net.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/neural-net.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/neural-net.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/neural-net.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,327 @@
+# Neural Net
+
+---
+
+`NeuralNet` in SINGA represents an instance of user's neural net model. As the
+neural net typically consists of a set of layers, `NeuralNet` comprises
+a set of unidirectionally connected [Layer](layer.html)s.
+This page describes how to convert an user's neural net into
+the configuration of `NeuralNet`.
+
+<img src="../images/model-category.png" align="center" width="200px"/>
+<span><strong>Figure 1 - Categorization of popular deep learning models.</strong></span>
+
+## Net structure configuration
+
+Users configure the `NeuralNet` by listing all layers of the neural net and
+specifying each layer's source layer names. Popular deep learning models can be
+categorized as Figure 1. The subsequent sections give details for each
+category.
+
+### Feed-forward models
+
+<div align = "left">
+<img src="../images/mlp-net.png" align="center" width="200px"/>
+<span><strong>Figure 2 - Net structure of a MLP model.</strong></span>
+</div>
+
+Feed-forward models, e.g., CNN and MLP, can easily get configured as their layer
+connections are undirected without circles. The
+configuration for the MLP model shown in Figure 1 is as follows,
+
+    net {
+      layer {
+        name : 'data"
+        type : kData
+      }
+      layer {
+        name : 'image"
+        type : kImage
+        srclayer: 'data'
+      }
+      layer {
+        name : 'label"
+        type : kLabel
+        srclayer: 'data'
+      }
+      layer {
+        name : 'hidden"
+        type : kHidden
+        srclayer: 'image'
+      }
+      layer {
+        name : 'softmax"
+        type : kSoftmaxLoss
+        srclayer: 'hidden'
+        srclayer: 'label'
+      }
+    }
+
+### Energy models
+
+<img src="../images/rbm-rnn.png" align="center" width="500px"/>
+<span><strong>Figure 3 - Convert connections in RBM and RNN.</strong></span>
+
+
+For energy models including RBM, DBM,
+etc., their connections are undirected (i.e., Category B). To represent these models using
+`NeuralNet`, users can simply replace each connection with two directed
+connections, as shown in Figure 3a. In other words, for each pair of connected layers, their source
+layer field should include each other's name.
+The full [RBM example](rbm.html) has
+detailed neural net configuration for a RBM model, which looks like
+
+    net {
+      layer {
+        name : "vis"
+        type : kVisLayer
+        param {
+          name : "w1"
+        }
+        srclayer: "hid"
+      }
+      layer {
+        name : "hid"
+        type : kHidLayer
+        param {
+          name : "w2"
+          share_from: "w1"
+        }
+        srclayer: "vis"
+      }
+    }
+
+### RNN models
+
+For recurrent neural networks (RNN), users can remove the recurrent connections
+by unrolling the recurrent layer.  For example, in Figure 3b, the original
+layer is unrolled into a new layer with 4 internal layers. In this way, the
+model is like a normal feed-forward model, thus can be configured similarly.
+The [RNN example](rnn.html) has a full neural net
+configuration for a RNN model.
+
+
+## Configuration for multiple nets
+
+Typically, a training job includes three neural nets for
+training, validation and test phase respectively. The three neural nets share most
+layers except the data layer, loss layer or output layer, etc..  To avoid
+redundant configurations for the shared layers, users can uses the `exclude`
+filed to filter a layer in the neural net, e.g., the following layer will be
+filtered when creating the testing `NeuralNet`.
+
+
+    layer {
+      ...
+      exclude : kTest # filter this layer for creating test net
+    }
+
+
+
+## Neural net partitioning
+
+A neural net can be partitioned in different ways to distribute the training
+over multiple workers.
+
+### Batch and feature dimension
+
+<img src="../images/partition_fc.png" align="center" width="400px"/>
+<span><strong>Figure 4 - Partitioning of a fully connected layer.</strong></span>
+
+
+Every layer's feature blob is considered a matrix whose rows are feature
+vectors. Thus, one layer can be split on two dimensions. Partitioning on
+dimension 0 (also called batch dimension) slices the feature matrix by rows.
+For instance, if the mini-batch size is 256 and the layer is partitioned into 2
+sub-layers, each sub-layer would have 128 feature vectors in its feature blob.
+Partitioning on this dimension has no effect on the parameters, as every
+[Param](param.html) object is replicated in the sub-layers. Partitioning on dimension
+1 (also called feature dimension) slices the feature matrix by columns. For
+example, suppose the original feature vector has 50 units, after partitioning
+into 2 sub-layers, each sub-layer would have 25 units. This partitioning may
+result in [Param](param.html) object being split, as shown in
+Figure 4. Both the bias vector and weight matrix are
+partitioned into two sub-layers.
+
+
+### Partitioning configuration
+
+There are 4 partitioning schemes, whose configurations are give below,
+
+  1. Partitioning each singe layer into sub-layers on batch dimension (see
+  below). It is enabled by configuring the partition dimension of the layer to
+  0, e.g.,
+
+          # with other fields omitted
+          layer {
+            partition_dim: 0
+          }
+
+  2. Partitioning each singe layer into sub-layers on feature dimension (see
+  below).  It is enabled by configuring the partition dimension of the layer to
+  1, e.g.,
+
+          # with other fields omitted
+          layer {
+            partition_dim: 1
+          }
+
+  3. Partitioning all layers into different subsets. It is enabled by
+  configuring the location ID of a layer, e.g.,
+
+          # with other fields omitted
+          layer {
+            location: 1
+          }
+          layer {
+            location: 0
+          }
+
+
+  4. Hybrid partitioning of strategy 1, 2 and 3. The hybrid partitioning is
+  useful for large models. An example application is to implement the
+  [idea proposed by Alex](http://arxiv.org/abs/1404.5997).
+  Hybrid partitioning is configured like,
+
+          # with other fields omitted
+          layer {
+            location: 1
+          }
+          layer {
+            location: 0
+          }
+          layer {
+            partition_dim: 0
+            location: 0
+          }
+          layer {
+            partition_dim: 1
+            location: 0
+          }
+
+Currently SINGA supports strategy-2 well. Other partitioning strategies are
+are under test and will be released in later version.
+
+## Parameter sharing
+
+Parameters can be shared in two cases,
+
+  * sharing parameters among layers via user configuration. For example, the
+  visible layer and hidden layer of a RBM shares the weight matrix, which is configured through
+  the `share_from` field as shown in the above RBM configuration. The
+  configurations must be the same (except name) for shared parameters.
+
+  * due to neural net partitioning, some `Param` objects are replicated into
+  different workers, e.g., partitioning one layer on batch dimension. These
+  workers share parameter values. SINGA controls this kind of parameter
+  sharing automatically, users do not need to do any configuration.
+
+  * the `NeuralNet` for training and testing (and validation) share most layers
+  , thus share `Param` values.
+
+If the shared `Param` instances resident in the same process (may in different
+threads), they use the same chunk of memory space for their values. But they
+would have different memory spaces for their gradients. In fact, their
+gradients will be averaged by the stub or server.
+
+## Advanced user guide
+
+### Creation
+
+    static NeuralNet* NeuralNet::Create(const NetProto& np, Phase phase, int num);
+
+The above function creates a `NeuralNet` for a given phase, and returns a
+pointer to the `NeuralNet` instance. The phase is in {kTrain,
+kValidation, kTest}. `num` is used for net partitioning which indicates the
+number of partitions.  Typically, a training job includes three neural nets for
+training, validation and test phase respectively. The three neural nets share most
+layers except the data layer, loss layer or output layer, etc.. The `Create`
+function takes in the full net configuration including layers for training,
+validation and test.  It removes layers for phases other than the specified
+phase based on the `exclude` field in
+[layer configuration](layer.html):
+
+    layer {
+      ...
+      exclude : kTest # filter this layer for creating test net
+    }
+
+The filtered net configuration is passed to the constructor of `NeuralNet`:
+
+    NeuralNet::NeuralNet(NetProto netproto, int npartitions);
+
+The constructor creates a graph representing the net structure firstly in
+
+    Graph* NeuralNet::CreateGraph(const NetProto& netproto, int npartitions);
+
+Next, it creates a layer for each node and connects layers if their nodes are
+connected.
+
+    void NeuralNet::CreateNetFromGraph(Graph* graph, int npartitions);
+
+Since the `NeuralNet` instance may be shared among multiple workers, the
+`Create` function returns a pointer to the `NeuralNet` instance .
+
+### Parameter sharing
+
+ `Param` sharing
+is enabled by first sharing the Param configuration (in `NeuralNet::Create`)
+to create two similar (e.g., the same shape) Param objects, and then calling
+(in `NeuralNet::CreateNetFromGraph`),
+
+    void Param::ShareFrom(const Param& from);
+
+It is also possible to share `Param`s of two nets, e.g., sharing parameters of
+the training net and the test net,
+
+    void NeuralNet:ShareParamsFrom(NeuralNet* other);
+
+It will call `Param::ShareFrom` for each Param object.
+
+### Access functions
+`NeuralNet` provides a couple of access function to get the layers and params
+of the net:
+
+    const std::vector<Layer*>& layers() const;
+    const std::vector<Param*>& params() const ;
+    Layer* name2layer(string name) const;
+    Param* paramid2param(int id) const;
+
+
+### Partitioning
+
+
+#### Implementation
+
+SINGA partitions the neural net in `CreateGraph` function, which creates one
+node for each (partitioned) layer. For example, if one layer's partition
+dimension is 0 or 1, then it creates `npartition` nodes for it; if the
+partition dimension is -1, a single node is created, i.e., no partitioning.
+Each node is assigned a partition (or location) ID. If the original layer is
+configured with a location ID, then the ID is assigned to each newly created node.
+These nodes are connected according to the connections of the original layers.
+Some connection layers will be added automatically.
+For instance, if two connected sub-layers are located at two
+different workers, then a pair of bridge layers is inserted to transfer the
+feature (and gradient) blob between them. When two layers are partitioned on
+different dimensions, a concatenation layer which concatenates feature rows (or
+columns) and a slice layer which slices feature rows (or columns) would be
+inserted. These connection layers help making the network communication and
+synchronization transparent to the users.
+
+#### Dispatching partitions to workers
+
+Each (partitioned) layer is assigned a location ID, based on which it is dispatched to one
+worker. Particularly, the pointer to the `NeuralNet` instance is passed
+to every worker within the same group, but each worker only computes over the
+layers that have the same partition (or location) ID as the worker's ID.  When
+every worker computes the gradients of the entire model parameters
+(strategy-2), we refer to this process as data parallelism.  When different
+workers compute the gradients of different parameters (strategy-3 or
+strategy-1), we call this process model parallelism.  The hybrid partitioning
+leads to hybrid parallelism where some workers compute the gradients of the
+same subset of model parameters while other workers compute on different model
+parameters.  For example, to implement the hybrid parallelism in for the
+[DCNN model](http://arxiv.org/abs/1404.5997), we set `partition_dim = 0` for
+lower layers and `partition_dim = 1` for higher layers.
+

Added: incubator/singa/site/trunk/content/markdown/docs/jp/neuralnet-partition.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/neuralnet-partition.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/neuralnet-partition.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/neuralnet-partition.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,54 @@
+# Neural Net Partition
+
+---
+
+The purposes of partitioning neural network is to distribute the partitions onto
+different working units (e.g., threads or nodes, called workers in this article)
+and parallelize the processing.
+Another reason for partition is to handle large neural network which cannot be
+hold in a single node. For instance, to train models against images with high
+resolution we need large neural networks (in terms of training parameters).
+
+Since *Layer* is the first class citizen in SIGNA, we do the partition against
+layers. Specifically, we support partitions at two levels. First, users can configure
+the location (i.e., worker ID) of each layer. In this way, users assign one worker
+for each layer. Secondly, for one layer, we can partition its neurons or partition
+the instances (e.g, images). They are called layer partition and data partition
+respectively. We illustrate the two types of partitions using an simple convolutional neural network.
+
+<img src="../images/conv-mnist.png" style="width: 220px"/>
+
+The above figure shows a convolutional neural network without any partition. It
+has 8 layers in total (one rectangular represents one layer). The first layer is
+DataLayer (data) which reads data from local disk files/databases (or HDFS). The second layer
+is a MnistLayer which parses the records from MNIST data to get the pixels of a batch
+of 8 images (each image is of size 28x28). The LabelLayer (label) parses the records to get the label
+of each image in the batch. The ConvolutionalLayer (conv1) transforms the input image to the
+shape of 8x27x27. The ReLULayer (relu1) conducts elementwise transformations. The PoolingLayer (pool1)
+sub-samples the images. The fc1 layer is fully connected with pool1 layer. It
+mulitplies each image with a weight matrix to generate a 10 dimension hidden feature which
+is then normalized by a SoftmaxLossLayer to get the prediction.
+
+<img src="../images/conv-mnist-datap.png" style="width: 1000px"/>
+
+The above figure shows the convolutional neural network after partitioning all layers
+except the DataLayer and ParserLayers, into 3 partitions using data partition.
+The read layers process 4 images of the batch, the black and blue layers process 2 images
+respectively. Some helper layers, i.e., SliceLayer, ConcateLayer, BridgeSrcLayer,
+BridgeDstLayer and SplitLayer, are added automatically by our partition algorithm.
+Layers of the same color resident in the same worker. There would be data transferring
+across different workers at the boundary layers (i.e., BridgeSrcLayer and BridgeDstLayer),
+e.g., between s-slice-mnist-conv1 and d-slice-mnist-conv1.
+
+<img src="../images/conv-mnist-layerp.png" style="width: 1000px"/>
+
+The above figure shows the convolutional neural network after partitioning all layers
+except the DataLayer and ParserLayers, into 2 partitions using layer partition. We can
+see that each layer processes all 8 images from the batch. But different partitions process
+different part of one image. For instance, the layer conv1-00 process only 4 channels. The other
+4 channels are processed by conv1-01 which residents in another worker.
+
+
+Since the partition is done at the layer level, we can apply different partitions for
+different layers to get a hybrid partition for the whole neural network. Moreover,
+we can also specify the layer locations to locate different layers to different workers.

Added: incubator/singa/site/trunk/content/markdown/docs/jp/overview.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/overview.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/overview.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/overview.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,68 @@
+# ã¤ã³ãããã¯ã·ã§ã³
+
+---
+
+SINGAã¯ãå¤§è¦æ¨¡ãªãã¼ã¿åæã®çºã®ãã£ã¼ãã©ã¼ãã³ã°ã¢ãã«ã®ãã¬ã¼ãã³ã°ãç®çã¨ãããåæ£ãã£ã¼ãã©ã¼ãã³ã°ã»ãã©ãããã©ã¼ã ãã§ãã
+ã¢ãã«ã¨ãªããããã¯ã¼ã¯ã®ãã¬ã¤ã¤ã¼ãæ¦å¿µã«åºã¥ããç´æçã«ããã°ã©ãã³ã°ã§ãããããã¶ã¤ã³ããã¦ãã¾ãã
+
+* Convolutional Neural Network (ç³ã¿è¾¼ã¿ãã¥ã¼ã©ã«ãããã¯ã¼ã¯) ã®ãããªãã£ã¼ããã©ã¯ã¼ããããã¯ã¼ã¯ããRestricted Boltzmann Machine (å¶éãã«ããã³ãã·ã³) ã®ãããªã¨ãã«ã®ã¼ã¢ãã«ãRecurrent Neural Network (åå¸°åãã¥ã¼ã©ã«ãããã¯ã¼ã¯) ã¢ãã«çãå¤æ§ãªã¢ãã«ããµãã¼ããã¾ãã
+
+* æ§ããªãã¬ã¤ã¤ã¼ãããBuilt-in Layer ã¨ãã¦å®è£ããã¦ãã¾ãã
+
+* SINGAã®ã¢ã¼ããã¯ãã£ã¼ã¯ãsynchronous (åæ)ãasynchronous (éåæ)ãããã¦ hybrid (ãã¤ããªãã) ãã¬ã¼ãã³ã°ãå®è¡ã§ããããè¨è¨ããã¦ãã¾ãã
+
+* å¤§è¦æ¨¡ãªã¢ãã«ã®ãã¬ã¼ãã³ã°ãä¸¦ååããããã®ããã¥ã¼ã©ã«ãããã¯ã¼ã¯åå²ã¹ãã¼ã  (ããããç¹å¾´ã®åå²) ããµãã¼ããã¦ãã¾ãã
+
+
+## ç®ç
+
+æ¡å¼µæ§ï¼åæ£ã·ã¹ãã ã¨ãã¦ãããå¤ãã®ãªã½ã¼ã¹ãå©ç¨ããç¹å®ã®ç²¾åº¦ã«éããã¾ã§ã®ãã¬ã¼ãã³ã°ã¹ãã¼ããåä¸ãããã
+
+ä½¿ããããï¼å¤§è¦æ¨¡ãªåæ£ã¢ãã«ã®å¹ççãªãã¬ã¼ãã³ã°ã«å¿è¦ã¨ãªãããã¼ã¿ãã¢ãã«ã®åå²ããããã¯ã¼ã¯éä¿¡çãããã°ã©ãã¼ã«ã¨ã£ã¦æéã®ãããä½æ¥ãç°¡ç¥åãããã£ã¼ãã§è¤éãªã¢ãã«ãã¢ã«ã´ãªãºã ã®å®è£ãå®¹æã«ããã
+
+
+## è¨è¨çå¿µ
+
+ã¹ã±ã¼ã©ããªãã£ã¯ãåæ£ãã£ã¼ãã©ã¼ãã³ã°ã«ããã¦éè¦ãªç ç©¶èª²é¡ã§ãã
+SINGAã¯ãæ§ããªãã¬ã¼ãã³ã°ãã¬ã¼ã ã¯ã¼ã¯ã®ã¹ã±ã¼ã©ããªãã£ã¼ãä¿ã¦ãããããè¨è¨ããã¦ãã¾ãã
+* Synchronous (åæ)ï¼ãã¬ã¼ãã³ã°ã®ï¼ã¹ãããã§å¾ãããå¹æãé«ãã¾ãã
+* Asynchronous (éåæ)ï¼ãã¬ã¼ãã³ã°ã®åæçãé«ãã¾ãã
+* Hybrid (ãã¤ããªãã)ï¼ã³ã¹ãããªã½ã¼ã¹(ã¯ã©ã¹ã¿ã¼ãµã¤ãºç)ã«åã£ãå¹æã¨åæçã®ãã©ã³ã¹ãã¨ããã¹ã±ã¼ã©ããªãã£ã¼ãé«ãã¾ãã
+
+SINGAã¯ããã£ã¼ãã©ã¼ãã³ã°ã¢ãã«ã®ãããã¯ã¼ã¯ã®ãã¬ã¤ã¤ã¼ãæ¦å¿µã«åºã¥ããç´æçã«ããã°ã©ãã³ã°åºæ¥ããããã¶ã¤ã³ããã¦ãã¾ããæ§ããªã¢ãã«ãå®è£ããã¬ã¼ãã³ã°ã§ãã¾ãã
+
+## ã·ã¹ãã æ¦è¦
+
+<img src="../../images/sgd.png" align="center" width="400px"/>
+<span><strong>Figure 1 - SGD ããã¼</strong></span>
+
+ããã£ã¼ãã©ã¼ãã³ã°ã¢ãã«ããã¬ã¼ãã³ã°ãããã¨ã¯ã
+ç¹å®ã®ã¿ã¹ã¯(åé¡ãäºæ¸¬ç)ãéæããããã«ä½¿ãããç¹å¾´éãçæããå¤æé¢æ°ã®æé©ãªãã©ã¡ã¿ã¼ãæ¢ããã¨ã§ãã
+ãã©ã¡ã¿ã¼ã®è¯ãæªãã¯ã[Cross-Entropy Loss](https://en.wikipedia.org/wiki/Cross_entropy) çã® loss function (æå¤±é¢æ°)ã§å¤æãã¾ãããã®é¢æ°ã¯ä¸è¬çã«éç·å½¢ãã¾ãã¯éå¸é¢æ°ãªã®ã§ãéè§£ãæ¢ãã®ãé£ããã§ãã
+
+ããã§ãStochastic Gradient Descent (ç¢ºççå¾ééä¸æ³) ãå©ç¨ãã¾ãã
+Figure 1 ã®ããã«ãã©ã³ãã ã«åæåããããã©ã¡ã¼ã¿ã¼ã®å¤ããæå¤±é¢æ°ãå°ãããªãããç¹°ãè¿ãã¢ãããã¼ããã¦ããã¾ãã
+
+<img src="../../images/overview.png" align="center" width="400px"/>
+<span><strong>Figure 2 - SINGA æ¦è¦</strong></span>
+
+ãã¬ã¼ãã³ã°ã«è¦ããã¯ã¼ã¯ãã¼ãã¯ãworkers ã¨ servers ã«åæ£ããã¾ããFigure 2 ã®ããã«ãã«ã¼ãæ¯ã« workers ã¯ *TrainOneBatch* é¢æ°ãå¼ã³ããã©ã¡ã¼ã¿ã¼å¾éãè¨ç®ãã¾ãã
+*TrainOneBatch* ã¯ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ã®æ§é ãè¨è¿°ããã *NeuralNet* ãªãã¸ã§ã¯ãã«åºã¥ãã¦ããã¬ã¤ã¤ã¼ããé ã«è¦ã¦åãã¾ãã
+è¨ç®ãããå¾éã¯ãã¼ã«ã«ãã¼ãã® stub ã«éãããéè¨ãããå¾ãå¯¾å¿ãã servers ã«éããã¾ããServers ã¯ãã¢ãããã¼ãããããã©ã¡ã¼ã¿ã¼ã workers ã«éãè¿ããæ¬¡ã®ã«ã¼ããå®è¡ãã¾ãã
+
+
+## ã¸ã§ã
+
+SINGAã§ãã¸ã§ããã¨ã¯ããã¥ã¼ã©ã«ãããã¯ã¼ã¯ã¢ãã«ã¨ãã¼ã¿ããã¬ã¼ãã³ã°æ¹æ³ãã¯ã©ã¹ã¿ã¼ãããã¸ã¼çãè¨è¿°ãããããJob Configurationããæãã¾ãã
+job configurationã¯ãFigure 2ã«æãããæ¬¡ã®4ã¤ã®ã³ã³ããã³ããç¹å®ãã¾ãã
+
+  * [NeuralNet](neural-net.html)ï¼ãã¥ã¼ã©ã«ãããã¯ã¼ã¯ã®æ§é ã¨ãåãã¬ã¤ã¤ã¼ãã®è¨å®ãè¨è¿°ãã¾ãã
+  * [TrainOneBatch](train-one-batch.html)ï¼ç°ãªãã¢ãã«ã«ãã´ãªã«é©ããã¢ã«ã´ãªãºã ãè¨è¿°ãã¾ãã
+  * [Updater](updater.html)ï¼ serverã§ã®ããã©ã¡ã¼ã¿ã¼ã®ã¢ãããã¼ãæ¹æ³ãè¨è¿°ãã¾ãã
+  * [Cluster Topology](distributed-training.html)ï¼workers ã¨ servers ã®åæ£ãããã¸ã¼ãè¨è¿°ãã¾ãã
+
+[main é¢æ°](programming-guide.html)ã®SINGAãã©ã¤ãã«ã¸ã§ããæ¸¡ãã¾ãã
+
+ãã®ããã»ã¹ã¯ãHadoopã§ã®ã¸ã§ããµãããã·ã§ã³ã«ä¼¼ã¦ãã¾ãã
+ã¦ã¼ã¶ã¼ã mainé¢æ°åã§ã¸ã§ãã®è¨å®ããã¾ãã
+Hadoopã¦ã¼ã¶ã¼ã¯ãèªèº«ã® mapper ã reducer ãè¨å®ãã¾ãããSINGAã¦ã¼ã¶ã¼ã¯ãèªèº«ã®ãã¬ã¤ã¤ã¼ããUpdaterçãè¨å®ãã¾ãã

Added: incubator/singa/site/trunk/content/markdown/docs/jp/param.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/param.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/param.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/param.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,226 @@
+# Parameters
+
+---
+
+A `Param` object in SINGA represents a set of parameters, e.g., a weight matrix
+or a bias vector. *Basic user guide* describes how to configure for a `Param`
+object, and *Advanced user guide* provides details on implementing users'
+parameter initialization methods.
+
+## Basic user guide
+
+The configuration of a Param object is inside a layer configuration, as the
+`Param` are associated with layers. An example configuration is like
+
+    layer {
+      ...
+      param {
+        name : "p1"
+        init {
+          type : kConstant
+          value: 1
+        }
+      }
+    }
+
+The [SGD algorithm](overview.html) starts with initializing all
+parameters according to user specified initialization method (the `init` field).
+For the above example,
+all parameters in `Param` "p1" will be initialized to constant value 1. The
+configuration fields of a Param object is defined in [ParamProto](../api/classsinga_1_1ParamProto.html):
+
+  * name, an identifier string. It is an optional field. If not provided, SINGA
+  will generate one based on layer name and its order in the layer.
+  * init, field for setting initialization methods.
+  * share_from, name of another `Param` object, from which this `Param` will share
+  configurations and values.
+  * lr_scale, float value to be multiplied with the learning rate when
+  [updating the parameters](updater.html)
+  * wd_scale, float value to be multiplied with the weight decay when
+  [updating the parameters](updater.html)
+
+There are some other fields that are specific to initialization methods.
+
+### Initialization methods
+
+Users can set the `type` of `init` use the following built-in initialization
+methods,
+
+  * `kConst`, set all parameters of the Param object to a constant value
+
+        type: kConst
+        value: float  # default is 1
+
+  * `kGaussian`, initialize the parameters following a Gaussian distribution.
+
+        type: kGaussian
+        mean: float # mean of the Gaussian distribution, default is 0
+        std: float # standard variance, default is 1
+        value: float # default 0
+
+  * `kUniform`, initialize the parameters following an uniform distribution
+
+        type: kUniform
+        low: float # lower boundary, default is -1
+        high: float # upper boundary, default is 1
+        value: float # default 0
+
+  * `kGaussianSqrtFanIn`, initialize `Param` objects with two dimensions (i.e.,
+  matrix) using `kGaussian` and then
+  multiple each parameter with `1/sqrt(fan_in)`, where`fan_in` is the number of
+  columns of the matrix.
+
+  * `kUniformSqrtFanIn`, the same as `kGaussianSqrtFanIn` except that the
+  distribution is an uniform distribution.
+
+  * `kUniformFanInOut`, initialize matrix `Param` objects using `kUniform` and then
+  multiple each parameter with `sqrt(6/(fan_in + fan_out))`, where`fan_in +
+  fan_out` sums up the number of columns and rows of the matrix.
+
+For all above initialization methods except `kConst`, if their `value` is not
+1, every parameter will be multiplied with `value`. Users can also implement
+their own initialization method following the *Advanced user guide*.
+
+
+## Advanced user guide
+
+This sections describes the details on implementing new parameter
+initialization methods.
+
+### Base ParamGenerator
+All initialization methods are implemented as
+subclasses of the base `ParamGenerator` class.
+
+    class ParamGenerator {
+     public:
+      virtual void Init(const ParamGenProto&);
+      void Fill(Param*);
+
+     protected:
+      ParamGenProto proto_;
+    };
+
+Configurations of the initialization method is in `ParamGenProto`. The `Fill`
+function fills the `Param` object (passed in as an argument).
+
+### New ParamGenerator subclass
+
+Similar to implement a new Layer subclass, users can define a configuration
+protocol message,
+
+    # in user.proto
+    message FooParamProto {
+      optional int32 x = 1;
+    }
+    extend ParamGenProto {
+      optional FooParamProto fooparam_conf =101;
+    }
+
+The configuration of `Param` would be
+
+    param {
+      ...
+      init {
+        user_type: 'FooParam" # must use user_type for user defined methods
+        [fooparam_conf] { # must use brackets for configuring user defined messages
+          x: 10
+        }
+      }
+    }
+
+The subclass could be declared as,
+
+    class FooParamGen : public ParamGenerator {
+     public:
+      void Fill(Param*) override;
+    };
+
+Users can access the configuration fields in `Fill` by
+
+    int x = proto_.GetExtension(fooparam_conf).x();
+
+To use the new initialization method, users need to register it in the
+[main function](programming-guide.html).
+
+    driver.RegisterParamGenerator<FooParamGen>("FooParam")  # must be consistent with the user_type in configuration
+
+{% comment %}
+### Base Param class
+
+### Members
+
+    int local_version_;
+    int slice_start_;
+    vector<int> slice_offset_, slice_size_;
+
+    shared_ptr<Blob<float>> data_;
+    Blob<float> grad_;
+    ParamProto proto_;
+
+Each Param object has a local version and a global version (inside the data
+Blob). These two versions are used for synchronization. If multiple Param
+objects share the same values, they would have the same `data_` field.
+Consequently, their global version is the same. The global version is updated
+by [the stub thread](communication.html). The local version is
+updated in `Worker::Update` function which assigns the global version to the
+local version. The `Worker::Collect` function is blocked until the global
+version is larger than the local version, i.e., when `data_` is updated. In
+this way, we synchronize workers sharing parameters.
+
+In Deep learning models, some Param objects are 100 times larger than others.
+To ensure the load-balance among servers, SINGA slices large Param objects. The
+slicing information is recorded by `slice_*`. Each slice is assigned a unique
+ID starting from 0. `slice_start_` is the ID of the first slice of this Param
+object. `slice_offset_[i]` is the offset of the i-th slice in this Param
+object. `slice_size_[i]` is the size of the i-th slice. These slice information
+is used to create messages for transferring parameter values or gradients to
+different servers.
+
+Each Param object has a `grad_` field for gradients. Param objects do not share
+this Blob although they may share `data_`.  Because each layer containing a
+Param object would contribute gradients. E.g., in RNN, the recurrent layers
+share parameters values, and the gradients used for updating are averaged from all recurrent
+these recurrent layers. In SINGA, the stub thread will aggregate local
+gradients for the same Param object. The server will do a global aggregation
+of gradients for the same Param object.
+
+The `proto_` field has some meta information, e.g., name and ID. It also has a
+field called `owner` which is the ID of the Param object that shares parameter
+values with others.
+
+### Functions
+The base Param class implements two sets of functions,
+
+    virtual void InitValues(int version = 0);  // initialize values according to `init_method`
+    void ShareFrom(const Param& other);  // share `data_` from `other` Param
+    --------------
+    virtual Msg* GenGetMsg(bool copy, int slice_idx);
+    virtual Msg* GenPutMsg(bool copy, int slice_idx);
+    ... // other message related functions.
+
+Besides the functions for processing the parameter values, there is a set of
+functions for generating and parsing messages. These messages are for
+transferring parameter values or gradients between workers and servers. Each
+message corresponds to one Param slice. If `copy` is false, it means the
+receiver of this message is in the same process as the sender. In such case,
+only pointers to the memory of parameter value (or gradient) are wrapped in
+the message; otherwise, the parameter values (or gradients) should be copied
+into the message.
+
+
+## Implementing Param subclass
+Users can extend the base Param class to implement their own parameter
+initialization methods and message transferring protocols. Similar to
+implementing a new Layer subclasses, users can create google protocol buffer
+messages for configuring the Param subclass. The subclass, denoted as FooParam
+should be registered in main.cc,
+
+    dirver.RegisterParam<FooParam>(kFooParam);  // kFooParam should be different to 0, which is for the base Param type
+
+
+  * type, an integer representing the `Param` type. Currently SINGA provides one
+    `Param` implementation with type 0 (the default type). If users want
+    to use their own Param implementation, they should extend the base Param
+    class and configure this field with `kUserParam`
+
+{% endcomment %}

Added: incubator/singa/site/trunk/content/markdown/docs/jp/programmer-guide.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/jp/programmer-guide.md?rev=1724348&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/jp/programmer-guide.md (added)
+++ incubator/singa/site/trunk/content/markdown/docs/jp/programmer-guide.md Wed Jan 13 03:46:19 2016
@@ -0,0 +1,98 @@
+# Programmer Guide
+
+---
+
+To submit a training job, users must provide the configuration of the
+four components shown in Figure 1:
+
+  * a [NeuralNet](neural-net.html) describing the neural net structure with the detailed layer setting and their connections;
+  * a [TrainOneBatch](train-one-batch.html) algorithm which is tailored for different model categories;
+  * an [Updater](updater.html) defining the protocol for updating parameters at the server side;
+  * a [Cluster Topology](distributed-training.html) specifying the distributed architecture of workers and servers.
+
+The *Basic user guide* section describes how to submit a training job using
+built-in components; while the *Advanced user guide* section presents details
+on writing user's own main function to register components implemented by
+themselves. In addition, the training data must be prepared, which has the same
+[process](data.html) for both advanced users and basic users.
+
+<img src="../images/overview.png" align="center" width="400px"/>
+<span><strong>Figure 1 - SINGA overview.</strong></span>
+
+
+
+## Basic user guide
+
+Users can use the default main function provided by SINGA to submit the training
+job. For this case, a job configuration file written as a google protocol
+buffer message for the [JobProto](../api/classsinga_1_1JobProto.html) must be provided in the command line,
+
+    ./bin/singa-run.sh -conf <path to job conf> [-resume] [-test]
+
+* `-resume` is for continuing the training from last [checkpoint](checkpoint.html).
+* `-test` is for testing the performance of previously trained model and extracting features for new data,
+more details are available [here](test.html).
+
+The [MLP](mlp.html) and [CNN](cnn.html)
+examples use built-in components. Please read the corresponding pages for their
+job configuration files. The subsequent pages will illustrate the details on
+each component of the configuration.
+
+## Advanced user guide
+
+If a user's model contains some user-defined components, e.g.,
+[Updater](updater.html), he has to write a main function to
+register these components. It is similar to Hadoop's main function. Generally,
+the main function should
+
+  * initialize SINGA, e.g., setup logging.
+
+  * register user-defined components.
+
+  * create and pass the job configuration to SINGA driver
+
+An example main function is like
+
+    #include <string>
+    #include "singa.h"
+    #include "user.h"  // header for user code
+
+    int main(int argc, char** argv) {
+      singa::Driver driver;
+      driver.Init(argc, argv);
+      bool resume;
+      // parse resume option from argv.
+
+      // register user defined layers
+      driver.RegisterLayer<FooLayer, std::string>("kFooLayer");
+      // register user defined updater
+      driver.RegisterUpdater<FooUpdater, std::string>("kFooUpdater");
+      ...
+      auto jobConf = driver.job_conf();
+      //  update jobConf
+
+      driver.Submit(resume, jobConf);
+      return 0;
+    }
+
+The Driver class' `Init` method will load a job configuration file provided by
+users as a command line argument (`-conf <job conf>`). It contains at least the
+cluster topology and returns the `jobConf` for users to update or fill in
+configurations of neural net, updater, etc. If users define subclasses of
+Layer, Updater, Worker and Param, they should register them through the driver.
+Finally, the job configuration is submitted to the driver which starts the
+training.
+
+We will provide helper functions to make the configuration easier in the
+future, like [keras](https://github.com/fchollet/keras).
+
+Users need to compile and link their code (e.g., layer implementations and the main
+file) with SINGA library (*.libs/libsinga.so*) to generate an
+executable file, e.g., with name *mysinga*.  To launch the program, users just pass the
+path of the *mysinga* and base job configuration to *./bin/singa-run.sh*.
+
+    ./bin/singa-run.sh -conf <path to job conf> -exec <path to mysinga> [other arguments]
+
+The [RNN application](rnn.html) provides a full example of
+implementing the main function for training a specific RNN model.
+