You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@singa.apache.org by wa...@apache.org on 2016/04/20 07:09:07 UTC
svn commit: r1740048 [7/10] - in
/incubator/singa/site/trunk/content/markdown: ./ develop/ docs/ docs/kr/
v0.3.0/ v0.3.0/jp/ v0.3.0/kr/ v0.3.0/zh/
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/kr/programming-guide.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/kr/programming-guide.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/kr/programming-guide.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/kr/programming-guide.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,77 @@
+# íë¡ê·¸ëë° ê°ì´ë
+
+---
+
+Figure 1ì ê·¸ë ¤ì§ ë¤ìê³¼ ê°ì 4ê°ì§ Components 를 ì¤ì íì¬ í¸ë ì´ëì ììí©ëë¤.
+
+ * [NeuralNet](neural-net.html) : ë´ë´ë¤í¸ìí¬ì 구조ì ê° "ë ì´ì´"ì ì¤ì ì 기ì í©ëë¤.
+ * [TrainOneBatch](train-one-batch.html) : ëª¨ë¸ ì¹´í
ê³ ë¦¬ì ì í©í ìê³ ë¦¬ì¦ì 기ì í©ëë¤.
+ * [Updater](updater.html) : serverìì ë§¤ê° ë³ì를 ì
ë°ì´í¸íë ë°©ë²ì 기ì í©ëë¤.
+ * [Cluster Topology](distributed-training.html) : workersì servers ë¶ì° í í´ë¡ì§ë¥¼ 기ì í©ëë¤.
+
+*Basic ì ì ê°ì´ë* ìì built-in components 를 ì¨ì í¸ë ì´ëì ììíë ë°©ë²ì ì¤ëª
í©ëë¤. *Advanced ì ì ê°ì´ë* ììë ì ì ê° ìí리ë©í¸í 모ë¸, í¨ì, ìê³ ë¦¬ë¬ì ì¨ì í¸ë ì´ëì ììíë ë°©ë²ì ì¤ë³í©ëë¤. í¸ë ì´ë ë°ì´íë [process](data.html) 를 ì°¸ê³ ë¡ ì¤ë¹ë¥¼ í´ì£¼ì¸ì.
+
+<img src="../../images/overview.png" align="center" width="400px"/>
+<span><strong>Figure 1 - SINGA Overview </strong></span>
+
+
+
+## Basic ì ì ê°ì´ë
+
+SINGA ìì ì¤ë¹ë main í¨ì를 ì¨ì ì½ê² í¸ë ì´ëì ììí ì ììµëë¤.
+ì´ ê²½ì° [JobProto](../api/classsinga_1_1JobProto.html) 를 ìíì¬ google protocol buffer message ë¡ ìì¬ì§ job configuration íì¼ì ì¤ë¹í©ëë¤. ê·¸ë¦¬ê³ ìëì 커맨ëë¼ì¸ì ì¤íí©ëë¤.
+
+ ./bin/singa-run.sh -conf <path to job conf> [-resume]
+
+`-resume` ë ì ë² [checkpoint](checkpoint.html) ë¶í° ë¤ì í¸ë ì´ëì ê³ìí ë ì°ë ì¸ì ì
ëë¤.
+[MLP](mlp.html) ì [CNN](cnn.html) ìíë¤ì built-in ì»´í¬ëí¸ë¥¼ ì´ì©íê³ ììµëë¤.
+Please read the corresponding pages for their job configuration files. The subsequent pages will illustrate the details on each component of the configuration.
+
+## Advanced ì ì ê°ì´ë
+
+If a user's model contains some user-defined components, e.g.,
+[Updater](updater.html), he has to write a main function to
+register these components. It is similar to Hadoop's main function. Generally,
+the main function should
+
+* SINGA ì´ê¸°í, e.g., setup logging.
+
+* ì ì ì»´í¬ëí¸ì ë±ë¡
+
+* job configuration ì ìì±íê³ SINGA driver ìì ì¤ì
+
+main í¨ìì ìíì
ëë¤.
+
+ #include "singa.h"
+ #include "user.h" // header for user code
+
+ int main(int argc, char** argv) {
+ singa::Driver driver;
+ driver.Init(argc, argv);
+ bool resume;
+ // parse resume option from argv.
+
+ // register user defined layers
+ driver.RegisterLayer<FooLayer>(kFooLayer);
+ // register user defined updater
+ driver.RegisterUpdater<FooUpdater>(kFooUpdater);
+ ...
+ auto jobConf = driver.job_conf();
+ // update jobConf
+
+ driver.Train(resume, jobConf);
+ return 0;
+ }
+
+Driver class' `Init` method ë 커맨ëë¼ì¸ ì¸ì `-conf <job conf>` ìì 주ì´ì§ job configuration íì¼ì ì½ìµëë¤. ê·¸ íì¼ìë cluster topology ì ë³´ê° ê¸°ì ëì´ìê³ , ì ì ê° neural net, updater ë±ì ì
ë°ì´í¸ í¹ì ì¤ì í기ìí `jobConf`를 리í´í©ëë¤.
+ì ì ê° Layer, Updater, Worker, Param ë±ì subclass를 ì ìíë©´, driver ì ë±ë¡ì í´ì¼í©ëë¤.
+í¸ë ì´ëì ììí기 ìíì¬ job configuration ì¦ `jobConf`를 driver.Train ì ë겨ì¤ëë¤.
+
+<!--We will provide helper functions to make the configuration easier in the
+future, like [keras](https://github.com/fchollet/keras).-->
+
+ì ì ì½ë를 compile íê³ SINGA library (*.libs/libsinga.so*) ì ë§í¬ìì¼ ì¤ííì¼, e.g., *mysinga*, ì ìì±í©ëë¤. íë¡ê·¸ë¨ì ë¤ìê³¼ ê°ì´ ì¤íí©ëë¤.
+
+ ./bin/singa-run.sh -conf <path to job conf> -exec <path to mysinga> [other arguments]
+
+[RNN application](rnn.html) ìì RNN 모ë¸ì í¸ë ì´ëì ìí í¨ìì íë¡ê·¸ë¨ ì를 ì¤ëª
í©ëë¤.
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/kr/quick-start.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/kr/quick-start.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/kr/quick-start.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/kr/quick-start.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,176 @@
+# íµ ì¤íí¸
+
+---
+
+## SINGA ì¸ì¤í¨
+
+SINGA ì¸ì¤í¨ì [ì¬ê¸°](installation.html)를 참조íììì¤.
+
+### Zookeeper ì¤í
+
+SINGA í¸ë ì´ëì [zookeeper](https://zookeeper.apache.org/)를 ì´ì©í©ëë¤. ì°ì zookeeper ìë¹ì¤ê° ììëì´ ìëì§ íì¸íììì¤.
+
+ì¤ë¹ë thirdparty ì¤í¬ë¦½í¸ë¥¼ ì¬ì©íì¬ zookeeper를 ì¤ì¹ í ê²½ì° ë¤ì ì¤í¬ë¦½í¸ë¥¼ ì¤ííììì¤.
+
+ #goto top level folder
+ cd SINGA_ROOT
+ ./bin/zk-service.sh start
+
+(`./bin/zk-service.sh stop` // zookeeper ì¤ì§).
+
+기본 í¬í¸ë¥¼ ì¬ì©íì§ ìê³ zookeeper를 ìììí¬ ëë `conf/singa.conf`ì í¸ì§íììì¤.
+
+ zookeeper_host : "localhost : YOUR_PORT"
+
+## Stand-alone 모ëìì ì¤í
+
+Stand-alone 모ëìì SINGAì ì¤íí ë, [Mesos](http://mesos.apache.org/) ì [YARN](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html) ê³¼ ê°ì í´ë¬ì¤í° ê´ë¦¬í´ì ì´ì©íì§ ìë ê²½ì°ë¥¼ ë§í©ëë¤.
+
+### Single ë
¸ëììì í¸ë ì´ë
+
+íëì íë¡ì¸ì¤ê° ììë©ëë¤.
+ì를 ë¤ì´,
+[CIFAR-10](http://www.cs.toronto.edu/~kriz/cifar.html) ë°ì´í° ì¸í¸ë¥¼ ì´ì©íì¬
+[CNN 모ë¸](http://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks)ì í¸ë ì´ë ìíµëë¤.
+íì´í¼ íë¼ë¯¸í°ë [cuda-convnet](https://code.google.com/p/cuda-convnet/)ì ë°ë¼ ì¤ì ëì´ ììµëë¤.
+ìì¸í ë´ì©ì [CNN ìí](cnn.html) íì´ì§ë¥¼ 참조íììì¤.
+
+
+#### ë°ì´í°ì ìì
ì¤ì
+
+ë°ì´í° ì¸í¸ ë¤ì´ë¡ëì Triaing ì´ë Test 를 ìí ë°ì´í° ì¤ëì ìì±ì ë¤ìê³¼ ê°ì´ ì¤ìí©ëë¤.
+
+ cd examples/cifar10/
+ cp Makefile.example Makefile
+ make download
+ make create
+
+Training ê³¼ Test ë°ì´í° ì¸í¸ë ê°ê° *cifar10-train-shard*
+ê·¸ë¦¬ê³ *cifar10-test-shard* í´ëì ë§ë¤ì´ì§ëë¤. 모ë ì´ë¯¸ì§ì í¹ì§ íê· ì 기ì í *image_mean.bin* íì¼ë í¨ê» ìì±ë©ëë¤.
+
+CNN ëª¨ë¸ í¸ë ì´ëì íìí ìì¤ì½ëë 모ë SINGAì í¬í¨ëì´ ììµëë¤. ì½ë를 ì¶ê° í íìë ììµëë¤.
+ìì
ì¤ì íì¼(*job.conf*) ì ì§ì íì¬ ì¤í¬ë¦½í¸(*../../bin/singa-run.sh*)를 ì¤íí©ëë¤.
+SINGA ì½ë를 ë³ê²½íê±°ë ì¶ê° í ê²½ì°ë, íë¡ê·¸ëë°ê°ì´ë (programming-guide.html)를 참조íììì¤.
+
+#### ë³ë ¬í ìì´ í¸ë ì´ë
+
+Cluster Topologyì 기본ê°ì íëì workerì íëì serverê° ììµëë¤.
+ë°ì´í°ì 모ë¸ì ë³ë ¬ ì²ë¦¬ë ëì§ ììµëë¤.
+
+í¸ë ì´ëì ììí기 ìíì¬ ë¤ì ì¤í¬ë¦½í¸ë¥¼ ì¤íí©ëë¤.
+
+ # goto top level folder
+ cd ../../
+ ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+
+íì¬ ì¤íì¤ì¸ ìì
ì 리ì¤í¸ë¥¼ ë³´ë ¤ë©´
+
+ ./bin/singa-console.sh list
+
+ JOB ID | NUM PROCS
+ ---------- | -----------
+ 24 | 1
+
+ìì
ì ì¢
ë£íë ¤ë©´
+
+ ./bin/singa-console.sh kill JOB_ID
+
+
+ë¡ê·¸ ë° ìì
ì ë³´ë */tmp/singa-log* í´ëì ì ì¥ë©ëë¤.
+*conf/singa.conf* íì¼ì `log-dir`ìì ë³ê²½ ê°ë¥í©ëë¤.
+
+
+#### ë¹ë기 ë³ë ¬ í¸ë ì´ë
+
+ # job.conf
+ ...
+ cluster {
+ nworker_groups : 2
+ nworkers_per_procs : 2
+ workspace : "examples/cifar10/"
+ }
+
+ì¬ë¬ worker 그룹ì ì¤íí¨ì¼ë¡ì¨ [ë¹ë기 í¸ë ì´ë](architecture.html)ì í ì ììµëë¤.
+ì를 ë¤ì´, *job.conf* ì ìì ê°ì´ ë³ê²½í©ëë¤.
+기본ì ì¼ë¡ íëì worker ê·¸ë£¹ì´ íëì worker를 ê°ëë¡ ì¤ì ëì´ ììµëë¤.
+ìì ì¤ì ì íëì íë¡ì¸ì¤ì 2ê°ì workerê° ì¤ì ëì´ ì기 ë문ì 2ê°ì worker ê·¸ë£¹ì´ ëì¼í íë¡ì¸ì¤ë¡ ì¤íë©ëë¤.
+ê²°ê³¼ ì¸ë©ëª¨ë¦¬ [Downpour](frameworks.html) í¸ë ì´ë íë ììí¬ë¡ ì¤íë©ëë¤.
+
+ì¬ì©ìë ë°ì´í°ì ë¶ì°ì ì ê²½ ì¸ íìë ììµëë¤.
+ëë¤ ì¤íì
ì ë°ë¼ ê° worker 그룹ì ë°ì´í°ê° ë³´ë´ì§ëë¤.
+ê° workerë ë¤ë¥¸ ë°ì´í° íí°ì
ì ë´ë¹í©ëë¤.
+
+ # job.conf
+ ...
+ neuralnet {
+ layer {
+ ...
+ sharddata_conf {
+ random_skip : 5000
+ }
+ }
+ ...
+ }
+
+ì¤í¬ë¦½í¸ ì¤í :
+
+ ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+#### ë기í ë³ë ¬ í¸ë ì´ë
+
+ # job.conf
+ ...
+ cluster {
+ nworkers_per_group : 2
+ nworkers_per_procs : 2
+ workspace : "examples/cifar10/"
+ }
+
+íëì worker 그룹ì¼ë¡ ì¬ë¬ worker를 ì¤ííì¬ [ë기 í¸ë ì´ë](architecture.html)ì ìí í ì ììµëë¤.
+ì를 ë¤ì´, *job.conf* íì¼ì ìì ê°ì´ ë³ê²½í©ëë¤.
+ìì ì¤ì ì íëì worker 그룹ì 2ê°ì workerê° ì¤ì ëììµëë¤.
+worker ë¤ì 그룹 ë´ìì ë기íí©ëë¤.
+ì´ê²ì ì¸ë©ëª¨ë¦¬ [sandblaster](frameworks.html)ë¡ ì¤íë©ëë¤.
+모ë¸ì 2ê°ì workerë¡ ë¶í ë©ëë¤. ê° ë ì´ì´ê° 2ê°ì workerë¡ ë¶ì°ë©ëë¤.
+ë°°ë¶ ë ë ì´ì´ë ì본 ë ì´ì´ì 기ë¥ì ê°ì§ë§ í¹ì§ ì¸ì¤í´ì¤ì ìê° `B / g` ë¡ ë©ëë¤.
+ì¬ê¸°ì `B`ë 미ëë°§ì¹ ì¸ì¤í´ì¤ì ì«ìë¡ `g`ë 그룹ì worker ì ì
ëë¤.
+[ë¤ë¥¸ ì¤í´](neural-net.html)ì ì´ì©í ë ì´ì´ (ë´ë´ë¤í¸ìí¬) íí°ì
ë°©ë²ë ììµëë¤.
+
+ë¤ë¥¸ ì¤ì ë¤ì 모ë "ë³ë ¬í ìì"ì ê²½ì°ì ëì¼í©ëë¤.
+
+ ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+### í´ë¬ì¤í°ììì í¸ë ì´ë
+
+í´ë¬ì¤í° ì¤ì ì ë³ê²½íì¬ ì í¸ë ì´ë íë ììí¬ë¥¼ íì¥í©ëë¤.
+
+ nworker_per_procs : 1
+
+모ë íë¡ì¸ì¤ë íëì worker ì¤ë ë를 ìì±í©ëë¤.
+ê²°ê³¼ worker ì°ë¦¬ë ë¤ë¥¸ íë¡ì¸ì¤ (ë
¸ë)ìì ìì±ë©ëë¤.
+í´ë¬ì¤í°ì ë
¸ë를 í¹ì íë ¤ë©´ *SINGA_ROOT/conf/* ì *hostfile* ì ì¤ââì ì´ íìí©ëë¤.
+
+e.g.,
+
+ logbase-a01
+ logbase-a02
+
+zookeeper locationë ì¤ì í´ì¼í©ëë¤.
+
+e.g.,
+
+ # conf/singa.conf
+ zookeeper_host : "logbase-a01"
+
+ì¤í¬ë¦½í¸ì ì¤íì "Single ë
¸ë í¸ë ì´ë"ê³¼ ëì¼í©ëë¤.
+
+ ./bin/singa-run.sh -conf examples/cifar10/job.conf
+
+## Mesosìì ì¤í
+
+*working* ...
+
+## ë¤ì
+
+SINGA ì ì½ë ë³ê²½ ë° ì¶ê°ì ëí ìì¸í ë´ì©ì [íë¡ê·¸ëë° ê°ì´ë](programming-guide.html)를 참조íììì¤.
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/kr/rbm.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/kr/rbm.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/kr/rbm.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/kr/rbm.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,365 @@
+# RBM Example
+
+---
+
+This example uses SINGA to train 4 RBM models and one auto-encoder model over the
+[MNIST dataset](http://yann.lecun.com/exdb/mnist/). The auto-encoder model is trained
+to reduce the dimensionality of the MNIST image feature. The RBM models are trained
+to initialize parameters of the auto-encoder model. This example application is
+from [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf).
+
+## Running instructions
+
+Running scripts are provided in *SINGA_ROOT/examples/rbm* folder.
+
+The MNIST dataset has 70,000 handwritten digit images. The
+[data preparation](data.html) page
+has details on converting this dataset into SINGA recognizable format. Users can
+simply run the following commands to download and convert the dataset.
+
+ # at SINGA_ROOT/examples/mnist/
+ $ cp Makefile.example Makefile
+ $ make download
+ $ make create
+
+The training is separated into two phases, namely pre-training and fine-tuning.
+The pre-training phase trains 4 RBMs in sequence,
+
+ # at SINGA_ROOT/
+ $ ./bin/singa-run.sh -conf examples/rbm/rbm1.conf
+ $ ./bin/singa-run.sh -conf examples/rbm/rbm2.conf
+ $ ./bin/singa-run.sh -conf examples/rbm/rbm3.conf
+ $ ./bin/singa-run.sh -conf examples/rbm/rbm4.conf
+
+The fine-tuning phase trains the auto-encoder by,
+
+ $ ./bin/singa-run.sh -conf examples/rbm/autoencoder.conf
+
+
+## Training details
+
+### RBM1
+
+<img src="../images/example-rbm1.png" align="center" width="200px"/>
+<span><strong>Figure 1 - RBM1.</strong></span>
+
+The neural net structure for training RBM1 is shown in Figure 1.
+The data layer and parser layer provides features for training RBM1.
+The visible layer (connected with parser layer) of RBM1 accepts the image feature
+(784 dimension). The hidden layer is set to have 1000 neurons (units).
+These two layers are configured as,
+
+ layer{
+ name: "RBMVis"
+ type: kRBMVis
+ srclayers:"mnist"
+ srclayers:"RBMHid"
+ rbm_conf{
+ hdim: 1000
+ }
+ param{
+ name: "w1"
+ init{
+ type: kGaussian
+ mean: 0.0
+ std: 0.1
+ }
+ }
+ param{
+ name: "b11"
+ init{
+ type: kConstant
+ value: 0.0
+ }
+ }
+ }
+
+ layer{
+ name: "RBMHid"
+ type: kRBMHid
+ srclayers:"RBMVis"
+ rbm_conf{
+ hdim: 1000
+ }
+ param{
+ name: "w1_"
+ share_from: "w1"
+ }
+ param{
+ name: "b12"
+ init{
+ type: kConstant
+ value: 0.0
+ }
+ }
+ }
+
+
+
+For RBM, the weight matrix is shared by the visible and hidden layers. For instance,
+`w1` is shared by `vis` and `hid` layers shown in Figure 1. In SINGA, we can configure
+the `share_from` field to enable [parameter sharing](param.html)
+as shown above for the param `w1` and `w1_`.
+
+[Contrastive Divergence](train-one-batch.html#contrastive-divergence)
+is configured as the algorithm for [TrainOneBatch](train-one-batch.html).
+Following Hinton's paper, we configure the [updating protocol](updater.html)
+as follows,
+
+ # Updater Configuration
+ updater{
+ type: kSGD
+ momentum: 0.2
+ weight_decay: 0.0002
+ learning_rate{
+ base_lr: 0.1
+ type: kFixed
+ }
+ }
+
+Since the parameters of RBM0 will be used to initialize the auto-encoder, we should
+configure the `workspace` field to specify a path for the checkpoint folder.
+For example, if we configure it as,
+
+ cluster {
+ workspace: "examples/rbm/rbm1/"
+ }
+
+Then SINGA will [checkpoint the parameters](checkpoint.html) into *examples/rbm/rbm1/*.
+
+### RBM1
+<img src="../images/example-rbm2.png" align="center" width="200px"/>
+<span><strong>Figure 2 - RBM2.</strong></span>
+
+Figure 2 shows the net structure of training RBM2.
+The visible units of RBM2 accept the output from the Sigmoid1 layer. The Inner1 layer
+is a `InnerProductLayer` whose parameters are set to the `w1` and `b12` learned
+from RBM1.
+The neural net configuration is (with layers for data layer and parser layer omitted).
+
+ layer{
+ name: "Inner1"
+ type: kInnerProduct
+ srclayers:"mnist"
+ innerproduct_conf{
+ num_output: 1000
+ }
+ param{ name: "w1" }
+ param{ name: "b12"}
+ }
+
+ layer{
+ name: "Sigmoid1"
+ type: kSigmoid
+ srclayers:"Inner1"
+ }
+
+ layer{
+ name: "RBMVis"
+ type: kRBMVis
+ srclayers:"Sigmoid1"
+ srclayers:"RBMHid"
+ rbm_conf{
+ hdim: 500
+ }
+ param{
+ name: "w2"
+ ...
+ }
+ param{
+ name: "b21"
+ ...
+ }
+ }
+
+ layer{
+ name: "RBMHid"
+ type: kRBMHid
+ srclayers:"RBMVis"
+ rbm_conf{
+ hdim: 500
+ }
+ param{
+ name: "w2_"
+ share_from: "w2"
+ }
+ param{
+ name: "b22"
+ ...
+ }
+ }
+
+To load w0 and b02 from RBM0's checkpoint file, we configure the `checkpoint_path` as,
+
+ checkpoint_path: "examples/rbm/rbm1/checkpoint/step6000-worker0"
+ cluster{
+ workspace: "examples/rbm/rbm2"
+ }
+
+The workspace is changed for checkpointing `w2`, `b21` and `b22` into
+*examples/rbm/rbm2/*.
+
+### RBM3
+
+<img src="../images/example-rbm3.png" align="center" width="200px"/>
+<span><strong>Figure 3 - RBM3.</strong></span>
+
+Figure 3 shows the net structure of training RBM3. In this model, a layer with
+250 units is added as the hidden layer of RBM3. The visible units of RBM3
+accepts output from Sigmoid2 layer. Parameters of Inner1 and Innner2 are set to
+`w1,b12,w2,b22` which can be load from the checkpoint file of RBM2,
+i.e., "examples/rbm/rbm2/".
+
+### RBM4
+
+
+<img src="../images/example-rbm4.png" align="center" width="200px"/>
+<span><strong>Figure 4 - RBM4.</strong></span>
+
+Figure 4 shows the net structure of training RBM4. It is similar to Figure 3,
+but according to [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf), the hidden units of the
+top RBM (RBM4) have stochastic real-valued states drawn from a unit variance
+Gaussian whose mean is determined by the input from the RBM's logistic visible
+units. So we add a `gaussian` field in the RBMHid layer to control the
+sampling distribution (Gaussian or Bernoulli). In addition, this
+RBM has a much smaller learning rate (0.001). The neural net configuration for
+the RBM4 and the updating protocol is (with layers for data layer and parser
+layer omitted),
+
+ # Updater Configuration
+ updater{
+ type: kSGD
+ momentum: 0.9
+ weight_decay: 0.0002
+ learning_rate{
+ base_lr: 0.001
+ type: kFixed
+ }
+ }
+
+ layer{
+ name: "RBMVis"
+ type: kRBMVis
+ srclayers:"Sigmoid3"
+ srclayers:"RBMHid"
+ rbm_conf{
+ hdim: 30
+ }
+ param{
+ name: "w4"
+ ...
+ }
+ param{
+ name: "b41"
+ ...
+ }
+ }
+
+ layer{
+ name: "RBMHid"
+ type: kRBMHid
+ srclayers:"RBMVis"
+ rbm_conf{
+ hdim: 30
+ gaussian: true
+ }
+ param{
+ name: "w4_"
+ share_from: "w4"
+ }
+ param{
+ name: "b42"
+ ...
+ }
+ }
+
+### Auto-encoder
+In the fine-tuning stage, the 4 RBMs are "unfolded" to form encoder and decoder
+networks that are initialized using the parameters from the previous 4 RBMs.
+
+<img src="../images/example-autoencoder.png" align="center" width="500px"/>
+<span><strong>Figure 5 - Auto-Encoders.</strong></span>
+
+
+Figure 5 shows the neural net structure for training the auto-encoder.
+[Back propagation (kBP)] (train-one-batch.html) is
+configured as the algorithm for `TrainOneBatch`. We use the same cluster
+configuration as RBM models. For updater, we use [AdaGrad](updater.html#adagradupdater) algorithm with
+fixed learning rate.
+
+ ### Updater Configuration
+ updater{
+ type: kAdaGrad
+ learning_rate{
+ base_lr: 0.01
+ type: kFixed
+ }
+ }
+
+
+
+According to [Hinton's science paper](http://www.cs.toronto.edu/~hinton/science.pdf),
+we configure a EuclideanLoss layer to compute the reconstruction error. The neural net
+configuration is (with some of the middle layers omitted),
+
+ layer{ name: "data" }
+ layer{ name:"mnist" }
+ layer{
+ name: "Inner1"
+ param{ name: "w1" }
+ param{ name: "b12" }
+ }
+ layer{ name: "Sigmoid1" }
+ ...
+ layer{
+ name: "Inner8"
+ innerproduct_conf{
+ num_output: 784
+ transpose: true
+ }
+ param{
+ name: "w8"
+ share_from: "w1"
+ }
+ param{ name: "b11" }
+ }
+ layer{ name: "Sigmoid8" }
+
+ # Euclidean Loss Layer Configuration
+ layer{
+ name: "loss"
+ type:kEuclideanLoss
+ srclayers:"Sigmoid8"
+ srclayers:"mnist"
+ }
+
+To load pre-trained parameters from the 4 RBMs' checkpoint file we configure `checkpoint_path` as
+
+ ### Checkpoint Configuration
+ checkpoint_path: "examples/rbm/checkpoint/rbm1/checkpoint/step6000-worker0"
+ checkpoint_path: "examples/rbm/checkpoint/rbm2/checkpoint/step6000-worker0"
+ checkpoint_path: "examples/rbm/checkpoint/rbm3/checkpoint/step6000-worker0"
+ checkpoint_path: "examples/rbm/checkpoint/rbm4/checkpoint/step6000-worker0"
+
+
+## Visualization Results
+
+<div>
+<img src="../images/rbm-weight.PNG" align="center" width="300px"/>
+
+<img src="../images/rbm-feature.PNG" align="center" width="300px"/>
+<br/>
+<span><strong>Figure 6 - Bottom RBM weight matrix.</strong></span>
+
+
+
+
+
+<span><strong>Figure 7 - Top layer features.</strong></span>
+</div>
+
+Figure 6 visualizes sample columns of the weight matrix of RBM1, We can see the
+Gabor-like filters are learned. Figure 7 depicts the features extracted from
+the top-layer of the auto-encoder, wherein one point represents one image.
+Different colors represent different digits. We can see that most images are
+well clustered according to the ground truth.
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/kr/rnn.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/kr/rnn.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/kr/rnn.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/kr/rnn.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,420 @@
+# Recurrent Neural Networks for Language Modelling
+
+---
+
+Recurrent Neural Networks (RNN) are widely used for modelling sequential data,
+such as music and sentences. In this example, we use SINGA to train a
+[RNN model](http://www.fit.vutbr.cz/research/groups/speech/publi/2010/mikolov_interspeech2010_IS100722.pdf)
+proposed by Tomas Mikolov for [language modeling](https://en.wikipedia.org/wiki/Language_model).
+The training objective (loss) is
+to minimize the [perplexity per word](https://en.wikipedia.org/wiki/Perplexity), which
+is equivalent to maximize the probability of predicting the next word given the current word in
+a sentence.
+
+Different to the [CNN](cnn.html), [MLP](mlp.html)
+and [RBM](rbm.html) examples which use built-in
+layers(layer) and records(data),
+none of the layers in this example are built-in. Hence users would learn to
+implement their own layers and data records through this example.
+
+## Running instructions
+
+In *SINGA_ROOT/examples/rnnlm/*, scripts are provided to run the training job.
+First, the data is prepared by
+
+ $ cp Makefile.example Makefile
+ $ make download
+ $ make create
+
+Second, to compile the source code under *examples/rnnlm/*, run
+
+ $ make rnnlm
+
+An executable file *rnnlm.bin* will be generated.
+
+Third, the training is started by passing *rnnlm.bin* and the job configuration
+to *singa-run.sh*,
+
+ # at SINGA_ROOT/
+ # export LD_LIBRARY_PATH=.libs:$LD_LIBRARY_PATH
+ $ ./bin/singa-run.sh -exec examples/rnnlm/rnnlm.bin -conf examples/rnnlm/job.conf
+
+## Implementations
+
+<img src="../images/rnnlm.png" align="center" width="400px"/>
+<span><strong>Figure 1 - Net structure of the RNN model.</strong></span>
+
+The neural net structure is shown Figure 1. Word records are loaded by
+`DataLayer`. For every iteration, at most `max_window` word records are
+processed. If a sentence ending character is read, the `DataLayer` stops
+loading immediately. `EmbeddingLayer` looks up a word embedding matrix to extract
+feature vectors for words loaded by the `DataLayer`. These features are transformed by the
+`HiddenLayer` which propagates the features from left to right. The
+output feature for word at position k is influenced by words from position 0 to
+k-1. Finally, `LossLayer` computes the cross-entropy loss (see below)
+by predicting the next word of each word.
+The cross-entropy loss is computed as
+
+`$$L(w_t)=-log P(w_{t+1}|w_t)$$`
+
+Given `$w_t$` the above equation would compute over all words in the vocabulary,
+which is time consuming.
+[RNNLM Toolkit](https://f25ea9ccb7d3346ce6891573d543960492b92c30.googledrive.com/host/0ByxdPXuxLPS5RFM5dVNvWVhTd0U/rnnlm-0.4b.tgz)
+accelerates the computation as
+
+`$$P(w_{t+1}|w_t) = P(C_{w_{t+1}}|w_t) * P(w_{t+1}|C_{w_{t+1}})$$`
+
+Words from the vocabulary are partitioned into a user-defined number of classes.
+The first term on the left side predicts the class of the next word, and
+then predicts the next word given its class. Both the number of classes and
+the words from one class are much smaller than the vocabulary size. The probabilities
+can be calculated much faster.
+
+The perplexity per word is computed by,
+
+`$$PPL = 10^{- avg_t log_{10} P(w_{t+1}|w_t)}$$`
+
+### Data preparation
+
+We use a small dataset provided by the [RNNLM Toolkit](https://f25ea9ccb7d3346ce6891573d543960492b92c30.googledrive.com/host/0ByxdPXuxLPS5RFM5dVNvWVhTd0U/rnnlm-0.4b.tgz).
+It has 10,000 training sentences, with 71350 words in total and 3720 unique words.
+The subsequent steps follow the instructions in
+[Data Preparation](data.html) to convert the
+raw data into records and insert them into data stores.
+
+#### Download source data
+
+ # in SINGA_ROOT/examples/rnnlm/
+ cp Makefile.example Makefile
+ make download
+
+#### Define record format
+
+We define the word record as follows,
+
+ # in SINGA_ROOT/examples/rnnlm/rnnlm.proto
+ message WordRecord {
+ optional string word = 1;
+ optional int32 word_index = 2;
+ optional int32 class_index = 3;
+ optional int32 class_start = 4;
+ optional int32 class_end = 5;
+ }
+
+It includes the word string and its index in the vocabulary.
+Words in the vocabulary are sorted based on their frequency in the training dataset.
+The sorted list is cut into 100 sublists such that each sublist has 1/100 total
+word frequency. Each sublist is called a class.
+Hence each word has a `class_index` ([0,100)). The `class_start` is the index
+of the first word in the same class as `word`. The `class_end` is the index of
+the first word in the next class.
+
+#### Create data stores
+
+We use code from RNNLM Toolkit to read words, and sort them into classes.
+The main function in *create_store.cc* first creates word classes based on the training
+dataset. Second it calls the following function to create data store for the
+training, validation and test dataset.
+
+ int create_data(const char *input_file, const char *output_file);
+
+`input` is the path to training/validation/testing text file from the RNNLM Toolkit, `output` is output store file.
+This function starts with
+
+ singa::io::KVFile store;
+ store.Open(output, signa::io::kCreate);
+
+Then it reads the words one by one. For each word it creates a `WordRecord` instance,
+and inserts it into the store,
+
+ int wcnt = 0; // word count
+ WordRecord wordRecord;
+ while(1) {
+ readWord(wordstr, fin);
+ if (feof(fin)) break;
+ ...// fill in the wordRecord;
+ string val;
+ wordRecord.SerializeToString(&val);
+ int length = snprintf(key, BUFFER_LEN, "%05d", wcnt++);
+ store.Write(string(key, length), val);
+ }
+
+Compilation and running commands are provided in the *Makefile.example*.
+After executing
+
+ make create
+
+*train_data.bin*, *test_data.bin* and *valid_data.bin* will be created.
+
+
+### Layer implementation
+
+4 user-defined layers are implemented for this application.
+Following the guide for implementing [new Layer subclasses](layer#implementing-a-new-layer-subclass),
+we extend the [LayerProto](../api/classsinga_1_1LayerProto.html)
+to include the configuration messages of user-defined layers as shown below
+(3 out of the 7 layers have specific configurations),
+
+
+ import "job.proto"; // Layer message for SINGA is defined
+
+ //For implementation of RNNLM application
+ extend singa.LayerProto {
+ optional EmbeddingProto embedding_conf = 101;
+ optional LossProto loss_conf = 102;
+ optional DataProto data_conf = 103;
+ }
+
+In the subsequent sections, we describe the implementation of each layer,
+including its configuration message.
+
+#### RNNLayer
+
+This is the base layer of all other layers for this applications. It is defined
+as follows,
+
+ class RNNLayer : virtual public Layer {
+ public:
+ inline int window() { return window_; }
+ protected:
+ int window_;
+ };
+
+For this application, two iterations may process different number of words.
+Because sentences have different lengths.
+The `DataLayer` decides the effective window size. All other layers call its source layers to get the
+effective window size and resets `window_` in `ComputeFeature` function.
+
+#### DataLayer
+
+DataLayer is for loading Records.
+
+ class DataLayer : public RNNLayer, singa::InputLayer {
+ public:
+ void Setup(const LayerProto& proto, const vector<Layer*>& srclayers) override;
+ void ComputeFeature(int flag, const vector<Layer*>& srclayers) override;
+ int max_window() const {
+ return max_window_;
+ }
+ private:
+ int max_window_;
+ singa::io::Store* store_;
+ };
+
+The Setup function gets the user configured max window size.
+
+ max_window_ = proto.GetExtension(input_conf).max_window();
+
+The `ComputeFeature` function loads at most max_window records. It could also
+stop when the sentence ending character is encountered.
+
+ ...// shift the last record to the first
+ window_ = max_window_;
+ for (int i = 1; i <= max_window_; i++) {
+ // load record; break if it is the ending character
+ }
+
+The configuration of `DataLayer` is like
+
+ name: "data"
+ user_type: "kData"
+ [data_conf] {
+ path: "examples/rnnlm/train_data.bin"
+ max_window: 10
+ }
+
+#### EmbeddingLayer
+
+This layer gets records from `DataLayer`. For each record, the word index is
+parsed and used to get the corresponding word feature vector from the embedding
+matrix.
+
+The class is declared as follows,
+
+ class EmbeddingLayer : public RNNLayer {
+ ...
+ const std::vector<Param*> GetParams() const override {
+ std::vector<Param*> params{embed_};
+ return params;
+ }
+ private:
+ int word_dim_, vocab_size_;
+ Param* embed_;
+ }
+
+The `embed_` field is a matrix whose values are parameter to be learned.
+The matrix size is `vocab_size_` x `word_dim_`.
+
+The Setup function reads configurations for `word_dim_` and `vocab_size_`. Then
+it allocates feature Blob for `max_window` words and setups `embed_`.
+
+ int max_window = srclayers[0]->data(this).shape()[0];
+ word_dim_ = proto.GetExtension(embedding_conf).word_dim();
+ data_.Reshape(vector<int>{max_window, word_dim_});
+ ...
+ embed_->Setup(vector<int>{vocab_size_, word_dim_});
+
+The `ComputeFeature` function simply copies the feature vector from the `embed_`
+matrix into the feature Blob.
+
+ # reset effective window size
+ window_ = datalayer->window();
+ auto records = datalayer->records();
+ ...
+ for (int t = 0; t < window_; t++) {
+ int idx <- word index
+ Copy(words[t], embed[idx]);
+ }
+
+The `ComputeGradient` function copies back the gradients to the `embed_` matrix.
+
+The configuration for `EmbeddingLayer` is like,
+
+ user_type: "kEmbedding"
+ [embedding_conf] {
+ word_dim: 15
+ vocab_size: 3720
+ }
+ srclayers: "data"
+ param {
+ name: "w1"
+ init {
+ type: kUniform
+ low:-0.3
+ high:0.3
+ }
+ }
+
+#### HiddenLayer
+
+This layer unrolls the recurrent connections for at most max_window times.
+The feature for position k is computed based on the feature from the embedding layer (position k)
+and the feature at position k-1 of this layer. The formula is
+
+`$$f[k]=\sigma (f[t-1]*W+src[t])$$`
+
+where `$W$` is a matrix with `word_dim_` x `word_dim_` parameters.
+
+If you want to implement a recurrent neural network following our
+design, this layer is of vital importance for you to refer to.
+
+ class HiddenLayer : public RNNLayer {
+ ...
+ const std::vector<Param*> GetParams() const override {
+ std::vector<Param*> params{weight_};
+ return params;
+ }
+ private:
+ Param* weight_;
+ };
+
+The `Setup` function setups the weight matrix as
+
+ weight_->Setup(std::vector<int>{word_dim, word_dim});
+
+The `ComputeFeature` function gets the effective window size (`window_`) from its source layer
+i.e., the embedding layer. Then it propagates the feature from position 0 to position
+`window_` -1. The detailed descriptions for this process are illustrated as follows.
+
+ void HiddenLayer::ComputeFeature() {
+ for(int t = 0; t < window_size; t++){
+ if(t == 0)
+ Copy(data[t], src[t]);
+ else
+ data[t]=sigmoid(data[t-1]*W + src[t]);
+ }
+ }
+
+The `ComputeGradient` function computes the gradient of the loss w.r.t. W and the source layer.
+Particularly, for each position k, since data[k] contributes to data[k+1] and the feature
+at position k in its destination layer (the loss layer), grad[k] should contains the gradient
+from two parts. The destination layer has already computed the gradient from the loss layer into
+grad[k]; In the `ComputeGradient` function, we need to add the gradient from position k+1.
+
+ void HiddenLayer::ComputeGradient(){
+ ...
+ for (int k = window_ - 1; k >= 0; k--) {
+ if (k < window_ - 1) {
+ grad[k] += dot(grad[k + 1], weight.T()); // add gradient from position t+1.
+ }
+ grad[k] =... // compute gL/gy[t], y[t]=data[t-1]*W+src[t]
+ }
+ gweight = dot(data.Slice(0, window_-1).T(), grad.Slice(1, window_));
+ Copy(gsrc, grad);
+ }
+
+After the loop, we get the gradient of the loss w.r.t y[k], which is used to
+compute the gradient of W and the src[k].
+
+#### LossLayer
+
+This layer computes the cross-entropy loss and the `$log_{10}P(w_{t+1}|w_t)$` (which
+could be averaged over all words by users to get the PPL value).
+
+There are two configuration fields to be specified by users.
+
+ message LossProto {
+ optional int32 nclass = 1;
+ optional int32 vocab_size = 2;
+ }
+
+There are two weight matrices to be learned
+
+ class LossLayer : public RNNLayer {
+ ...
+ private:
+ Param* word_weight_, *class_weight_;
+ }
+
+The ComputeFeature function computes the two probabilities respectively.
+
+`$$P(C_{w_{t+1}}|w_t) = Softmax(w_t * class\_weight_)$$`
+`$$P(w_{t+1}|C_{w_{t+1}}) = Softmax(w_t * word\_weight[class\_start:class\_end])$$`
+
+`$w_t$` is the feature from the hidden layer for the k-th word, its ground truth
+next word is `$w_{t+1}$`. The first equation computes the probability distribution over all
+classes for the next word. The second equation computes the
+probability distribution over the words in the ground truth class for the next word.
+
+The ComputeGradient function computes the gradient of the source layer
+(i.e., the hidden layer) and the two weight matrices.
+
+### Updater Configuration
+
+We employ kFixedStep type of the learning rate change method and the
+configuration is as follows. We decay the learning rate once the performance does
+not increase on the validation dataset.
+
+ updater{
+ type: kSGD
+ learning_rate {
+ type: kFixedStep
+ fixedstep_conf:{
+ step:0
+ step:48810
+ step:56945
+ step:65080
+ step:73215
+ step_lr:0.1
+ step_lr:0.05
+ step_lr:0.025
+ step_lr:0.0125
+ step_lr:0.00625
+ }
+ }
+ }
+
+### TrainOneBatch() Function
+
+We use BP (BackPropagation) algorithm to train the RNN model here. The
+corresponding configuration can be seen below.
+
+ # In job.conf file
+ train_one_batch {
+ alg: kBackPropagation
+ }
+
+### Cluster Configuration
+
+The default cluster configuration can be used, i.e., single worker and single server
+in a single process.
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/kr/test.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/kr/test.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/kr/test.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/kr/test.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,119 @@
+# Performance Test and Feature Extraction
+
+----
+
+Once SINGA finishes the training of a model, it would checkpoint the model parameters
+into disk files under the [checkpoint folder](checkpoint.html). Model parameters can also be dumped
+into this folder periodically during training if the
+[checkpoint configuration[(checkpoint.html) fields are set. With the checkpoint
+files, we can load the model parameters to conduct performance test, feature extraction and prediction
+against new data.
+
+To load the model parameters from checkpoint files, we need to add the paths of
+checkpoint files in the job configuration file
+
+ checkpoint_path: PATH_TO_CHECKPOINT_FILE1
+ checkpoint_path: PATH_TO_CHECKPOINT_FILE2
+ ...
+
+The new dataset is configured by specifying the ``test_step`` and the data input
+layer, e.g. the following configuration is for a dataset with 100*100 instances.
+
+ test_steps: 100
+ net {
+ layer {
+ name: "input"
+ store_conf {
+ backend: "kvfile"
+ path: PATH_TO_TEST_KVFILE
+ batchsize: 100
+ }
+ }
+ ...
+ }
+
+## Performance Test
+
+This application is to test the performance, e.g., accuracy, of the previously
+trained model. Depending on the application, the test data may have ground truth
+labels or not. For example, if the model is trained for image classification,
+the test images must have ground truth labels to calculate the accuracy; if the
+model is an auto-encoder, the performance could be measured by reconstruction error, which
+does not require extra labels. For both cases, there would be a layer that calculates
+the performance, e.g., the `SoftmaxLossLayer`.
+
+The job configuration file for the cifar10 example can be used directly for testing after
+adding the checkpoint path. The running command is
+
+
+ $ ./bin/singa-run.sh -conf examples/cifar10/job.conf -test
+
+The performance would be output on the screen like,
+
+
+ Load from checkpoint file examples/cifar10/checkpoint/step50000-worker0
+ accuracy = 0.728000, loss = 0.807645
+
+## Feature extraction
+
+Since deep learning models are good at learning features, feature extraction for
+is a major functionality of deep learning models, e.g., we can extract features
+from the fully connected layers of [AlexNet](www.cs.toronto.edu/~fritz/absps/imagenet.pdf) as image features for image retrieval.
+To extract the features from one layer, we simply add an output layer after that layer.
+For instance, to extract the fully connected (with name `ip1`) layer of the cifar10 example model,
+we replace the `SoftmaxLossLayer` with a `CSVOutputLayer` which extracts the features into a CSV file,
+
+ layer {
+ name: "ip1"
+ }
+ layer {
+ name: "output"
+ type: kCSVOutput
+ srclayers: "ip1"
+ store_conf {
+ backend: "textfile"
+ path: OUTPUT_FILE_PATH
+ }
+ }
+
+The input layer and test steps, and the running command are the same as in *Performance Test* section.
+
+## Label Prediction
+
+If the output layer is connected to a layer that predicts labels of images,
+the output layer would then write the prediction results into files.
+SINGA provides two built-in layers for generating prediction results, namely,
+
+* SoftmaxLayer, generates probabilities of each candidate labels.
+* ArgSortLayer, sorts labels according to probabilities in descending order and keep topk labels.
+
+By connecting the two layers with the previous layer and the output layer, we can
+extract the predictions of each instance. For example,
+
+ layer {
+ name: "feature"
+ ...
+ }
+ layer {
+ name: "softmax"
+ type: kSoftmax
+ srclayers: "feature"
+ }
+ layer {
+ name: "prediction"
+ type: kArgSort
+ srclayers: "softmax"
+ argsort_conf {
+ topk: 5
+ }
+ }
+ layer {
+ name: "output"
+ type: kCSVOutput
+ srclayers: "prediction"
+ store_conf {}
+ }
+
+The top-5 labels of each instance will be written as one line of the output CSV file.
+Currently, above layers cannot co-exist with the loss layers used for training.
+Please comment out the loss layers for extracting prediction results.
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/kr/train-one-batch.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/kr/train-one-batch.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/kr/train-one-batch.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/kr/train-one-batch.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,179 @@
+# Train-One-Batch
+
+---
+
+For each SGD iteration, every worker calls the `TrainOneBatch` function to
+compute gradients of parameters associated with local layers (i.e., layers
+dispatched to it). SINGA has implemented two algorithms for the
+`TrainOneBatch` function. Users select the corresponding algorithm for
+their model in the configuration.
+
+## Basic user guide
+
+### Back-propagation
+
+[BP algorithm](http://yann.lecun.com/exdb/publis/pdf/lecun-98b.pdf) is used for
+computing gradients of feed-forward models, e.g., [CNN](cnn.html)
+and [MLP](mlp.html), and [RNN](rnn.html) models in SINGA.
+
+
+ # in job.conf
+ alg: kBP
+
+To use the BP algorithm for the `TrainOneBatch` function, users just simply
+configure the `alg` field with `kBP`. If a neural net contains user-defined
+layers, these layers must be implemented properly be to consistent with the
+implementation of the BP algorithm in SINGA (see below).
+
+
+### Contrastive Divergence
+
+[CD algorithm](http://www.cs.toronto.edu/~fritz/absps/nccd.pdf) is used for
+computing gradients of energy models like RBM.
+
+ # job.conf
+ alg: kCD
+ cd_conf {
+ cd_k: 2
+ }
+
+To use the CD algorithm for the `TrainOneBatch` function, users just configure
+the `alg` field to `kCD`. Uses can also configure the Gibbs sampling steps in
+the CD algorthm through the `cd_k` field. By default, it is set to 1.
+
+
+
+## Advanced user guide
+
+### Implementation of BP
+
+The BP algorithm is implemented in SINGA following the below pseudo code,
+
+ BPTrainOnebatch(step, net) {
+ // forward propagate
+ foreach layer in net.local_layers() {
+ if IsBridgeDstLayer(layer)
+ recv data from the src layer (i.e., BridgeSrcLayer)
+ foreach param in layer.params()
+ Collect(param) // recv response from servers for last update
+
+ layer.ComputeFeature(kForward)
+
+ if IsBridgeSrcLayer(layer)
+ send layer.data_ to dst layer
+ }
+ // backward propagate
+ foreach layer in reverse(net.local_layers) {
+ if IsBridgeSrcLayer(layer)
+ recv gradient from the dst layer (i.e., BridgeDstLayer)
+ recv response from servers for last update
+
+ layer.ComputeGradient()
+ foreach param in layer.params()
+ Update(step, param) // send param.grad_ to servers
+
+ if IsBridgeDstLayer(layer)
+ send layer.grad_ to src layer
+ }
+ }
+
+
+It forwards features through all local layers (can be checked by layer
+partition ID and worker ID) and backwards gradients in the reverse order.
+[BridgeSrcLayer](layer.html#bridgesrclayer--bridgedstlayer)
+(resp. `BridgeDstLayer`) will be blocked until the feature (resp.
+gradient) from the source (resp. destination) layer comes. Parameter gradients
+are sent to servers via `Update` function. Updated parameters are collected via
+`Collect` function, which will be blocked until the parameter is updated.
+[Param](param.html) objects have versions, which can be used to
+check whether the `Param` objects have been updated or not.
+
+Since RNN models are unrolled into feed-forward models, users need to implement
+the forward propagation in the recurrent layer's `ComputeFeature` function,
+and implement the backward propagation in the recurrent layer's `ComputeGradient`
+function. As a result, the whole `TrainOneBatch` runs
+[back-propagation through time (BPTT)](https://en.wikipedia.org/wiki/Backpropagation_through_time) algorithm.
+
+### Implementation of CD
+
+The CD algorithm is implemented in SINGA following the below pseudo code,
+
+ CDTrainOneBatch(step, net) {
+ # positive phase
+ foreach layer in net.local_layers()
+ if IsBridgeDstLayer(layer)
+ recv positive phase data from the src layer (i.e., BridgeSrcLayer)
+ foreach param in layer.params()
+ Collect(param) // recv response from servers for last update
+ layer.ComputeFeature(kPositive)
+ if IsBridgeSrcLayer(layer)
+ send positive phase data to dst layer
+
+ # negative phase
+ foreach gibbs in [0...layer_proto_.cd_k]
+ foreach layer in net.local_layers()
+ if IsBridgeDstLayer(layer)
+ recv negative phase data from the src layer (i.e., BridgeSrcLayer)
+ layer.ComputeFeature(kPositive)
+ if IsBridgeSrcLayer(layer)
+ send negative phase data to dst layer
+
+ foreach layer in net.local_layers()
+ layer.ComputeGradient()
+ foreach param in layer.params
+ Update(param)
+ }
+
+Parameter gradients are computed after the positive phase and negative phase.
+
+### Implementing a new algorithm
+
+SINGA implements BP and CD by creating two subclasses of
+the [Worker](../api/classsinga_1_1Worker.html) class:
+[BPWorker](../api/classsinga_1_1BPWorker.html)'s `TrainOneBatch` function implements the BP
+algorithm; [CDWorker](../api/classsinga_1_1CDWorker.html)'s `TrainOneBatch` function implements the CD
+algorithm. To implement a new algorithm for the `TrainOneBatch` function, users
+need to create a new subclass of the `Worker`, e.g.,
+
+ class FooWorker : public Worker {
+ void TrainOneBatch(int step, shared_ptr<NeuralNet> net, Metric* perf) override;
+ void TestOneBatch(int step, Phase phase, shared_ptr<NeuralNet> net, Metric* perf) override;
+ };
+
+The `FooWorker` must implement the above two functions for training one
+mini-batch and testing one mini-batch. The `perf` argument is for collecting
+training or testing performance, e.g., the objective loss or accuracy. It is
+passed to the `ComputeFeature` function of each layer.
+
+Users can define some fields for users to configure
+
+ # in user.proto
+ message FooWorkerProto {
+ optional int32 b = 1;
+ }
+
+ extend JobProto {
+ optional FooWorkerProto foo_conf = 101;
+ }
+
+ # in job.proto
+ JobProto {
+ ...
+ extension 101..max;
+ }
+
+It is similar as [adding configuration fields for a new layer](layer.html#implementing-a-new-layer-subclass).
+
+To use `FooWorker`, users need to register it in the [main.cc](programming-guide.html)
+and configure the `alg` and `foo_conf` fields,
+
+ # in main.cc
+ const int kFoo = 3; // worker ID, must be different to that of CDWorker and BPWorker
+ driver.RegisterWorker<FooWorker>(kFoo);
+
+ # in job.conf
+ ...
+ alg: 3
+ [foo_conf] {
+ b = 4;
+ }
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/kr/updater.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/kr/updater.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/kr/updater.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/kr/updater.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,284 @@
+# Updater
+
+---
+
+Every server in SINGA has an [Updater](../api/classsinga_1_1Updater.html)
+instance that updates parameters based on gradients.
+In this page, the *Basic user guide* describes the configuration of an updater.
+The *Advanced user guide* present details on how to implement a new updater and a new
+learning rate changing method.
+
+## Basic user guide
+
+There are many different parameter updating protocols (i.e., subclasses of
+`Updater`). They share some configuration fields like
+
+* `type`, an integer for identifying an updater;
+* `learning_rate`, configuration for the
+[LRGenerator](../api/classsinga_1_1LRGenerator.html) which controls the learning rate.
+* `weight_decay`, the co-efficient for [L2 * regularization](http://deeplearning.net/tutorial/gettingstarted.html#regularization).
+* [momentum](http://ufldl.stanford.edu/tutorial/supervised/OptimizationStochasticGradientDescent/).
+
+If you are not familiar with the above terms, you can get their meanings in
+[this page provided by Karpathy](http://cs231n.github.io/neural-networks-3/#update).
+
+### Configuration of built-in updater classes
+
+#### Updater
+The base `Updater` implements the [vanilla SGD algorithm](http://cs231n.github.io/neural-networks-3/#sgd).
+Its configuration type is `kSGD`.
+Users need to configure at least the `learning_rate` field.
+`momentum` and `weight_decay` are optional fields.
+
+ updater{
+ type: kSGD
+ momentum: float
+ weight_decay: float
+ learning_rate {
+ ...
+ }
+ }
+
+#### AdaGradUpdater
+
+It inherits the base `Updater` to implement the
+[AdaGrad](http://www.magicbroom.info/Papers/DuchiHaSi10.pdf) algorithm.
+Its type is `kAdaGrad`.
+`AdaGradUpdater` is configured similar to `Updater` except
+that `momentum` is not used.
+
+#### NesterovUpdater
+
+It inherits the base `Updater` to implements the
+[Nesterov](http://arxiv.org/pdf/1212.0901v2.pdf) (section 3.5) updating protocol.
+Its type is `kNesterov`.
+`learning_rate` and `momentum` must be configured. `weight_decay` is an
+optional configuration field.
+
+#### RMSPropUpdater
+
+It inherits the base `Updater` to implements the
+[RMSProp algorithm](http://cs231n.github.io/neural-networks-3/#sgd) proposed by
+[Hinton](http://www.cs.toronto.edu/%7Etijmen/csc321/slides/lecture_slides_lec6.pdf)(slide 29).
+Its type is `kRMSProp`.
+
+ updater {
+ type: kRMSProp
+ rmsprop_conf {
+ rho: float # [0,1]
+ }
+ }
+
+
+### Configuration of learning rate
+
+The `learning_rate` field is configured as,
+
+ learning_rate {
+ type: ChangeMethod
+ base_lr: float # base/initial learning rate
+ ... # fields to a specific changing method
+ }
+
+The common fields include `type` and `base_lr`. SINGA provides the following
+`ChangeMethod`s.
+
+#### kFixed
+
+The `base_lr` is used for all steps.
+
+#### kLinear
+
+The updater should be configured like
+
+ learning_rate {
+ base_lr: float
+ linear_conf {
+ freq: int
+ final_lr: float
+ }
+ }
+
+Linear interpolation is used to change the learning rate,
+
+ lr = (1 - step / freq) * base_lr + (step / freq) * final_lr
+
+#### kExponential
+
+The udapter should be configured like
+
+ learning_rate {
+ base_lr: float
+ exponential_conf {
+ freq: int
+ }
+ }
+
+The learning rate for `step` is
+
+ lr = base_lr / 2^(step / freq)
+
+#### kInverseT
+
+The updater should be configured like
+
+ learning_rate {
+ base_lr: float
+ inverset_conf {
+ final_lr: float
+ }
+ }
+
+The learning rate for `step` is
+
+ lr = base_lr / (1 + step / final_lr)
+
+#### kInverse
+
+The updater should be configured like
+
+ learning_rate {
+ base_lr: float
+ inverse_conf {
+ gamma: float
+ pow: float
+ }
+ }
+
+
+The learning rate for `step` is
+
+ lr = base_lr * (1 + gamma * setp)^(-pow)
+
+
+#### kStep
+
+The updater should be configured like
+
+ learning_rate {
+ base_lr : float
+ step_conf {
+ change_freq: int
+ gamma: float
+ }
+ }
+
+
+The learning rate for `step` is
+
+ lr = base_lr * gamma^ (step / change_freq)
+
+#### kFixedStep
+
+The updater should be configured like
+
+ learning_rate {
+ fixedstep_conf {
+ step: int
+ step_lr: float
+
+ step: int
+ step_lr: float
+
+ ...
+ }
+ }
+
+Denote the i-th tuple as (step[i], step_lr[i]), then the learning rate for
+`step` is,
+
+ step_lr[k]
+
+where step[k] is the smallest number that is larger than `step`.
+
+
+## Advanced user guide
+
+### Implementing a new Updater subclass
+
+The base Updater class has one virtual function,
+
+ class Updater{
+ public:
+ virtual void Update(int step, Param* param, float grad_scale = 1.0f) = 0;
+
+ protected:
+ UpdaterProto proto_;
+ LRGenerator lr_gen_;
+ };
+
+It updates the values of the `param` based on its gradients. The `step`
+argument is for deciding the learning rate which may change through time
+(step). `grad_scale` scales the original gradient values. This function is
+called by servers once it receives all gradients for the same `Param` object.
+
+To implement a new Updater subclass, users must override the `Update` function.
+
+ class FooUpdater : public Updater {
+ void Update(int step, Param* param, float grad_scale = 1.0f) override;
+ };
+
+Configuration of this new updater can be declared similar to that of a new
+layer,
+
+ # in user.proto
+ FooUpdaterProto {
+ optional int32 c = 1;
+ }
+
+ extend UpdaterProto {
+ optional FooUpdaterProto fooupdater_conf= 101;
+ }
+
+The new updater should be registered in the
+[main function](programming-guide.html)
+
+ driver.RegisterUpdater<FooUpdater>("FooUpdater");
+
+Users can then configure the job as
+
+ # in job.conf
+ updater {
+ user_type: "FooUpdater" # must use user_type with the same string identifier as the one used for registration
+ fooupdater_conf {
+ c : 20;
+ }
+ }
+
+### Implementing a new LRGenerator subclass
+
+The base `LRGenerator` is declared as,
+
+ virtual float Get(int step);
+
+To implement a subclass, e.g., `FooLRGen`, users should declare it like
+
+ class FooLRGen : public LRGenerator {
+ public:
+ float Get(int step) override;
+ };
+
+Configuration of `FooLRGen` can be defined using a protocol message,
+
+ # in user.proto
+ message FooLRProto {
+ ...
+ }
+
+ extend LRGenProto {
+ optional FooLRProto foolr_conf = 101;
+ }
+
+The configuration is then like,
+
+ learning_rate {
+ user_type : "FooLR" # must use user_type with the same string identifier as the one used for registration
+ base_lr: float
+ foolr_conf {
+ ...
+ }
+ }
+
+Users have to register this subclass in the main function,
+
+ driver.RegisterLRGenerator<FooLRGen, std::string>("FooLR")
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/layer.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/layer.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/layer.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/layer.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,620 @@
+# Layers
+
+---
+
+Layer is a core abstraction in SINGA. It performs a variety of feature
+transformations for extracting high-level features, e.g., loading raw features,
+parsing RGB values, doing convolution transformation, etc.
+
+The *Basic user guide* section introduces the configuration of a built-in
+layer. *Advanced user guide* explains how to extend the base Layer class to
+implement users' functions.
+
+## Basic user guide
+
+### Layer configuration
+
+Configuration of two example layers are shown below,
+
+ layer {
+ name: "data"
+ type: kCSVRecord
+ store_conf { }
+ }
+ layer{
+ name: "fc1"
+ type: kInnerProduct
+ srclayers: "data"
+ innerproduct_conf{ }
+ param{ }
+ }
+
+There are some common fields for all kinds of layers:
+
+ * `name`: a string used to differentiate two layers in a neural net.
+ * `type`: an integer used for identifying a specific Layer subclass. The types of built-in
+ layers are listed in LayerType (defined in job.proto).
+ For user-defined layer subclasses, `user_type` should be used instead of `type`.
+ * `srclayers`: names of the source layers.
+ In SINGA, all connections are [converted](neural-net.html) to directed connections.
+ * `param`: configuration for a [Param](param.html) instance.
+ There can be multiple Param objects in one layer.
+
+Different layers may have different configurations. These configurations
+are defined in `<type>_conf`. E.g., "fc1" layer has
+`innerproduct_conf`. The subsequent sections
+explain the functionality of each built-in layer and how to configure it.
+
+### Built-in Layer subclasses
+SINGA has provided many built-in layers, which can be used directly to create neural nets.
+These layers are categorized according to their functionalities,
+
+ * Input layers for loading records (e.g., images) from disk files, HDFS or network into memory.
+ * Neuron layers for feature transformation, e.g., [convolution](../api/classsinga_1_1ConvolutionLayer.html), [pooling](../api/classsinga_1_1PoolingLayer.html), dropout, etc.
+ * Loss layers for measuring the training objective loss, e.g., Cross Entropy loss or Euclidean loss.
+ * Output layers for outputting the prediction results (e.g., probabilities of each category) or features into persistent storage, e.g., disk or HDFS.
+ * Connection layers for connecting layers when the neural net is partitioned.
+
+#### Input layers
+
+Input layers load training/test data from disk or other places (e.g., HDFS or network)
+into memory.
+
+##### StoreInputLayer
+
+[StoreInputLayer](../api/classsinga_1_1StoreInputLayer.html) is a base layer for
+loading data from data store. The data store can be a KVFile or TextFile (LMDB,
+LevelDB, HDFS, etc., will be supported later). Its `ComputeFeature` function reads
+batchsize (string:key, string:value) tuples. Each tuple is parsed by a `Parse` function
+implemented by its subclasses.
+
+The configuration for this layer is in `store_conf`,
+
+ store_conf {
+ backend: # "kvfile" or "textfile"
+ path: # path to the data store
+ batchsize : 32
+ prefetching: true #default value is false
+ ...
+ }
+
+##### SingleLabelRecordLayer
+
+It is a subclass of StoreInputLayer. It assumes the (key, value) tuple loaded
+from a data store contains a feature vector (and a label) for one data instance.
+All feature vectors are of the same fixed length. The shape of one instance
+is configured through the `shape` field, e.g., the following configuration
+specifies the shape for the CIFAR10 images.
+
+ store_conf {
+ shape: 3 #channels
+ shape: 32 #height
+ shape: 32 #width
+ }
+
+It may do some preprocessing like [standardization](http://ufldl.stanford.edu/wiki/index.php/Data_Preprocessing).
+The data for preprocessing is loaded by and parsed in a virtual function, which is implemented by
+its subclasses.
+
+##### RecordInputLayer
+
+It is a subclass of SingleLabelRecordLayer. It parses the value field from one
+tuple into a RecordProto, which is generated by Google Protobuf according
+to common.proto. It can be used to store features for images (e.g., using the pixel field)
+or other objects (using the data field). The key field is not parsed.
+
+ type: kRecordInput
+ store_conf {
+ has_label: # default is true
+ ...
+ }
+
+##### CSVInputLayer
+
+It is a subclass of SingleLabelRecordLayer. The value field from one tuple is parsed
+as a CSV line (separated by comma). The first number would be parsed as a label if
+`has_label` is configured in `store_conf`. Otherwise, all numbers would be parsed
+into one row of the `data_` Blob.
+
+ type: kCSVInput
+ store_conf {
+ has_label: # default is true
+ ...
+ }
+
+##### ImagePreprocessLayer
+
+This layer does image preprocessing, e.g., cropping, mirroring and scaling, against
+the data Blob from its source layer. It deprecates the RGBImageLayer which
+works on the Record from ShardDataLayer. It still uses the same configuration as
+RGBImageLayer,
+
+ type: kImagePreprocess
+ rgbimage_conf {
+ scale: float
+ cropsize: int # cropping each image to keep the central part with this size
+ mirror: bool # mirror the image by set image[i,j]=image[i,len-j]
+ meanfile: "Image_Mean_File_Path"
+ }
+
+##### ShardDataLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[ShardDataLayer](../api/classsinga_1_1ShardDataLayer.html) is a subclass of DataLayer,
+which reads Records from disk file. The file should be created using
+[DataShard](../api/classsinga_1_1DataShard.html)
+class. With the data file prepared, users configure the layer as
+
+ type: kShardData
+ sharddata_conf {
+ path: "path to data shard folder"
+ batchsize: int
+ random_skip: int
+ }
+
+`batchsize` specifies the number of records to be trained for one mini-batch.
+The first `rand() % random_skip` `Record`s will be skipped at the first
+iteration. This is to enforce that different workers work on different Records.
+
+##### LMDBDataLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[LMDBDataLayer] is similar to ShardDataLayer, except that the Records are
+loaded from LMDB.
+
+ type: kLMDBData
+ lmdbdata_conf {
+ path: "path to LMDB folder"
+ batchsize: int
+ random_skip: int
+ }
+
+##### ParserLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+It get a vector of Records from DataLayer and parse features into
+a Blob.
+
+ virtual void ParseRecords(Phase phase, const vector<Record>& records, Blob<float>* blob) = 0;
+
+
+##### LabelLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+
+[LabelLayer](../api/classsinga_1_1LabelLayer.html) is a subclass of ParserLayer.
+It parses a single label from each Record. Consequently, it
+will put $b$ (mini-batch size) values into the Blob. It has no specific configuration fields.
+
+
+##### MnistImageLayer (Deprected)
+Deprected! Please use ProtoRecordInputLayer or CSVRecordInputLayer.
+[MnistImageLayer] is a subclass of ParserLayer. It parses the pixel values of
+each image from the MNIST dataset. The pixel
+values may be normalized as `x/norm_a - norm_b`. For example, if `norm_a` is
+set to 255 and `norm_b` is set to 0, then every pixel will be normalized into
+[0, 1].
+
+ type: kMnistImage
+ mnistimage_conf {
+ norm_a: float
+ norm_b: float
+ }
+
+##### RGBImageLayer (Deprected)
+Deprected! Please use the ImagePreprocessLayer.
+[RGBImageLayer](../api/classsinga_1_1RGBImageLayer.html) is a subclass of ParserLayer.
+It parses the RGB values of one image from each Record. It may also
+apply some transformations, e.g., cropping, mirroring operations. If the
+`meanfile` is specified, it should point to a path that contains one Record for
+the mean of each pixel over all training images.
+
+ type: kRGBImage
+ rgbimage_conf {
+ scale: float
+ cropsize: int # cropping each image to keep the central part with this size
+ mirror: bool # mirror the image by set image[i,j]=image[i,len-j]
+ meanfile: "Image_Mean_File_Path"
+ }
+
+##### PrefetchLayer
+
+[PrefetchLayer](../api/classsinga_1_1PrefetchLayer.html) embeds other input layers
+to do data prefeching. It will launch a thread to call the embedded layers to load and extract features.
+It ensures that the I/O task and computation task can work simultaneously.
+One example PrefetchLayer configuration is,
+
+ layer {
+ name: "prefetch"
+ type: kPrefetch
+ sublayers {
+ name: "data"
+ type: kShardData
+ sharddata_conf { }
+ }
+ sublayers {
+ name: "rgb"
+ type: kRGBImage
+ srclayers:"data"
+ rgbimage_conf { }
+ }
+ sublayers {
+ name: "label"
+ type: kLabel
+ srclayers: "data"
+ }
+ exclude:kTest
+ }
+
+The layers on top of the PrefetchLayer should use the name of the embedded
+layers as their source layers. For example, the "rgb" and "label" should be
+configured to the `srclayers` of other layers.
+
+
+#### Output Layers
+
+Output layers get data from their source layers and write them to persistent storage,
+e.g., disk files or HDFS (to be supported).
+
+##### RecordOutputLayer
+
+This layer gets data (and label if it is available) from its source layer and converts it into records of type
+RecordProto. Records are written as (key = instance No., value = serialized record) tuples into Store, e.g., KVFile. The configuration of this layer
+should include the specifics of the Store backend via `store_conf`.
+
+ layer {
+ name: "output"
+ type: kRecordOutput
+ srclayers:
+ store_conf {
+ backend: "kvfile"
+ path:
+ }
+ }
+
+##### CSVOutputLayer
+This layer gets data (and label if it available) from its source layer and converts it into
+a string per instance with fields separated by commas (i.e., CSV format). The shape information
+is not kept in the string. All strings are written into
+Store, e.g., text file. The configuration of this layer should include the specifics of the Store backend via `store_conf`.
+
+ layer {
+ name: "output"
+ type: kCSVOutput
+ srclayers:
+ store_conf {
+ backend: "textfile"
+ path:
+ }
+ }
+
+#### Neuron Layers
+
+Neuron layers conduct feature transformations.
+
+#### ActivationLayer
+
+ type: kActivation
+ activation_conf {
+ type: {RELU, SIGMOID, TANH, STANH}
+ }
+
+##### ConvolutionLayer
+
+[ConvolutionLayer](../api/classsinga_1_1ConvolutionLayer.html) conducts convolution transformation.
+
+ type: kConvolution
+ convolution_conf {
+ num_filters: int
+ kernel: int
+ stride: int
+ pad: int
+ }
+ param { } # weight/filter matrix
+ param { } # bias vector
+
+The int value `num_filters` stands for the count of the applied filters; the int
+value `kernel` stands for the convolution kernel size (equal width and height);
+the int value `stride` stands for the distance between the successive filters;
+the int value `pad` pads each with a given int number of pixels border of
+zeros.
+
+##### InnerProductLayer
+
+[InnerProductLayer](../api/classsinga_1_1InnerProductLayer.html) is fully connected with its (single) source layer.
+Typically, it has two parameter fields, one for weight matrix, and the other
+for bias vector. It rotates the feature of the source layer (by multiplying with weight matrix) and
+shifts it (by adding the bias vector).
+
+ type: kInnerProduct
+ innerproduct_conf {
+ num_output: int
+ }
+ param { } # weight matrix
+ param { } # bias vector
+
+
+##### PoolingLayer
+
+[PoolingLayer](../api/classsinga_1_1PoolingLayer.html) is used to do a normalization (or averaging or sampling) of the
+feature vectors from the source layer.
+
+ type: kPooling
+ pooling_conf {
+ pool: AVE|MAX // Choose whether use the Average Pooling or Max Pooling
+ kernel: int // size of the kernel filter
+ pad: int // the padding size
+ stride: int // the step length of the filter
+ }
+
+The pooling layer has two methods: Average Pooling and Max Pooling.
+Use the enum AVE and MAX to choose the method.
+
+ * Max Pooling selects the max value for each filtering area as a point of the
+ result feature blob.
+ * Average Pooling averages all values for each filtering area at a point of the
+ result feature blob.
+
+##### ReLULayer
+
+[ReLuLayer](../api/classsinga_1_1ReLULayer.html) has rectified linear neurons, which conducts the following
+transformation, `f(x) = Max(0, x)`. It has no specific configuration fields.
+
+##### STanhLayer
+
+[STanhLayer](../api/classsinga_1_1TanhLayer.html) uses the scaled tanh as activation function, i.e., `f(x)=1.7159047* tanh(0.6666667 * x)`.
+It has no specific configuration fields.
+
+##### SigmoidLayer
+
+[SigmoidLayer] uses the sigmoid (or logistic) as activation function, i.e.,
+`f(x)=sigmoid(x)`. It has no specific configuration fields.
+
+
+##### Dropout Layer
+[DropoutLayer](../api/asssinga_1_1DropoutLayer.html) is a layer that randomly dropouts some inputs.
+This scheme helps deep learning model away from over-fitting.
+
+ type: kDropout
+ dropout_conf {
+ dropout_ratio: float # dropout probability
+ }
+
+##### LRNLayer
+[LRNLayer](../api/classsinga_1_1LRNLayer.html), (Local Response Normalization), normalizes over the channels.
+
+ type: kLRN
+ lrn_conf {
+ local_size: int
+ alpha: float // scaling parameter
+ beta: float // exponential number
+ }
+
+`local_size` specifies the quantity of the adjoining channels which will be summed up.
+ For `WITHIN_CHANNEL`, it means the side length of the space region which will be summed up.
+
+
+
+### CuDNN layers
+
+CuDNN v3 and v4 are supported in SINGA, which include the following layers,
+
+* CudnnActivationLayer (activation functions are SIGMOID, TANH, RELU)
+* CudnnConvLayer
+* CudnnLRNLayer
+* CudnnPoolLayer
+* CudnnSoftmaxLayer
+
+These layers have the same configuration as the corresponding CPU layers.
+For CuDNN v4, the batch normalization layer is added, which is named as
+`CudnnBMLayer`.
+
+
+#### Loss Layers
+
+Loss layers measures the objective training loss.
+
+##### SoftmaxLossLayer
+
+[SoftmaxLossLayer](../api/classsinga_1_1SoftmaxLossLayer.html) is a combination of the Softmax transformation and
+Cross-Entropy loss. It applies Softmax firstly to get a prediction probability
+for each output unit (neuron) and compute the cross-entropy against the ground truth.
+It is generally used as the final layer to generate labels for classification tasks.
+
+ type: kSoftmaxLoss
+ softmaxloss_conf {
+ topk: int
+ }
+
+The configuration field `topk` is for selecting the labels with `topk`
+probabilities as the prediction results. It is tedious for users to view the
+prediction probability of every label.
+
+#### ConnectionLayer
+
+Subclasses of ConnectionLayer are utility layers that connects other layers due
+to neural net partitioning or other cases.
+
+##### ConcateLayer
+
+[ConcateLayer](../api/classsinga_1_1ConcateLayer.html) connects more than one source layers to concatenate their feature
+blob along given dimension.
+
+ type: kConcate
+ concate_conf {
+ concate_dim: int // define the dimension
+ }
+
+##### SliceLayer
+
+[SliceLayer](../api/classsinga_1_1SliceLayer.html) connects to more than one destination layers to slice its feature
+blob along given dimension.
+
+ type: kSlice
+ slice_conf {
+ slice_dim: int
+ }
+
+##### SplitLayer
+
+[SplitLayer](../api/classsinga_1_1SplitLayer.html) connects to more than one destination layers to replicate its
+feature blob.
+
+ type: kSplit
+ split_conf {
+ num_splits: int
+ }
+
+##### BridgeSrcLayer & BridgeDstLayer
+
+[BridgeSrcLayer](../api/classsinga_1_1BridgeSrcLayer.html) &
+[BridgeDstLayer](../api/classsinga_1_1BridgeDstLayer.html) are utility layers assisting data (e.g., feature or
+gradient) transferring due to neural net partitioning. These two layers are
+added implicitly. Users typically do not need to configure them in their neural
+net configuration.
+
+### OutputLayer
+
+It write the prediction results or the extracted features into file, HTTP stream
+or other places. Currently SINGA has not implemented any specific output layer.
+
+## Advanced user guide
+
+The base Layer class is introduced in this section, followed by how to
+implement a new Layer subclass.
+
+### Base Layer class
+
+#### Members
+
+ LayerProto layer_conf_;
+ vector<Blob<float>> datavec_, gradvec_;
+ vector<AuxType> aux_data_;
+
+The base layer class keeps the user configuration in `layer_conf_`.
+`datavec_` stores the features associated with this layer.
+There are layers without feature vectors; instead, they share the data from
+source layers.
+The `gradvec_` is for storing the gradients of the
+objective loss w.r.t. the `datavec_`. The `aux_data_` stores the auxiliary data, e.g., image label (set `AuxType` to int).
+If images have variant number of labels, the AuxType can be defined to `vector<int>`.
+Currently, we hard code `AuxType` to int. It will be added as a template argument of Layer class later.
+
+If a layer has parameters, these parameters are declared using type
+[Param](param.html). Since some layers do not have
+parameters, we do not declare any `Param` in the base layer class.
+
+#### Functions
+
+ virtual void Setup(const LayerProto& conf, const vector<Layer*>& srclayers);
+ virtual void ComputeFeature(int flag, const vector<Layer*>& srclayers) = 0;
+ virtual void ComputeGradient(int flag, const vector<Layer*>& srclayers) = 0;
+
+The `Setup` function reads user configuration, i.e. `conf`, and information
+from source layers, e.g., mini-batch size, to set the
+shape of the `data_` (and `grad_`) field as well
+as some other layer specific fields.
+Memory will not be allocated until computation over the data structure happens.
+
+The `ComputeFeature` function evaluates the feature blob by transforming (e.g.
+convolution and pooling) features from the source layers. `ComputeGradient`
+computes the gradients of parameters associated with this layer. These two
+functions are invoked by the [TrainOneBatch](train-one-batch.html)
+function during training. Hence, they should be consistent with the
+`TrainOneBatch` function. Particularly, for feed-forward and RNN models, they are
+trained using [BP algorithm](train-one-batch.html#back-propagation),
+which requires each layer's `ComputeFeature`
+function to compute `data_` based on source layers, and requires each layer's
+`ComputeGradient` to compute gradients of parameters and source layers'
+`grad_`. For energy models, e.g., RBM, they are trained by
+[CD algorithm](train-one-batch.html#contrastive-divergence), which
+requires each layer's `ComputeFeature` function to compute the feature vectors
+for the positive phase or negative phase depending on the `phase` argument, and
+requires the `ComputeGradient` function to only compute parameter gradients.
+For some layers, e.g., loss layer or output layer, they can put the loss or
+prediction result into the `metric` argument, which will be averaged and
+displayed periodically.
+
+### Implementing a new Layer subclass
+
+Users can extend the Layer class or other subclasses to implement their own feature transformation
+logics as long as the two virtual functions are overridden to be consistent with
+the `TrainOneBatch` function. The `Setup` function may also be overridden to
+read specific layer configuration.
+
+The [RNNLM](rnn.html) provides a couple of user-defined layers. You can refer to them as examples.
+
+#### Layer specific protocol message
+
+To implement a new layer, the first step is to define the layer specific
+configuration. Suppose the new layer is `FooLayer`, the layer specific
+google protocol message `FooLayerProto` should be defined as
+
+ # in user.proto
+ package singa
+ import "job.proto"
+ message FooLayerProto {
+ optional int32 a = 1; // specific fields to the FooLayer
+ }
+
+In addition, users need to extend the original `LayerProto` (defined in job.proto of SINGA)
+to include the `foo_conf` as follows.
+
+ extend LayerProto {
+ optional FooLayerProto foo_conf = 101; // unique field id, reserved for extensions
+ }
+
+If there are multiple new layers, then each layer that has specific
+configurations would have a `<type>_conf` field and takes one unique extension number.
+SINGA has reserved enough extension numbers, e.g., starting from 101 to 1000.
+
+ # job.proto of SINGA
+ LayerProto {
+ ...
+ extensions 101 to 1000;
+ }
+
+With user.proto defined, users can use
+[protoc](https://developers.google.com/protocol-buffers/) to generate the `user.pb.cc`
+and `user.pb.h` files. In users' code, the extension fields can be accessed via,
+
+ auto conf = layer_proto_.GetExtension(foo_conf);
+ int a = conf.a();
+
+When defining configurations of the new layer (in job.conf), users should use
+`user_type` for its layer type instead of `type`. In addition, `foo_conf`
+should be enclosed in brackets.
+
+ layer {
+ name: "foo"
+ user_type: "kFooLayer" # Note user_type of user-defined layers is string
+ [foo_conf] { # Note there is a pair of [] for extension fields
+ a: 10
+ }
+ }
+
+#### New Layer subclass declaration
+
+The new layer subclass can be implemented like the built-in layer subclasses.
+
+ class FooLayer : public singa::Layer {
+ public:
+ void Setup(const LayerProto& conf, const vector<Layer*>& srclayers) override;
+ void ComputeFeature(int flag, const vector<Layer*>& srclayers) override;
+ void ComputeGradient(int flag, const vector<Layer*>& srclayers) override;
+
+ private:
+ // members
+ };
+
+Users must override the two virtual functions to be called by the
+`TrainOneBatch` for either BP or CD algorithm. Typically, the `Setup` function
+will also be overridden to initialize some members. The user configured fields
+can be accessed through `layer_conf_` as shown in the above paragraphs.
+
+#### New Layer subclass registration
+
+The newly defined layer should be registered in [main.cc](http://singa.incubator.apache.org/docs/programming-guide) by adding
+
+ driver.RegisterLayer<FooLayer, std::string>("kFooLayer"); // "kFooLayer" should be matched to layer configurations in job.conf.
+
+After that, the [NeuralNet](neural-net.html) can create instances of the new Layer subclass.
Added: incubator/singa/site/trunk/content/markdown/v0.3.0/mesos.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/v0.3.0/mesos.md?rev=1740048&view=auto
==============================================================================
--- incubator/singa/site/trunk/content/markdown/v0.3.0/mesos.md (added)
+++ incubator/singa/site/trunk/content/markdown/v0.3.0/mesos.md Wed Apr 20 05:09:06 2016
@@ -0,0 +1,87 @@
+#Distributed Training on Mesos
+
+This guide explains how to start SINGA distributed training on a Mesos cluster. It assumes that both Mesos and HDFS are already running, and every node has SINGA installed.
+We assume the architecture depicted below, in which a cluster nodes are Docker container. Refer to [Docker guide](docker.html) for details of how to start individual nodes and set up network connection between them (make sure [weave](http://weave.works/guides/weave-docker-ubuntu-simple.html) is running at each node, and the cluster's headnode is running in container `node0`)
+
+![Nothing](http://www.comp.nus.edu.sg/~dinhtta/files/singa_mesos.png)
+
+---
+
+## Start HDFS and Mesos
+Go inside each container, using:
+````
+docker exec -it nodeX /bin/bash
+````
+and configure it as follows:
+
+* On container `node0`
+
+ hadoop namenode -format
+ hadoop-daemon.sh start namenode
+ /opt/mesos-0.22.0/build/bin/mesos-master.sh --work_dir=/opt --log_dir=/opt --quiet > /dev/null &
+ zk-service.sh start
+
+* On container `node1, node2, ...`
+
+ hadoop-daemon.sh start datanode
+ /opt/mesos-0.22.0/build/bin/mesos-slave.sh --master=node0:5050 --log_dir=/opt --quiet > /dev/null &
+
+To check if the setup has been successful, check that HDFS namenode has registered `N` datanodes, via:
+
+````
+hadoop dfsadmin -report
+````
+
+####Important If the Docker version is 1.9 or newer, make sure [name resolution is set up
+properly](docker.html#launch_pseudo)
+
+#### Mesos logs
+Mesos logs are stored at `/opt/lt-mesos-master.INFO` on `node0` and `/opt/lt-mesos-slave.INFO` at other nodes.
+
+---
+
+## Starting SINGA training on Mesos
+Assumed that Mesos and HDFS are already started, SINGA job can be launched at **any** container.
+
+#### Launching job
+
+1. Log in to any container, then
+ cd incubator-singa/tool/mesos
+<a name="job_start"></a>
+2. Check that configuration files are correct:
+ + `scheduler.conf` contains information about the master nodes
+ + `singa.conf` contains information about Zookeeper node0
+ + Job configuration file `job.conf` **contains full path to the examples directories (NO RELATIVE PATH!).**
+3. Start the job:
+ + If starting for the first time:
+
+ ./scheduler <job config file> -scheduler_conf <scheduler config file> -singa_conf <SINGA config file>
+ + If not the first time:
+
+ ./scheduler <job config file>
+
+**Notes.** Each running job is given a `frameworkID`. Look for the log message of the form:
+
+ Framework registered with XXX-XXX-XXX-XXX-XXX-XXX
+
+#### Monitoring and Debugging
+
+Each Mesos job is given a `frameworkID` and a *sandbox* directory is created for each job.
+The directory is in the specified `work_dir` (or `/tmp/mesos`) by default. For example, the error
+during SINGA execution can be found at:
+
+ /tmp/mesos/slaves/xxxxx-Sx/frameworks/xxxxx/executors/SINGA_x/runs/latest/stderr
+
+Other artifacts, like files downloaded from HDFS (`job.conf`) and `stdout` can be found in the same
+directory.
+
+#### Stopping
+
+There are two way to kill the running job:
+
+1. If the scheduler is running in the foreground, simply kill it (using `Ctrl-C`, for example).
+
+2. If the scheduler is running in the background, kill it using Mesos's REST API:
+
+ curl -d "frameworkId=XXX-XXX-XXX-XXX-XXX-XXX" -X POST http://<master>/master/shutdown
+