You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@singa.apache.org by wa...@apache.org on 2015/09/28 08:40:15 UTC

svn commit: r1705605 - in /incubator/singa/site/trunk/content/markdown: develop/schedule.md docs/checkpoint.md docs/debug.md docs/installation.md docs/layer.md docs/programming-guide.md docs/quick-start.md docs/rbm.md

Author: wangwei
Date: Mon Sep 28 06:40:14 2015
New Revision: 1705605

URL: http://svn.apache.org/viewvc?rev=1705605&view=rev
Log:
update docs of rbm (.bin), layer and installation to be consistent with the code (README.md)

Modified:
    incubator/singa/site/trunk/content/markdown/develop/schedule.md
    incubator/singa/site/trunk/content/markdown/docs/checkpoint.md
    incubator/singa/site/trunk/content/markdown/docs/debug.md
    incubator/singa/site/trunk/content/markdown/docs/installation.md
    incubator/singa/site/trunk/content/markdown/docs/layer.md
    incubator/singa/site/trunk/content/markdown/docs/programming-guide.md
    incubator/singa/site/trunk/content/markdown/docs/quick-start.md
    incubator/singa/site/trunk/content/markdown/docs/rbm.md

Modified: incubator/singa/site/trunk/content/markdown/develop/schedule.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/develop/schedule.md?rev=1705605&r1=1705604&r2=1705605&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/develop/schedule.md (original)
+++ incubator/singa/site/trunk/content/markdown/develop/schedule.md Mon Sep 28 06:40:14 2015
@@ -3,7 +3,7 @@
 
 | Release | Module| Feature | Status |
 |---------|---------|-------------|--------|
-| 0.1 September    | Neural Network |1.1. Feed forward neural network, including CNN, MLP | done|
+| 0.1 Sep.     | Neural Network |1.1. Feed forward neural network, including CNN, MLP | done|
 |         |          |1.2. RBM-like model, including RBM | done|
 |         |                |1.3. Recurrent neural network, including standard RNN | done|
 |         | Architecture   |1.4. One worker group on single node (with data partition)| done|
@@ -14,15 +14,11 @@
 |         |                |1.9. Load-balance among servers | done|
 |         | Failure recovery|1.10. Checkpoint and restore |done|
 |         | Tools|1.11. Installation with GNU auto tools| done|
-|0.2 October  | Neural Network |2.1. Feed forward neural network, including auto-encoders, hinge loss layers, HDFS data layers||
-| |                |2.2. RBM-like model, including DBM | |
-|         |                |2.3. Recurrent neural network, including LSTM| |
-|         |                |2.4. Model partition ||
-|         | Communication  |2.5. MPI||
-|         | GPU            |2.6. Single GPU ||
-|         |                |2.7. Multiple GPUs on single node||
-|         | Resource Management |1.9. Integration with Mesos ||
-|         | Architecture   |2.8. Update to support GPUs
-|         | Fault Tolerance|2.9. Node failure detection and recovery||
-|         | Binding        |2.9. Python binding ||
-|         | User Interface |2.10. Web front-end for job submission and performance visualization||
+|0.2 Nov.  | Neural Network |2.1. Feed forward neural network, including VGG model, CSV input layer, HDFS output layer, etc.||
+|         |                |2.2. Recurrent neural network, including GRU and LSTM| |
+|         | |2.3. Model partition and hybrid partition||
+|         | Configuration   |2.4. Configuration helpers for popular models, e.g., CNN, MLP, Auto-encoders||
+|         | Tools |2.5. Integration with Mesos for resource management||
+|         |               |2.6. Prepare Docker images for deployment||
+|         | Binding        |2.7. Python binding for major components ||
+|         | GPU            |2.8. Single node with multiple GPUs ||

Modified: incubator/singa/site/trunk/content/markdown/docs/checkpoint.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/checkpoint.md?rev=1705605&r1=1705604&r2=1705605&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/checkpoint.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/checkpoint.md Mon Sep 28 06:40:14 2015
@@ -27,7 +27,7 @@ For example,
     checkpoint_frequency: 300
     ...
 
-Checkpointing files are located at *WORKSPACE/checkpoint/stepSTEP-workerWORKERID.bin*.
+Checkpointing files are located at *WORKSPACE/checkpoint/stepSTEP-workerWORKERID*.
 *WORKSPACE* is configured in
 
     cluster {
@@ -37,8 +37,8 @@ Checkpointing files are located at *WORK
 For the above configuration, after training for 700 steps, there would be
 two checkpointing files,
 
-    step400-worker0.bin
-    step700-worker0.bin
+    step400-worker0
+    step700-worker0
 
 ## Application - resuming training
 
@@ -54,15 +54,15 @@ We can also use the checkpointing file f
 a new model by configuring the new job as,
 
     # job.conf
-    checkpoint : "WORKSPACE/checkpoint/step400-worker0.bin"
+    checkpoint : "WORKSPACE/checkpoint/step400-worker0"
     ...
 
 If there are multiple checkpointing files for the same snapshot due to model
 partitioning, all the checkpointing files should be added,
 
     # job.conf
-    checkpoint : "WORKSPACE/checkpoint/step400-worker0.bin"
-    checkpoint : "WORKSPACE/checkpoint/step400-worker1.bin"
+    checkpoint : "WORKSPACE/checkpoint/step400-worker0"
+    checkpoint : "WORKSPACE/checkpoint/step400-worker1"
     ...
 
 The training command is the same as starting a new job,

Modified: incubator/singa/site/trunk/content/markdown/docs/debug.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/debug.md?rev=1705605&r1=1705604&r2=1705605&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/debug.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/debug.md Mon Sep 28 06:40:14 2015
@@ -18,7 +18,7 @@ To debug, first start zookeeper if it is
     # do this for only once
     ./bin/zk-service.sh start
     # do this every time
-    gdb ./bin/singa
+    gdb .libs/singa
 
 Then set the command line arguments
 

Modified: incubator/singa/site/trunk/content/markdown/docs/installation.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/installation.md?rev=1705605&r1=1705604&r2=1705605&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/installation.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/installation.md Mon Sep 28 06:40:14 2015
@@ -26,7 +26,19 @@ Optional dependencies include:
   * lmdb version 0.9.10
 
 
-SINGA comes with a script for installing the above libraries (see below).
+You can install all dependencies into $PREFIX folder by
+
+    ./thirdparty/install.sh all $PREFIX
+
+If $PREFIX is not a system path (e.g., /usr/local/), please export the following
+variables to continue the building instructions,
+
+    export LD_LIBRARY_PATH=$PREFIX/lib:$LD_LIBRARY_PATH
+    export CPLUS_INCLUDE_PATH=$PREFIX/include:$CPLUS_INCLUDE_PATH
+    export LIBRARY_PATH=$PREFIX/lib:$LIBRARY_PATH
+    export PATH=$PREFIX/bin:$PATH
+
+More details on using this script is given below.
 
 ## Building SINGA from source
 
@@ -60,18 +72,16 @@ There are two ways to build SINGA,
 
         $ ./configure --enable-lmdb
 
+<!---
+Zhongle: please update the code to use the follow command
 
-The SINGA test is not included by default settings. If you want to run the
-test, please compile with `--enable-test`. You can run:
-
-
-    $ ./configure --enable-test
-    $ make
+    $ make test
 
 After compilation, you will find the binary file singatest. Just run it!
 More details about configure script can be found by running:
 
-		$ ./configure --help
+		$ ./configure -h
+-->
 
 After compiling SINGA successfully, the *libsinga.so* and the executable file
 *singa* will be generated into *.libs/* folder.
@@ -79,8 +89,15 @@ After compiling SINGA successfully, the
 If some dependent libraries are missing (or not detected), you can use the
 following script to download and install them:
 
+<!---
+to be updated after zhongle changes the code to use
+
+    ./install.sh libname \-\-prefix=
+
+-->
+
     $ cd thirdparty
-    $ ./install.sh MISSING_LIBRARY_NAME1 YOUR_INSTALL_PATH1 MISSING_LIBRARY_NAME2 YOUR_INSTALL_PATH2 ...
+    $ ./install.sh LIB_NAME PREFIX
 
 If you do not specify the installation path, the library will be installed in
 the default folder specified by the software itself.  For example, if you want
@@ -90,7 +107,7 @@ to install `zeromq` library in the defau
 
 Or, if you want to install it into another folder,
 
-    $ ./install.sh zeromq --prefix=YOUR_FOLDER
+    $ ./install.sh zeromq PREFIX
 
 You can also install all dependencies in */usr/local* directory:
 
@@ -98,8 +115,7 @@ You can also install all dependencies in
 
 Here is a table showing the first arguments:
 
-    MISSING_LIBRARY_NAME  LIBRARIES
-    cmake                 cmake tools
+    LIB_NAME  LIBRARIE
     czmq*                 czmq lib
     glog                  glog lib
     lmdb                  lmdb lib
@@ -112,50 +128,120 @@ Here is a table showing the first argume
 indicate `zeromq` location.
 The installation commands of `czmq` is:
 
-    $./install.sh czmq /usr/local /usr/local/zeromq
+<!---
+to be updated to
+
+    $./install.sh czmq  \-\-prefix=/usr/local \-\-zeromq=/usr/local/zeromq
+-->
+
+    $./install.sh czmq  /usr/local -f=/usr/local/zeromq
 
 After the execution, `czmq` will be installed in */usr/local*. The last path
 specifies the path to zeromq.
 
 ### FAQ
+* Q1:I get error `./configure --> cannot find blas_segmm() function` even I
+have installed OpenBLAS.
+
+  A1: This means the compiler cannot find the `OpenBLAS` library. If you installed
+  it to $PREFIX (e.g., /opt/OpenBLAS), then you need to export it as
+
+      $ export LIBRARY_PATH=$PREFIX/lib:$LIBRARY_PATH
+      # e.g.,
+      $ export LIBRARY_PATH=/opt/OpenBLAS/lib:$LIBRARY_PATH
+
+
+* Q2: I get error `cblas.h no such file or directory exists`.
+
+  Q2: You need to include the folder of the cblas.h into CPLUS_INCLUDE_PATH,
+  e.g.,
+
+      $ export CPLUS_INCLUDE_PATH=$PREFIX/include:$CPLUS_INCLUDE_PATH
+      # e.g.,
+      $ export CPLUS_INCLUDE_PATH=/opt/OpenBLAS/include:$CPLUS_INCLUDE_PATH
+      # then reconfigure and make SINGA
+      $ ./configure
+      $ make
+
+
+* Q3:While compiling SINGA, I get error `SSE2 instruction set not enabled`
 
-Q1:While compiling SINGA and installing `glog` on max OS X, I get fatal error
+  A3:You can try following command:
+
+      $ make CFLAGS='-msse2' CXXFLAGS='-msse2'
+
+
+* Q4:I get `ImportError: cannot import name enum_type_wrapper` from
+google.protobuf.internal when I try to import .py files.
+
+  A4:After install google protobuf by `make install`, we should install python
+  runtime libraries. Go to protobuf source directory, run:
+
+      $ cd /PROTOBUF/SOURCE/FOLDER
+      $ cd python
+      $ python setup.py build
+      $ python setup.py install
+
+  You may need `sudo` when you try to install python runtime libraries in
+  the system folder.
+
+
+* Q5: I get a linking error caused by gflags.
+
+  A5: SINGA does not depend on gflags. But you may have installed the glog with
+  gflags. In that case you can reinstall glog using *thirdparty/install.sh* into
+  a another folder and export the LDFLAGS and CPPFLAGS to include that folder.
+
+
+* Q6: While compiling SINGA and installing `glog` on mac OS X, I get fatal error
 `'ext/slist' file not found`
 
-A1:Please install `glog` individually and try :
+  A6:Please install `glog` individually and try :
 
-    $ make CFLAGS='-stdlib=libstdc++' CXXFLAGS='stdlib=libstdc++'
+      $ make CFLAGS='-stdlib=libstdc++' CXXFLAGS='stdlib=libstdc++'
 
+* Q7: When I start a training job, it reports error related with "ZOO_ERROR...zk retcode=-4...".
 
-Q2:While compiling SINGA, I get error `SSE2 instruction set not enabled`
+  A7: This is because the zookeeper is not started. Please start the zookeeper service
 
-A2:You can try following command:
+      $ ./bin/zk-service start
 
-    $ make CFLAGS='-msse2' CXXFLAGS='-msse2'
+  If the error still exists, probably that you do not have java. You can simple
+  check it by
 
-Q3:I get error `./configure --> cannot find blas_segmm() function` even I
-run `install.sh OpenBLAS`.
+      $ java --version
 
-A3:Since `OpenBLAS` library is installed in `/opt` folder by default or
-`/other/folder` by your preference, you may edit your environment settings.
-You need add its default installation directories before linking, just
-run:
+* Q8: When I build OpenBLAS from source, I am told that I need a fortran compiler.
 
-    $ export LDFLAGS=-L/opt
+  A8: You can compile OpenBLAS by
 
-Or as an alternative option, you can also edit LIBRARY_PATH to figure it out.
+      $ make ONLY_CBLAS=1
 
+  or install it using
 
-Q4:I get `ImportError: cannot import name enum_type_wrapper` from
-google.protobuf.internal when I try to import .py files.
+	    $ sudo apt-get install openblas-dev
+
+  or
+
+	    $ sudo yum install openblas-devel
+
+  It is worth noting that you need root access to run the last two commands.
+  Remember to set the environment variables to include the header and library
+  paths of OpenBLAS after installation (please refer to the Dependencies section).
+
+* Q9: When I build protocol buffer, it reports that GLIBC++_3.4.20 not found in /usr/lib64/libstdc++.so.6.
+
+  A9: This means the linker found libstdc++.so.6 but that library
+  belongs to an older version of GCC than was used to compile and link the
+  program. The program depends on code defined in
+  the newer libstdc++ that belongs to the newer version of GCC, so the linker
+  must be told how to find the newer libstdc++ shared library.
+  The simplest way to fix this is to find the correct libstdc++ and export it to
+  LD_LIBRARY_PATH. For example, if GLIBC++_3.4.20 is listed in the output of the
+  following command,
 
-A4:After install google protobuf by `make install`, we should install python
-runtime libraries. Go to protobuf source directory, run:
+      $ strings /usr/local/lib64/libstdc++.so.6|grep GLIBC++
 
-    $ cd /PROTOBUF/SOURCE/FOLDER
-    $ cd python
-    $ python setup.py build
-    $ python setup.py install
+  then you just set your environment variable as
 
-You may need `sudo` when you try to install python runtime libraries in
-the system folder.
+      $ export LD_LIBRARY_PATH=/usr/local/lib64:$LD_LIBRARY_PATH

Modified: incubator/singa/site/trunk/content/markdown/docs/layer.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/layer.md?rev=1705605&r1=1705604&r2=1705605&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/layer.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/layer.md Mon Sep 28 06:40:14 2015
@@ -366,12 +366,10 @@ implement a new Layer subclass.
 
 #### Members
 
-    LayerProto layer_proto_;
+    LayerProto layer_conf_;
     Blob<float> data_, grad_;
-    vector<Layer*> srclayers_, dstlayers_;
 
-The base layer class keeps the user configuration in `layer_proto_`. Source
-layers and destination layers are stored in `srclayers_` and `dstlayers_`, respectively.
+The base layer class keeps the user configuration in `layer_conf_`.
 Almost all layers has $b$ (mini-batch size) feature vectors, which are stored
 in the `data_` [Blob](../api/classsinga_1_1Blob.html) (A Blob is a chunk of memory space, proposed in
 [Caffe](http://caffe.berkeleyvision.org/)).
@@ -390,14 +388,16 @@ parameters, we do not declare any `Param
 
 #### Functions
 
-    virtual void Setup(const LayerProto& proto, int npartitions = 1);
-    virtual void ComputeFeature(Phase phase, Metric* perf) = 0;
-    virtual void ComputeGradient(Phase phase) = 0;
+    virtual void Setup(const LayerProto& conf, const vector<Layer*>& srclayers);
+    virtual void ComputeFeature(int flag, const vector<Layer*>& srclayers) = 0;
+    virtual void ComputeGradient(int flag, const vector<Layer*>& srclayers) = 0;
 
-The `Setup` function reads user configuration, i.e. `proto`, and information
+The `Setup` function reads user configuration, i.e. `conf`, and information
 from source layers, e.g., mini-batch size,  to set the
 shape of the `data_` (and `grad_`) field as well
-as some other layer specific fields. If `npartitions` is larger than 1, then
+as some other layer specific fields.
+<!---
+If `npartitions` is larger than 1, then
 users need to reduce the sizes of `data_`, `grad_` Blobs or Param objects. For
 example, if the `partition_dim=0` and there is no source layer, e.g., this
 layer is a (bottom) data layer, then its `data_` and `grad_` Blob should have
@@ -405,8 +405,9 @@ layer is a (bottom) data layer, then its
 dimension 0, then this layer should have the same number of feature vectors as
 the source layer. More complex partition cases are discussed in
 [Neural net partitioning](neural-net.html#neural-net-partitioning). Typically, the
-Setup function just set the shapes of `data_` Blobs and Param objects. Memory
-will not be allocated until computation over the data structure happens.
+Setup function just set the shapes of `data_` Blobs and Param objects.
+-->
+Memory will not be allocated until computation over the data structure happens.
 
 The `ComputeFeature` function evaluates the feature blob by transforming (e.g.
 convolution and pooling) features from the source layers.  `ComputeGradient`
@@ -434,6 +435,8 @@ logics as long as the two virtual functi
 the `TrainOneBatch` function. The `Setup` function may also be overridden to
 read specific layer configuration.
 
+The [RNNLM](rnn.html) provides a couple of user-defined layers. You can refer to them as examples.
+
 #### Layer specific protocol message
 
 To implement a new layer, the first step is to define the layer specific
@@ -489,9 +492,9 @@ The new layer subclass can be implemente
 
     class FooLayer : public singa::Layer {
      public:
-      void Setup(const LayerProto& proto, int npartitions = 1) override;
-      void ComputeFeature(Phase phase, Metric* perf) override;
-      void ComputeGradient(Phase phase) override;
+      void Setup(const LayerProto& conf, const vector<Layer*>& srclayers) override;
+      void ComputeFeature(int flag, const vector<Layer*>& srclayers) override;
+      void ComputeGradient(int flag, const vector<Layer*>& srclayers) override;
 
      private:
       //  members
@@ -500,7 +503,7 @@ The new layer subclass can be implemente
 Users must override the two virtual functions to be called by the
 `TrainOneBatch` for either BP or CD algorithm. Typically, the `Setup` function
 will also be overridden to initialize some members. The user configured fields
-can be accessed through `layer_proto_` as shown in the above paragraphs.
+can be accessed through `layer_conf_` as shown in the above paragraphs.
 
 #### New Layer subclass registration
 

Modified: incubator/singa/site/trunk/content/markdown/docs/programming-guide.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/programming-guide.md?rev=1705605&r1=1705604&r2=1705605&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/programming-guide.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/programming-guide.md Mon Sep 28 06:40:14 2015
@@ -69,7 +69,7 @@ An example main function is like
       auto jobConf = driver.job_conf();
       //  update jobConf
 
-      driver.Submit(resume, jobConf);
+      driver.Train(resume, jobConf);
       return 0;
     }
 

Modified: incubator/singa/site/trunk/content/markdown/docs/quick-start.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/quick-start.md?rev=1705605&r1=1705604&r2=1705605&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/quick-start.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/quick-start.md Mon Sep 28 06:40:14 2015
@@ -48,6 +48,7 @@ available at [CNN example](cnn.html).
 Download the dataset and create the data shards for training and testing.
 
     cd examples/cifar10/
+    cp Makefile.example Makefile
     make download
     make create
 

Modified: incubator/singa/site/trunk/content/markdown/docs/rbm.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/rbm.md?rev=1705605&r1=1705604&r2=1705605&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/rbm.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/rbm.md Mon Sep 28 06:40:14 2015
@@ -192,7 +192,7 @@ The neural net configuration is (with la
 
 To load w0 and b02 from RBM0's checkpoint file, we configure the `checkpoint_path` as,
 
-    checkpoint_path: "examples/rbm/rbm1/checkpoint/step6000-worker0.bin"
+    checkpoint_path: "examples/rbm/rbm1/checkpoint/step6000-worker0"
     cluster{
       workspace: "examples/rbm/rbm2"
     }
@@ -337,10 +337,10 @@ configuration is (with some of the middl
 To load pre-trained parameters from the 4 RBMs' checkpoint file we configure `checkpoint_path` as
 
     ### Checkpoint Configuration
-    checkpoint_path: "examples/rbm/checkpoint/rbm1/checkpoint/step6000-worker0.bin"
-    checkpoint_path: "examples/rbm/checkpoint/rbm2/checkpoint/step6000-worker0.bin"
-    checkpoint_path: "examples/rbm/checkpoint/rbm3/checkpoint/step6000-worker0.bin"
-    checkpoint_path: "examples/rbm/checkpoint/rbm4/checkpoint/step6000-worker0.bin"
+    checkpoint_path: "examples/rbm/checkpoint/rbm1/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm2/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm3/checkpoint/step6000-worker0"
+    checkpoint_path: "examples/rbm/checkpoint/rbm4/checkpoint/step6000-worker0"
 
 
 ## Visualization Results