You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@singa.apache.org by wa...@apache.org on 2015/07/20 15:46:50 UTC
svn commit: r1691945 - in /incubator/singa/site/trunk/content/markdown/docs:
model-config.md program-model.md
Author: wangwei
Date: Mon Jul 20 13:46:49 2015
New Revision: 1691945
URL: http://svn.apache.org/r1691945
Log:
add the programming model page
Modified:
incubator/singa/site/trunk/content/markdown/docs/model-config.md
incubator/singa/site/trunk/content/markdown/docs/program-model.md
Modified: incubator/singa/site/trunk/content/markdown/docs/model-config.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/model-config.md?rev=1691945&r1=1691944&r2=1691945&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/model-config.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/model-config.md Mon Jul 20 13:46:49 2015
@@ -14,22 +14,6 @@ has a model.conf file.
### NeuralNet
-#### Deep learning training
-
-Deep learning is labeled as a feature learning technique, which usually
-consists of multiple layers. Each layer is associated a feature transformation
-function. After going through all layers, the raw input feature (e.g., pixels
-of images) would be converted into a high-level feature that is easier for
-tasks like classification.
-
-Training a deep learning model is to find the optimal parameters involved in
-the transformation functions that generates good features for specific tasks.
-The goodness of a set of parameters is measured by a loss function, e.g.,
-[Cross-Entropy Loss](https://en.wikipedia.org/wiki/Cross_entropy). Since the
-loss functions are usually non-linear and non-convex, it is difficult to get a
-closed form solution. Normally, people uses the SGD algorithm which randomly
-initializes the parameters and then iteratively update them to reduce the loss.
-
#### Uniform model (neuralnet) representation
<img src = "../images/model-categorization.png" style = "width: 400px"> Fig. 1:
Modified: incubator/singa/site/trunk/content/markdown/docs/program-model.md
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/content/markdown/docs/program-model.md?rev=1691945&r1=1691944&r2=1691945&view=diff
==============================================================================
--- incubator/singa/site/trunk/content/markdown/docs/program-model.md (original)
+++ incubator/singa/site/trunk/content/markdown/docs/program-model.md Mon Jul 20 13:46:49 2015
@@ -0,0 +1,101 @@
+## Programming Model
+
+We describe the programming model of SINGA to provide users instructions of
+implementing a new model and submitting the training job. The programming model
+is made almost transparent to the underlying distributed environment. Hence
+users do not need to worry much about the communication and synchronization of
+nodes, which is discussed in [architecture](architecture.html) in details.
+
+### Deep learning training
+
+Deep learning is labeled as a feature learning technique, which usually
+consists of multiple layers. Each layer is associated a feature transformation
+function. After going through all layers, the raw input feature (e.g., pixels
+of images) would be converted into a high-level feature that is easier for
+tasks like classification.
+
+Training a deep learning model is to find the optimal parameters involved in
+the transformation functions that generates good features for specific tasks.
+The goodness of a set of parameters is measured by a loss function, e.g.,
+[Cross-Entropy Loss](https://en.wikipedia.org/wiki/Cross_entropy). Since the
+loss functions are usually non-linear and non-convex, it is difficult to get a
+closed form solution. Normally, people uses the SGD algorithm which randomly
+initializes the parameters and then iteratively update them to reduce the loss.
+
+
+### Steps to submit a training job
+
+SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters
+of deep learning models. For each SGD iteration, there is a
+[Worker](architecture.html) computing gradients of parameters from the
+NeuralNet and a [Updater]() updating parameter values based on gradients. SINGA
+has implemented three algorithms for gradient calculation, namely Back
+propagation algorithm for feed-forward models, back-propagation through time
+for recurrent neural networks and contrastive divergence for energy models like
+RBM and DBM. Variant SGD updaters are also provided, including
+[AdaDelta](http://arxiv.org/pdf/1212.5701v1.pdf),
+[AdaGrad](http://www.magicbroom.info/Papers/DuchiHaSi10.pdf),
+[RMSProp](http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf),
+[Nesterov](http://scholar.google.com/citations?view_op=view_citation&hl=en&user=DJ8Ep8YAAAAJ&citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C).
+
+Consequently, what a user needs to do to submit a training job is
+
+ 1. [Prepare the data](data.html) for training, validation and test.
+
+ 2. [Implement the new Layers](layer.html) to support specific feature transformations
+ required in the new model.
+
+ 3. Configure the training job including the [cluster setting](architecture.html)
+ and [model configuration](model-config.html)
+
+### Driver program
+
+Each training job has a driver program that
+
+ * registers the layers implemented by the user and,
+
+ * starts the [Trainer](https://github.com/apache/incubator-singa/blob/master/include/trainer/trainer.h)
+ by providing the job configuration.
+
+An example driver program is like
+
+ #include "singa.h"
+ #include "user-layer.h" // header for user defined layers
+
+ DEFINE_int32(job, -1, "Job ID"); // job ID generated by the SINGA script
+ DEFINE_string(workspace, "examples/mnist/", "workspace of the training job");
+ DEFINE_bool(resume, false, "resume from checkpoint");
+
+ int main(int argc, char** argv) {
+ google::InitGoogleLogging(argv[0]);
+ gflags::ParseCommandLineFlags(&argc, &argv, true);
+
+ // register all user defined layers in user-layer.h
+ Register(kFooLayer, FooLayer);
+ ...
+
+ JobProto jobConf;
+ // read job configuration from text conf file
+ ReadProtoFromTextFile(&jobConf, FLAGS_workspace + "/job.conf");
+ Trainer trainer;
+ trainer.Start(FLAGS_job, jobConf, FLAGS_resume);
+ }
+
+Users can also configure the job in the driver program instead of writing the
+configuration file
+
+
+ JobProto jobConf;
+ jobConf.set_job_name("my singa job");
+ ... // configure cluster and model
+ Trainer trainer;
+ trainer.Start(FLAGS_job, jobConf, FLAGS_resume);
+
+We will provide helper functions to make the configuration easier in the
+future, like [keras](https://github.com/fchollet/keras).
+
+Compile and link the driver program with singa library to generate an
+executable, e.g., with name `mysinga`. To submit the job, just pass the path of
+the executable and the workspace to the singa job submission script
+
+ ./bin/singa-run.sh <path to mysinga> -workspace=<my job workspace>