You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@singa.apache.org by bu...@apache.org on 2015/07/20 15:47:26 UTC

svn commit: r959027 - in /websites/staging/singa/trunk/content: ./ docs/model-config.html docs/program-model.html

Author: buildbot
Date: Mon Jul 20 13:47:26 2015
New Revision: 959027

Log:
Staging update by buildbot for singa

Modified:
    websites/staging/singa/trunk/content/   (props changed)
    websites/staging/singa/trunk/content/docs/model-config.html
    websites/staging/singa/trunk/content/docs/program-model.html

Propchange: websites/staging/singa/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Mon Jul 20 13:47:26 2015
@@ -1 +1 @@
-1691875
+1691945

Modified: websites/staging/singa/trunk/content/docs/model-config.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/model-config.html (original)
+++ websites/staging/singa/trunk/content/docs/model-config.html Mon Jul 20 13:47:26 2015
@@ -451,10 +451,6 @@
 <div class="section">
 <h3><a name="NeuralNet"></a>NeuralNet</h3>
 <div class="section">
-<h4><a name="Deep_learning_training"></a>Deep learning training</h4>
-<p>Deep learning is labeled as a feature learning technique, which usually consists of multiple layers. Each layer is associated a feature transformation function. After going through all layers, the raw input feature (e.g., pixels of images) would be converted into a high-level feature that is easier for tasks like classification.</p>
-<p>Training a deep learning model is to find the optimal parameters involved in the transformation functions that generates good features for specific tasks. The goodness of a set of parameters is measured by a loss function, e.g., <a class="externalLink" href="https://en.wikipedia.org/wiki/Cross_entropy">Cross-Entropy Loss</a>. Since the loss functions are usually non-linear and non-convex, it is difficult to get a closed form solution. Normally, people uses the SGD algorithm which randomly initializes the parameters and then iteratively update them to reduce the loss.</p></div>
-<div class="section">
 <h4><a name="Uniform_model_neuralnet_representation"></a>Uniform model (neuralnet) representation</h4>
 <p><img src="../images/model-categorization.png" style="width: 400px" alt="" /> Fig. 1: Deep learning model categorization</img></p>
 <p>Many deep learning models have being proposed. Fig. 1 is a categorization of popular deep learning models based on the layer connections. The <a class="externalLink" href="https://github.com/apache/incubator-singa/blob/master/include/neuralnet/neuralnet.h">NeuralNet</a> abstraction of SINGA consists of multiple directly connected layers. This abstraction is able to represent models from all the three categorizations.</p>

Modified: websites/staging/singa/trunk/content/docs/program-model.html
==============================================================================
--- websites/staging/singa/trunk/content/docs/program-model.html (original)
+++ websites/staging/singa/trunk/content/docs/program-model.html Mon Jul 20 13:47:26 2015
@@ -9,7 +9,7 @@
     <meta name="viewport" content="width=device-width, initial-scale=1.0" />
     <meta name="Date-Revision-yyyymmdd" content="20150720" />
     <meta http-equiv="Content-Language" content="en" />
-    <title>Apache SINGA &#x2013; </title>
+    <title>Apache SINGA &#x2013; Programming Model</title>
     <link rel="stylesheet" href="../css/apache-maven-fluido-1.4.min.css" />
     <link rel="stylesheet" href="../css/site.css" />
     <link rel="stylesheet" href="../css/print.css" media="print" />
@@ -195,7 +195,7 @@
         Apache SINGA</a>
                     <span class="divider">/</span>
       </li>
-        <li class="active "></li>
+        <li class="active ">Programming Model</li>
         
                 
                     
@@ -445,7 +445,81 @@
                         
         <div id="bodyColumn"  class="span10" >
                                   
-            
+            <div class="section">
+<h2><a name="Programming_Model"></a>Programming Model</h2>
+<p>We describe the programming model of SINGA to provide users instructions of implementing a new model and submitting the training job. The programming model is made almost transparent to the underlying distributed environment. Hence users do not need to worry much about the communication and synchronization of nodes, which is discussed in <a href="architecture.html">architecture</a> in details.</p>
+<div class="section">
+<h3><a name="Deep_learning_training"></a>Deep learning training</h3>
+<p>Deep learning is labeled as a feature learning technique, which usually consists of multiple layers. Each layer is associated a feature transformation function. After going through all layers, the raw input feature (e.g., pixels of images) would be converted into a high-level feature that is easier for tasks like classification.</p>
+<p>Training a deep learning model is to find the optimal parameters involved in the transformation functions that generates good features for specific tasks. The goodness of a set of parameters is measured by a loss function, e.g., <a class="externalLink" href="https://en.wikipedia.org/wiki/Cross_entropy">Cross-Entropy Loss</a>. Since the loss functions are usually non-linear and non-convex, it is difficult to get a closed form solution. Normally, people uses the SGD algorithm which randomly initializes the parameters and then iteratively update them to reduce the loss.</p></div>
+<div class="section">
+<h3><a name="Steps_to_submit_a_training_job"></a>Steps to submit a training job</h3>
+<p>SINGA uses the stochastic gradient descent (SGD) algorithm to train parameters of deep learning models. For each SGD iteration, there is a <a href="architecture.html">Worker</a> computing gradients of parameters from the NeuralNet and a <a href="">Updater</a> updating parameter values based on gradients. SINGA has implemented three algorithms for gradient calculation, namely Back propagation algorithm for feed-forward models, back-propagation through time for recurrent neural networks and contrastive divergence for energy models like RBM and DBM. Variant SGD updaters are also provided, including <a class="externalLink" href="http://arxiv.org/pdf/1212.5701v1.pdf">AdaDelta</a>, <a class="externalLink" href="http://www.magicbroom.info/Papers/DuchiHaSi10.pdf">AdaGrad</a>, <a class="externalLink" href="http://www.cs.toronto.edu/~tijmen/csc321/slides/lecture_slides_lec6.pdf">RMSProp</a>, <a class="externalLink" href="http://scholar.google.com/citations?view_op=view_citation&amp;hl=en&a
 mp;user=DJ8Ep8YAAAAJ&amp;citation_for_view=DJ8Ep8YAAAAJ:hkOj_22Ku90C">Nesterov</a>.</p>
+<p>Consequently, what a user needs to do to submit a training job is</p>
+
+<ol style="list-style-type: decimal">
+  
+<li>
+<p><a href="data.html">Prepare the data</a> for training, validation and test.</p></li>
+  
+<li>
+<p><a href="layer.html">Implement the new Layers</a> to support specific feature transformations  required in the new model.</p></li>
+  
+<li>
+<p>Configure the training job including the <a href="architecture.html">cluster setting</a>  and <a href="model-config.html">model configuration</a></p></li>
+</ol></div>
+<div class="section">
+<h3><a name="Driver_program"></a>Driver program</h3>
+<p>Each training job has a driver program that</p>
+
+<ul>
+  
+<li>
+<p>registers the layers implemented by the user and,</p></li>
+  
+<li>
+<p>starts the <a class="externalLink" href="https://github.com/apache/incubator-singa/blob/master/include/trainer/trainer.h">Trainer</a>  by providing the job configuration.</p></li>
+</ul>
+<p>An example driver program is like</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">#include &quot;singa.h&quot;
+#include &quot;user-layer.h&quot;  // header for user defined layers
+
+DEFINE_int32(job, -1, &quot;Job ID&quot;);  // job ID generated by the SINGA script
+DEFINE_string(workspace, &quot;examples/mnist/&quot;, &quot;workspace of the training job&quot;);
+DEFINE_bool(resume, false, &quot;resume from checkpoint&quot;);
+
+int main(int argc, char** argv) {
+  google::InitGoogleLogging(argv[0]);
+  gflags::ParseCommandLineFlags(&amp;argc, &amp;argv, true);
+
+  // register all user defined layers in user-layer.h
+  Register(kFooLayer, FooLayer);
+  ...
+
+  JobProto jobConf;
+  // read job configuration from text conf file
+  ReadProtoFromTextFile(&amp;jobConf, FLAGS_workspace + &quot;/job.conf&quot;);
+  Trainer trainer;
+  trainer.Start(FLAGS_job, jobConf, FLAGS_resume);
+}
+</pre></div></div>
+<p>Users can also configure the job in the driver program instead of writing the configuration file</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">  JobProto jobConf;
+  jobConf.set_job_name(&quot;my singa job&quot;);
+  ... // configure cluster and model
+  Trainer trainer;
+  trainer.Start(FLAGS_job, jobConf, FLAGS_resume);
+</pre></div></div>
+<p>We will provide helper functions to make the configuration easier in the future, like <a class="externalLink" href="https://github.com/fchollet/keras">keras</a>.</p>
+<p>Compile and link the driver program with singa library to generate an executable, e.g., with name <tt>mysinga</tt>. To submit the job, just pass the path of the executable and the workspace to the singa job submission script</p>
+
+<div class="source">
+<div class="source"><pre class="prettyprint">./bin/singa-run.sh &lt;path to mysinga&gt; -workspace=&lt;my job workspace&gt;
+</pre></div></div></div></div>
                   </div>
             </div>
           </div>