You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@singa.apache.org by wa...@apache.org on 2016/08/16 07:30:21 UTC

svn commit: r1756485 [2/4] - in /incubator/singa/site/trunk: en/ en/_static/ en/community/ en/develop/ en/docs/ zh/ zh/_static/

Modified: incubator/singa/site/trunk/en/docs/layer.html
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/en/docs/layer.html?rev=1756485&r1=1756484&r2=1756485&view=diff
==============================================================================
--- incubator/singa/site/trunk/en/docs/layer.html (original)
+++ incubator/singa/site/trunk/en/docs/layer.html Tue Aug 16 07:30:21 2016
@@ -96,7 +96,7 @@
 <li class="toctree-l2"><a class="reference internal" href="device.html">Device</a></li>
 <li class="toctree-l2"><a class="reference internal" href="tensor.html">Tensor</a></li>
 <li class="toctree-l2 current"><a class="current reference internal" href="#">Layer</a><ul>
-<li class="toctree-l3"><a class="reference internal" href="#python-api">Python API</a></li>
+<li class="toctree-l3"><a class="reference internal" href="#module-singa.layer">Python API</a></li>
 <li class="toctree-l3"><a class="reference internal" href="#cpp-api">CPP API</a></li>
 </ul>
 </li>
@@ -104,6 +104,7 @@
 <li class="toctree-l2"><a class="reference internal" href="loss.html">Loss</a></li>
 <li class="toctree-l2"><a class="reference internal" href="metric.html">Metric</a></li>
 <li class="toctree-l2"><a class="reference internal" href="optimizer.html">Optimizer</a></li>
+<li class="toctree-l2"><a class="reference internal" href="examples/index.html">Examples</a></li>
 </ul>
 </li>
 </ul>
@@ -166,8 +167,598 @@
             
   <div class="section" id="layer">
 <h1>Layer<a class="headerlink" href="#layer" title="Permalink to this headline">¶</a></h1>
-<div class="section" id="python-api">
-<h2>Python API<a class="headerlink" href="#python-api" title="Permalink to this headline">¶</a></h2>
+<div class="section" id="module-singa.layer">
+<span id="python-api"></span><h2>Python API<a class="headerlink" href="#module-singa.layer" title="Permalink to this headline">¶</a></h2>
+<p>Python layers wrap the C++ layers to provide simpler construction APIs.</p>
+<p>Example usages:</p>
+<div class="highlight-default"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">layer</span>
+<span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">tensor</span>
+<span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">device</span>
+<span class="kn">from</span> <span class="nn">singa.model_pb2</span> <span class="k">import</span> <span class="n">kTrain</span>
+
+<span class="n">layer</span><span class="o">.</span><span class="n">engine</span> <span class="o">=</span> <span class="s">&#39;cudnn&#39;</span>  <span class="c"># to use cudnn layers</span>
+<span class="n">dev</span> <span class="o">=</span> <span class="n">device</span><span class="o">.</span><span class="n">create_cuda_gpu</span><span class="p">()</span>
+
+<span class="c"># create a convolution layer</span>
+<span class="n">conv</span> <span class="o">=</span> <span class="n">layer</span><span class="o">.</span><span class="n">Conv2D</span><span class="p">(</span><span class="s">&#39;conv&#39;</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">pad</span><span class="o">=</span><span class="mi">1</span><span class="p">,</span> <span class="n">input_sample_shape</span><span class="o">=</span><span class="p">(</span><span class="mi">3</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">32</span><span class="p">))</span>
+<span class="n">conv</span><span class="o">.</span><span class="n">to_device</span><span class="p">(</span><span class="n">dev</span><span class="p">)</span>  <span class="c"># move the layer data onto a CudaGPU device</span>
+<span class="n">x</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">Tensor</span><span class="p">((</span><span class="mi">3</span><span class="p">,</span> <span class="mi">32</span><span class="p">,</span> <span class="mi">32</span><span class="p">),</span> <span class="n">dev</span><span class="p">)</span>
+<span class="n">x</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>
+<span class="n">y</span> <span class="o">=</span> <span class="n">conv</span><span class="o">.</span><span class="n">foward</span><span class="p">(</span><span class="n">kTrain</span><span class="p">,</span> <span class="n">x</span><span class="p">)</span>
+
+<span class="n">dy</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">Tensor</span><span class="p">()</span>
+<span class="n">dy</span><span class="o">.</span><span class="n">reset_like</span><span class="p">(</span><span class="n">y</span><span class="p">)</span>
+<span class="n">dy</span><span class="o">.</span><span class="n">set_value</span><span class="p">(</span><span class="mf">0.1</span><span class="p">)</span>
+<span class="c"># dp is a list of tensors for parameter gradients</span>
+<span class="n">dx</span><span class="p">,</span> <span class="n">dp</span> <span class="o">=</span> <span class="n">conv</span><span class="o">.</span><span class="n">backward</span><span class="p">(</span><span class="n">kTrain</span><span class="p">,</span> <span class="n">dy</span><span class="p">)</span>
+</pre></div>
+</div>
+<dl class="data">
+<dt id="singa.layer.engine">
+<code class="descclassname">singa.layer.</code><code class="descname">engine</code><em class="property"> = 'cudnn'</em><a class="headerlink" href="#singa.layer.engine" title="Permalink to this definition">¶</a></dt>
+<dd><p>engine is the prefix of layer identifier.</p>
+<p>The value could be one of [<strong>&#8216;cudnn&#8217;, &#8216;singacpp&#8217;, &#8216;singacuda&#8217;, &#8216;singacl&#8217;</strong>], for
+layers implemented using the cudnn library, Cpp, Cuda and OpenCL respectively.
+For example, CudnnConvolution layer is identified by &#8216;cudnn_convolution&#8217;;
+&#8216;singacpp_convolution&#8217; is for Convolution layer;
+Some layers&#8217; implementation use only Tensor functions, thererfore they are
+transparent to the underlying devices. For threse layers, they would have
+multiple identifiers, e.g., singacpp_dropout, singacuda_dropout and
+singacl_dropout are all for the Dropout layer. In addition, it has an extra
+identifier &#8216;singa&#8217;, i.e. &#8216;singa_dropout&#8217; also stands for the Dropout layer.</p>
+<p>engine is case insensitive. Each python layer would create the correct specific
+layer using the engine attribute.</p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Layer">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Layer</code><span class="sig-paren">(</span><em>name</em>, <em>**kwargs</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">object</span></code></p>
+<p>Base Python layer class.</p>
+<dl class="docutils">
+<dt>Typically, the life cycle of a layer instance includes:</dt>
+<dd><ol class="first last arabic simple">
+<li>construct layer without input_sample_shapes, goto 2;
+construct layer with input_sample_shapes, goto 3;</li>
+<li>call setup to create the parameters and setup other meta fields</li>
+<li>call forward or access layer members</li>
+<li>call backward and get parameters for update</li>
+</ol>
+</dd>
+</dl>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>name</strong> (<em>str</em>) &#8211; layer name</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.Layer.param_names">
+<code class="descname">param_names</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.param_names" title="Permalink to this definition">¶</a></dt>
+<dd><table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a list of strings, one for the name of one parameter Tensor</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.setup">
+<code class="descname">setup</code><span class="sig-paren">(</span><em>in_shapes</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.setup" title="Permalink to this definition">¶</a></dt>
+<dd><p>Call the C++ setup function to create params and set some meta data.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>in_shapes</strong> &#8211; if the layer accepts a single input Tensor, in_shapes is
+a single tuple specifying the inpute Tensor shape; if the layer
+accepts multiple input Tensor (e.g., the concatenation layer),
+in_shapes is a tuple of tuples, each for one input Tensor</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd><p>Called after setup to get the shape of the output sample(s).</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a tuple for a single output Tensor or a list of tuples if this layer
+has multiple outputs</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.param_values">
+<code class="descname">param_values</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.param_values" title="Permalink to this definition">¶</a></dt>
+<dd><p>Return param value tensors.</p>
+<p>Parameter tensors are not stored as layer members because cpp Tensor
+could be moved onto diff devices due to the change of layer device,
+which would result in inconsistency.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a list of tensors, one for each paramter</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Forward propagate through this layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) &#8211; kTrain or kEval</li>
+<li><strong>x</strong> (<em>Tensor or list&lt;Tensor&gt;</em>) &#8211; an input tensor if the layer is
+connected from a single layer; a list of tensors if the layer
+is connected from multiple layers.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a tensor if the layer is connected to a single layer; a list of
+tensors if the layer is connected to multiple layers;</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>flag</em>, <em>dy</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Backward propagate gradients through this layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) &#8211; for future use.</li>
+<li><strong>dy</strong> (<em>Tensor or list&lt;Tensor&gt;</em>) &#8211; the gradient tensor(s) y w.r.t the
+objective loss</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">&lt;dx, &lt;dp1, dp2..&gt;&gt;, dx is a (set of) tensor(s) for the gradient of x
+, dpi is the gradient of the i-th parameter</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.to_device">
+<code class="descname">to_device</code><span class="sig-paren">(</span><em>device</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.to_device" title="Permalink to this definition">¶</a></dt>
+<dd><p>Move layer state tensors onto the given device.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>device</strong> &#8211; swig converted device, created using singa.device</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.Layer.as_type">
+<code class="descname">as_type</code><span class="sig-paren">(</span><em>dtype</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Layer.as_type" title="Permalink to this definition">¶</a></dt>
+<dd></dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Conv2D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Conv2D</code><span class="sig-paren">(</span><em>name</em>, <em>nb_kernels</em>, <em>kernel=3</em>, <em>stride=1</em>, <em>border_mode='same'</em>, <em>cudnn_prefer='fatest'</em>, <em>data_format='NCHW'</em>, <em>use_bias=True</em>, <em>W_specs=None</em>, <em>b_specs=None</em>, <em>pad=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Conv2D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Construct a layer for 2D convolution.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>nb_kernels</strong> (<em>int</em>) &#8211; num of the channels (kernels) of the input Tensor</li>
+<li><strong>kernel</strong> &#8211; an integer or a pair of integers for kernel height and width</li>
+<li><strong>stride</strong> &#8211; an integer or a pair of integers for stride height and width</li>
+<li><strong>border_mode</strong> (<em>string</em>) &#8211; padding mode, case in-sensitive,
+&#8216;valid&#8217; -&gt; padding is 0 for height and width
+&#8216;same&#8217; -&gt; padding is half of the kernel (floor), the kernel must be
+odd number.</li>
+<li><strong>cudnn_prefer</strong> (<em>string</em>) &#8211; the preferred algorithm for cudnn convolution
+which could be &#8216;fatest&#8217;, &#8216;autotune&#8217;, &#8216;limited_workspace&#8217; and
+&#8216;no_workspace&#8217;</li>
+<li><strong>data_format</strong> (<em>string</em>) &#8211; either &#8216;NCHW&#8217; or &#8216;NHWC&#8217;</li>
+<li><strong>use_bias</strong> (<em>bool</em>) &#8211; True or False</li>
+<li><strong>pad</strong> &#8211; an integer or a pair of integers for padding height and width</li>
+<li><strong>W_specs</strong> (<em>dict</em>) &#8211; used to specify the weight matrix specs, fields
+include,
+&#8216;name&#8217; for parameter name
+&#8216;lr_mult&#8217; for learning rate multiplier
+&#8216;decay_mult&#8217; for weight decay multiplier
+&#8216;init&#8217; for init method, which could be &#8216;gaussian&#8217;, &#8216;uniform&#8217;,
+&#8216;xavier&#8217; and &#8216;&#8217;
+&#8216;std&#8217;, &#8216;mean&#8217;, &#8216;high&#8217;, &#8216;low&#8217; for corresponding init methods
+TODO(wangwei) &#8216;clamp&#8217; for gradient constraint, value is scalar
+&#8216;regularizer&#8217; for regularization, currently support &#8216;l2&#8217;</li>
+<li><strong>b_specs</strong> (<em>dict</em>) &#8211; hyper-parameters for bias vector, similar as W_specs</li>
+<li><strong>name</strong> (<em>string</em>) &#8211; layer name.</li>
+<li><strong>input_sample_shape</strong> &#8211; 3d tuple for the shape of the input Tensor
+without the batchsize, e.g., (channel, height, width) or
+(height, width, channel)</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Conv1D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Conv1D</code><span class="sig-paren">(</span><em>name</em>, <em>nb_kernels</em>, <em>kernel=3</em>, <em>stride=1</em>, <em>border_mode='same'</em>, <em>cudnn_prefer='fatest'</em>, <em>use_bias=True</em>, <em>W_specs={'init': 'Xavier'}</em>, <em>b_specs={'init': 'Constant'</em>, <em>'value': 0}</em>, <em>pad=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Conv1D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Conv2D" title="singa.layer.Conv2D"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Conv2D</span></code></a></p>
+<p>Construct a layer for 1D convolution.</p>
+<p>Most of the args are the same as those for Conv2D except the kernel,
+stride, pad, which is a scalar instead of a tuple.
+input_sample_shape is a tuple with a single value for the input feature
+length</p>
+<dl class="method">
+<dt id="singa.layer.Conv1D.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Conv1D.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd></dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Pooling2D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Pooling2D</code><span class="sig-paren">(</span><em>name</em>, <em>mode</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Pooling2D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>2D pooling layer providing max/avg pooling.</p>
+<p>All args are the same as those for Conv2D, except the following one</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><strong>mode</strong> &#8211; pooling type, model_pb2.PoolingConf.MAX or
+model_pb2.PoolingConf.AVE</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.MaxPooling2D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">MaxPooling2D</code><span class="sig-paren">(</span><em>name</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.MaxPooling2D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Pooling2D" title="singa.layer.Pooling2D"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Pooling2D</span></code></a></p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.AvgPooling2D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">AvgPooling2D</code><span class="sig-paren">(</span><em>name</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.AvgPooling2D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Pooling2D" title="singa.layer.Pooling2D"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Pooling2D</span></code></a></p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.MaxPooling1D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">MaxPooling1D</code><span class="sig-paren">(</span><em>name</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.MaxPooling1D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.MaxPooling2D" title="singa.layer.MaxPooling2D"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.MaxPooling2D</span></code></a></p>
+<dl class="method">
+<dt id="singa.layer.MaxPooling1D.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.MaxPooling1D.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd></dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.AvgPooling1D">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">AvgPooling1D</code><span class="sig-paren">(</span><em>name</em>, <em>kernel=3</em>, <em>stride=2</em>, <em>border_mode='same'</em>, <em>pad=None</em>, <em>data_format='NCHW'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.AvgPooling1D" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.AvgPooling2D" title="singa.layer.AvgPooling2D"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.AvgPooling2D</span></code></a></p>
+<dl class="method">
+<dt id="singa.layer.AvgPooling1D.get_output_sample_shape">
+<code class="descname">get_output_sample_shape</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.AvgPooling1D.get_output_sample_shape" title="Permalink to this definition">¶</a></dt>
+<dd></dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.BatchNormalization">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">BatchNormalization</code><span class="sig-paren">(</span><em>name</em>, <em>momentum=0.9</em>, <em>beta_specs=None</em>, <em>gamma_specs=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.BatchNormalization" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Batch-normalization.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>momentum</strong> (<em>float</em>) &#8211; for running average mean and variance.</li>
+<li><strong>beta_specs</strong> (<em>dict</em>) &#8211; dictionary includes the fields for the beta
+param:
+&#8216;name&#8217; for parameter name
+&#8216;lr_mult&#8217; for learning rate multiplier
+&#8216;decay_mult&#8217; for weight decay multiplier
+&#8216;init&#8217; for init method, which could be &#8216;gaussian&#8217;, &#8216;uniform&#8217;,
+&#8216;xavier&#8217; and &#8216;&#8217;
+&#8216;std&#8217;, &#8216;mean&#8217;, &#8216;high&#8217;, &#8216;low&#8217; for corresponding init methods
+&#8216;clamp&#8217; for gradient constraint, value is scalar
+&#8216;regularizer&#8217; for regularization, currently support &#8216;l2&#8217;</li>
+<li><strong>gamma_specs</strong> (<em>dict</em>) &#8211; similar to beta_specs, but for the gamma param.</li>
+<li><strong>name</strong> (<em>string</em>) &#8211; layer name</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) &#8211; with at least one integer</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.LRN">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">LRN</code><span class="sig-paren">(</span><em>name</em>, <em>size=5</em>, <em>alpha=1</em>, <em>beta=0.75</em>, <em>mode='cross_channel'</em>, <em>k=1</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.LRN" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Local response normalization.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>size</strong> (<em>int</em>) &#8211; # of channels to be crossed
+normalization.</li>
+<li><strong>mode</strong> (<em>string</em>) &#8211; &#8216;cross_channel&#8217;</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) &#8211; 3d tuple, (channel, height, width)</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Dense">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Dense</code><span class="sig-paren">(</span><em>name</em>, <em>num_output</em>, <em>use_bias=True</em>, <em>W_specs=None</em>, <em>b_specs=None</em>, <em>W_transpose=False</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dense" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Apply linear/affine transformation, also called inner-product or
+fully connected layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>num_output</strong> (<em>int</em>) &#8211; output feature length.</li>
+<li><strong>use_bias</strong> (<em>bool</em>) &#8211; add a bias vector or not to the transformed feature</li>
+<li><strong>W_specs</strong> (<em>dict</em>) &#8211; specs for the weight matrix
+&#8216;name&#8217; for parameter name
+&#8216;lr_mult&#8217; for learning rate multiplier
+&#8216;decay_mult&#8217; for weight decay multiplier
+&#8216;init&#8217; for init method, which could be &#8216;gaussian&#8217;, &#8216;uniform&#8217;,
+&#8216;xavier&#8217; and &#8216;&#8217;
+&#8216;std&#8217;, &#8216;mean&#8217;, &#8216;high&#8217;, &#8216;low&#8217; for corresponding init methods
+&#8216;clamp&#8217; for gradient constraint, value is scalar
+&#8216;regularizer&#8217; for regularization, currently support &#8216;l2&#8217;</li>
+<li><strong>b_specs</strong> (<em>dict</em>) &#8211; specs for the bias vector, same fields as W_specs.</li>
+<li><strong>W_transpose</strong> (<em>bool</em>) &#8211; if true, output=x*W.T+b;</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) &#8211; input feature length</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Dropout">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Dropout</code><span class="sig-paren">(</span><em>name</em>, <em>p=0.5</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Dropout" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Droput layer.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>p</strong> (<em>float</em>) &#8211; probability for dropping out the element, i.e., set to 0</li>
+<li><strong>name</strong> (<em>string</em>) &#8211; layer name</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Activation">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Activation</code><span class="sig-paren">(</span><em>name</em>, <em>mode='relu'</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Activation" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Activation layers.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>name</strong> (<em>string</em>) &#8211; layer name</li>
+<li><strong>mode</strong> (<em>string</em>) &#8211; &#8216;relu&#8217;, &#8216;sigmoid&#8217;, or &#8216;tanh&#8217;</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) &#8211; shape of a single sample</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Softmax">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Softmax</code><span class="sig-paren">(</span><em>name</em>, <em>axis=1</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Softmax" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Apply softmax.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>axis</strong> (<em>int</em>) &#8211; reshape the input as a matrix with the dimension
+[0,axis) as the row, the [axis, -1) as the column.</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) &#8211; shape of a single sample</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.Flatten">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">Flatten</code><span class="sig-paren">(</span><em>name</em>, <em>axis=1</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.Flatten" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Reshape the input tensor into a matrix.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>axis</strong> (<em>int</em>) &#8211; reshape the input as a matrix with the dimension
+[0,axis) as the row, the [axis, -1) as the column.</li>
+<li><strong>input_sample_shape</strong> (<em>tuple</em>) &#8211; shape for a single sample</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.RNN">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">RNN</code><span class="sig-paren">(</span><em>name</em>, <em>hidden_size</em>, <em>rnn_mode='lstm'</em>, <em>dropout=0.0</em>, <em>num_stacks=1</em>, <em>input_mode='linear'</em>, <em>bidirectional=False</em>, <em>param_specs=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.RNN" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.Layer" title="singa.layer.Layer"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.Layer</span></code></a></p>
+<p>Recurrent layer with 4 types of units, namely lstm, gru, tanh and relu.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first last simple">
+<li><strong>hidden_size</strong> &#8211; hidden feature size, the same for all stacks of layers.</li>
+<li><strong>rnn_mode</strong> &#8211; decides the rnn unit, which could be one of &#8216;lstm&#8217;, &#8216;gru&#8217;,
+&#8216;tanh&#8217; and &#8216;relu&#8217;, refer to cudnn manual for each mode.</li>
+<li><strong>num_stacks</strong> &#8211; num of stacks of rnn layers. It is different to the
+unrolling seqence length.</li>
+<li><strong>input_mode</strong> &#8211; &#8216;linear&#8217; convert the input feature x by by a linear
+transformation to get a feature vector of size hidden_size;
+&#8216;skip&#8217; does nothing but requires the input feature size equals
+hidden_size</li>
+<li><strong>bidirection</strong> &#8211; True for bidirectional RNN</li>
+<li><strong>param_specs</strong> &#8211; config for initializing the RNN parameters.</li>
+<li><strong>input_sample_shape</strong> &#8211; includes a single integer for the input sample
+feature size.</li>
+</ul>
+</td>
+</tr>
+</tbody>
+</table>
+<dl class="method">
+<dt id="singa.layer.RNN.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>inputs</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.RNN.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Forward inputs through the RNN.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>kTrain or kEval.</strong> (<em>flag,</em>) &#8211; </li>
+<li><strong>&lt;x1, x2,...xn, hx, cx&gt;, where xi is the input tensor for the</strong> (<em>inputs,</em>) &#8211; i-th position, its shape is (batch_size, input_feature_length);
+the batch_size of xi must &gt;= that of xi+1; hx is the initial
+hidden state of shape (num_stacks * bidirection?2:1, batch_size,
+hidden_size). cx is the initial cell state tensor of the same
+shape as hy. cx is valid for only lstm. For other RNNs there is
+no cx. Both hx and cx could be dummy tensors without shape and
+data.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"><dl class="docutils">
+<dt>&lt;y1, y2, ... yn, hy, cy&gt;, where yi is the output tensor for the i-th</dt>
+<dd><p class="first last">position, its shape is (batch_size,
+hidden_size * bidirection?2:1). hy is the final hidden state
+tensor. cx is the final cell state tensor. cx is only used for
+lstm.</p>
+</dd>
+</dl>
+</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.layer.RNN.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><em>flag</em>, <em>grad</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.RNN.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Backward gradients through the RNN.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>for future use.</strong> (<em>flag,</em>) &#8211; </li>
+<li><strong>&lt;dy1, dy2,...dyn, dhy, dcy&gt;, where dyi is the gradient for the</strong> (<em>grad,</em>) &#8211; </li>
+<li><strong>output, its shape is (batch_size, hidden_size*bidirection?2</strong> (<em>i-th</em>) &#8211; 1);
+dhy is the gradient for the final hidden state, its shape is
+(num_stacks * bidirection?2:1, batch_size,
+hidden_size). dcy is the gradient for the final cell state.
+cx is valid only for lstm. For other RNNs there is
+no cx. Both dhy and dcy could be dummy tensors without shape and
+data.</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last"><dl class="docutils">
+<dt>&lt;dx1, dx2, ... dxn, dhx, dcx&gt;, where dxi is the gradient tensor for</dt>
+<dd><p class="first last">the i-th input, its shape is (batch_size,
+input_feature_length). dhx is the gradient for the initial
+hidden state. dcx is the gradient for the initial cell state,
+which is valid only for lstm.</p>
+</dd>
+</dl>
+</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.LSTM">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">LSTM</code><span class="sig-paren">(</span><em>name</em>, <em>hidden_size</em>, <em>dropout=0.0</em>, <em>num_stacks=1</em>, <em>input_mode='linear'</em>, <em>bidirectional=False</em>, <em>param_specs=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.LSTM" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.RNN" title="singa.layer.RNN"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.RNN</span></code></a></p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.layer.GRU">
+<em class="property">class </em><code class="descclassname">singa.layer.</code><code class="descname">GRU</code><span class="sig-paren">(</span><em>name</em>, <em>hidden_size</em>, <em>dropout=0.0</em>, <em>num_stacks=1</em>, <em>input_mode='linear'</em>, <em>bidirectional=False</em>, <em>param_specs=None</em>, <em>input_sample_shape=None</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.GRU" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.layer.RNN" title="singa.layer.RNN"><code class="xref py py-class docutils literal"><span class="pre">singa.layer.RNN</span></code></a></p>
+</dd></dl>
+
+<dl class="function">
+<dt id="singa.layer.get_layer_list">
+<code class="descclassname">singa.layer.</code><code class="descname">get_layer_list</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.layer.get_layer_list" title="Permalink to this definition">¶</a></dt>
+<dd><p>Return a list of strings which include the identifiers (tags) of all
+supported layers</p>
+</dd></dl>
+
 </div>
 <div class="section" id="cpp-api">
 <h2>CPP API<a class="headerlink" href="#cpp-api" title="Permalink to this headline">¶</a></h2>

Modified: incubator/singa/site/trunk/en/docs/loss.html
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/en/docs/loss.html?rev=1756485&r1=1756484&r2=1756485&view=diff
==============================================================================
--- incubator/singa/site/trunk/en/docs/loss.html (original)
+++ incubator/singa/site/trunk/en/docs/loss.html Tue Aug 16 07:30:21 2016
@@ -100,6 +100,7 @@
 <li class="toctree-l2 current"><a class="current reference internal" href="#">Loss</a></li>
 <li class="toctree-l2"><a class="reference internal" href="metric.html">Metric</a></li>
 <li class="toctree-l2"><a class="reference internal" href="optimizer.html">Optimizer</a></li>
+<li class="toctree-l2"><a class="reference internal" href="examples/index.html">Examples</a></li>
 </ul>
 </li>
 </ul>
@@ -160,8 +161,162 @@
           <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
            <div itemprop="articleBody">
             
-  <div class="section" id="loss">
-<h1>Loss<a class="headerlink" href="#loss" title="Permalink to this headline">¶</a></h1>
+  <div class="section" id="module-singa.loss">
+<span id="loss"></span><h1>Loss<a class="headerlink" href="#module-singa.loss" title="Permalink to this headline">¶</a></h1>
+<p>Loss module includes a set of training loss implmentations. Some are converted
+from C++ implementation, and the rest are implemented directly using python
+Tensor.</p>
+<p>Example usage:</p>
+<div class="highlight-default"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">tensor</span>
+<span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">loss</span>
+<span class="kn">from</span> <span class="nn">singa.proto</span> <span class="k">import</span> <span class="n">model_pb2</span>
+
+<span class="n">x</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">Tensor</span><span class="p">((</span><span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
+<span class="n">x</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>  <span class="c"># randomly genearte the prediction activation</span>
+<span class="n">y</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">from_numpy</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int</span><span class="p">))</span>  <span class="c"># set the truth</span>
+
+<span class="n">f</span> <span class="o">=</span> <span class="n">loss</span><span class="o">.</span><span class="n">SoftmaxCrossEntropy</span><span class="p">()</span>
+<span class="n">l</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">forward</span><span class="p">(</span><span class="n">model_pb2</span><span class="o">.</span><span class="n">kTrain</span><span class="p">,</span> <span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>  <span class="c"># l is tensor with 3 loss values</span>
+<span class="n">g</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">backward</span><span class="p">()</span>  <span class="c"># g is a tensor containing all gradients of x w.r.t l</span>
+</pre></div>
+</div>
+<dl class="class">
+<dt id="singa.loss.Loss">
+<em class="property">class </em><code class="descclassname">singa.loss.</code><code class="descname">Loss</code><a class="headerlink" href="#singa.loss.Loss" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">object</span></code></p>
+<p>Base loss class.</p>
+<p>Subclasses that wrap the C++ loss classes can use the inherited foward,
+backward, and evaluate functions of this base class. Other subclasses need
+to override these functions</p>
+<dl class="method">
+<dt id="singa.loss.Loss.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.Loss.backward" title="Permalink to this definition">¶</a></dt>
+<dd><table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">the grad of x w.r.t. the loss</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.Loss.evaluate">
+<code class="descname">evaluate</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.Loss.evaluate" title="Permalink to this definition">¶</a></dt>
+<dd><table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) &#8211; must be kEval, to be removed</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; the prediction Tensor</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; the ground truth Tnesor</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">the averaged loss for all samples in x.</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.Loss.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.Loss.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the loss values.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) &#8211; kTrain or kEval. If it is kTrain, then the backward
+function must be called before calling forward again.</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; the prediction Tensor</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; the ground truch Tensor, x.shape[0] must = y.shape[0]</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a tensor of floats for the loss values, one per sample</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.loss.SoftmaxCrossEntropy">
+<em class="property">class </em><code class="descclassname">singa.loss.</code><code class="descname">SoftmaxCrossEntropy</code><a class="headerlink" href="#singa.loss.SoftmaxCrossEntropy" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.loss.Loss" title="singa.loss.Loss"><code class="xref py py-class docutils literal"><span class="pre">singa.loss.Loss</span></code></a></p>
+<p>This loss function is a combination of SoftMax and Cross-Entropy loss.</p>
+<p>It converts the inputs via SoftMax function and then
+computes the cross-entropy loss against the ground truth values.</p>
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.loss.SquaredError">
+<em class="property">class </em><code class="descclassname">singa.loss.</code><code class="descname">SquaredError</code><a class="headerlink" href="#singa.loss.SquaredError" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.loss.Loss" title="singa.loss.Loss"><code class="xref py py-class docutils literal"><span class="pre">singa.loss.Loss</span></code></a></p>
+<p>This loss evaluates the squared error between the prediction and the
+truth values.</p>
+<p>It is implemented using Python Tensor operations.</p>
+<dl class="method">
+<dt id="singa.loss.SquaredError.backward">
+<code class="descname">backward</code><span class="sig-paren">(</span><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SquaredError.backward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the gradient of x w.r.t the error.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">x - y</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.SquaredError.evaluate">
+<code class="descname">evaluate</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SquaredError.evaluate" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compuate the averaged error.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Returns:</th><td class="field-body">a float value as the averaged error</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.loss.SquaredError.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>flag</em>, <em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.loss.SquaredError.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the error as 0.5 * ||x-y||^2.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>flag</strong> (<em>int</em>) &#8211; kTrain or kEval; if kTrain, then the backward must be
+called before calling forward again.</li>
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; the prediction Tensor</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; the truth Tensor, an integer value per sample, whose
+value is [0, x.shape[1])</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a Tensor with one error value per sample</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
 </div>
 
 

Modified: incubator/singa/site/trunk/en/docs/metric.html
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/en/docs/metric.html?rev=1756485&r1=1756484&r2=1756485&view=diff
==============================================================================
--- incubator/singa/site/trunk/en/docs/metric.html (original)
+++ incubator/singa/site/trunk/en/docs/metric.html Tue Aug 16 07:30:21 2016
@@ -100,6 +100,7 @@
 <li class="toctree-l2"><a class="reference internal" href="loss.html">Loss</a></li>
 <li class="toctree-l2 current"><a class="current reference internal" href="#">Metric</a></li>
 <li class="toctree-l2"><a class="reference internal" href="optimizer.html">Optimizer</a></li>
+<li class="toctree-l2"><a class="reference internal" href="examples/index.html">Examples</a></li>
 </ul>
 </li>
 </ul>
@@ -160,8 +161,85 @@
           <div role="main" class="document" itemscope="itemscope" itemtype="http://schema.org/Article">
            <div itemprop="articleBody">
             
-  <div class="section" id="metric">
-<h1>Metric<a class="headerlink" href="#metric" title="Permalink to this headline">¶</a></h1>
+  <div class="section" id="module-singa.metric">
+<span id="metric"></span><h1>Metric<a class="headerlink" href="#module-singa.metric" title="Permalink to this headline">¶</a></h1>
+<p>This module includes a set of metric classes for evaluating the model&#8217;s
+performance. The specific metric classes could be converted from C++
+implmentation or implemented directly using Python.</p>
+<p>Example usage:</p>
+<div class="highlight-default"><div class="highlight"><pre><span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">tensor</span>
+<span class="kn">from</span> <span class="nn">singa</span> <span class="k">import</span> <span class="n">metric</span>
+
+<span class="n">x</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">Tensor</span><span class="p">((</span><span class="mi">3</span><span class="p">,</span> <span class="mi">5</span><span class="p">))</span>
+<span class="n">x</span><span class="o">.</span><span class="n">uniform</span><span class="p">(</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">)</span>  <span class="c"># randomly genearte the prediction activation</span>
+<span class="n">x</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">SoftMax</span><span class="p">(</span><span class="n">x</span><span class="p">)</span>  <span class="c"># normalize the prediction into probabilities</span>
+<span class="n">y</span> <span class="o">=</span> <span class="n">tensor</span><span class="o">.</span><span class="n">from_numpy</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="mi">0</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">3</span><span class="p">],</span> <span class="n">dtype</span><span class="o">=</span><span class="n">np</span><span class="o">.</span><span class="n">int</span><span class="p">))</span>  <span class="c"># set the truth</span>
+
+<span class="n">f</span> <span class="o">=</span> <span class="n">metric</span><span class="o">.</span><span class="n">Accuracy</span><span class="p">()</span>
+<span class="n">acc</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">evaluate</span><span class="p">(</span><span class="n">x</span><span class="p">,</span> <span class="n">y</span><span class="p">)</span>  <span class="c"># averaged accuracy over all 3 samples in x</span>
+</pre></div>
+</div>
+<dl class="class">
+<dt id="singa.metric.Metric">
+<em class="property">class </em><code class="descclassname">singa.metric.</code><code class="descname">Metric</code><a class="headerlink" href="#singa.metric.Metric" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <code class="xref py py-class docutils literal"><span class="pre">object</span></code></p>
+<p>Base metric class.</p>
+<p>Subclasses that wrap the C++ loss classes can use the inherited foward,
+and evaluate functions of this base class. Other subclasses need
+to override these functions. Users need to feed in the <strong>predictions</strong> and
+ground truth to get the metric values.</p>
+<dl class="method">
+<dt id="singa.metric.Metric.forward">
+<code class="descname">forward</code><span class="sig-paren">(</span><em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.metric.Metric.forward" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the metric for each sample.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; predictions, one row per sample</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; ground truth values, one row per sample</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a tensor of floats, one per sample</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+<dl class="method">
+<dt id="singa.metric.Metric.evaluate">
+<code class="descname">evaluate</code><span class="sig-paren">(</span><em>x</em>, <em>y</em><span class="sig-paren">)</span><a class="headerlink" href="#singa.metric.Metric.evaluate" title="Permalink to this definition">¶</a></dt>
+<dd><p>Compute the averaged metric over all samples.</p>
+<table class="docutils field-list" frame="void" rules="none">
+<col class="field-name" />
+<col class="field-body" />
+<tbody valign="top">
+<tr class="field-odd field"><th class="field-name">Parameters:</th><td class="field-body"><ul class="first simple">
+<li><strong>x</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; predictions, one row per sample</li>
+<li><strong>y</strong> (<a class="reference internal" href="tensor.html#singa.tensor.Tensor" title="singa.tensor.Tensor"><em>Tensor</em></a>) &#8211; ground truth values, one row per sample</li>
+</ul>
+</td>
+</tr>
+<tr class="field-even field"><th class="field-name">Returns:</th><td class="field-body"><p class="first last">a float value for the averaged metric</p>
+</td>
+</tr>
+</tbody>
+</table>
+</dd></dl>
+
+</dd></dl>
+
+<dl class="class">
+<dt id="singa.metric.Accuracy">
+<em class="property">class </em><code class="descclassname">singa.metric.</code><code class="descname">Accuracy</code><a class="headerlink" href="#singa.metric.Accuracy" title="Permalink to this definition">¶</a></dt>
+<dd><p>Bases: <a class="reference internal" href="#singa.metric.Metric" title="singa.metric.Metric"><code class="xref py py-class docutils literal"><span class="pre">singa.metric.Metric</span></code></a></p>
+<p>Compute the top one accuracy for singel label prediction tasks.</p>
+<p>It calls the C++ functions to do the calculation.</p>
+</dd></dl>
+
 </div>
 
 

Modified: incubator/singa/site/trunk/en/docs/neural-net.html
URL: http://svn.apache.org/viewvc/incubator/singa/site/trunk/en/docs/neural-net.html?rev=1756485&r1=1756484&r2=1756485&view=diff
==============================================================================
--- incubator/singa/site/trunk/en/docs/neural-net.html (original)
+++ incubator/singa/site/trunk/en/docs/neural-net.html Tue Aug 16 07:30:21 2016
@@ -168,31 +168,31 @@ category.</p>
 </div><p>Feed-forward models, e.g., CNN and MLP, can easily get configured as their layer
 connections are undirected without circles. The
 configuration for the MLP model shown in Figure 1 is as follows,</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">net</span> <span class="p">{</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">net</span> <span class="p">{</span>
   <span class="n">layer</span> <span class="p">{</span>
-    <span class="n">name</span> <span class="p">:</span> <span class="s1">&#39;data&quot;</span>
+    <span class="n">name</span> <span class="p">:</span> <span class="s">&#39;data&quot;</span>
     <span class="nb">type</span> <span class="p">:</span> <span class="n">kData</span>
   <span class="p">}</span>
   <span class="n">layer</span> <span class="p">{</span>
-    <span class="n">name</span> <span class="p">:</span> <span class="s1">&#39;image&quot;</span>
+    <span class="n">name</span> <span class="p">:</span> <span class="s">&#39;image&quot;</span>
     <span class="nb">type</span> <span class="p">:</span> <span class="n">kImage</span>
-    <span class="n">srclayer</span><span class="p">:</span> <span class="s1">&#39;data&#39;</span>
+    <span class="n">srclayer</span><span class="p">:</span> <span class="s">&#39;data&#39;</span>
   <span class="p">}</span>
   <span class="n">layer</span> <span class="p">{</span>
-    <span class="n">name</span> <span class="p">:</span> <span class="s1">&#39;label&quot;</span>
+    <span class="n">name</span> <span class="p">:</span> <span class="s">&#39;label&quot;</span>
     <span class="nb">type</span> <span class="p">:</span> <span class="n">kLabel</span>
-    <span class="n">srclayer</span><span class="p">:</span> <span class="s1">&#39;data&#39;</span>
+    <span class="n">srclayer</span><span class="p">:</span> <span class="s">&#39;data&#39;</span>
   <span class="p">}</span>
   <span class="n">layer</span> <span class="p">{</span>
-    <span class="n">name</span> <span class="p">:</span> <span class="s1">&#39;hidden&quot;</span>
+    <span class="n">name</span> <span class="p">:</span> <span class="s">&#39;hidden&quot;</span>
     <span class="nb">type</span> <span class="p">:</span> <span class="n">kHidden</span>
-    <span class="n">srclayer</span><span class="p">:</span> <span class="s1">&#39;image&#39;</span>
+    <span class="n">srclayer</span><span class="p">:</span> <span class="s">&#39;image&#39;</span>
   <span class="p">}</span>
   <span class="n">layer</span> <span class="p">{</span>
-    <span class="n">name</span> <span class="p">:</span> <span class="s1">&#39;softmax&quot;</span>
+    <span class="n">name</span> <span class="p">:</span> <span class="s">&#39;softmax&quot;</span>
     <span class="nb">type</span> <span class="p">:</span> <span class="n">kSoftmaxLoss</span>
-    <span class="n">srclayer</span><span class="p">:</span> <span class="s1">&#39;hidden&#39;</span>
-    <span class="n">srclayer</span><span class="p">:</span> <span class="s1">&#39;label&#39;</span>
+    <span class="n">srclayer</span><span class="p">:</span> <span class="s">&#39;hidden&#39;</span>
+    <span class="n">srclayer</span><span class="p">:</span> <span class="s">&#39;label&#39;</span>
   <span class="p">}</span>
 <span class="p">}</span>
 </pre></div>
@@ -209,23 +209,23 @@ connections, as shown in Figure 3a. In o
 layer field should include each other&#8217;s name.
 The full <a class="reference external" href="rbm.html">RBM example</a> has
 detailed neural net configuration for a RBM model, which looks like</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">net</span> <span class="p">{</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">net</span> <span class="p">{</span>
   <span class="n">layer</span> <span class="p">{</span>
-    <span class="n">name</span> <span class="p">:</span> <span class="s2">&quot;vis&quot;</span>
+    <span class="n">name</span> <span class="p">:</span> <span class="s">&quot;vis&quot;</span>
     <span class="nb">type</span> <span class="p">:</span> <span class="n">kVisLayer</span>
     <span class="n">param</span> <span class="p">{</span>
-      <span class="n">name</span> <span class="p">:</span> <span class="s2">&quot;w1&quot;</span>
+      <span class="n">name</span> <span class="p">:</span> <span class="s">&quot;w1&quot;</span>
     <span class="p">}</span>
-    <span class="n">srclayer</span><span class="p">:</span> <span class="s2">&quot;hid&quot;</span>
+    <span class="n">srclayer</span><span class="p">:</span> <span class="s">&quot;hid&quot;</span>
   <span class="p">}</span>
   <span class="n">layer</span> <span class="p">{</span>
-    <span class="n">name</span> <span class="p">:</span> <span class="s2">&quot;hid&quot;</span>
+    <span class="n">name</span> <span class="p">:</span> <span class="s">&quot;hid&quot;</span>
     <span class="nb">type</span> <span class="p">:</span> <span class="n">kHidLayer</span>
     <span class="n">param</span> <span class="p">{</span>
-      <span class="n">name</span> <span class="p">:</span> <span class="s2">&quot;w2&quot;</span>
-      <span class="n">share_from</span><span class="p">:</span> <span class="s2">&quot;w1&quot;</span>
+      <span class="n">name</span> <span class="p">:</span> <span class="s">&quot;w2&quot;</span>
+      <span class="n">share_from</span><span class="p">:</span> <span class="s">&quot;w1&quot;</span>
     <span class="p">}</span>
-    <span class="n">srclayer</span><span class="p">:</span> <span class="s2">&quot;vis&quot;</span>
+    <span class="n">srclayer</span><span class="p">:</span> <span class="s">&quot;vis&quot;</span>
   <span class="p">}</span>
 <span class="p">}</span>
 </pre></div>
@@ -249,9 +249,9 @@ layers except the data layer, loss layer
 redundant configurations for the shared layers, users can uses the <code class="docutils literal"><span class="pre">exclude</span></code>
 filed to filter a layer in the neural net, e.g., the following layer will be
 filtered when creating the testing <code class="docutils literal"><span class="pre">NeuralNet</span></code>.</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">layer</span> <span class="p">{</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">layer</span> <span class="p">{</span>
   <span class="o">...</span>
-  <span class="n">exclude</span> <span class="p">:</span> <span class="n">kTest</span> <span class="c1"># filter this layer for creating test net</span>
+  <span class="n">exclude</span> <span class="p">:</span> <span class="n">kTest</span> <span class="c"># filter this layer for creating test net</span>
 <span class="p">}</span>
 </pre></div>
 </div>
@@ -285,7 +285,7 @@ partitioned into two sub-layers.</p>
 <li><p class="first">Partitioning each singe layer into sub-layers on batch dimension (see
 below). It is enabled by configuring the partition dimension of the layer to
 0, e.g.,</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span> <span class="c1"># with other fields omitted</span>
+<div class="highlight-default"><div class="highlight"><pre> <span class="c"># with other fields omitted</span>
  <span class="n">layer</span> <span class="p">{</span>
    <span class="n">partition_dim</span><span class="p">:</span> <span class="mi">0</span>
  <span class="p">}</span>
@@ -295,7 +295,7 @@ below). It is enabled by configuring the
 <li><p class="first">Partitioning each singe layer into sub-layers on feature dimension (see
 below).  It is enabled by configuring the partition dimension of the layer to
 1, e.g.,</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span> <span class="c1"># with other fields omitted</span>
+<div class="highlight-default"><div class="highlight"><pre> <span class="c"># with other fields omitted</span>
  <span class="n">layer</span> <span class="p">{</span>
    <span class="n">partition_dim</span><span class="p">:</span> <span class="mi">1</span>
  <span class="p">}</span>
@@ -304,7 +304,7 @@ below).  It is enabled by configuring th
 </li>
 <li><p class="first">Partitioning all layers into different subsets. It is enabled by
 configuring the location ID of a layer, e.g.,</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span> <span class="c1"># with other fields omitted</span>
+<div class="highlight-default"><div class="highlight"><pre> <span class="c"># with other fields omitted</span>
  <span class="n">layer</span> <span class="p">{</span>
    <span class="n">location</span><span class="p">:</span> <span class="mi">1</span>
  <span class="p">}</span>
@@ -320,7 +320,7 @@ configuring the location ID of a layer,
 useful for large models. An example application is to implement the
 <a class="reference external" href="http://arxiv.org/abs/1404.5997">idea proposed by Alex</a>.
 Hybrid partitioning is configured like,</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span> <span class="c1"># with other fields omitted</span>
+<div class="highlight-default"><div class="highlight"><pre> <span class="c"># with other fields omitted</span>
  <span class="n">layer</span> <span class="p">{</span>
    <span class="n">location</span><span class="p">:</span> <span class="mi">1</span>
  <span class="p">}</span>
@@ -367,7 +367,7 @@ gradients will be averaged by the stub o
 <span id="advanced-user-guide"></span><h2>Advanced user guide<a class="headerlink" href="#advanced-user-guide" title="Permalink to this headline">¶</a></h2>
 <div class="section" id="creation">
 <span id="creation"></span><h3>Creation<a class="headerlink" href="#creation" title="Permalink to this headline">¶</a></h3>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">static</span> <span class="n">NeuralNet</span><span class="o">*</span> <span class="n">NeuralNet</span><span class="p">::</span><span class="n">Create</span><span class="p">(</span><span class="n">const</span> <span class="n">NetProto</span><span class="o">&amp;</span> <span class="n">np</span><span class="p">,</span> <span class="n">Phase</span> <span class="n">phase</span><span class="p">,</span> <span class="nb">int</span> <span class="n">num</span><span class="p">);</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">static</span> <span class="n">NeuralNet</span><span class="o">*</span> <span class="n">NeuralNet</span><span class="p">::</span><span class="n">Create</span><span class="p">(</span><span class="n">const</span> <span class="n">NetProto</span><span class="o">&amp;</span> <span class="n">np</span><span class="p">,</span> <span class="n">Phase</span> <span class="n">phase</span><span class="p">,</span> <span class="nb">int</span> <span class="n">num</span><span class="p">);</span>
 </pre></div>
 </div>
 <p>The above function creates a <code class="docutils literal"><span class="pre">NeuralNet</span></code> for a given phase, and returns a
@@ -380,23 +380,23 @@ function takes in the full net configura
 validation and test.  It removes layers for phases other than the specified
 phase based on the <code class="docutils literal"><span class="pre">exclude</span></code> field in
 <a class="reference external" href="layer.html">layer configuration</a>:</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">layer</span> <span class="p">{</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">layer</span> <span class="p">{</span>
   <span class="o">...</span>
-  <span class="n">exclude</span> <span class="p">:</span> <span class="n">kTest</span> <span class="c1"># filter this layer for creating test net</span>
+  <span class="n">exclude</span> <span class="p">:</span> <span class="n">kTest</span> <span class="c"># filter this layer for creating test net</span>
 <span class="p">}</span>
 </pre></div>
 </div>
 <p>The filtered net configuration is passed to the constructor of <code class="docutils literal"><span class="pre">NeuralNet</span></code>:</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">NeuralNet</span><span class="p">::</span><span class="n">NeuralNet</span><span class="p">(</span><span class="n">NetProto</span> <span class="n">netproto</span><span class="p">,</span> <span class="nb">int</span> <span class="n">npartitions</span><span class="p">);</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">NeuralNet</span><span class="p">::</span><span class="n">NeuralNet</span><span class="p">(</span><span class="n">NetProto</span> <span class="n">netproto</span><span class="p">,</span> <span class="nb">int</span> <span class="n">npartitions</span><span class="p">);</span>
 </pre></div>
 </div>
 <p>The constructor creates a graph representing the net structure firstly in</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">Graph</span><span class="o">*</span> <span class="n">NeuralNet</span><span class="p">::</span><span class="n">CreateGraph</span><span class="p">(</span><span class="n">const</span> <span class="n">NetProto</span><span class="o">&amp;</span> <span class="n">netproto</span><span class="p">,</span> <span class="nb">int</span> <span class="n">npartitions</span><span class="p">);</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">Graph</span><span class="o">*</span> <span class="n">NeuralNet</span><span class="p">::</span><span class="n">CreateGraph</span><span class="p">(</span><span class="n">const</span> <span class="n">NetProto</span><span class="o">&amp;</span> <span class="n">netproto</span><span class="p">,</span> <span class="nb">int</span> <span class="n">npartitions</span><span class="p">);</span>
 </pre></div>
 </div>
 <p>Next, it creates a layer for each node and connects layers if their nodes are
 connected.</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">void</span> <span class="n">NeuralNet</span><span class="p">::</span><span class="n">CreateNetFromGraph</span><span class="p">(</span><span class="n">Graph</span><span class="o">*</span> <span class="n">graph</span><span class="p">,</span> <span class="nb">int</span> <span class="n">npartitions</span><span class="p">);</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">void</span> <span class="n">NeuralNet</span><span class="p">::</span><span class="n">CreateNetFromGraph</span><span class="p">(</span><span class="n">Graph</span><span class="o">*</span> <span class="n">graph</span><span class="p">,</span> <span class="nb">int</span> <span class="n">npartitions</span><span class="p">);</span>
 </pre></div>
 </div>
 <p>Since the <code class="docutils literal"><span class="pre">NeuralNet</span></code> instance may be shared among multiple workers, the
@@ -408,12 +408,12 @@ connected.</p>
 is enabled by first sharing the Param configuration (in <code class="docutils literal"><span class="pre">NeuralNet::Create</span></code>)
 to create two similar (e.g., the same shape) Param objects, and then calling
 (in <code class="docutils literal"><span class="pre">NeuralNet::CreateNetFromGraph</span></code>),</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">void</span> <span class="n">Param</span><span class="p">::</span><span class="n">ShareFrom</span><span class="p">(</span><span class="n">const</span> <span class="n">Param</span><span class="o">&amp;</span> <span class="n">from</span><span class="p">);</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">void</span> <span class="n">Param</span><span class="p">::</span><span class="n">ShareFrom</span><span class="p">(</span><span class="n">const</span> <span class="n">Param</span><span class="o">&amp;</span> <span class="n">from</span><span class="p">);</span>
 </pre></div>
 </div>
 <p>It is also possible to share <code class="docutils literal"><span class="pre">Param</span></code>s of two nets, e.g., sharing parameters of
 the training net and the test net,</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">void</span> <span class="n">NeuralNet</span><span class="p">:</span><span class="n">ShareParamsFrom</span><span class="p">(</span><span class="n">NeuralNet</span><span class="o">*</span> <span class="n">other</span><span class="p">);</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">void</span> <span class="n">NeuralNet</span><span class="p">:</span><span class="n">ShareParamsFrom</span><span class="p">(</span><span class="n">NeuralNet</span><span class="o">*</span> <span class="n">other</span><span class="p">);</span>
 </pre></div>
 </div>
 <p>It will call <code class="docutils literal"><span class="pre">Param::ShareFrom</span></code> for each Param object.</p>
@@ -422,7 +422,7 @@ the training net and the test net,</p>
 <span id="access-functions"></span><h3>Access functions<a class="headerlink" href="#access-functions" title="Permalink to this headline">¶</a></h3>
 <p><code class="docutils literal"><span class="pre">NeuralNet</span></code> provides a couple of access function to get the layers and params
 of the net:</p>
-<div class="highlight-default"><div class="highlight"><pre><span></span><span class="n">const</span> <span class="n">std</span><span class="p">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">Layer</span><span class="o">*&gt;&amp;</span> <span class="n">layers</span><span class="p">()</span> <span class="n">const</span><span class="p">;</span>
+<div class="highlight-default"><div class="highlight"><pre><span class="n">const</span> <span class="n">std</span><span class="p">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">Layer</span><span class="o">*&gt;&amp;</span> <span class="n">layers</span><span class="p">()</span> <span class="n">const</span><span class="p">;</span>
 <span class="n">const</span> <span class="n">std</span><span class="p">::</span><span class="n">vector</span><span class="o">&lt;</span><span class="n">Param</span><span class="o">*&gt;&amp;</span> <span class="n">params</span><span class="p">()</span> <span class="n">const</span> <span class="p">;</span>
 <span class="n">Layer</span><span class="o">*</span> <span class="n">name2layer</span><span class="p">(</span><span class="n">string</span> <span class="n">name</span><span class="p">)</span> <span class="n">const</span><span class="p">;</span>
 <span class="n">Param</span><span class="o">*</span> <span class="n">paramid2param</span><span class="p">(</span><span class="nb">int</span> <span class="nb">id</span><span class="p">)</span> <span class="n">const</span><span class="p">;</span>