You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by bu...@apache.org on 2015/04/05 06:49:46 UTC
svn commit: r946372 - in /websites/staging/mahout/trunk/content: ./ users/classification/mlp.html

Author: buildbot
Date: Sun Apr  5 04:49:46 2015
New Revision: 946372

Log:
Staging update by buildbot for mahout

Added:
    websites/staging/mahout/trunk/content/users/classification/mlp.html
Modified:
    websites/staging/mahout/trunk/content/   (props changed)

Propchange: websites/staging/mahout/trunk/content/
------------------------------------------------------------------------------
--- cms:source-revision (original)
+++ cms:source-revision Sun Apr  5 04:49:46 2015
@@ -1 +1 @@
-1671360
+1671372

Added: websites/staging/mahout/trunk/content/users/classification/mlp.html
==============================================================================
--- websites/staging/mahout/trunk/content/users/classification/mlp.html (added)
+++ websites/staging/mahout/trunk/content/users/classification/mlp.html Sun Apr  5 04:49:46 2015
@@ -0,0 +1,483 @@
+<!DOCTYPE html>
+<!--
+
+    Licensed to the Apache Software Foundation (ASF) under one or more
+    contributor license agreements.  See the NOTICE file distributed with
+    this work for additional information regarding copyright ownership.
+    The ASF licenses this file to You under the Apache License, Version 2.0
+    (the "License"); you may not use this file except in compliance with
+    the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+    Unless required by applicable law or agreed to in writing, software
+    distributed under the License is distributed on an "AS IS" BASIS,
+    WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+    See the License for the specific language governing permissions and
+    limitations under the License.
+-->
+
+<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en"><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
+  <title>Apache Mahout: Scalable machine learning and data mining</title>
+  <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
+  <meta name="Distribution" content="Global">
+  <meta name="Robots" content="index,follow">
+  <meta name="keywords" content="apache, apache hadoop, apache lucene,
+        business data mining, cluster analysis,
+        collaborative filtering, data extraction, data filtering, data framework, data integration,
+        data matching, data mining, data mining algorithms, data mining analysis, data mining data,
+        data mining introduction, data mining software,
+        data mining techniques, data representation, data set, datamining,
+        feature extraction, fuzzy k means, genetic algorithm, hadoop,
+        hierarchical clustering, high dimensional, introduction to data mining, kmeans,
+        knowledge discovery, learning approach, learning approaches, learning methods,
+        learning techniques, lucene, machine learning, machine translation, mahout apache,
+        mahout taste, map reduce hadoop, mining data, mining methods, naive bayes,
+        natural language processing,
+        supervised, text mining, time series data, unsupervised, web data mining">
+  <link rel="shortcut icon" type="image/x-icon" href="http://mahout.apache.org/images/favicon.ico">
+  <script type="text/javascript" src="/js/prototype.js"></script>
+  <script type="text/javascript" src="/js/effects.js"></script>
+  <script type="text/javascript" src="/js/search.js"></script>
+  <script type="text/javascript" src="/js/slides.js"></script>
+
+  <link href="/css/bootstrap.min.css" rel="stylesheet" media="screen">
+  <link href="/css/bootstrap-responsive.css" rel="stylesheet">
+  <link rel="stylesheet" href="/css/global.css" type="text/css">
+
+  <!-- mathJax stuff -- use `\(...\)` for inline style math in markdown -->
+  <script type="text/x-mathjax-config">
+  MathJax.Hub.Config({
+    tex2jax: {
+      skipTags: ['script', 'noscript', 'style', 'textarea', 'pre']
+    }
+  });
+  MathJax.Hub.Queue(function() {
+    var all = MathJax.Hub.getAllJax(), i;
+    for(i = 0; i < all.length; i += 1) {
+      all[i].SourceElement().parentNode.className += ' has-jax';
+    }
+  });
+  </script>
+  <script type="text/javascript">
+    var mathjax = document.createElement('script'); 
+    mathjax.type = 'text/javascript'; 
+    mathjax.async = true;
+
+    mathjax.src = ('https:' == document.location.protocol) ?
+        'https://c328740.ssl.cf1.rackcdn.com/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' : 
+        'http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML';
+	
+	  var s = document.getElementsByTagName('script')[0]; 
+    s.parentNode.insertBefore(mathjax, s);
+  </script>
+</head>
+
+<body id="home" data-twttr-rendered="true">
+  <div id="wrap">
+   <div id="header">
+    <div id="logo"><a href="/overview.html"></a></div>
+  <div id="search">
+    <form id="search-form" action="http://www.google.com/search" method="get" class="navbar-search pull-right">    
+      <input value="http://mahout.apache.org" name="sitesearch" type="hidden">
+      <input class="search-query" name="q" id="query" type="text">
+      <input id="submission" type="image" src="/images/mahout-lupe.png" alt="Search" />
+    </form>
+  </div>
+
+    <div class="navbar navbar-inverse" style="position:absolute;top:133px;padding-right:0px;padding-left:0px;">
+      <div class="navbar-inner" style="border: none; background: #999; border: none; border-radius: 0px;">
+        <div class="container">
+          <button type="button" class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+            <span class="icon-bar"></span>
+          </button>
+          <!-- <a class="brand" href="#">Apache Community Development Project</a> -->
+          <div class="nav-collapse collapse">
+            <ul class="nav">
+             <!-- <li><a href="/">Home</a></li> --> 
+              <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">General<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="/general/downloads.html">Downloads</a>
+                  <li><a href="/general/who-we-are.html">Who we are</a>
+                  <li><a href="/general/mailing-lists,-irc-and-archives.html">Mailing Lists</a>
+                  <li><a href="/general/release-notes.html">Release Notes</a> 
+                  <li><a href="/general/books-tutorials-and-talks.html">Books, Tutorials, Talks</a></li>
+                  <li><a href="/general/powered-by-mahout.html">Powered By Mahout</a>
+                  <li><a href="/general/professional-support.html">Professional Support</a>
+                  <li class="divider"></li>
+                  <li class="nav-header">Resources</li>
+                  <li><a href="/general/reference-reading.html">Reference Reading</a>
+                  <li><a href="/general/faq.html">FAQ</a>
+                  <li class="divider"></li>
+                  <li class="nav-header">Legal</li>
+                  <li><a href="http://www.apache.org/licenses/">License</a></li>
+                  <li><a href="http://www.apache.org/security/">Security</a></li>
+                  <li><a href="/general/privacy-policy.html">Privacy Policy</a>
+                </ul>
+              </li>
+              <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Developers<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="/developers/developer-resources.html">Developer resources</a></li>
+                  <li><a href="/developers/version-control.html">Version control</a></li>
+                  <li><a href="/developers/buildingmahout.html">Build from source</a></li>
+                  <li><a href="/developers/issue-tracker.html">Issue tracker</a></li>
+                  <li><a href="https://builds.apache.org/job/Mahout-Quality/" target="_blank">Code quality reports</a></li>
+                  <li class="divider"></li>
+                  <li class="nav-header">Contributions</li>
+                  <li><a href="/developers/how-to-contribute.html">How to contribute</a></li>
+                  <li><a href="/developers/how-to-become-a-committer.html">How to become a committer</a></li>
+                  <li><a href="/developers/gsoc.html">GSoC</a></li>
+                  <li class="divider"></li>
+                  <li class="nav-header">For committers</li>
+                  <li><a href="/developers/how-to-update-the-website.html">How to update the website</a></li>
+                  <li><a href="/developers/patch-check-list.html">Patch check list</a></li>
+                  <li><a href="/developers/github.html">Handling Github PRs</a></li>
+                  <li><a href="/developers/how-to-release.html">How to release</a></li>
+                  <li><a href="/developers/thirdparty-dependencies.html">Third party dependencies</a></li>
+                </ul>
+               </li>
+               <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Mahout Environment<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="/users/sparkbindings/home.html">Scala &amp; Spark Bindings Overview</a></li>
+                  <li><a href="/users/sparkbindings/faq.html">FAQ</a></li>
+                  <li class="nav-header">Engines</li>
+                  <li><a href="/users/sparkbindings/home.html">Spark</a></li>
+                  <li><a href="/users/environment/h2o-internals.html">H2O</a></li>
+                  <li class="nav-header">Tutorials</li>
+                  <li><a href="/users/sparkbindings/play-with-shell.html">Playing with Mahout's Spark Shell</a></li>
+                </ul>
+              </li>
+              <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Algorithms<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                  <li><a href="/users/basics/algorithms.html">List of algorithms</a>
+                  <li class="nav-header">Distributed Matrix Decomposition</li>
+                  <li><a href="/users/algorithms/d-qr.html">Cholesky QR</a></li>
+                  <li><a href="/users/algorithms/d-ssvd.html">SSVD</a></li>
+                  <li class="nav-header">Recommendations</li>
+                  <li><a href="/users/algorithms/recommender-overview.html">Recommender Overview</a></li>
+                  <li><a href="/users/algorithms/intro-cooccurrence-spark.html">Intro to cooccurrence-based<br/> recommendations with Spark</a></li>
+                  <li class="nav-header">Classification</li>
+                  <li><a href="/users/algorithms/spark-naive-bayes.html">Spark Naive Bayes</a></li>
+                </ul>
+               </li>
+               <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">MapReduce Basics<b class="caret"></b></a>
+                 <ul class="dropdown-menu">
+                  <li><a href="/users/basics/algorithms.html">List of algorithms</a>
+                  <li><a href="/users/basics/quickstart.html">Overview</a>
+                  <li class="divider"></li>
+                  <li class="nav-header">Working with text</li>
+                  <li><a href="/users/basics/creating-vectors-from-text.html">Creating vectors from text</a>
+                  <li><a href="/users/basics/collocations.html">Collocations</a>
+                  <li class="divider"></li>
+                  <li class="nav-header">Dimensionality reduction</li>
+                  <li><a href="/users/dim-reduction/dimensional-reduction.html">Singular Value Decomposition</a></li>
+                  <li><a href="/users/dim-reduction/ssvd.html">Stochastic SVD</a></li>
+                  <li class="divider"></li>
+                  <li class="nav-header">Topic Models</li>      
+                  <li><a href="/users/clustering/latent-dirichlet-allocation.html">Latent Dirichlet Allocation</a></li>
+                </ul>
+               </li>
+               <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Mahout MapReduce<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                <li class="nav-header">Classification</li>
+                  <li><a href="/users/classification/bayesian.html">Naive Bayes</a></li>
+                  <li><a href="/users/classification/hidden-markov-models.html">Hidden Markov Models</a></li>
+                  <li><a href="/users/classification/logistic-regression.html">Logistic Regression</a></li>
+                  <li><a href="/users/classification/partial-implementation.html">Random Forest</a></li>
+                  <li class="nav-header">Classification Examples</li>
+                  <li><a href="/users/classification/breiman-example.html">Breiman example</a></li>
+                  <li><a href="/users/classification/twenty-newsgroups.html">20 newsgroups example</a></li>
+                  <li><a href="/users/classification/bankmarketing-example.html">SGD classifier bank marketing</a></li>
+                  <li class="nav-header">Clustering</li>
+                  <li><a href="/users/clustering/k-means-clustering.html">k-Means</a></li>
+                  <li><a href="/users/clustering/canopy-clustering.html">Canopy</a></li>
+                  <li><a href="/users/clustering/fuzzy-k-means.html">Fuzzy k-Means</a></li>
+                  <li><a href="/users/clustering/streaming-k-means.html">Streaming KMeans</a></li>
+                  <li><a href="/users/clustering/spectral-clustering.html">Spectral Clustering</a></li>
+                  <li class="nav-header">Clustering Commandline usage</li>
+                  <li><a href="/users/clustering/k-means-commandline.html">Options for k-Means</a></li>
+                  <li><a href="/users/clustering/canopy-commandline.html">Options for Canopy</a></li>
+                  <li><a href="/users/clustering/fuzzy-k-means-commandline.html">Options for Fuzzy k-Means</a></li>
+                  <li class="nav-header">Clustering Examples</li>
+                  <li><a href="/users/clustering/clustering-of-synthetic-control-data.html">Synthetic data</a></li>
+                  <li class="nav-header">Cluster Post processing</li>
+                  <li><a href="/users/clustering/cluster-dumper.html">Cluster Dumper tool</a></li>
+                  <li><a href="/users/clustering/visualizing-sample-clusters.html">Cluster visualisation</a></li>
+                  <li class="nav-header">Recommendations</li>
+                  <li><a href="/users/recommender/recommender-first-timer-faq.html">First Timer FAQ</a></li>
+                  <li><a href="/users/recommender/userbased-5-minutes.html">A user-based recommender <br/>in 5 minutes</a></li>
+		  <li><a href="/users/recommender/matrix-factorization.html">Matrix factorization-based<br/> recommenders</a></li>
+                  <li><a href="/users/recommender/recommender-documentation.html">Overview</a></li>
+                  <li><a href="/users/recommender/intro-itembased-hadoop.html">Intro to item-based recommendations<br/> with Hadoop</a></li>
+                  <li><a href="/users/recommender/intro-als-hadoop.html">Intro to ALS recommendations<br/> with Hadoop</a></li>
+               </ul>
+              </li>
+              <!--  <li class="dropdown"> <a href="#" class="dropdown-toggle" data-toggle="dropdown">Recommendations<b class="caret"></b></a>
+                <ul class="dropdown-menu">
+                
+                </ul> -->
+            </li>
+           </ul>
+          </div><!--/.nav-collapse -->
+        </div>
+      </div>
+    </div>
+
+</div>
+
+ <div id="sidebar">
+  <div id="sidebar-wrap">
+    <h2>Twitter</h2>
+	<ul class="sidemenu">
+		<li>
+<a class="twitter-timeline" href="https://twitter.com/ApacheMahout" data-widget-id="422861673444028416">Tweets by @ApacheMahout</a>
+<script>!function(d,s,id){var js,fjs=d.getElementsByTagName(s)[0],p=/^http:/.test(d.location)?'http':'https';if(!d.getElementById(id)){js=d.createElement(s);js.id=id;js.src=p+"://platform.twitter.com/widgets.js";fjs.parentNode.insertBefore(js,fjs);}}(document,"script","twitter-wjs");</script>
+</li>
+	</ul>
+    <h2>Apache Software Foundation</h2>
+    <ul class="sidemenu">
+      <li><a href="http://www.apache.org/foundation/how-it-works.html">How the ASF works</a></li>
+      <li><a href="http://www.apache.org/foundation/getinvolved.html">Get Involved</a></li>
+      <li><a href="http://www.apache.org/dev/">Developer Resources</a></li>
+      <li><a href="http://www.apache.org/foundation/sponsorship.html">Sponsorship</a></li>
+      <li><a href="http://www.apache.org/foundation/thanks.html">Thanks</a></li>
+    </ul>
+    <h2>Related Projects</h2>
+    <ul class="sidemenu">
+      <li><a href="http://lucene.apache.org/">Lucene</a></li>
+      <li><a href="http://hadoop.apache.org/">Hadoop</a></li>
+    </ul>
+  </div>
+</div>
+
+  <div id="content-wrap" class="clearfix">
+   <div id="main">
+    <h1 id="multilayer-perceptron">Multilayer Perceptron</h1>
+<p>A multilayer perceptron is a biologically inspired feed-forward network that can 
+be trained to represent a nonlinear mapping between input and output data. It 
+consists of multiple layers, each containing multiple artificial neuron units and
+can be used for classification and regression tasks in a supervised learning approach. </p>
+<h2 id="command-line-usage">Command line usage</h2>
+<p>The MLP implementation is currently located in the MapReduce-Legacy package. It
+can be used with the following commands: </p>
+<h1 id="model-training">model training</h1>
+<div class="codehilite"><pre>$ <span class="n">bin</span><span class="o">/</span><span class="n">mahout</span> <span class="n">org</span><span class="p">.</span><span class="n">apache</span><span class="p">.</span><span class="n">mahout</span><span class="p">.</span><span class="n">classifier</span><span class="p">.</span><span class="n">mlp</span><span class="p">.</span><span class="n">TrainMultilayerPerceptron</span>
+</pre></div>
+
+
+<h1 id="model-usage">model usage</h1>
+<div class="codehilite"><pre>$ <span class="n">bin</span><span class="o">/</span><span class="n">mahout</span> <span class="n">org</span><span class="p">.</span><span class="n">apache</span><span class="p">.</span><span class="n">mahout</span><span class="p">.</span><span class="n">classifier</span><span class="p">.</span><span class="n">mlp</span><span class="p">.</span><span class="n">RunMultilayerPerceptron</span>
+</pre></div>
+
+
+<p>To train and use the model, a number of parameters can be specified. Parameters without default values have to be specified by the user. Consider that not all parameters can be used both for training and running the model. We give an example of the usage below.</p>
+<h3 id="parameters">Parameters</h3>
+<table>
+<thead>
+<tr>
+<th align="left">Command</th>
+<th align="right">Default</th>
+<th align="left">Description</th>
+<th align="left">Type</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td align="left">--input -i</td>
+<td align="right"></td>
+<td align="left">Path to the input data (currently, only .csv-files are allowed)</td>
+<td align="left"></td>
+</tr>
+<tr>
+<td align="left">--skipHeader -sh</td>
+<td align="right">false</td>
+<td align="left">Skip first row of the input file (corresponds to the csv headers)</td>
+<td align="left"></td>
+</tr>
+<tr>
+<td align="left">--update -u</td>
+<td align="right">false</td>
+<td align="left">Whether the model should be updated incrementally with every new training instance. If this parameter is not given, the model is trained from scratch.</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--labels -labels</td>
+<td align="right"></td>
+<td align="left">Instance labels separated by whitespaces.</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--model -mo</td>
+<td align="right"></td>
+<td align="left">Location where the model will be stored / is stored (if the specified location has an existing model, it will update the model through incremental learning).</td>
+<td align="left"></td>
+</tr>
+<tr>
+<td align="left">--layerSize -ls</td>
+<td align="right"></td>
+<td align="left">Number of units per layer, including input, hidden and ouput layers. This parameter specifies the topology of the network (see <a href="mlperceptron_structure.png" title="Architecture of a three-layer MLP">this image</a> for an example specified by <code>-ls 4 8 3</code>).</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--squashingFunction -sf</td>
+<td align="right">Sigmoid</td>
+<td align="left">The squashing function to use for the units. Currently only the sigmoid fucntion is available.</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--learningRate -l</td>
+<td align="right">0.5</td>
+<td align="left">The learning rate that is used for weight updates.</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--momemtumWeight -m</td>
+<td align="right">0.1</td>
+<td align="left">The momentum weight that is used for gradient descent. Must be in the range between 0 ... 1.0</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--regularizationWeight -r</td>
+<td align="right">0</td>
+<td align="left">Regularization value for the weight vector. Must be in the range between 0 ... 0.1</td>
+<td align="left">training</td>
+</tr>
+<tr>
+<td align="left">--format -f</td>
+<td align="right">csv</td>
+<td align="left">Input file format. Currently only csv is supported.</td>
+<td align="left"></td>
+</tr>
+<tr>
+<td align="left">--columnRange -cr</td>
+<td align="right"></td>
+<td align="left">Range of the columns to use from the input file, starting with 0 (i.e. <code>-cr 0 5</code> for including the first six columns only)</td>
+<td align="left">testing</td>
+</tr>
+<tr>
+<td align="left">--output -o</td>
+<td align="right"></td>
+<td align="left">Path to store the labeled results from running the model.</td>
+<td align="left">testing</td>
+</tr>
+</tbody>
+</table>
+<h2 id="example-usage">Example usage</h2>
+<p>In this example, we will train a multilayer perceptron for classification on the iris data set. The iris flower data set contains data of three flower species where each datapoint consists of four features.
+The dimensions of the data set are given through some flower parameters (sepal length, sepal width, ...). All samples contain a label that indicates the flower species they belong to.</p>
+<h3 id="training">Training</h3>
+<p>To train our multilayer perceptron model from the command line, we call the following command</p>
+<div class="codehilite"><pre>$ <span class="n">bin</span><span class="o">/</span><span class="n">mahout</span> <span class="n">org</span><span class="p">.</span><span class="n">apache</span><span class="p">.</span><span class="n">mahout</span><span class="p">.</span><span class="n">classifier</span><span class="p">.</span><span class="n">mlp</span><span class="p">.</span><span class="n">TrainMultilayerPerceptron</span> <span class="o">\</span>
+            <span class="o">-</span><span class="nb">i</span> <span class="o">./</span><span class="n">mrlegacy</span><span class="o">/</span><span class="n">src</span><span class="o">/</span><span class="n">test</span><span class="o">/</span><span class="n">resources</span><span class="o">/</span><span class="n">iris</span><span class="p">.</span><span class="n">csv</span> <span class="o">-</span><span class="n">sh</span> <span class="o">\</span>
+            <span class="o">-</span><span class="n">labels</span> <span class="n">setosa</span> <span class="n">versicolor</span> <span class="n">virginica</span> <span class="o">\</span>
+            <span class="o">-</span><span class="n">mo</span> <span class="o">/</span><span class="n">tmp</span><span class="o">/</span><span class="n">model</span><span class="p">.</span><span class="n">model</span> <span class="o">-</span><span class="n">ls</span> 4 8 3 <span class="o">-</span><span class="n">l</span> 0<span class="p">.</span>2 <span class="o">-</span><span class="n">m</span> 0<span class="p">.</span>35 <span class="o">-</span><span class="n">r</span> 0<span class="p">.</span>0001
+</pre></div>
+
+
+<p>The individual parameters are explained in the following.</p>
+<ul>
+<li><code>-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris data set as input data</li>
+<li><code>-sh</code> since the file <code>iris.csv</code> contains a header row, this row needs to be skipped </li>
+<li><code>-labels setosa versicolor virginica</code> we specify, which class labels should be learnt (which are the flower species in this case)</li>
+<li><code>-mo /tmp/model.model</code> specify where to store the model file</li>
+<li><code>-ls 4 8 3</code> we specify the structure and depth of our layers. The actual network structure can be seen in the figure below.</li>
+<li><code>-l 0.2</code> we set the learning rate to <code>0.2</code></li>
+<li><code>-m 0.35</code> momemtum weight is set to <code>0.35</code></li>
+<li><code>-r 0.0001</code> regularization weight is set to <code>0.0001</code></li>
+</ul>
+<table>
+<thead>
+<tr>
+<th></th>
+<th></th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>The picture shows the architecture defined by the above command. The topolgy of the network is completely defined through the number of layers and units because in this implementation of the MLP every unit is fully connected to the units of the next and previous layer. Bias units are added automatically.</td>
+<td><img alt="Multilayer perceptron network" src="mlperceptron_structure.png" title="Architecture of a three-layer MLP" /></td>
+</tr>
+</tbody>
+</table>
+<h3 id="testing">Testing</h3>
+<p>To test / run the multilayer perceptron classification on the trained model, we can use the following command</p>
+<div class="codehilite"><pre>$ <span class="n">bin</span><span class="o">/</span><span class="n">mahout</span> <span class="n">org</span><span class="p">.</span><span class="n">apache</span><span class="p">.</span><span class="n">mahout</span><span class="p">.</span><span class="n">classifier</span><span class="p">.</span><span class="n">mlp</span><span class="p">.</span><span class="n">RunMultilayerPerceptron</span> <span class="o">\</span>
+            <span class="o">-</span><span class="nb">i</span> <span class="o">./</span><span class="n">mrlegacy</span><span class="o">/</span><span class="n">src</span><span class="o">/</span><span class="n">test</span><span class="o">/</span><span class="n">resources</span><span class="o">/</span><span class="n">iris</span><span class="p">.</span><span class="n">csv</span> <span class="o">-</span><span class="n">sh</span> <span class="o">-</span><span class="n">cr</span> 0 3 <span class="o">\</span>
+            <span class="o">-</span><span class="n">mo</span> <span class="o">/</span><span class="n">tmp</span><span class="o">/</span><span class="n">model</span><span class="p">.</span><span class="n">model</span> <span class="o">-</span><span class="n">o</span> <span class="o">/</span><span class="n">tmp</span><span class="o">/</span><span class="n">labelResult</span><span class="p">.</span><span class="n">txt</span>
+</pre></div>
+
+
+<p>The individual parameters are explained in the following.</p>
+<ul>
+<li><code>-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris data set as input data</li>
+<li><code>-sh</code> since the file <code>iris.csv</code> contains a header row, this row needs to be skipped</li>
+<li><code>-cr 0 3</code> we specify the column range of the input file</li>
+<li><code>-mo /tmp/model.model</code> specify where the model file is stored</li>
+<li><code>-o /tmp/labelResult.txt</code> specify where the labeled output file will be stored</li>
+</ul>
+<h2 id="implementation">Implementation</h2>
+<p>The Multilayer Perceptron implementation is based on a more general Neural Network class. Command line support was added later on and provides a simple usage of the MLP as shown in the example. It is implemented to run on a single machine using stochastic gradient descent where the weights are updated using one datapoint at a time, resulting in a weight update of the form:
+$$ \vec{w}^{(t + 1)} = \vec{w}^{(t)} - n \Delta E_n(\vec{w}^{(t)}) $$</p>
+<p>where <em>a</em> is the activation of the unit. It is not yet possible to change the learning to more advanced methods using adaptive learning rates yet. </p>
+<p>The number of layers and units per layer can be specified manually and determines the whole topology with each unit being fully connected to the previous layer. A bias unit is automatically added to the input of every layer. 
+Currently, the logistic sigmoid is used as a squashing function in every hidden and output layer. It is of the form:</p>
+<p>$$ \frac{1}{1 + exp(-a)} $$</p>
+<p>The command line version <strong>does not perform iterations</strong> which leads to bad results on small datasets. Another restriction is, that the CLI version of the MLP only supports classification, since the labels have to be given explicitly when executing on the command line. </p>
+<p>A learned model can be stored and updated with new training instanced using the <code>--update</code> flag. Output of classification reults is saved as a .txt-file and only consists of the assigned labels. Apart from the command-line interface, it is possible to construct and compile more specialized neural networks using the API and interfaces in the mrlegacy package. </p>
+<h2 id="theoretical-background">Theoretical Background</h2>
+<p>The <em>multilayer perceptron</em> was inspired by the biological structure of the brain where multiple neurons are connected and form columns and layers. Perceptual input enters this network through our sensory organs and is then further processed into higher levels. 
+The term multilayer perceptron is a little misleading since the <em>perceptron</em> is a special case of a single <em>artificial neuron</em> that can be used for simple computations <a href="http://en.wikipedia.org/wiki/Perceptron" title="The perceptron in wikipedia">[1]</a>. The difference is that the perceptron uses a discontinous nonlinearity while for the MLP neurons that are implemented in mahout it is important to use continous nonlinearities. This is necessary for the implemented learning algorithm, where the error is propagated back from the output layer to the input layer and the weights of the connections are changed according to their contribution to the overall error. This algorithm is called backpropagation and uses gradient descent to update the weights. To compute the gradients we need continous nonlinearities. But let's start from the beginning!</p>
+<p>The first layer of the MLP represents the input and has no other purpose than routing the input to every connected unit in a feed-forward fashion. Following layers are called hidden layers and the last layer serves the special purpose to determine the output. The activation of a unit <em>u</em> in a hidden layer is computed through a weighted sum of all inputs, resulting in 
+$$ a_j = \sum_{i=1}^{D} w_{ji}^{(l)} x_i + w_{j0}^{(l)} $$
+This computes the activation <em>a</em> for neuron <em>j</em> where <em>w</em> is the weight from neuron <em>i</em> to neuron <em>j</em> in layer <em>l</em>. The last part, where <em>i = 0</em> is called the bias and can be used as an offset, independent from the input.</p>
+<p>The activation is then transformed by the aforementioned differentiable, nonlinear <em>activation function</em> and serves as the input to the next layer. The activation function is usually chosen from the family of sigmoidal functions such as <em>tanh</em> or <em>logistic sigmoidal</em> <a href="http://en.wikipedia.org/wiki/Sigmoid_function" title="Sigmoid function on wikipedia">[2]</a>. Often sigmoidal and logistic sigmoidal are used synonymous. Another word for the activation function is <em>squashing function</em> since the s-shape of this function class <em>squashes</em> the input.</p>
+<p>For different units or layers, different activation functions can be used to obtain different behaviors. Especially in the output layer, the activation function can be chosen to obtain the output value <em>y</em>, depending on the learning problem:
+$$ y_k = \sigma (a_k) $$</p>
+<p>If the learning problem is a linear regression task, sigma can be chosen to be the identity function. In case of classification problems, the choice of the squashing functions depends on the exact task at hand and often softmax activation functions are used. </p>
+<p>The equation for a MLP with three layers (one input, one hidden and one output) is then given by</p>
+<p>$$ y_k(\vec{x}, \vec{w}) = h \left( \sum_{j=1}^{M} w_{kj}^{(2)} h \left( \sum_{i=1}^{D} w_{ji}^{(1)} x_i + w_{j0}^{(1)} \right) + w_{k0}^{(2)} \right) $$ </p>
+<p>where <em>h</em> indicates the respective squashing function that is used in the units of a layer. <em>M</em> and <em>D</em> specify the number of incoming connections to a unit and we can see that the input to the first layer (hidden layer) is just the original input <em>x</em> whereas the input into the second layer (output layer) is the transformed output of layer one. The output <em>y</em> of unit <em>k</em> is therefore given by the above equation and depends on the input <em>x</em> and the weight vector <em>w</em>. This shows us, that the parameter that we can optimize during learning is <em>w</em> since we can not do anything about the input <em>x</em>. To facilitate the following steps, we can include the bias-terms into the weight vector and correct for the indices by adding another dimension with the value 1 to the input vector. The bias is a constant factor that is added to the weighted sum and that serves as a scaling factor of the nonlinear transformation. Including 
 it into the weight vector leads to:</p>
+<p>$$ y_k(\vec{x}, \vec{w}) = h \left( \sum_{j=0}^{M} w_{kj}^{(2)} h \left( \sum_{i=0}^{D} w_{ji}^{(1)} x_i \right) \right) $$ </p>
+<p>The previous paragraphs described how the MLP transforms a given input into some output using a combination of different nonlinear functions. Of course what we really want is to learn the structure of our data so that we can feed data with unknown labels into the network and get the estimated target labels <em>t</em>. To achieve this, we have to train our network. In this context, training means optimizing some function such that the error between the real labels <em>y</em> and the network-output <em>t</em> becomes smallest. We have seen in the previous pragraph, that our only knob to change is the weight vector <em>w</em>, making the function to be optimized a function of <em>w</em>. For simplicitly and because it is widely used, we choose the so called <em>sum-of-squares</em> error function as an example that is given by</p>
+<p>$$ E(\vec{w}) = \frac{1}{2} \sum_{n=1}^N \left( y(\vec{x}_n, \vec{w}) - t_n \right)^2 $$</p>
+<p>The goal is to minimize this function and thereby increase the performance of our model. A common method to achieve this is to use gradient descent and the so called technique of <em>backpropagation</em> where the goal is to compute the contribution of every unit to the overall error and changing the weight according to this contribution and into the direction of the gradient of the error function at this particular unit. In the following we try to give a short overview of the model training with gradient descent and backpropagation. A more detailed example can be found in <a href="http://research.microsoft.com/en-us/um/people/cmbishop/prml/" title="Christopher M. Bishop: Pattern Recognition and Machine Learning, Springer 2009">[3]</a> where much of this information is taken from.</p>
+<p>The problem with minimizing the error function is that the error can only be computed at the output layers where we get <em>t</em>, but we want to update all the weights of all the units. Therefore we use the technique of backpropagation to propagate the error, that we first compute at the output layer, back to the units of the previous layers. For this approach we also need to compute the gradients of the activation function. </p>
+<p>Weights are then updated with a small step in the direction of the negative gradient, regulated by the learning rate <em>n</em> such that we arrive at the formula for weight update:</p>
+<p>$$ \vec{w}^{(t + 1)} = \vec{w}^{(t)} - n \Delta E(\vec{w}^{(t)}) $$</p>
+<p>A momentum weight can be set as a parameter of the gradient descent method to increase the probability of finding better local or global optima of the error function.</p>
+<p>References</p>
+<p>[1] http://en.wikipedia.org/wiki/Perceptron</p>
+<p>[2] http://en.wikipedia.org/wiki/Sigmoid_function</p>
+<p>[3] <a href="http://research.microsoft.com/en-us/um/people/cmbishop/prml/">Christopher M. Bishop: Pattern Recognition and Machine Learning, Springer 2009</a></p>
+   </div>
+  </div>     
+</div> 
+  <footer class="footer" align="center">
+    <div class="container">
+      <p>
+        Copyright &copy; 2014 The Apache Software Foundation, Licensed under
+        the <a href="http://www.apache.org/licenses/LICENSE-2.0">Apache License, Version 2.0</a>.
+        <br />
+        Apache and the Apache feather logos are trademarks of The Apache Software Foundation.
+      </p>
+    </div>
+  </footer>
+  
+  <script src="/js/jquery-1.9.1.min.js"></script>
+  <script src="/js/bootstrap.min.js"></script>
+  <script>
+    (function() {
+      var cx = '012254517474945470291:vhsfv7eokdc';
+      var gcse = document.createElement('script');
+      gcse.type = 'text/javascript';
+      gcse.async = true;
+      gcse.src = (document.location.protocol == 'https:' ? 'https:' : 'http:') +
+          '//www.google.com/cse/cse.js?cx=' + cx;
+      var s = document.getElementsByTagName('script')[0];
+      s.parentNode.insertBefore(gcse, s);
+    })();
+  </script>
+</body>
+</html>