You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by dl...@apache.org on 2014/05/20 00:01:28 UTC

svn commit: r1596072 - /mahout/site/mahout_cms/trunk/content/users/sparkbindings/play-with-shell.mdtext

Author: dlyubimov
Date: Mon May 19 22:01:28 2014
New Revision: 1596072

URL: http://svn.apache.org/r1596072
Log:
CMS commit to mahout by dlyubimov

Modified:
    mahout/site/mahout_cms/trunk/content/users/sparkbindings/play-with-shell.mdtext

Modified: mahout/site/mahout_cms/trunk/content/users/sparkbindings/play-with-shell.mdtext
URL: http://svn.apache.org/viewvc/mahout/site/mahout_cms/trunk/content/users/sparkbindings/play-with-shell.mdtext?rev=1596072&r1=1596071&r2=1596072&view=diff
==============================================================================
--- mahout/site/mahout_cms/trunk/content/users/sparkbindings/play-with-shell.mdtext (original)
+++ mahout/site/mahout_cms/trunk/content/users/sparkbindings/play-with-shell.mdtext Mon May 19 22:01:28 2014
@@ -67,7 +67,7 @@ val drmData = drmParallelize(dense(
 
 Have a look at this matrix. The first four columns represent the ingredients (our features) and the last column (the rating) is the target variable for our regression. [Linear regression](https://en.wikipedia.org/wiki/Linear_regression) assumes that the **target variable y** is generated by the linear combination of **the feature matrix X** with the **parameter vector β** plus the **noise ε**, summarized in the formula **y = Xβ + ε**. Our goal is to find an estimate of the parameter vector *β* that explains the data very well.
 
-As a first step, we extract *X* and *y* from our data matrix. We get *X* by slicing: we take all rows (denoted by ```::```) and the first four columns, which have the ingredients in milligrams as content. Note that the result is again a DRM. The shell will not execute this code yet, it saves the history of operations and defers the execution until we really access a result. **Mahout's DSL automatically optimizes and parallelizes all operations on DRMs and runs them on Apache Spark.**
+As a first step, we extract `\(\mathbf{X}\)' and *y* from our data matrix. We get *X* by slicing: we take all rows (denoted by ```::```) and the first four columns, which have the ingredients in milligrams as content. Note that the result is again a DRM. The shell will not execute this code yet, it saves the history of operations and defers the execution until we really access a result. **Mahout's DSL automatically optimizes and parallelizes all operations on DRMs and runs them on Apache Spark.**
 
 <div class="codehilite"><pre>
 val drmX = drmData(::, 0 until 4)
@@ -173,4 +173,3 @@ goodness
 
 
 Liked what you saw? Checkout Mahout's overview for the [Scala and Spark bindings](https://mahout.apache.org/users/sparkbindings/home.html).
-