You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@mahout.apache.org by gi...@apache.org on 2017/11/30 00:23:41 UTC

[3/4] mahout git commit: Automatic Site Publish by Buildbot

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/basics/collocations.html
----------------------------------------------------------------------
diff --git a/users/basics/collocations.html b/users/basics/collocations.html
index 5c6cd24..875b720 100644
--- a/users/basics/collocations.html
+++ b/users/basics/collocations.html
@@ -369,7 +369,7 @@ specified LLR score from being emitted, and the –minSupport argument can
 be used to filter out collocations that appear below a certain number of
 times.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout seq2sparse
+<pre><code>bin/mahout seq2sparse
 
 Usage:									    
      [--minSupport &lt;minSupport&gt; --analyzerName &lt;analyzerName&gt; --chunkSize &lt;chunkSize&gt;
@@ -418,12 +418,12 @@ Options
   --sequentialAccessVector (-seq)     (Optional) Whether output vectors should	
 				      be SequentialAccessVectors If set true	
 				      else false 
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="Collocations-CollocDriver"></a></p>
 <h3 id="collocdriver">CollocDriver</h3>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout org.apache.mahout.vectorizer.collocations.llr.CollocDriver
+<pre><code>bin/mahout org.apache.mahout.vectorizer.collocations.llr.CollocDriver
 
 Usage:									    
  [--input &lt;input&gt; --output &lt;output&gt; --maxNGramSize &lt;ngramSize&gt; --overwrite    
@@ -462,7 +462,7 @@ Options
 				      final output alongside collocations
    
   --help (-h)			      Print out help	      
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="Collocations-Algorithmdetails"></a></p>
 <h2 id="algorithm-details">Algorithm details</h2>
@@ -494,14 +494,14 @@ frequencies are collected across the entire document.</p>
 <p>Once this is done, ngrams are split into head and tail portions. A key of type GramKey is generated which is used later to join ngrams with their heads and tails in the reducer phase. The GramKey is a composite key made up of a string n-gram fragement as the primary key and a secondary key used for grouping and sorting in the reduce phase. The secondary key will either be EMPTY in the case where we are collecting either the head or tail of an ngram as the value or it will contain the byte<a href=".html"></a>
  form of the ngram when collecting an ngram as the value.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>head_key(EMPTY) -&gt; (head subgram, head frequency)
+<pre><code>head_key(EMPTY) -&gt; (head subgram, head frequency)
 
 head_key(ngram) -&gt; (ngram, ngram frequency) 
 
 tail_key(EMPTY) -&gt; (tail subgram, tail frequency)
 
 tail_key(ngram) -&gt; (ngram, ngram frequency)
-</code></pre></div></div>
+</code></pre>
 
 <p>subgram and ngram values are packaged in Gram objects.</p>
 
@@ -543,7 +543,7 @@ or (subgram_key, ngram) tuple; one from each map task executed in which the
 particular subgram was found.
 The input will be traversed in the following order:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>(head subgram, frequency 1)
+<pre><code>(head subgram, frequency 1)
 (head subgram, frequency 2)
 ... 
 (head subgram, frequency N)
@@ -560,7 +560,7 @@ The input will be traversed in the following order:</p>
 (ngram N, frequency 2)
 ...
 (ngram N, frequency N)
-</code></pre></div></div>
+</code></pre>
 
 <p>Where all of the ngrams above share the same head. Data is presented in the
 same manner for the tail subgrams.</p>
@@ -574,18 +574,18 @@ be incremented.</p>
 
 <p>Pairs are passed to the collector in the following format:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ngram, ngram frequency -&gt; subgram subgram frequency
-</code></pre></div></div>
+<pre><code>ngram, ngram frequency -&gt; subgram subgram frequency
+</code></pre>
 
 <p>In this manner, the output becomes an unsorted version of the following:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ngram 1, frequency -&gt; ngram 1 head, head frequency
+<pre><code>ngram 1, frequency -&gt; ngram 1 head, head frequency
 ngram 1, frequency -&gt; ngram 1 tail, tail frequency
 ngram 2, frequency -&gt; ngram 2 head, head frequency
 ngram 2, frequency -&gt; ngram 2 tail, tail frequency
 ngram N, frequency -&gt; ngram N head, head frequency
 ngram N, frequency -&gt; ngram N tail, tail frequency
-</code></pre></div></div>
+</code></pre>
 
 <p>Output is in the format k:Gram (ngram, frequency), v:Gram (subgram,
 frequency)</p>
@@ -610,11 +610,11 @@ the work for llr calculation is done in the reduce phase.</p>
 <p>This phase receives the head and tail subgrams and their frequencies for
 each ngram (with frequency) produced for the input:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>ngram 1, frequency -&gt; ngram 1 head, frequency; ngram 1 tail, frequency
+<pre><code>ngram 1, frequency -&gt; ngram 1 head, frequency; ngram 1 tail, frequency
 ngram 2, frequency -&gt; ngram 2 head, frequency; ngram 2 tail, frequency
 ...
 ngram 1, frequency -&gt; ngram N head, frequency; ngram N tail, frequency
-</code></pre></div></div>
+</code></pre>
 
 <p>It also reads the full ngram count obtained from the first pass, passed in
 as a configuration option. The parameters to the llr calculation are

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/basics/creating-vectors-from-text.html
----------------------------------------------------------------------
diff --git a/users/basics/creating-vectors-from-text.html b/users/basics/creating-vectors-from-text.html
index ecd9b1e..1dfb217 100644
--- a/users/basics/creating-vectors-from-text.html
+++ b/users/basics/creating-vectors-from-text.html
@@ -310,7 +310,7 @@ option.  Examples of running the driver are included below:</p>
 <p><a name="CreatingVectorsfromText-GeneratinganoutputfilefromaLuceneIndex"></a></p>
 <h4 id="generating-an-output-file-from-a-lucene-index">Generating an output file from a Lucene Index</h4>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$MAHOUT_HOME/bin/mahout lucene.vector 
+<pre><code>$MAHOUT_HOME/bin/mahout lucene.vector 
     --dir (-d) dir                     The Lucene directory      
     --idField idField                  The field in the index    
                                            containing the index.  If 
@@ -362,17 +362,17 @@ option.  Examples of running the driver are included below:</p>
                                            percentage is expressed   
                                            as a value between 0 and  
                                            1. The default is 0.  
-</code></pre></div></div>
+</code></pre>
 
 <h4 id="example-create-50-vectors-from-an-index">Example: Create 50 Vectors from an Index</h4>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$MAHOUT_HOME/bin/mahout lucene.vector
+<pre><code>$MAHOUT_HOME/bin/mahout lucene.vector
     --dir $WORK_DIR/wikipedia/solr/data/index 
     --field body 
     --dictOut $WORK_DIR/solr/wikipedia/dict.txt
     --output $WORK_DIR/solr/wikipedia/out.txt 
     --max 50
-</code></pre></div></div>
+</code></pre>
 
 <p>This uses the index specified by –dir and the body field in it and writes
 out the info to the output dir and the dictionary to dict.txt.	It only
@@ -382,14 +382,14 @@ the index are output.</p>
 <p><a name="CreatingVectorsfromText-50VectorsFromLuceneL2Norm"></a></p>
 <h4 id="example-creating-50-normalized-vectors-from-a-lucene-index-using-the-l_2-norm">Example: Creating 50 Normalized Vectors from a Lucene Index using the <a href="http://en.wikipedia.org/wiki/Lp_space">L_2 Norm</a></h4>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$MAHOUT_HOME/bin/mahout lucene.vector 
+<pre><code>$MAHOUT_HOME/bin/mahout lucene.vector 
     --dir $WORK_DIR/wikipedia/solr/data/index 
     --field body 
     --dictOut $WORK_DIR/solr/wikipedia/dict.txt
     --output $WORK_DIR/solr/wikipedia/out.txt 
     --max 50 
     --norm 2
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="CreatingVectorsfromText-FromDirectoryofTextdocuments"></a></p>
 <h2 id="from-a-directory-of-text-documents">From A Directory of Text documents</h2>
@@ -408,7 +408,7 @@ binary documents to text.</p>
 <p>Mahout has a nifty utility which reads a directory path including its
 sub-directories and creates the SequenceFile in a chunked manner for us.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$MAHOUT_HOME/bin/mahout seqdirectory 
+<pre><code>$MAHOUT_HOME/bin/mahout seqdirectory 
     --input (-i) input                       Path to job input directory.   
     --output (-o) output                     The directory pathname for     
                                                  output.                        
@@ -438,7 +438,7 @@ sub-directories and creates the SequenceFile in a chunked manner for us.</p>
     --tempDir tempDir                        Intermediate output directory  
     --startPhase startPhase                  First phase to run             
     --endPhase endPhase                      Last phase to run  
-</code></pre></div></div>
+</code></pre>
 
 <p>The output of seqDirectory will be a Sequence file &lt; Text, Text &gt; of all documents (/sub-directory-path/documentFileName, documentText).</p>
 
@@ -448,7 +448,7 @@ sub-directories and creates the SequenceFile in a chunked manner for us.</p>
 <p>From the sequence file generated from the above step run the following to
 generate vectors.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$MAHOUT_HOME/bin/mahout seq2sparse
+<pre><code>$MAHOUT_HOME/bin/mahout seq2sparse
     --minSupport (-s) minSupport      (Optional) Minimum Support. Default       
                                           Value: 2                                  
     --analyzerName (-a) analyzerName  The class name of the analyzer            
@@ -497,7 +497,7 @@ generate vectors.</p>
                                           be NamedVectors. If set true else false   
     --logNormalize (-lnorm)           (Optional) Whether output vectors should  
                                           be logNormalize. If set true else false
-</code></pre></div></div>
+</code></pre>
 
 <p>This will create SequenceFiles of tokenized documents &lt; Text, StringTuple &gt;  (docID, tokenizedDoc) and vectorized documents &lt; Text, VectorWritable &gt; (docID, TF-IDF Vector).</p>
 
@@ -510,17 +510,17 @@ generate vectors.</p>
 <h4 id="example-creating-normalized-tf-idf-vectors-from-a-directory-of-text-documents-using-trigrams-and-the-l_2-norm">Example: Creating Normalized <a href="http://en.wikipedia.org/wiki/Tf%E2%80%93idf">TF-IDF</a> Vectors from a directory of text documents using <a href="http://en.wikipedia.org/wiki/N-gram">trigrams</a> and the <a href="http://en.wikipedia.org/wiki/Lp_space">L_2 Norm</a></h4>
 <p>Create sequence files from the directory of text documents:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$MAHOUT_HOME/bin/mahout seqdirectory 
+<pre><code>$MAHOUT_HOME/bin/mahout seqdirectory 
     -i $WORK_DIR/reuters 
     -o $WORK_DIR/reuters-seqdir 
     -c UTF-8
     -chunk 64
     -xm sequential
-</code></pre></div></div>
+</code></pre>
 
 <p>Vectorize the documents using trigrams, L_2 length normalization and a maximum document frequency cutoff of 85%.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$MAHOUT_HOME/bin/mahout seq2sparse 
+<pre><code>$MAHOUT_HOME/bin/mahout seq2sparse 
     -i $WORK_DIR/reuters-out-seqdir/ 
     -o $WORK_DIR/reuters-out-seqdir-sparse-kmeans 
     --namedVec
@@ -528,7 +528,7 @@ generate vectors.</p>
     -ng 3
     -n 2
     --maxDFPercent 85 
-</code></pre></div></div>
+</code></pre>
 
 <p>The sequence file in the $WORK_DIR/reuters-out-seqdir-sparse-kmeans/tfidf-vectors directory can now be used as input to the Mahout <a href="http://mahout.apache.org/users/clustering/k-means-clustering.html">k-Means</a> clustering algorithm.</p>
 
@@ -549,14 +549,14 @@ format. Probably the easiest way to go would be to implement your own
 Iterable<Vector> (called VectorIterable in the example below) and then
 reuse the existing VectorWriter classes:</Vector></p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>VectorWriter vectorWriter = SequenceFile.createWriter(filesystem,
+<pre><code>VectorWriter vectorWriter = SequenceFile.createWriter(filesystem,
                                                       configuration,
                                                       outfile,
                                                       LongWritable.class,
                                                       SparseVector.class);
 
 long numDocs = vectorWriter.write(new VectorIterable(), Long.MAX_VALUE);
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/basics/quickstart.html
----------------------------------------------------------------------
diff --git a/users/basics/quickstart.html b/users/basics/quickstart.html
index 6d8a4c0..b6f689d 100644
--- a/users/basics/quickstart.html
+++ b/users/basics/quickstart.html
@@ -287,12 +287,12 @@
 <p>Mahout is also available via a <a href="http://mvnrepository.com/artifact/org.apache.mahout">maven repository</a> under the group id <em>org.apache.mahout</em>.
 If you would like to import the latest release of mahout into a java project, add the following dependency in your <em>pom.xml</em>:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>&lt;dependency&gt;
+<pre><code>&lt;dependency&gt;
     &lt;groupId&gt;org.apache.mahout&lt;/groupId&gt;
     &lt;artifactId&gt;mahout-mr&lt;/artifactId&gt;
     &lt;version&gt;0.10.0&lt;/version&gt;
 &lt;/dependency&gt;
-</code></pre></div></div>
+</code></pre>
 
 <h2 id="features">Features</h2>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/bayesian-commandline.html
----------------------------------------------------------------------
diff --git a/users/classification/bayesian-commandline.html b/users/classification/bayesian-commandline.html
index 6039cfd..ffeea8b 100644
--- a/users/classification/bayesian-commandline.html
+++ b/users/classification/bayesian-commandline.html
@@ -288,14 +288,14 @@ complementary naive bayesian classification algorithms on a Hadoop cluster.</p>
 
 <p>In the examples directory type:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mvn -q exec:java
+<pre><code>mvn -q exec:java
     -Dexec.mainClass="org.apache.mahout.classifier.bayes.mapreduce.bayes.&lt;JOB&gt;"
     -Dexec.args="&lt;OPTIONS&gt;"
 
 mvn -q exec:java
     -Dexec.mainClass="org.apache.mahout.classifier.bayes.mapreduce.cbayes.&lt;JOB&gt;"
     -Dexec.args="&lt;OPTIONS&gt;"
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="bayesian-commandline-Runningitonthecluster"></a></p>
 <h3 id="running-it-on-the-cluster">Running it on the cluster</h3>
@@ -328,7 +328,7 @@ to view all outputs.</p>
 <p><a name="bayesian-commandline-Commandlineoptions"></a></p>
 <h2 id="command-line-options">Command line options</h2>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>BayesDriver, BayesThetaNormalizerDriver, CBayesNormalizedWeightDriver, CBayesDriver, CBayesThetaDriver, CBayesThetaNormalizerDriver, BayesWeightSummerDriver, BayesFeatureDriver, BayesTfIdfDriver Usage:
+<pre><code>BayesDriver, BayesThetaNormalizerDriver, CBayesNormalizedWeightDriver, CBayesDriver, CBayesThetaDriver, CBayesThetaNormalizerDriver, BayesWeightSummerDriver, BayesFeatureDriver, BayesTfIdfDriver Usage:
     [--input &lt;input&gt; --output &lt;output&gt; --help]
   
 Options
@@ -336,7 +336,7 @@ Options
   --input (-i) input	  The Path for input Vectors. Must be a SequenceFile of Writable, Vector.
   --output (-o) output	  The directory pathname for output points.
   --help (-h)		  Print out help.
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/bayesian.html
----------------------------------------------------------------------
diff --git a/users/classification/bayesian.html b/users/classification/bayesian.html
index 128e658..22c48df 100644
--- a/users/classification/bayesian.html
+++ b/users/classification/bayesian.html
@@ -288,38 +288,38 @@
 <p>As described in <a href="http://people.csail.mit.edu/jrennie/papers/icml03-nb.pdf">[1]</a> Mahout Naive Bayes is broken down into the following steps (assignments are over all possible index values):</p>
 
 <ul>
-  <li>Let <code class="highlighter-rouge">\(\vec{d}=(\vec{d_1},...,\vec{d_n})\)</code> be a set of documents; <code class="highlighter-rouge">\(d_{ij}\)</code> is the count of word <code class="highlighter-rouge">\(i\)</code> in document <code class="highlighter-rouge">\(j\)</code>.</li>
-  <li>Let <code class="highlighter-rouge">\(\vec{y}=(y_1,...,y_n)\)</code> be their labels.</li>
-  <li>Let <code class="highlighter-rouge">\(\alpha_i\)</code> be a smoothing parameter for all words in the vocabulary; let <code class="highlighter-rouge">\(\alpha=\sum_i{\alpha_i}\)</code>.</li>
-  <li><strong>Preprocessing</strong>(via seq2Sparse) TF-IDF transformation and L2 length normalization of <code class="highlighter-rouge">\(\vec{d}\)</code>
+  <li>Let <code>\(\vec{d}=(\vec{d_1},...,\vec{d_n})\)</code> be a set of documents; <code>\(d_{ij}\)</code> is the count of word <code>\(i\)</code> in document <code>\(j\)</code>.</li>
+  <li>Let <code>\(\vec{y}=(y_1,...,y_n)\)</code> be their labels.</li>
+  <li>Let <code>\(\alpha_i\)</code> be a smoothing parameter for all words in the vocabulary; let <code>\(\alpha=\sum_i{\alpha_i}\)</code>.</li>
+  <li><strong>Preprocessing</strong>(via seq2Sparse) TF-IDF transformation and L2 length normalization of <code>\(\vec{d}\)</code>
     <ol>
-      <li><code class="highlighter-rouge">\(d_{ij} = \sqrt{d_{ij}}\)</code></li>
-      <li><code class="highlighter-rouge">\(d_{ij} = d_{ij}\left(\log{\frac{\sum_k1}{\sum_k\delta_{ik}+1}}+1\right)\)</code></li>
-      <li><code class="highlighter-rouge">\(d_{ij} =\frac{d_{ij}}{\sqrt{\sum_k{d_{kj}^2}}}\)</code></li>
+      <li><code>\(d_{ij} = \sqrt{d_{ij}}\)</code></li>
+      <li><code>\(d_{ij} = d_{ij}\left(\log{\frac{\sum_k1}{\sum_k\delta_{ik}+1}}+1\right)\)</code></li>
+      <li><code>\(d_{ij} =\frac{d_{ij}}{\sqrt{\sum_k{d_{kj}^2}}}\)</code></li>
     </ol>
   </li>
-  <li><strong>Training: Bayes</strong><code class="highlighter-rouge">\((\vec{d},\vec{y})\)</code> calculate term weights <code class="highlighter-rouge">\(w_{ci}\)</code> as:
+  <li><strong>Training: Bayes</strong><code>\((\vec{d},\vec{y})\)</code> calculate term weights <code>\(w_{ci}\)</code> as:
     <ol>
-      <li><code class="highlighter-rouge">\(\hat\theta_{ci}=\frac{d_{ic}+\alpha_i}{\sum_k{d_{kc}}+\alpha}\)</code></li>
-      <li><code class="highlighter-rouge">\(w_{ci}=\log{\hat\theta_{ci}}\)</code></li>
+      <li><code>\(\hat\theta_{ci}=\frac{d_{ic}+\alpha_i}{\sum_k{d_{kc}}+\alpha}\)</code></li>
+      <li><code>\(w_{ci}=\log{\hat\theta_{ci}}\)</code></li>
     </ol>
   </li>
-  <li><strong>Training: CBayes</strong><code class="highlighter-rouge">\((\vec{d},\vec{y})\)</code> calculate term weights <code class="highlighter-rouge">\(w_{ci}\)</code> as:
+  <li><strong>Training: CBayes</strong><code>\((\vec{d},\vec{y})\)</code> calculate term weights <code>\(w_{ci}\)</code> as:
     <ol>
-      <li><code class="highlighter-rouge">\(\hat\theta_{ci} = \frac{\sum_{j:y_j\neq c}d_{ij}+\alpha_i}{\sum_{j:y_j\neq c}{\sum_k{d_{kj}}}+\alpha}\)</code></li>
-      <li><code class="highlighter-rouge">\(w_{ci}=-\log{\hat\theta_{ci}}\)</code></li>
-      <li><code class="highlighter-rouge">\(w_{ci}=\frac{w_{ci}}{\sum_i \lvert w_{ci}\rvert}\)</code></li>
+      <li><code>\(\hat\theta_{ci} = \frac{\sum_{j:y_j\neq c}d_{ij}+\alpha_i}{\sum_{j:y_j\neq c}{\sum_k{d_{kj}}}+\alpha}\)</code></li>
+      <li><code>\(w_{ci}=-\log{\hat\theta_{ci}}\)</code></li>
+      <li><code>\(w_{ci}=\frac{w_{ci}}{\sum_i \lvert w_{ci}\rvert}\)</code></li>
     </ol>
   </li>
   <li><strong>Label Assignment/Testing:</strong>
     <ol>
-      <li>Let <code class="highlighter-rouge">\(\vec{t}= (t_1,...,t_n)\)</code> be a test document; let <code class="highlighter-rouge">\(t_i\)</code> be the count of the word <code class="highlighter-rouge">\(t\)</code>.</li>
-      <li>Label the document according to <code class="highlighter-rouge">\(l(t)=\arg\max_c \sum\limits_{i} t_i w_{ci}\)</code></li>
+      <li>Let <code>\(\vec{t}= (t_1,...,t_n)\)</code> be a test document; let <code>\(t_i\)</code> be the count of the word <code>\(t\)</code>.</li>
+      <li>Label the document according to <code>\(l(t)=\arg\max_c \sum\limits_{i} t_i w_{ci}\)</code></li>
     </ol>
   </li>
 </ul>
 
-<p>As we can see, the main difference between Bayes and CBayes is the weight calculation step.  Where Bayes weighs terms more heavily based on the likelihood that they belong to class <code class="highlighter-rouge">\(c\)</code>, CBayes seeks to maximize term weights on the likelihood that they do not belong to any other class.</p>
+<p>As we can see, the main difference between Bayes and CBayes is the weight calculation step.  Where Bayes weighs terms more heavily based on the likelihood that they belong to class <code>\(c\)</code>, CBayes seeks to maximize term weights on the likelihood that they do not belong to any other class.</p>
 
 <h2 id="running-from-the-command-line">Running from the command line</h2>
 
@@ -330,31 +330,31 @@
     <p><strong>Preprocessing:</strong>
 For a set of Sequence File Formatted documents in PATH_TO_SEQUENCE_FILES the <a href="https://mahout.apache.org/users/basics/creating-vectors-from-text.html">mahout seq2sparse</a> command performs the TF-IDF transformations (-wt tfidf option) and L2 length normalization (-n 2 option) as follows:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  mahout seq2sparse 
+    <pre><code>  mahout seq2sparse 
     -i ${PATH_TO_SEQUENCE_FILES} 
     -o ${PATH_TO_TFIDF_VECTORS} 
     -nv 
     -n 2
     -wt tfidf
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p><strong>Training:</strong>
-The model is then trained using <code class="highlighter-rouge">mahout trainnb</code> .  The default is to train a Bayes model. The -c option is given to train a CBayes model:</p>
+The model is then trained using <code>mahout trainnb</code> .  The default is to train a Bayes model. The -c option is given to train a CBayes model:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  mahout trainnb
+    <pre><code>  mahout trainnb
     -i ${PATH_TO_TFIDF_VECTORS} 
     -o ${PATH_TO_MODEL}/model 
     -li ${PATH_TO_MODEL}/labelindex 
     -ow 
     -c
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p><strong>Label Assignment/Testing:</strong>
-Classification and testing on a holdout set can then be performed via <code class="highlighter-rouge">mahout testnb</code>. Again, the -c option indicates that the model is CBayes.  The -seq option tells <code class="highlighter-rouge">mahout testnb</code> to run sequentially:</p>
+Classification and testing on a holdout set can then be performed via <code>mahout testnb</code>. Again, the -c option indicates that the model is CBayes.  The -seq option tells <code>mahout testnb</code> to run sequentially:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  mahout testnb 
+    <pre><code>  mahout testnb 
     -i ${PATH_TO_TFIDF_TEST_VECTORS}
     -m ${PATH_TO_MODEL}/model 
     -l ${PATH_TO_MODEL}/labelindex 
@@ -362,7 +362,7 @@ Classification and testing on a holdout set can then be performed via <code clas
     -o ${PATH_TO_OUTPUT} 
     -c 
     -seq
-</code></pre></div>    </div>
+</code></pre>
   </li>
 </ul>
 
@@ -372,9 +372,9 @@ Classification and testing on a holdout set can then be performed via <code clas
   <li>
     <p><strong>Preprocessing:</strong></p>
 
-    <p>Only relevant parameters used for Bayes/CBayes as detailed above are shown. Several other transformations can be performed by <code class="highlighter-rouge">mahout seq2sparse</code> and used as input to Bayes/CBayes.  For a full list of <code class="highlighter-rouge">mahout seq2Sparse</code> options see the <a href="https://mahout.apache.org/users/basics/creating-vectors-from-text.html">Creating vectors from text</a> page.</p>
+    <p>Only relevant parameters used for Bayes/CBayes as detailed above are shown. Several other transformations can be performed by <code>mahout seq2sparse</code> and used as input to Bayes/CBayes.  For a full list of <code>mahout seq2Sparse</code> options see the <a href="https://mahout.apache.org/users/basics/creating-vectors-from-text.html">Creating vectors from text</a> page.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  mahout seq2sparse                         
+    <pre><code>  mahout seq2sparse                         
     --output (-o) output             The directory pathname for output.        
     --input (-i) input               Path to job input directory.              
     --weight (-wt) weight            The kind of weight to use. Currently TF   
@@ -389,12 +389,12 @@ Classification and testing on a holdout set can then be performed via <code clas
                                          else false                                
     --namedVector (-nv)              (Optional) Whether output vectors should  
                                          be NamedVectors. If set true else false   
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p><strong>Training:</strong></p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  mahout trainnb
+    <pre><code>  mahout trainnb
     --input (-i) input               Path to job input directory.                 
     --output (-o) output             The directory pathname for output.                    
     --alphaI (-a) alphaI             Smoothing parameter. Default is 1.0
@@ -406,12 +406,12 @@ Classification and testing on a holdout set can then be performed via <code clas
     --tempDir tempDir                Intermediate output directory                
     --startPhase startPhase          First phase to run                           
     --endPhase endPhase              Last phase to run
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p><strong>Testing:</strong></p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  mahout testnb   
+    <pre><code>  mahout testnb   
     --input (-i) input               Path to job input directory.                  
     --output (-o) output             The directory pathname for output.            
     --overwrite (-ow)                If present, overwrite the output directory    
@@ -426,7 +426,7 @@ Classification and testing on a holdout set can then be performed via <code clas
     --tempDir tempDir                Intermediate output directory                 
     --startPhase startPhase          First phase to run                            
     --endPhase endPhase              Last phase to run  
-</code></pre></div>    </div>
+</code></pre>
   </li>
 </ul>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/breiman-example.html
----------------------------------------------------------------------
diff --git a/users/classification/breiman-example.html b/users/classification/breiman-example.html
index 8d1a60f..c239bd7 100644
--- a/users/classification/breiman-example.html
+++ b/users/classification/breiman-example.html
@@ -300,8 +300,8 @@ results to greater values of <em>m</em></li>
 
 <p>First, we deal with <a href="http://archive.ics.uci.edu/ml/datasets/Glass+Identification">Glass Identification</a>: download the <a href="http://archive.ics.uci.edu/ml/machine-learning-databases/glass/glass.data">dataset</a> file called <strong>glass.data</strong> and store it onto your local machine. Next, we must generate the descriptor file <strong>glass.info</strong> for this dataset with the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout org.apache.mahout.classifier.df.tools.Describe -p /path/to/glass.data -f /path/to/glass.info -d I 9 N L
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.classifier.df.tools.Describe -p /path/to/glass.data -f /path/to/glass.info -d I 9 N L
+</code></pre>
 
 <p>Substitute <em>/path/to/</em> with the folder where you downloaded the dataset, the argument “I 9 N L” indicates the nature of the variables. Here it means 1
 ignored (I) attribute, followed by 9 numerical(N) attributes, followed by
@@ -309,8 +309,8 @@ the label (L).</p>
 
 <p>Finally, we build and evaluate our random forest classifier as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout org.apache.mahout.classifier.df.BreimanExample -d /path/to/glass.data -ds /path/to/glass.info -i 10 -t 100 which builds 100 trees (-t argument) and repeats the test 10 iterations (-i argument) 
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.classifier.df.BreimanExample -d /path/to/glass.data -ds /path/to/glass.info -i 10 -t 100 which builds 100 trees (-t argument) and repeats the test 10 iterations (-i argument) 
+</code></pre>
 
 <p>The example outputs the following results:</p>
 
@@ -327,13 +327,13 @@ iterations</li>
 
 <p>We can repeat this for a <a href="http://archive.ics.uci.edu/ml/datasets/Connectionist+Bench+%28Sonar,+Mines+vs.+Rocks%29">Sonar</a> usecase: download the <a href="http://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/sonar/sonar.all-data">dataset</a> file called <strong>sonar.all-data</strong> and store it onto your local machine. Generate the descriptor file <strong>sonar.info</strong> for this dataset with the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout org.apache.mahout.classifier.df.tools.Describe -p /path/to/sonar.all-data -f /path/to/sonar.info -d 60 N L
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.classifier.df.tools.Describe -p /path/to/sonar.all-data -f /path/to/sonar.info -d 60 N L
+</code></pre>
 
 <p>The argument “60 N L” means 60 numerical(N) attributes, followed by the label (L). Analogous to the previous case, we run the evaluation as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout org.apache.mahout.classifier.df.BreimanExample -d /path/to/sonar.all-data -ds /path/to/sonar.info -i 10 -t 100
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.classifier.df.BreimanExample -d /path/to/sonar.all-data -ds /path/to/sonar.info -i 10 -t 100
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/class-discovery.html
----------------------------------------------------------------------
diff --git a/users/classification/class-discovery.html b/users/classification/class-discovery.html
index 9dcfe83..20f30fc 100644
--- a/users/classification/class-discovery.html
+++ b/users/classification/class-discovery.html
@@ -304,13 +304,13 @@ A classification rule can be represented as follows:</p>
 <p>For a given <em>target</em> class and a weight <em>threshold</em>, the classification
 rule can be read :</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for each row of the dataset
+<pre><code>for each row of the dataset
   if (rule.w1 &lt; threshold || (rule.w1 &gt;= threshold &amp;&amp; row.value1 rule.op1 rule.value1)) &amp;&amp;
      (rule.w2 &lt; threshold || (rule.w2 &gt;= threshold &amp;&amp; row.value2 rule.op2 rule.value2)) &amp;&amp;
      ...
      (rule.wN &lt; threshold || (rule.wN &gt;= threshold &amp;&amp; row.valueN rule.opN rule.valueN)) then
     row is part of the target class
-</code></pre></div></div>
+</code></pre>
 
 <p><em>Important:</em> The label attribute is not evaluated by the rule.</p>
 
@@ -344,11 +344,11 @@ and the following parameters: threshold = 1 and target = 0 (brown).
 
 <p>This rule can be read as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>for each row of the dataset
+<pre><code>for each row of the dataset
   if (0 &lt; 1 || (0 &gt;= 1 &amp;&amp; row.value1 &lt; 20)) &amp;&amp;
      (1 &lt; 1 || (1 &gt;= 1 &amp;&amp; row.value2 != light)) then
     row is part of the "brown Eye Color" class
-</code></pre></div></div>
+</code></pre>
 
 <p>Please note how the rule skipped the label attribute (Eye Color), and how
 the first condition is ignored because its weight is &lt; threshold.</p>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/hidden-markov-models.html
----------------------------------------------------------------------
diff --git a/users/classification/hidden-markov-models.html b/users/classification/hidden-markov-models.html
index 1a84234..6f4fe33 100644
--- a/users/classification/hidden-markov-models.html
+++ b/users/classification/hidden-markov-models.html
@@ -330,18 +330,18 @@ can be efficiently solved using the Baum-Welch algorithm.</li>
 
 <p>Create an input file to train the model.  Here we have a sequence drawn from the set of states 0, 1, 2, and 3, separated by space characters.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ echo "0 1 2 2 2 1 1 0 0 3 3 3 2 1 2 1 1 1 1 2 2 2 0 0 0 0 0 0 2 2 2 0 0 0 0 0 0 2 2 2 3 3 3 3 3 3 2 3 2 3 2 3 2 1 3 0 0 0 1 0 1 0 2 1 2 1 2 1 2 3 3 3 3 2 2 3 2 1 1 0" &gt; hmm-input
-</code></pre></div></div>
+<pre><code>$ echo "0 1 2 2 2 1 1 0 0 3 3 3 2 1 2 1 1 1 1 2 2 2 0 0 0 0 0 0 2 2 2 0 0 0 0 0 0 2 2 2 3 3 3 3 3 3 2 3 2 3 2 3 2 1 3 0 0 0 1 0 1 0 2 1 2 1 2 1 2 3 3 3 3 2 2 3 2 1 1 0" &gt; hmm-input
+</code></pre>
 
 <p>Now run the baumwelch job to train your model, after first setting MAHOUT_LOCAL to true, to use your local file system.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ export MAHOUT_LOCAL=true
+<pre><code>$ export MAHOUT_LOCAL=true
 $ $MAHOUT_HOME/bin/mahout baumwelch -i hmm-input -o hmm-model -nh 3 -no 4 -e .0001 -m 1000
-</code></pre></div></div>
+</code></pre>
 
 <p>Output like the following should appear in the console.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Initial probabilities: 
+<pre><code>Initial probabilities: 
 0 1 2 
 1.0 0.0 3.5659361683006626E-251 
 Transition matrix:
@@ -355,18 +355,18 @@ Emission matrix:
 1 7.495656581383351E-34 0.2241269055449904 0.4510889999455847 0.32478409450942497 
 2 0.815051477991782 0.18494852200821799 8.465660634827592E-33 2.8603899591778015E-36 
 14/03/22 09:52:21 INFO driver.MahoutDriver: Program took 180 ms (Minutes: 0.003)
-</code></pre></div></div>
+</code></pre>
 
 <p>The model trained with the input set now is in the file ‘hmm-model’, which we can use to build a predicted sequence.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ $MAHOUT_HOME/bin/mahout hmmpredict -m hmm-model -o hmm-predictions -l 10
-</code></pre></div></div>
+<pre><code>$ $MAHOUT_HOME/bin/mahout hmmpredict -m hmm-model -o hmm-predictions -l 10
+</code></pre>
 
 <p>To see the predictions:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ cat hmm-predictions 
+<pre><code>$ cat hmm-predictions 
 0 1 3 3 2 2 2 2 1 2
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="HiddenMarkovModels-Resources"></a></p>
 <h2 id="resources">Resources</h2>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/mlp.html
----------------------------------------------------------------------
diff --git a/users/classification/mlp.html b/users/classification/mlp.html
index 5283911..4983775 100644
--- a/users/classification/mlp.html
+++ b/users/classification/mlp.html
@@ -285,9 +285,9 @@ can be used for classification and regression tasks in a supervised learning app
 can be used with the following commands:</p>
 
 <h1 id="model-training">model training</h1>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ bin/mahout org.apache.mahout.classifier.mlp.TrainMultilayerPerceptron  # model usage
+<pre><code>$ bin/mahout org.apache.mahout.classifier.mlp.TrainMultilayerPerceptron  # model usage
 $ bin/mahout org.apache.mahout.classifier.mlp.RunMultilayerPerceptron
-</code></pre></div></div>
+</code></pre>
 
 <p>To train and use the model, a number of parameters can be specified. Parameters without default values have to be specified by the user. Consider that not all parameters can be used both for training and running the model. We give an example of the usage below.</p>
 
@@ -336,7 +336,7 @@ $ bin/mahout org.apache.mahout.classifier.mlp.RunMultilayerPerceptron
     <tr>
       <td style="text-align: left">–layerSize -ls</td>
       <td style="text-align: right"> </td>
-      <td style="text-align: left">Number of units per layer, including input, hidden and ouput layers. This parameter specifies the topology of the network (see <a href="mlperceptron_structure.png" title="Architecture of a three-layer MLP">this image</a> for an example specified by <code class="highlighter-rouge">-ls 4 8 3</code>).</td>
+      <td style="text-align: left">Number of units per layer, including input, hidden and ouput layers. This parameter specifies the topology of the network (see <a href="mlperceptron_structure.png" title="Architecture of a three-layer MLP">this image</a> for an example specified by <code>-ls 4 8 3</code>).</td>
       <td style="text-align: left">training</td>
     </tr>
     <tr>
@@ -372,7 +372,7 @@ $ bin/mahout org.apache.mahout.classifier.mlp.RunMultilayerPerceptron
     <tr>
       <td style="text-align: left">–columnRange -cr</td>
       <td style="text-align: right"> </td>
-      <td style="text-align: left">Range of the columns to use from the input file, starting with 0 (i.e. <code class="highlighter-rouge">-cr 0 5</code> for including the first six columns only)</td>
+      <td style="text-align: left">Range of the columns to use from the input file, starting with 0 (i.e. <code>-cr 0 5</code> for including the first six columns only)</td>
       <td style="text-align: left">testing</td>
     </tr>
     <tr>
@@ -393,23 +393,23 @@ The dimensions of the data set are given through some flower parameters (sepal l
 
 <p>To train our multilayer perceptron model from the command line, we call the following command</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ bin/mahout org.apache.mahout.classifier.mlp.TrainMultilayerPerceptron \
+<pre><code>$ bin/mahout org.apache.mahout.classifier.mlp.TrainMultilayerPerceptron \
             -i ./mrlegacy/src/test/resources/iris.csv -sh \
             -labels setosa versicolor virginica \
             -mo /tmp/model.model -ls 4 8 3 -l 0.2 -m 0.35 -r 0.0001
-</code></pre></div></div>
+</code></pre>
 
 <p>The individual parameters are explained in the following.</p>
 
 <ul>
-  <li><code class="highlighter-rouge">-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris data set as input data</li>
-  <li><code class="highlighter-rouge">-sh</code> since the file <code class="highlighter-rouge">iris.csv</code> contains a header row, this row needs to be skipped</li>
-  <li><code class="highlighter-rouge">-labels setosa versicolor virginica</code> we specify, which class labels should be learnt (which are the flower species in this case)</li>
-  <li><code class="highlighter-rouge">-mo /tmp/model.model</code> specify where to store the model file</li>
-  <li><code class="highlighter-rouge">-ls 4 8 3</code> we specify the structure and depth of our layers. The actual network structure can be seen in the figure below.</li>
-  <li><code class="highlighter-rouge">-l 0.2</code> we set the learning rate to <code class="highlighter-rouge">0.2</code></li>
-  <li><code class="highlighter-rouge">-m 0.35</code> momemtum weight is set to <code class="highlighter-rouge">0.35</code></li>
-  <li><code class="highlighter-rouge">-r 0.0001</code> regularization weight is set to <code class="highlighter-rouge">0.0001</code></li>
+  <li><code>-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris data set as input data</li>
+  <li><code>-sh</code> since the file <code>iris.csv</code> contains a header row, this row needs to be skipped</li>
+  <li><code>-labels setosa versicolor virginica</code> we specify, which class labels should be learnt (which are the flower species in this case)</li>
+  <li><code>-mo /tmp/model.model</code> specify where to store the model file</li>
+  <li><code>-ls 4 8 3</code> we specify the structure and depth of our layers. The actual network structure can be seen in the figure below.</li>
+  <li><code>-l 0.2</code> we set the learning rate to <code>0.2</code></li>
+  <li><code>-m 0.35</code> momemtum weight is set to <code>0.35</code></li>
+  <li><code>-r 0.0001</code> regularization weight is set to <code>0.0001</code></li>
 </ul>
 
 <table>
@@ -431,19 +431,19 @@ The dimensions of the data set are given through some flower parameters (sepal l
 
 <p>To test / run the multilayer perceptron classification on the trained model, we can use the following command</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ bin/mahout org.apache.mahout.classifier.mlp.RunMultilayerPerceptron \
+<pre><code>$ bin/mahout org.apache.mahout.classifier.mlp.RunMultilayerPerceptron \
             -i ./mrlegacy/src/test/resources/iris.csv -sh -cr 0 3 \
             -mo /tmp/model.model -o /tmp/labelResult.txt
-</code></pre></div></div>
+</code></pre>
 
 <p>The individual parameters are explained in the following.</p>
 
 <ul>
-  <li><code class="highlighter-rouge">-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris data set as input data</li>
-  <li><code class="highlighter-rouge">-sh</code> since the file <code class="highlighter-rouge">iris.csv</code> contains a header row, this row needs to be skipped</li>
-  <li><code class="highlighter-rouge">-cr 0 3</code> we specify the column range of the input file</li>
-  <li><code class="highlighter-rouge">-mo /tmp/model.model</code> specify where the model file is stored</li>
-  <li><code class="highlighter-rouge">-o /tmp/labelResult.txt</code> specify where the labeled output file will be stored</li>
+  <li><code>-i ./mrlegacy/src/test/resources/iris.csv</code> use the iris data set as input data</li>
+  <li><code>-sh</code> since the file <code>iris.csv</code> contains a header row, this row needs to be skipped</li>
+  <li><code>-cr 0 3</code> we specify the column range of the input file</li>
+  <li><code>-mo /tmp/model.model</code> specify where the model file is stored</li>
+  <li><code>-o /tmp/labelResult.txt</code> specify where the labeled output file will be stored</li>
 </ul>
 
 <h2 id="implementation">Implementation</h2>
@@ -460,7 +460,7 @@ Currently, the logistic sigmoid is used as a squashing function in every hidden
 
 <p>The command line version <strong>does not perform iterations</strong> which leads to bad results on small datasets. Another restriction is, that the CLI version of the MLP only supports classification, since the labels have to be given explicitly when executing on the command line.</p>
 
-<p>A learned model can be stored and updated with new training instanced using the <code class="highlighter-rouge">--update</code> flag. Output of classification reults is saved as a .txt-file and only consists of the assigned labels. Apart from the command-line interface, it is possible to construct and compile more specialized neural networks using the API and interfaces in the mrlegacy package.</p>
+<p>A learned model can be stored and updated with new training instanced using the <code>--update</code> flag. Output of classification reults is saved as a .txt-file and only consists of the assigned labels. Apart from the command-line interface, it is possible to construct and compile more specialized neural networks using the API and interfaces in the mrlegacy package.</p>
 
 <h2 id="theoretical-background">Theoretical Background</h2>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/partial-implementation.html
----------------------------------------------------------------------
diff --git a/users/classification/partial-implementation.html b/users/classification/partial-implementation.html
index 5028896..6310eca 100644
--- a/users/classification/partial-implementation.html
+++ b/users/classification/partial-implementation.html
@@ -316,8 +316,8 @@ $HADOOP_HOME/bin/hadoop fs -put <PATH TO="" DATA=""> testdata{code}</PATH></li>
 <h2 id="generate-a-file-descriptor-for-the-dataset">Generate a file descriptor for the dataset:</h2>
 <p>run the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/core/target/mahout-core-&lt;VERSION&gt;-job.jar org.apache.mahout.classifier.df.tools.Describe -p testdata/KDDTrain+.arff -f testdata/KDDTrain+.info -d N 3 C 2 N C 4 N C 8 N 2 C 19 N L
-</code></pre></div></div>
+<pre><code>$HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/core/target/mahout-core-&lt;VERSION&gt;-job.jar org.apache.mahout.classifier.df.tools.Describe -p testdata/KDDTrain+.arff -f testdata/KDDTrain+.info -d N 3 C 2 N C 4 N C 8 N 2 C 19 N L
+</code></pre>
 
 <p>The “N 3 C 2 N C 4 N C 8 N 2 C 19 N L” string describes all the attributes
 of the data. In this cases, it means 1 numerical(N) attribute, followed by
@@ -327,8 +327,8 @@ to ignore some attributes</p>
 <p><a name="PartialImplementation-Runtheexample"></a></p>
 <h2 id="run-the-example">Run the example</h2>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-&lt;version&gt;-job.jar org.apache.mahout.classifier.df.mapreduce.BuildForest -Dmapred.max.split.size=1874231 -d testdata/KDDTrain+.arff -ds testdata/KDDTrain+.info -sl 5 -p -t 100 -o nsl-forest
-</code></pre></div></div>
+<pre><code>$HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-&lt;version&gt;-job.jar org.apache.mahout.classifier.df.mapreduce.BuildForest -Dmapred.max.split.size=1874231 -d testdata/KDDTrain+.arff -ds testdata/KDDTrain+.info -sl 5 -p -t 100 -o nsl-forest
+</code></pre>
 
 <p>which builds 100 trees (-t argument) using the partial implementation (-p).
 Each tree is built using 5 random selected attribute per node (-sl
@@ -356,8 +356,8 @@ nsl-forest/forest.seq</p>
 <h2 id="using-the-decision-forest-to-classify-new-data">Using the Decision Forest to Classify new data</h2>
 <p>run the following command:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-&lt;version&gt;-job.jar org.apache.mahout.classifier.df.mapreduce.TestForest -i nsl-kdd/KDDTest+.arff -ds nsl-kdd/KDDTrain+.info -m nsl-forest -a -mr -o predictions
-</code></pre></div></div>
+<pre><code>$HADOOP_HOME/bin/hadoop jar $MAHOUT_HOME/examples/target/mahout-examples-&lt;version&gt;-job.jar org.apache.mahout.classifier.df.mapreduce.TestForest -i nsl-kdd/KDDTest+.arff -ds nsl-kdd/KDDTrain+.info -m nsl-forest -a -mr -o predictions
+</code></pre>
 
 <p>This will compute the predictions of “KDDTest+.arff” dataset (-i argument)
 using the same data descriptor generated for the training dataset (-ds) and

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/twenty-newsgroups.html
----------------------------------------------------------------------
diff --git a/users/classification/twenty-newsgroups.html b/users/classification/twenty-newsgroups.html
index 291719f..c671aab 100644
--- a/users/classification/twenty-newsgroups.html
+++ b/users/classification/twenty-newsgroups.html
@@ -307,35 +307,35 @@ the 20 newsgroups.</p>
   <li>
     <p>If running Hadoop in cluster mode, start the hadoop daemons by executing the following commands:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ cd $HADOOP_HOME/bin
+    <pre><code>     $ cd $HADOOP_HOME/bin
      $ ./start-all.sh
-</code></pre></div>    </div>
+</code></pre>
 
     <p>Otherwise:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ export MAHOUT_LOCAL=true
-</code></pre></div>    </div>
+    <pre><code>     $ export MAHOUT_LOCAL=true
+</code></pre>
   </li>
   <li>
     <p>In the trunk directory of Mahout, compile and install Mahout:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ cd $MAHOUT_HOME
+    <pre><code>     $ cd $MAHOUT_HOME
      $ mvn -DskipTests clean install
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Run the <a href="https://github.com/apache/mahout/blob/master/examples/bin/classify-20newsgroups.sh">20 newsgroups example script</a> by executing:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ ./examples/bin/classify-20newsgroups.sh
-</code></pre></div>    </div>
+    <pre><code>     $ ./examples/bin/classify-20newsgroups.sh
+</code></pre>
   </li>
   <li>
     <p>You will be prompted to select a classification method algorithm:</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     1. Complement Naive Bayes
+    <pre><code>     1. Complement Naive Bayes
      2. Naive Bayes
      3. Stochastic Gradient Descent
-</code></pre></div>    </div>
+</code></pre>
   </li>
 </ol>
 
@@ -353,7 +353,7 @@ the 20 newsgroups.</p>
 
 <p>Output should look something like:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>=======================================================
+<pre><code>=======================================================
 Confusion Matrix
 -------------------------------------------------------
  a  b  c  d  e  f  g  h  i  j  k  l  m  n  o  p  q  r  s  t &lt;--Classified as
@@ -384,7 +384,7 @@ Kappa                                       0.8808
 Accuracy                                   90.8596%
 Reliability                                86.3632%
 Reliability (standard deviation)            0.2131
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="TwentyNewsgroups-ComplementaryNaiveBayes"></a></p>
 <h2 id="end-to-end-commands-to-build-a-cbayes-model-for-20-newsgroups">End to end commands to build a CBayes model for 20 newsgroups</h2>
@@ -396,14 +396,14 @@ Reliability (standard deviation)            0.2131
   <li>
     <p>Create a working directory for the dataset and all input/output.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ export WORK_DIR=/tmp/mahout-work-${USER}
+    <pre><code>     $ export WORK_DIR=/tmp/mahout-work-${USER}
      $ mkdir -p ${WORK_DIR}
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Download and extract the <em>20news-bydate.tar.gz</em> from the <a href="http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz">20newsgroups dataset</a> to the working directory.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ curl http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz 
+    <pre><code>     $ curl http://people.csail.mit.edu/jrennie/20Newsgroups/20news-bydate.tar.gz 
          -o ${WORK_DIR}/20news-bydate.tar.gz
      $ mkdir -p ${WORK_DIR}/20news-bydate
      $ cd ${WORK_DIR}/20news-bydate &amp;&amp; tar xzf ../20news-bydate.tar.gz &amp;&amp; cd .. &amp;&amp; cd ..
@@ -411,62 +411,62 @@ Reliability (standard deviation)            0.2131
      $ cp -R ${WORK_DIR}/20news-bydate/*/* ${WORK_DIR}/20news-all   * If you're running on a Hadoop cluster:
  
      $ hadoop dfs -put ${WORK_DIR}/20news-all ${WORK_DIR}/20news-all
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Convert the full 20 newsgroups dataset into a &lt; Text, Text &gt; SequenceFile.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ mahout seqdirectory 
+    <pre><code>     $ mahout seqdirectory 
          -i ${WORK_DIR}/20news-all 
          -o ${WORK_DIR}/20news-seq 
          -ow
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Convert and preprocesses the dataset into  a &lt; Text, VectorWritable &gt; SequenceFile containing term frequencies for each document.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ mahout seq2sparse 
+    <pre><code>     $ mahout seq2sparse 
          -i ${WORK_DIR}/20news-seq 
          -o ${WORK_DIR}/20news-vectors
          -lnorm 
          -nv 
          -wt tfidf If we wanted to use different parsing methods or transformations on the term frequency vectors we could supply different options here e.g.: -ng 2 for bigrams or -n 2 for L2 length normalization.  See the [Creating vectors from text](http://mahout.apache.org/users/basics/creating-vectors-from-text.html) page for a list of all seq2sparse options.   
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Split the preprocessed dataset into training and testing sets.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ mahout split 
+    <pre><code>     $ mahout split 
          -i ${WORK_DIR}/20news-vectors/tfidf-vectors 
          --trainingOutput ${WORK_DIR}/20news-train-vectors 
          --testOutput ${WORK_DIR}/20news-test-vectors  
          --randomSelectionPct 40 
          --overwrite --sequenceFiles -xm sequential
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Train the classifier.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ mahout trainnb 
+    <pre><code>     $ mahout trainnb 
          -i ${WORK_DIR}/20news-train-vectors
          -el  
          -o ${WORK_DIR}/model 
          -li ${WORK_DIR}/labelindex 
          -ow 
          -c
-</code></pre></div>    </div>
+</code></pre>
   </li>
   <li>
     <p>Test the classifier.</p>
 
-    <div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>     $ mahout testnb 
+    <pre><code>     $ mahout testnb 
          -i ${WORK_DIR}/20news-test-vectors
          -m ${WORK_DIR}/model 
          -l ${WORK_DIR}/labelindex 
          -ow 
          -o ${WORK_DIR}/20news-testing 
          -c
-</code></pre></div>    </div>
+</code></pre>
   </li>
 </ol>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/classification/wikipedia-classifier-example.html
----------------------------------------------------------------------
diff --git a/users/classification/wikipedia-classifier-example.html b/users/classification/wikipedia-classifier-example.html
index bda386c..0d10dd1 100644
--- a/users/classification/wikipedia-classifier-example.html
+++ b/users/classification/wikipedia-classifier-example.html
@@ -281,32 +281,32 @@
 
 <h2 id="oververview">Oververview</h2>
 
-<p>Tou run the example simply execute the <code class="highlighter-rouge">$MAHOUT_HOME/examples/bin/classify-wikipedia.sh</code> script.</p>
+<p>Tou run the example simply execute the <code>$MAHOUT_HOME/examples/bin/classify-wikipedia.sh</code> script.</p>
 
 <p>By defult the script is set to run on a medium sized Wikipedia XML dump.  To run on the full set (the entire english Wikipedia) you can change the download by commenting out line 78, and uncommenting line 80  of <a href="https://github.com/apache/mahout/blob/master/examples/bin/classify-wikipedia.sh">classify-wikipedia.sh</a> [1]. However this is not recommended unless you have the resources to do so. <em>Be sure to clean your work directory when changing datasets- option (3).</em></p>
 
-<p>The step by step process for Creating a Naive Bayes Classifier for the Wikipedia XML dump is very similar to that for <a href="http://mahout.apache.org/users/classification/twenty-newsgroups.html">creating a 20 Newsgroups Classifier</a> [4].  The only difference being that instead of running <code class="highlighter-rouge">$mahout seqdirectory</code> on the unzipped 20 Newsgroups file, you’ll run <code class="highlighter-rouge">$mahout seqwiki</code> on the unzipped Wikipedia xml dump.</p>
+<p>The step by step process for Creating a Naive Bayes Classifier for the Wikipedia XML dump is very similar to that for <a href="http://mahout.apache.org/users/classification/twenty-newsgroups.html">creating a 20 Newsgroups Classifier</a> [4].  The only difference being that instead of running <code>$mahout seqdirectory</code> on the unzipped 20 Newsgroups file, you’ll run <code>$mahout seqwiki</code> on the unzipped Wikipedia xml dump.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ mahout seqwiki 
-</code></pre></div></div>
+<pre><code>$ mahout seqwiki 
+</code></pre>
 
-<p>The above command launches <code class="highlighter-rouge">WikipediaToSequenceFile.java</code> which accepts a text file of categories [3] and starts an MR job to parse the each document in the XML file.  This process will seek to extract documents with a wikipedia category tag which (exactly, if the <code class="highlighter-rouge">-exactMatchOnly</code> option is set) matches a line in the category file.  If no match is found and the <code class="highlighter-rouge">-all</code> option is set, the document will be dumped into an “unknown” category. The documents will then be written out as a <code class="highlighter-rouge">&lt;Text,Text&gt;</code> sequence file of the form (K:/category/document_title , V: document).</p>
+<p>The above command launches <code>WikipediaToSequenceFile.java</code> which accepts a text file of categories [3] and starts an MR job to parse the each document in the XML file.  This process will seek to extract documents with a wikipedia category tag which (exactly, if the <code>-exactMatchOnly</code> option is set) matches a line in the category file.  If no match is found and the <code>-all</code> option is set, the document will be dumped into an “unknown” category. The documents will then be written out as a <code>&lt;Text,Text&gt;</code> sequence file of the form (K:/category/document_title , V: document).</p>
 
 <p>There are 3 different example category files available to in the /examples/src/test/resources
 directory:  country.txt, country10.txt and country2.txt.  You can edit these categories to extract a different corpus from the Wikipedia dataset.</p>
 
-<p>The CLI options for <code class="highlighter-rouge">seqwiki</code> are as follows:</p>
+<p>The CLI options for <code>seqwiki</code> are as follows:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>--input          (-i)         input pathname String
+<pre><code>--input          (-i)         input pathname String
 --output         (-o)         the output pathname String
 --categories     (-c)         the file containing the Wikipedia categories
 --exactMatchOnly (-e)         if set, then the Wikipedia category must match
                                 exactly instead of simply containing the category string
 --all            (-all)       if set select all categories
 --removeLabels   (-rl)        if set, remove [[Category:labels]] from document text after extracting label.
-</code></pre></div></div>
+</code></pre>
 
-<p>After <code class="highlighter-rouge">seqwiki</code>, the script runs <code class="highlighter-rouge">seq2sparse</code>, <code class="highlighter-rouge">split</code>, <code class="highlighter-rouge">trainnb</code> and <code class="highlighter-rouge">testnb</code> as in the <a href="http://mahout.apache.org/users/classification/twenty-newsgroups.html">step by step 20newsgroups example</a>.  When all of the jobs have finished, a confusion matrix will be displayed.</p>
+<p>After <code>seqwiki</code>, the script runs <code>seq2sparse</code>, <code>split</code>, <code>trainnb</code> and <code>testnb</code> as in the <a href="http://mahout.apache.org/users/classification/twenty-newsgroups.html">step by step 20newsgroups example</a>.  When all of the jobs have finished, a confusion matrix will be displayed.</p>
 
 <p>#Resourcese</p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/canopy-clustering.html
----------------------------------------------------------------------
diff --git a/users/clustering/canopy-clustering.html b/users/clustering/canopy-clustering.html
index 1b17ff2..06d0a13 100644
--- a/users/clustering/canopy-clustering.html
+++ b/users/clustering/canopy-clustering.html
@@ -361,7 +361,7 @@ Both require several arguments:</p>
 
 <p>Invocation using the command line takes the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout canopy \
+<pre><code>bin/mahout canopy \
     -i &lt;input vectors directory&gt; \
     -o &lt;output working directory&gt; \
     -dm &lt;DistanceMeasure&gt; \
@@ -373,7 +373,7 @@ Both require several arguments:</p>
     -ow &lt;overwrite output directory if present&gt;
     -cl &lt;run input vector clustering after computing Canopies&gt;
     -xm &lt;execution method: sequential or mapreduce&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>Invocation using Java involves supplying the following arguments:</p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/canopy-commandline.html
----------------------------------------------------------------------
diff --git a/users/clustering/canopy-commandline.html b/users/clustering/canopy-commandline.html
index e878275..fb7f2eb 100644
--- a/users/clustering/canopy-commandline.html
+++ b/users/clustering/canopy-commandline.html
@@ -282,8 +282,8 @@ an operating Hadoop cluster on the target machine then the invocation will
 run Canopy on that cluster. If either of the environment variables are
 missing then the stand-alone Hadoop configuration will be invoked instead.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./bin/mahout canopy &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout canopy &lt;OPTIONS&gt;
+</code></pre>
 
 <ul>
   <li>In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
@@ -326,7 +326,7 @@ to view all outputs.</li>
 <p><a name="canopy-commandline-Commandlineoptions"></a></p>
 <h1 id="command-line-options">Command line options</h1>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  --input (-i) input			     Path to job input directory.Must  
+<pre><code>  --input (-i) input			     Path to job input directory.Must  
 					     be a SequenceFile of	    
 					     VectorWritable		    
   --output (-o) output			     The directory pathname for output. 
@@ -340,7 +340,7 @@ to view all outputs.</li>
   --clustering (-cl)			     If present, run clustering after	
 					     the iterations have taken place	 
   --help (-h)				     Print out help		    
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/cluster-dumper.html
----------------------------------------------------------------------
diff --git a/users/clustering/cluster-dumper.html b/users/clustering/cluster-dumper.html
index ba4e841..2fe2421 100644
--- a/users/clustering/cluster-dumper.html
+++ b/users/clustering/cluster-dumper.html
@@ -295,15 +295,15 @@ you can run clusterdumper in 2 modes:</p>
 <h3 id="hadoop-environment">Hadoop Environment</h3>
 
 <p>If you have setup your HADOOP_HOME environment variable, you can use the
-command line utility <code class="highlighter-rouge">mahout</code> to execute the ClusterDumper on Hadoop. In
+command line utility <code>mahout</code> to execute the ClusterDumper on Hadoop. In
 this case we wont need to get the output clusters to our local machines.
 The utility will read the output clusters present in HDFS and output the
 human-readable cluster values into our local file system. Say you’ve just
 executed the <a href="clustering-of-synthetic-control-data.html">synthetic control example </a>
- and want to analyze the output, you can execute the <code class="highlighter-rouge">mahout clusterdumper</code> utility from the command line.</p>
+ and want to analyze the output, you can execute the <code>mahout clusterdumper</code> utility from the command line.</p>
 
 <h4 id="cli-options">CLI options:</h4>
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>--help                               Print out help	
+<pre><code>--help                               Print out help	
 --input (-i) input                   The directory containing Sequence
                                        Files for the Clusters	    
 --output (-o) output                 The output file.  If not specified,
@@ -329,7 +329,7 @@ executed the <a href="clustering-of-synthetic-control-data.html">synthetic contr
 --evaluate (-e)                      Run ClusterEvaluator and CDbwEvaluator over the
                                       input. The output will be appended to the rest of
                                       the output at the end.   
-</code></pre></div></div>
+</code></pre>
 
 <h3 id="standalone-java-program">Standalone Java Program</h3>
 
@@ -350,11 +350,11 @@ executed the <a href="clustering-of-synthetic-control-data.html">synthetic contr
 
 <p>In the arguments tab, specify the below arguments</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>--seqFileDir &lt;MAHOUT_HOME&gt;/examples/output/clusters-10 
+<pre><code>--seqFileDir &lt;MAHOUT_HOME&gt;/examples/output/clusters-10 
 --pointsDir &lt;MAHOUT_HOME&gt;/examples/output/clusteredPoints 
 --output &lt;MAHOUT_HOME&gt;/examples/output/clusteranalyze.txt
 replace &lt;MAHOUT_HOME&gt; with the actual path of your $MAHOUT_HOME
-</code></pre></div></div>
+</code></pre>
 
 <ul>
   <li>Hit run to execute the ClusterDumper using Eclipse. Setting breakpoints etc should just work fine.</li>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/clustering-of-synthetic-control-data.html
----------------------------------------------------------------------
diff --git a/users/clustering/clustering-of-synthetic-control-data.html b/users/clustering/clustering-of-synthetic-control-data.html
index 2441536..ec32638 100644
--- a/users/clustering/clustering-of-synthetic-control-data.html
+++ b/users/clustering/clustering-of-synthetic-control-data.html
@@ -312,22 +312,22 @@
   <li><a href="/users/clustering/canopy-clustering.html">Canopy Clustering</a></li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout org.apache.mahout.clustering.syntheticcontrol.canopy.Job
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.clustering.syntheticcontrol.canopy.Job
+</code></pre>
 
 <ul>
   <li><a href="/users/clustering/k-means-clustering.html">k-Means Clustering</a></li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.clustering.syntheticcontrol.kmeans.Job
+</code></pre>
 
 <ul>
   <li><a href="/users/clustering/fuzzy-k-means.html">Fuzzy k-Means Clustering</a></li>
 </ul>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout org.apache.mahout.clustering.syntheticcontrol.fuzzykmeans.Job
-</code></pre></div></div>
+<pre><code>bin/mahout org.apache.mahout.clustering.syntheticcontrol.fuzzykmeans.Job
+</code></pre>
 
 <p>The clustering output will be produced in the <em>output</em> directory. The output data points are in vector format. In order to read/analyze the output, you can use the <a href="/users/clustering/cluster-dumper.html">clusterdump</a> utility provided by Mahout.</p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/clusteringyourdata.html
----------------------------------------------------------------------
diff --git a/users/clustering/clusteringyourdata.html b/users/clustering/clusteringyourdata.html
index 695ed10..6dbe65c 100644
--- a/users/clustering/clusteringyourdata.html
+++ b/users/clustering/clusteringyourdata.html
@@ -315,13 +315,13 @@ In particular for text preparation check out <a href="../basics/creating-vectors
 
 <p>Mahout has a cluster dumper utility that can be used to retrieve and evaluate your clustering data.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./bin/mahout clusterdump &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout clusterdump &lt;OPTIONS&gt;
+</code></pre>
 
 <p><a name="ClusteringYourData-Theclusterdumperoptionsare:"></a></p>
 <h2 id="the-cluster-dumper-options-are">The cluster dumper options are:</h2>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  --help (-h)				   Print out help	
+<pre><code>  --help (-h)				   Print out help	
     
   --input (-i) input			   The directory containing Sequence    
 					   Files for the Clusters	    
@@ -359,7 +359,7 @@ In particular for text preparation check out <a href="../basics/creating-vectors
   --evaluate (-e)			   Run ClusterEvaluator and CDbwEvaluator over the
 					   input. The output will be appended to the rest of
 					   the output at the end.   
-</code></pre></div></div>
+</code></pre>
 
 <p>More information on using clusterdump utility can be found <a href="cluster-dumper.html">here</a></p>
 

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/fuzzy-k-means-commandline.html
----------------------------------------------------------------------
diff --git a/users/clustering/fuzzy-k-means-commandline.html b/users/clustering/fuzzy-k-means-commandline.html
index 4b8cb3d..7be184e 100644
--- a/users/clustering/fuzzy-k-means-commandline.html
+++ b/users/clustering/fuzzy-k-means-commandline.html
@@ -282,8 +282,8 @@ an operating Hadoop cluster on the target machine then the invocation will
 run FuzzyK on that cluster. If either of the environment variables are
 missing then the stand-alone Hadoop configuration will be invoked instead.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./bin/mahout fkmeans &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout fkmeans &lt;OPTIONS&gt;
+</code></pre>
 
 <ul>
   <li>In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
@@ -324,7 +324,7 @@ to view all outputs.</li>
 <p><a name="fuzzy-k-means-commandline-Commandlineoptions"></a></p>
 <h1 id="command-line-options">Command line options</h1>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  --input (-i) input			       Path to job input directory. 
+<pre><code>  --input (-i) input			       Path to job input directory. 
 					       Must be a SequenceFile of    
 					       VectorWritable		    
   --clusters (-c) clusters		       The input centroids, as Vectors. 
@@ -366,7 +366,7 @@ to view all outputs.</li>
 					       is 0 
   --clustering (-cl)			       If present, run clustering after 
 					       the iterations have taken place  
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/fuzzy-k-means.html
----------------------------------------------------------------------
diff --git a/users/clustering/fuzzy-k-means.html b/users/clustering/fuzzy-k-means.html
index 44f5c14..648c188 100644
--- a/users/clustering/fuzzy-k-means.html
+++ b/users/clustering/fuzzy-k-means.html
@@ -351,7 +351,7 @@ FuzzyKMeansDriver.run().</p>
 
 <p>Invocation using the command line takes the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout fkmeans \
+<pre><code>bin/mahout fkmeans \
     -i &lt;input vectors directory&gt; \
     -c &lt;input clusters directory&gt; \
     -o &lt;output working directory&gt; \
@@ -365,7 +365,7 @@ FuzzyKMeansDriver.run().</p>
     -e &lt;emit vectors to most likely cluster during clustering&gt;
     -t &lt;threshold to use for clustering if -e is false&gt;
     -xm &lt;execution method: sequential or mapreduce&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p><em>Note:</em> if the -k argument is supplied, any clusters in the -c directory
 will be overwritten and -k random points will be sampled from the input

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/k-means-clustering.html
----------------------------------------------------------------------
diff --git a/users/clustering/k-means-clustering.html b/users/clustering/k-means-clustering.html
index 21f9e2f..431aaa7 100644
--- a/users/clustering/k-means-clustering.html
+++ b/users/clustering/k-means-clustering.html
@@ -331,14 +331,14 @@ clustering and convergence values.</p>
 
 <p>Canopy clustering can be used to compute the initial clusters for k-KMeans:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// run the CanopyDriver job
+<pre><code>// run the CanopyDriver job
 CanopyDriver.runJob("testdata", "output"
 ManhattanDistanceMeasure.class.getName(), (float) 3.1, (float) 2.1, false);
 
 // now run the KMeansDriver job
 KMeansDriver.runJob("testdata", "output/clusters-0", "output",
 EuclideanDistanceMeasure.class.getName(), "0.001", "10", true);
-</code></pre></div></div>
+</code></pre>
 
 <p>In the above example, the input data points are stored in ‘testdata’ and
 the CanopyDriver is configured to output to the ‘output/clusters-0’
@@ -359,7 +359,7 @@ on KMeansDriver.main or by making a Java call to KMeansDriver.runJob().</p>
 
 <p>Invocation using the command line takes the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout kmeans \
+<pre><code>bin/mahout kmeans \
     -i &lt;input vectors directory&gt; \
     -c &lt;input clusters directory&gt; \
     -o &lt;output working directory&gt; \
@@ -370,7 +370,7 @@ on KMeansDriver.main or by making a Java call to KMeansDriver.runJob().</p>
     -ow &lt;overwrite output directory if present&gt;
     -cl &lt;run input vector clustering after computing Canopies&gt;
     -xm &lt;execution method: sequential or mapreduce&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p>Note: if the -k argument is supplied, any clusters in the -c directory
 will be overwritten and -k random points will be sampled from the input

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/k-means-commandline.html
----------------------------------------------------------------------
diff --git a/users/clustering/k-means-commandline.html b/users/clustering/k-means-commandline.html
index 318b847..cf7de7a 100644
--- a/users/clustering/k-means-commandline.html
+++ b/users/clustering/k-means-commandline.html
@@ -289,8 +289,8 @@ an operating Hadoop cluster on the target machine then the invocation will
 run k-Means on that cluster. If either of the environment variables are
 missing then the stand-alone Hadoop configuration will be invoked instead.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./bin/mahout kmeans &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout kmeans &lt;OPTIONS&gt;
+</code></pre>
 
 <p>In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
 will be generated in $MAHOUT_HOME/core/target/ and it’s name will contain
@@ -331,7 +331,7 @@ to view all outputs.</li>
 <p><a name="k-means-commandline-Commandlineoptions"></a></p>
 <h1 id="command-line-options">Command line options</h1>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>  --input (-i) input			       Path to job input directory. 
+<pre><code>  --input (-i) input			       Path to job input directory. 
 					       Must be a SequenceFile of    
 					       VectorWritable		    
   --clusters (-c) clusters		       The input centroids, as Vectors. 
@@ -362,7 +362,7 @@ to view all outputs.</li>
   --help (-h)				       Print out help		    
   --clustering (-cl)			       If present, run clustering after 
 					       the iterations have taken place  
-</code></pre></div></div>
+</code></pre>
 
 
    </div>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/latent-dirichlet-allocation.html
----------------------------------------------------------------------
diff --git a/users/clustering/latent-dirichlet-allocation.html b/users/clustering/latent-dirichlet-allocation.html
index 78a8e4f..e857424 100644
--- a/users/clustering/latent-dirichlet-allocation.html
+++ b/users/clustering/latent-dirichlet-allocation.html
@@ -343,7 +343,7 @@ vectors, it’s recommended that you follow the instructions in <a href="../basi
 
 <p>Invocation takes the form:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout cvb \
+<pre><code>bin/mahout cvb \
     -i &lt;input path for document vectors&gt; \
     -dict &lt;path to term-dictionary file(s) , glob expression supported&gt; \
     -o &lt;output path for topic-term distributions&gt;
@@ -358,7 +358,7 @@ vectors, it’s recommended that you follow the instructions in <a href="../basi
     -seed &lt;random seed&gt; \
     -tf &lt;fraction of data to hold for testing&gt; \
     -block &lt;number of iterations per perplexity check, ignored unless test_set_percentage&gt;0&gt; \
-</code></pre></div></div>
+</code></pre>
 
 <p>Topic smoothing should generally be about 50/K, where K is the number of
 topics. The number of words in the vocabulary can be an upper bound, though
@@ -370,14 +370,14 @@ recommended that you try several values.</p>
 <p>After running LDA you can obtain an output of the computed topics using the
 LDAPrintTopics utility:</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>bin/mahout ldatopics \
+<pre><code>bin/mahout ldatopics \
     -i &lt;input vectors directory&gt; \
     -d &lt;input dictionary file&gt; \
     -w &lt;optional number of words to print&gt; \
     -o &lt;optional output working directory. Default is to console&gt; \
     -h &lt;print out help&gt; \
     -dt &lt;optional dictionary type (text|sequencefile). Default is text&gt;
-</code></pre></div></div>
+</code></pre>
 
 <p><a name="LatentDirichletAllocation-Example"></a></p>
 <h1 id="example">Example</h1>

http://git-wip-us.apache.org/repos/asf/mahout/blob/d9686c8b/users/clustering/lda-commandline.html
----------------------------------------------------------------------
diff --git a/users/clustering/lda-commandline.html b/users/clustering/lda-commandline.html
index d3f4c67..729c061 100644
--- a/users/clustering/lda-commandline.html
+++ b/users/clustering/lda-commandline.html
@@ -285,8 +285,8 @@ Hadoop cluster on the target machine then the invocation will run the LDA
 algorithm on that cluster. If either of the environment variables are
 missing then the stand-alone Hadoop configuration will be invoked instead.</p>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>./bin/mahout cvb &lt;OPTIONS&gt;
-</code></pre></div></div>
+<pre><code>./bin/mahout cvb &lt;OPTIONS&gt;
+</code></pre>
 
 <ul>
   <li>In $MAHOUT_HOME/, build the jar containing the job (mvn install) The job
@@ -327,7 +327,7 @@ to view all outputs.</li>
 <p><a name="lda-commandline-CommandlineoptionsfromMahoutcvbversion0.8"></a></p>
 <h1 id="command-line-options-from-mahout-cvb-version-08">Command line options from Mahout cvb version 0.8</h1>
 
-<div class="highlighter-rouge"><div class="highlight"><pre class="highlight"><code>mahout cvb -h 
+<pre><code>mahout cvb -h 
   --input (-i) input					  Path to job input directory.	      
   --output (-o) output					  The directory pathname for output.  
   --maxIter (-x) maxIter				  The maximum number of iterations.		
@@ -352,7 +352,7 @@ to view all outputs.</li>
   --tempDir tempDir					  Intermediate output directory	     
   --startPhase startPhase				  First phase to run    
   --endPhase endPhase					  Last phase to run
-</code></pre></div></div>
+</code></pre>
 
 
    </div>