You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@joshua.apache.org by mj...@apache.org on 2016/09/13 21:49:36 UTC

[3/4] incubator-joshua-site git commit: Hid old documentation, pointed to wiki

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/22be73ab/6/features.html
----------------------------------------------------------------------
diff --git a/6/features.html b/6/features.html
deleted file mode 100644
index 6e617cf..0000000
--- a/6/features.html
+++ /dev/null
@@ -1,192 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-  <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-    <meta name="description" content="">
-    <meta name="author" content="">
-    <link rel="icon" href="../../favicon.ico">
-
-    <title>Joshua Documentation | Features</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link href="/joshua6.css" rel="stylesheet">
-  </head>
-
-  <body>
-
-    <div class="blog-masthead">
-      <div class="container">
-        <nav class="blog-nav">
-          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
-          <a class="blog-nav-item" href="/">Joshua</a>
-          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New features</a> -->
-          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
-          <a class="blog-nav-item" href="/data/">Datasets</a>
-          <a class="blog-nav-item" href="/support/">Support</a>
-          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
-        </nav>
-      </div>
-    </div>
-
-    <div class="container">
-
-      <div class="row">
-
-        <div class="col-sm-2">
-          <div class="sidebar-module">
-            <!-- <h4>About</h4> -->
-            <center>
-            <img src="/images/joshua-logo-small.png" />
-            <p>Joshua machine translation toolkit</p>
-            </center>
-          </div>
-          <hr>
-          <center>
-            <a href="/releases/current/" target="_blank"><button class="button">Download Joshua 6.0.5</button></a>
-            <br />
-            <a href="/releases/runtime/" target="_blank"><button class="button">Runtime only version</button></a>
-            <p>Released November 5, 2015</p>
-          </center>
-          <hr>
-          <!-- <div class="sidebar-module"> -->
-          <!--   <span id="download"> -->
-          <!--     <a href="http://joshua-decoder.org/downloads/joshua-6.0.tgz">Download</a> -->
-          <!--   </span> -->
-          <!-- </div> -->
-          <div class="sidebar-module">
-            <h4>Using Joshua</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/install.html">Installation</a></li>
-              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
-            </ol>
-          </div>
-          <hr>
-          <div class="sidebar-module">
-            <h4>Building new models</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
-              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
-              <li><a href="/6.0/faq.html">FAQ</a></li>
-            </ol>
-          </div>
-<!--
-          <div class="sidebar-module">
-            <h4>Phrase-based</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/phrase.html">Training</a></li>
-            </ol>
-          </div>
--->
-          <hr>
-          <div class="sidebar-module">
-            <h4>Advanced</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/bundle.html">Building language packs</a></li>
-              <li><a href="/6.0/decoder.html">Decoder options</a></li>
-              <li><a href="/6.0/file-formats.html">File formats</a></li>
-              <li><a href="/6.0/packing.html">Packing TMs</a></li>
-              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
-            </ol>
-          </div>
-
-          <hr> 
-          <div class="sidebar-module">
-            <h4>Developer</h4>
-            <ol class="list-unstyled">              
-		<li><a href="https://github.com/joshua-decoder/joshua">Github</a></li>
-		<li><a href="http://cs.jhu.edu/~post/joshua-docs">Javadoc</a></li>
-		<li><a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers">Mailing list</a></li>              
-            </ol>
-          </div>
-
-        </div><!-- /.blog-sidebar -->
-
-        
-        <div class="col-sm-8 blog-main">
-        
-
-          <div class="blog-title">
-            <h2>Features</h2>
-          </div>
-          
-          <div class="blog-post">
-
-            <p>Joshua 5.0 uses a sparse feature representation to encode features internally.</p>
-
-
-          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
-
-          <!--   <p>This blog post shows a few different types of content that's supported and styled with Bootstrap. Basic typography, images, and code are all supported.</p> -->
-          <!--   <hr> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
-          <!--   <blockquote> -->
-          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit.</p> -->
-          <!--   </blockquote> -->
-          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur.</p> -->
-          <!--   <h2>Heading</h2> -->
-          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p> -->
-          <!--   <pre><code>Example code block</code></pre> -->
-          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus.</p> -->
-          <!--   <ul> -->
-          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl consectetur et.</li> -->
-          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> -->
-          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
-          <!--   </ul> -->
-          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue.</p> -->
-          <!--   <ol> -->
-          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> -->
-          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</li> -->
-          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet non magna.</li> -->
-          <!--   </ol> -->
-          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis.</p> -->
-          <!-- </div><\!-- /.blog-post -\-> -->
-
-        </div>
-
-      </div><!-- /.row -->
-
-      
-        
-    </div><!-- /.container -->
-
-    <!-- Bootstrap core JavaScript
-    ================================================== -->
-    <!-- Placed at the end of the document so the pages load faster -->
-    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
-    <script src="../../dist/js/bootstrap.min.js"></script>
-    <!-- <script src="../../assets/js/docs.min.js"></script> -->
-    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
-    <!-- <script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
-    -->
-
-    <!-- Start of StatCounter Code for Default Guide -->
-    <script type="text/javascript">
-      var sc_project=8264132; 
-      var sc_invisible=1; 
-      var sc_security="4b97fe2d"; 
-    </script>
-    <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>
-    <noscript>
-      <div class="statcounter">
-        <a title="hit counter joomla" 
-           href="http://statcounter.com/joomla/"
-           target="_blank">
-          <img class="statcounter"
-               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/"
-               alt="hit counter joomla" />
-        </a>
-      </div>
-    </noscript>
-    <!-- End of StatCounter Code for Default Guide -->
-  </body>
-</html>
-

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/22be73ab/6/file-formats.html
----------------------------------------------------------------------
diff --git a/6/file-formats.html b/6/file-formats.html
deleted file mode 100644
index 4918253..0000000
--- a/6/file-formats.html
+++ /dev/null
@@ -1,270 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-  <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-    <meta name="description" content="">
-    <meta name="author" content="">
-    <link rel="icon" href="../../favicon.ico">
-
-    <title>Joshua Documentation | Joshua file formats</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link href="/joshua6.css" rel="stylesheet">
-  </head>
-
-  <body>
-
-    <div class="blog-masthead">
-      <div class="container">
-        <nav class="blog-nav">
-          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
-          <a class="blog-nav-item" href="/">Joshua</a>
-          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New features</a> -->
-          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
-          <a class="blog-nav-item" href="/data/">Datasets</a>
-          <a class="blog-nav-item" href="/support/">Support</a>
-          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
-        </nav>
-      </div>
-    </div>
-
-    <div class="container">
-
-      <div class="row">
-
-        <div class="col-sm-2">
-          <div class="sidebar-module">
-            <!-- <h4>About</h4> -->
-            <center>
-            <img src="/images/joshua-logo-small.png" />
-            <p>Joshua machine translation toolkit</p>
-            </center>
-          </div>
-          <hr>
-          <center>
-            <a href="/releases/current/" target="_blank"><button class="button">Download Joshua 6.0.5</button></a>
-            <br />
-            <a href="/releases/runtime/" target="_blank"><button class="button">Runtime only version</button></a>
-            <p>Released November 5, 2015</p>
-          </center>
-          <hr>
-          <!-- <div class="sidebar-module"> -->
-          <!--   <span id="download"> -->
-          <!--     <a href="http://joshua-decoder.org/downloads/joshua-6.0.tgz">Download</a> -->
-          <!--   </span> -->
-          <!-- </div> -->
-          <div class="sidebar-module">
-            <h4>Using Joshua</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/install.html">Installation</a></li>
-              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
-            </ol>
-          </div>
-          <hr>
-          <div class="sidebar-module">
-            <h4>Building new models</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
-              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
-              <li><a href="/6.0/faq.html">FAQ</a></li>
-            </ol>
-          </div>
-<!--
-          <div class="sidebar-module">
-            <h4>Phrase-based</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/phrase.html">Training</a></li>
-            </ol>
-          </div>
--->
-          <hr>
-          <div class="sidebar-module">
-            <h4>Advanced</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/bundle.html">Building language packs</a></li>
-              <li><a href="/6.0/decoder.html">Decoder options</a></li>
-              <li><a href="/6.0/file-formats.html">File formats</a></li>
-              <li><a href="/6.0/packing.html">Packing TMs</a></li>
-              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
-            </ol>
-          </div>
-
-          <hr> 
-          <div class="sidebar-module">
-            <h4>Developer</h4>
-            <ol class="list-unstyled">              
-		<li><a href="https://github.com/joshua-decoder/joshua">Github</a></li>
-		<li><a href="http://cs.jhu.edu/~post/joshua-docs">Javadoc</a></li>
-		<li><a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers">Mailing list</a></li>              
-            </ol>
-          </div>
-
-        </div><!-- /.blog-sidebar -->
-
-        
-        <div class="col-sm-8 blog-main">
-        
-
-          <div class="blog-title">
-            <h2>Joshua file formats</h2>
-          </div>
-          
-          <div class="blog-post">
-
-            <p>This page describes the formats of Joshua configuration and support files.</p>
-
-<h2 id="translation-models-grammars">Translation models (grammars)</h2>
-
-<p>Joshua supports two grammar file formats: a text-based version (also used by Hiero, shared by
-<a href="">cdec</a>, and supported by <a href="">hierarchical Moses</a>), and an efficient
-<a href="packing.html">packed representation</a> developed by <a href="http://cs.jhu.edu/~juri">Juri Ganitkevich</a>.</p>
-
-<p>Grammar rules follow this format.</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>[LHS] ||| SOURCE-SIDE ||| TARGET-SIDE ||| FEATURES
-</code></pre>
-</div>
-
-<p>The source and target sides contain a mixture of terminals and nonterminals. The nonterminals are
-linked across sides by indices. There is no limit to the number of paired nonterminals in the rule
-or on the nonterminal labels (Joshua supports decoding with SAMT and GHKM grammars).</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>[X] ||| el chico [X,1] ||| the boy [X,1] ||| -3.14 0 2 17
-[S] ||| el chico [VP,1] ||| the boy [VP,1] ||| -3.14 0 2 17
-[VP] ||| [NP,1] [IN,2] [VB,3] ||| [VB,3] [IN,2] [NP,1] ||| 0.0019026637 0.81322956
-</code></pre>
-</div>
-
-<p>The feature values can have optional labels, e.g.:</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>[X] ||| el chico [X,1] ||| the boy [X,1] ||| lexprob=-3.14 lexicalized=1 numwords=2 count=17
-</code></pre>
-</div>
-
-<p>One file common to decoding is the glue grammar, which for hiero grammar is defined as follows:</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>[GOAL] ||| &lt;s&gt; ||| &lt;s&gt; ||| 0
-[GOAL] ||| [GOAL,1] [X,2] ||| [GOAL,1] [X,2] ||| -1
-[GOAL] ||| [GOAL,1] &lt;/s&gt; ||| [GOAL,1] &lt;/s&gt; ||| 0
-</code></pre>
-</div>
-
-<p>Joshua\u2019s <a href="pipeline.html">pipeline</a> supports extraction of Hiero and SAMT grammars via
-<a href="thrax.html">Thrax</a> or GHKM grammars using <a href="http://www-nlp.stanford.edu/~mgalley/">Michel Galley</a>\u2019s
-GHKM extractor (included) or Moses\u2019 GHKM extractor (if Moses is installed).</p>
-
-<h2 id="language-model">Language Model</h2>
-
-<p>Joshua has two language model implementations: <a href="http://kheafield.com/code/kenlm/">KenLM</a> and
-<a href="http://berkeleylm.googlecode.com">BerkeleyLM</a>.  All language model implementations support the
-standard ARPA format output by <a href="http://www.speech.sri.com/projects/srilm/">SRILM</a>.  In addition,
-KenLM and BerkeleyLM support compiled formats that can be loaded more quickly and efficiently. KenLM
-is written in C++ and is supported via a JNI bridge, while BerkeleyLM is written in Java. KenLM is
-the default because of its support for left-state minimization.</p>
-
-<h3 id="compiling-for-kenlm">Compiling for KenLM</h3>
-
-<p>To compile an ARPA grammar for KenLM, use the (provided) <code class="highlighter-rouge">build-binary</code> command, located deep within
-the Joshua source code:</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>$JOSHUA/bin/build_binary lm.arpa lm.kenlm
-</code></pre>
-</div>
-
-<p>This script takes the <code class="highlighter-rouge">lm.arpa</code> file and produces the compiled version in <code class="highlighter-rouge">lm.kenlm</code>.</p>
-
-<h3 id="compiling-for-berkeleylm">Compiling for BerkeleyLM</h3>
-
-<p>To compile a grammar for BerkeleyLM, type:</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>java -cp $JOSHUA/lib/berkeleylm.jar -server -mxMEM edu.berkeley.nlp.lm.io.MakeLmBinaryFromArpa lm.arpa lm.berkeleylm
-</code></pre>
-</div>
-
-<p>The <code class="highlighter-rouge">lm.berkeleylm</code> file can then be listed directly in the <a href="decoder.html">Joshua configuration file</a>.</p>
-
-<h2 id="joshua-configuration-file">Joshua configuration file</h2>
-
-<p>The <a href="decoder.html">decoder page</a> documents decoder command-line and config file options.</p>
-
-<h2 id="thrax-configuration">Thrax configuration</h2>
-
-<p>See <a href="thrax.html">the thrax page</a> for more information about the Thrax configuration file.</p>
-
-
-          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
-
-          <!--   <p>This blog post shows a few different types of content that's supported and styled with Bootstrap. Basic typography, images, and code are all supported.</p> -->
-          <!--   <hr> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
-          <!--   <blockquote> -->
-          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit.</p> -->
-          <!--   </blockquote> -->
-          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur.</p> -->
-          <!--   <h2>Heading</h2> -->
-          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p> -->
-          <!--   <pre><code>Example code block</code></pre> -->
-          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus.</p> -->
-          <!--   <ul> -->
-          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl consectetur et.</li> -->
-          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> -->
-          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
-          <!--   </ul> -->
-          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue.</p> -->
-          <!--   <ol> -->
-          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> -->
-          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</li> -->
-          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet non magna.</li> -->
-          <!--   </ol> -->
-          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis.</p> -->
-          <!-- </div><\!-- /.blog-post -\-> -->
-
-        </div>
-
-      </div><!-- /.row -->
-
-      
-        
-    </div><!-- /.container -->
-
-    <!-- Bootstrap core JavaScript
-    ================================================== -->
-    <!-- Placed at the end of the document so the pages load faster -->
-    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
-    <script src="../../dist/js/bootstrap.min.js"></script>
-    <!-- <script src="../../assets/js/docs.min.js"></script> -->
-    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
-    <!-- <script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
-    -->
-
-    <!-- Start of StatCounter Code for Default Guide -->
-    <script type="text/javascript">
-      var sc_project=8264132; 
-      var sc_invisible=1; 
-      var sc_security="4b97fe2d"; 
-    </script>
-    <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>
-    <noscript>
-      <div class="statcounter">
-        <a title="hit counter joomla" 
-           href="http://statcounter.com/joomla/"
-           target="_blank">
-          <img class="statcounter"
-               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/"
-               alt="hit counter joomla" />
-        </a>
-      </div>
-    </noscript>
-    <!-- End of StatCounter Code for Default Guide -->
-  </body>
-</html>
-

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/22be73ab/6/index.html
----------------------------------------------------------------------
diff --git a/6/index.html b/6/index.html
deleted file mode 100644
index 7392541..0000000
--- a/6/index.html
+++ /dev/null
@@ -1,210 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-  <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-    <meta name="description" content="">
-    <meta name="author" content="">
-    <link rel="icon" href="../../favicon.ico">
-
-    <title>Joshua Documentation | Joshua documentation</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link href="/joshua6.css" rel="stylesheet">
-  </head>
-
-  <body>
-
-    <div class="blog-masthead">
-      <div class="container">
-        <nav class="blog-nav">
-          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
-          <a class="blog-nav-item" href="/">Joshua</a>
-          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New features</a> -->
-          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
-          <a class="blog-nav-item" href="/data/">Datasets</a>
-          <a class="blog-nav-item" href="/support/">Support</a>
-          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
-        </nav>
-      </div>
-    </div>
-
-    <div class="container">
-
-      <div class="row">
-
-        <div class="col-sm-2">
-          <div class="sidebar-module">
-            <!-- <h4>About</h4> -->
-            <center>
-            <img src="/images/joshua-logo-small.png" />
-            <p>Joshua machine translation toolkit</p>
-            </center>
-          </div>
-          <hr>
-          <center>
-            <a href="/releases/current/" target="_blank"><button class="button">Download Joshua 6.0.5</button></a>
-            <br />
-            <a href="/releases/runtime/" target="_blank"><button class="button">Runtime only version</button></a>
-            <p>Released November 5, 2015</p>
-          </center>
-          <hr>
-          <!-- <div class="sidebar-module"> -->
-          <!--   <span id="download"> -->
-          <!--     <a href="http://joshua-decoder.org/downloads/joshua-6.0.tgz">Download</a> -->
-          <!--   </span> -->
-          <!-- </div> -->
-          <div class="sidebar-module">
-            <h4>Using Joshua</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/install.html">Installation</a></li>
-              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
-            </ol>
-          </div>
-          <hr>
-          <div class="sidebar-module">
-            <h4>Building new models</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
-              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
-              <li><a href="/6.0/faq.html">FAQ</a></li>
-            </ol>
-          </div>
-<!--
-          <div class="sidebar-module">
-            <h4>Phrase-based</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/phrase.html">Training</a></li>
-            </ol>
-          </div>
--->
-          <hr>
-          <div class="sidebar-module">
-            <h4>Advanced</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/bundle.html">Building language packs</a></li>
-              <li><a href="/6.0/decoder.html">Decoder options</a></li>
-              <li><a href="/6.0/file-formats.html">File formats</a></li>
-              <li><a href="/6.0/packing.html">Packing TMs</a></li>
-              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
-            </ol>
-          </div>
-
-          <hr> 
-          <div class="sidebar-module">
-            <h4>Developer</h4>
-            <ol class="list-unstyled">              
-		<li><a href="https://github.com/joshua-decoder/joshua">Github</a></li>
-		<li><a href="http://cs.jhu.edu/~post/joshua-docs">Javadoc</a></li>
-		<li><a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers">Mailing list</a></li>              
-            </ol>
-          </div>
-
-        </div><!-- /.blog-sidebar -->
-
-        
-        <div class="col-sm-8 blog-main">
-        
-
-          <div class="blog-title">
-            <h2>Joshua documentation</h2>
-          </div>
-          
-          <div class="blog-post">
-
-            <p>This page contains end-user oriented documentation for the 6.0 release of
-<a href="http://joshua-decoder.org/">the Joshua decoder</a>.</p>
-
-<p>To navigate the documentation, use the links on the navigation bar to
-the left. For more detail on the decoder itself, including its command-line options, see
-<a href="decoder.html">the Joshua decoder page</a>.  You can also learn more about other steps of
-<a href="pipeline.html">the Joshua MT pipeline</a>, including <a href="thrax.html">grammar extraction</a> with Thrax and
-Joshua\u2019s <a href="packing.html">efficient grammar representation</a>.</p>
-
-<p>A <a href="bundle.html">bundled configuration</a>, which is a minimal set of configuration, resource, and script files, can be created and easily transferred and shared.</p>
-
-<h2 id="development">Development</h2>
-
-<p>For developer support, please consult <a href="http://cs.jhu.edu/~post/joshua-docs">the javadoc documentation</a> and the <a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers">Joshua developers mailing list</a>.</p>
-
-<h2 id="support">Support</h2>
-
-<p>If you have problems or issues, you might find some help <a href="faq.html">on our answers page</a> or
-<a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_support">in the mailing list archives</a>.</p>
-
-
-          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
-
-          <!--   <p>This blog post shows a few different types of content that's supported and styled with Bootstrap. Basic typography, images, and code are all supported.</p> -->
-          <!--   <hr> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
-          <!--   <blockquote> -->
-          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit.</p> -->
-          <!--   </blockquote> -->
-          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur.</p> -->
-          <!--   <h2>Heading</h2> -->
-          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p> -->
-          <!--   <pre><code>Example code block</code></pre> -->
-          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus.</p> -->
-          <!--   <ul> -->
-          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl consectetur et.</li> -->
-          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> -->
-          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
-          <!--   </ul> -->
-          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue.</p> -->
-          <!--   <ol> -->
-          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> -->
-          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</li> -->
-          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet non magna.</li> -->
-          <!--   </ol> -->
-          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis.</p> -->
-          <!-- </div><\!-- /.blog-post -\-> -->
-
-        </div>
-
-      </div><!-- /.row -->
-
-      
-        
-    </div><!-- /.container -->
-
-    <!-- Bootstrap core JavaScript
-    ================================================== -->
-    <!-- Placed at the end of the document so the pages load faster -->
-    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
-    <script src="../../dist/js/bootstrap.min.js"></script>
-    <!-- <script src="../../assets/js/docs.min.js"></script> -->
-    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
-    <!-- <script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
-    -->
-
-    <!-- Start of StatCounter Code for Default Guide -->
-    <script type="text/javascript">
-      var sc_project=8264132; 
-      var sc_invisible=1; 
-      var sc_security="4b97fe2d"; 
-    </script>
-    <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>
-    <noscript>
-      <div class="statcounter">
-        <a title="hit counter joomla" 
-           href="http://statcounter.com/joomla/"
-           target="_blank">
-          <img class="statcounter"
-               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/"
-               alt="hit counter joomla" />
-        </a>
-      </div>
-    </noscript>
-    <!-- End of StatCounter Code for Default Guide -->
-  </body>
-</html>
-

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/22be73ab/6/install.html
----------------------------------------------------------------------
diff --git a/6/install.html b/6/install.html
deleted file mode 100644
index b972e81..0000000
--- a/6/install.html
+++ /dev/null
@@ -1,301 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-  <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-    <meta name="description" content="">
-    <meta name="author" content="">
-    <link rel="icon" href="../../favicon.ico">
-
-    <title>Joshua Documentation | Installation</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link href="/joshua6.css" rel="stylesheet">
-  </head>
-
-  <body>
-
-    <div class="blog-masthead">
-      <div class="container">
-        <nav class="blog-nav">
-          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
-          <a class="blog-nav-item" href="/">Joshua</a>
-          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New features</a> -->
-          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
-          <a class="blog-nav-item" href="/data/">Datasets</a>
-          <a class="blog-nav-item" href="/support/">Support</a>
-          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
-        </nav>
-      </div>
-    </div>
-
-    <div class="container">
-
-      <div class="row">
-
-        <div class="col-sm-2">
-          <div class="sidebar-module">
-            <!-- <h4>About</h4> -->
-            <center>
-            <img src="/images/joshua-logo-small.png" />
-            <p>Joshua machine translation toolkit</p>
-            </center>
-          </div>
-          <hr>
-          <center>
-            <a href="/releases/current/" target="_blank"><button class="button">Download Joshua 6.0.5</button></a>
-            <br />
-            <a href="/releases/runtime/" target="_blank"><button class="button">Runtime only version</button></a>
-            <p>Released November 5, 2015</p>
-          </center>
-          <hr>
-          <!-- <div class="sidebar-module"> -->
-          <!--   <span id="download"> -->
-          <!--     <a href="http://joshua-decoder.org/downloads/joshua-6.0.tgz">Download</a> -->
-          <!--   </span> -->
-          <!-- </div> -->
-          <div class="sidebar-module">
-            <h4>Using Joshua</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/install.html">Installation</a></li>
-              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
-            </ol>
-          </div>
-          <hr>
-          <div class="sidebar-module">
-            <h4>Building new models</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
-              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
-              <li><a href="/6.0/faq.html">FAQ</a></li>
-            </ol>
-          </div>
-<!--
-          <div class="sidebar-module">
-            <h4>Phrase-based</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/phrase.html">Training</a></li>
-            </ol>
-          </div>
--->
-          <hr>
-          <div class="sidebar-module">
-            <h4>Advanced</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/bundle.html">Building language packs</a></li>
-              <li><a href="/6.0/decoder.html">Decoder options</a></li>
-              <li><a href="/6.0/file-formats.html">File formats</a></li>
-              <li><a href="/6.0/packing.html">Packing TMs</a></li>
-              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
-            </ol>
-          </div>
-
-          <hr> 
-          <div class="sidebar-module">
-            <h4>Developer</h4>
-            <ol class="list-unstyled">              
-		<li><a href="https://github.com/joshua-decoder/joshua">Github</a></li>
-		<li><a href="http://cs.jhu.edu/~post/joshua-docs">Javadoc</a></li>
-		<li><a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers">Mailing list</a></li>              
-            </ol>
-          </div>
-
-        </div><!-- /.blog-sidebar -->
-
-        
-        <div class="col-sm-8 blog-main">
-        
-
-          <div class="blog-title">
-            <h2>Installation</h2>
-          </div>
-          
-          <div class="blog-post">
-
-            <h3 id="download-and-install">Download and install</h3>
-
-<p>To use Joshua as a standalone decoder (with <a href="/language-packs/">language packs</a>), you only need to download and install the runtime version of the decoder. 
-If you also wish to build translation models from your own data, you will want to install the full version.
-See the instructions below.</p>
-
-<ol>
-  <li>
-    <p>Set up some basic environment variables. 
-You need to define <code class="highlighter-rouge">$JAVA_HOME</code></p>
-
-    <div class="highlighter-rouge"><pre class="highlight"><code>export JAVA_HOME=/path/to/java
-
-# JAVA_HOME is not very standardized. Here are some places to look:
-# OS X:  export JAVA_HOME=/Library/Java/JavaVirtualMachines/jdk1.7.0_71.jdk/Contents/Home
-# Linux: export JAVA_HOME=/usr/java/default
-</code></pre>
-    </div>
-  </li>
-  <li>
-    <p>If you are installing the full version of Joshua, you also need to define <code class="highlighter-rouge">$HADOOP</code> to point to your Hadoop installation.
-(Joshua looks for the Hadoop executuble in <code class="highlighter-rouge">$HADOOP/bin/hadoop</code>)</p>
-
-    <div class="highlighter-rouge"><pre class="highlight"><code>export HADOOP=/usr
-</code></pre>
-    </div>
-
-    <p>If you don\u2019t have a Hadoop installation, <a href="pipeline.html">Joshua\u2019s pipeline</a> can install a standalone version for you.</p>
-  </li>
-  <li>
-    <p>To install just the runtime version of Joshua, type</p>
-
-    <div class="highlighter-rouge"><pre class="highlight"><code>wget -q http://cs.jhu.edu/~post/files/joshua-runtime-6.0.5.tgz
-</code></pre>
-    </div>
-
-    <p>Then build everything</p>
-
-    <div class="highlighter-rouge"><pre class="highlight"><code>tar xzf joshua-runtime-6.0.5.tgz
-cd joshua-runtime-6.0.5
-
-# Add this to your init files
-export JOSHUA=$(pwd)
-   
-# build everything
-ant
-</code></pre>
-    </div>
-  </li>
-  <li>
-    <p>To instead install the full version, type</p>
-
-    <div class="highlighter-rouge"><pre class="highlight"><code>wget -q http://cs.jhu.edu/~post/files/joshua-6.0.5.tgz
-
-tar xzf joshua-6.0.5.tgz
-cd joshua-6.0.5
-
-# Add this to your init files
-export JOSHUA=$(pwd)
-   
-# build everything
-ant
-</code></pre>
-    </div>
-  </li>
-</ol>
-
-<h3 id="building-new-models">Building new models</h3>
-
-<p>If you wish to build models for new language pairs from existing data (such as the <a href="http://statmt.org/wmt14/">WMT data</a>), you need to install some additional dependencies.</p>
-
-<ol>
-  <li>
-    <p>For learning hierarchical models, Joshua includes a tool called <a href="thrax.html">Thrax</a>, which
-is built on Hadoop. If you have a Hadoop installation, make sure that the environment variable
-<code class="highlighter-rouge">$HADOOP</code> is set and points to it. If you don\u2019t, Joshua will roll one out for you in standalone
-mode. Hadoop is only needed if you plan to build new models with Joshua.</p>
-  </li>
-  <li>
-    <p>You will need to install Moses if either of the following applies to you:</p>
-
-    <ul>
-      <li>
-        <p>You wish to build <a href="phrase.html">phrase-based models</a> (Joshua 6 includes a phrase-based
-decoder, but not the tools for building such a model)</p>
-      </li>
-      <li>
-        <p>You are building your own models (phrase- or syntax-based) and wish to use Cherry &amp; Foster\u2019s
-<a href="http://aclweb.org/anthology-new/N/N12/N12-1047v2.pdf">batch MIRA tuner</a> instead of the included
-MERT implementation, <a href="zmert.html">Z-MERT</a>. </p>
-      </li>
-    </ul>
-
-    <p>Follow <a href="http://www.statmt.org/moses/?n=Development.GetStarted">the instructions for installing Moses
-here</a>, and then define the <code class="highlighter-rouge">$MOSES</code>
-environment variable to point to the root of the Moses installation.</p>
-  </li>
-</ol>
-
-<h2 id="more-information">More information</h2>
-
-<p>For more detail on the decoder itself, including its command-line options, see
-<a href="decoder.html">the Joshua decoder page</a>.  You can also learn more about other steps of
-<a href="pipeline.html">the Joshua MT pipeline</a>, including <a href="thrax.html">grammar extraction</a> with Thrax and
-Joshua\u2019s <a href="packing.html">efficient grammar representation</a>.</p>
-
-<p>If you have problems or issues, you might find some help <a href="faq.html">on our answers page</a> or
-<a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_support">in the mailing list archives</a>.</p>
-
-<p>A <a href="bundle.html">bundled configuration</a>, which is a minimal set of configuration, resource, and script files, can be created and easily transferred and shared.</p>
-
-
-          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
-
-          <!--   <p>This blog post shows a few different types of content that's supported and styled with Bootstrap. Basic typography, images, and code are all supported.</p> -->
-          <!--   <hr> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
-          <!--   <blockquote> -->
-          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit.</p> -->
-          <!--   </blockquote> -->
-          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur.</p> -->
-          <!--   <h2>Heading</h2> -->
-          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p> -->
-          <!--   <pre><code>Example code block</code></pre> -->
-          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus.</p> -->
-          <!--   <ul> -->
-          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl consectetur et.</li> -->
-          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> -->
-          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
-          <!--   </ul> -->
-          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue.</p> -->
-          <!--   <ol> -->
-          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> -->
-          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</li> -->
-          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet non magna.</li> -->
-          <!--   </ol> -->
-          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis.</p> -->
-          <!-- </div><\!-- /.blog-post -\-> -->
-
-        </div>
-
-      </div><!-- /.row -->
-
-      
-        
-    </div><!-- /.container -->
-
-    <!-- Bootstrap core JavaScript
-    ================================================== -->
-    <!-- Placed at the end of the document so the pages load faster -->
-    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
-    <script src="../../dist/js/bootstrap.min.js"></script>
-    <!-- <script src="../../assets/js/docs.min.js"></script> -->
-    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
-    <!-- <script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
-    -->
-
-    <!-- Start of StatCounter Code for Default Guide -->
-    <script type="text/javascript">
-      var sc_project=8264132; 
-      var sc_invisible=1; 
-      var sc_security="4b97fe2d"; 
-    </script>
-    <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>
-    <noscript>
-      <div class="statcounter">
-        <a title="hit counter joomla" 
-           href="http://statcounter.com/joomla/"
-           target="_blank">
-          <img class="statcounter"
-               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/"
-               alt="hit counter joomla" />
-        </a>
-      </div>
-    </noscript>
-    <!-- End of StatCounter Code for Default Guide -->
-  </body>
-</html>
-

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/22be73ab/6/jacana.html
----------------------------------------------------------------------
diff --git a/6/jacana.html b/6/jacana.html
deleted file mode 100644
index b8f5a79..0000000
--- a/6/jacana.html
+++ /dev/null
@@ -1,331 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-  <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-    <meta name="description" content="">
-    <meta name="author" content="">
-    <link rel="icon" href="../../favicon.ico">
-
-    <title>Joshua Documentation | Alignment with Jacana</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link href="/joshua6.css" rel="stylesheet">
-  </head>
-
-  <body>
-
-    <div class="blog-masthead">
-      <div class="container">
-        <nav class="blog-nav">
-          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
-          <a class="blog-nav-item" href="/">Joshua</a>
-          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New features</a> -->
-          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
-          <a class="blog-nav-item" href="/data/">Datasets</a>
-          <a class="blog-nav-item" href="/support/">Support</a>
-          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
-        </nav>
-      </div>
-    </div>
-
-    <div class="container">
-
-      <div class="row">
-
-        <div class="col-sm-2">
-          <div class="sidebar-module">
-            <!-- <h4>About</h4> -->
-            <center>
-            <img src="/images/joshua-logo-small.png" />
-            <p>Joshua machine translation toolkit</p>
-            </center>
-          </div>
-          <hr>
-          <center>
-            <a href="/releases/current/" target="_blank"><button class="button">Download Joshua 6.0.5</button></a>
-            <br />
-            <a href="/releases/runtime/" target="_blank"><button class="button">Runtime only version</button></a>
-            <p>Released November 5, 2015</p>
-          </center>
-          <hr>
-          <!-- <div class="sidebar-module"> -->
-          <!--   <span id="download"> -->
-          <!--     <a href="http://joshua-decoder.org/downloads/joshua-6.0.tgz">Download</a> -->
-          <!--   </span> -->
-          <!-- </div> -->
-          <div class="sidebar-module">
-            <h4>Using Joshua</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/install.html">Installation</a></li>
-              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
-            </ol>
-          </div>
-          <hr>
-          <div class="sidebar-module">
-            <h4>Building new models</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
-              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
-              <li><a href="/6.0/faq.html">FAQ</a></li>
-            </ol>
-          </div>
-<!--
-          <div class="sidebar-module">
-            <h4>Phrase-based</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/phrase.html">Training</a></li>
-            </ol>
-          </div>
--->
-          <hr>
-          <div class="sidebar-module">
-            <h4>Advanced</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/bundle.html">Building language packs</a></li>
-              <li><a href="/6.0/decoder.html">Decoder options</a></li>
-              <li><a href="/6.0/file-formats.html">File formats</a></li>
-              <li><a href="/6.0/packing.html">Packing TMs</a></li>
-              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
-            </ol>
-          </div>
-
-          <hr> 
-          <div class="sidebar-module">
-            <h4>Developer</h4>
-            <ol class="list-unstyled">              
-		<li><a href="https://github.com/joshua-decoder/joshua">Github</a></li>
-		<li><a href="http://cs.jhu.edu/~post/joshua-docs">Javadoc</a></li>
-		<li><a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers">Mailing list</a></li>              
-            </ol>
-          </div>
-
-        </div><!-- /.blog-sidebar -->
-
-        
-        <div class="col-sm-8 blog-main">
-        
-
-          <div class="blog-title">
-            <h2>Alignment with Jacana</h2>
-          </div>
-          
-          <div class="blog-post">
-
-            <h2 id="introduction">Introduction</h2>
-
-<p>jacana-xy is a token-based word aligner for machine translation, adapted from the original
-English-English word aligner jacana-align described in the following paper:</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>A Lightweight and High Performance Monolingual Word Aligner. Xuchen Yao, Benjamin Van Durme,
-Chris Callison-Burch and Peter Clark. Proceedings of ACL 2013, short papers.
-</code></pre>
-</div>
-
-<p>It currently supports only aligning from French to English with a very limited feature set, from the
-one week hack at the <a href="http://statmt.org/mtm13">Eighth MT Marathon 2013</a>. Please feel free to check
-out the code, read to the bottom of this page, and
-<a href="http://www.cs.jhu.edu/~xuchen/">send the author an email</a> if you want to add more language pairs to
-it.</p>
-
-<h2 id="build">Build</h2>
-
-<p>jacana-xy is written in a mixture of Java and Scala. If you build from ant, you have to set up the
-environmental variables <code class="highlighter-rouge">JAVA_HOME</code> and <code class="highlighter-rouge">SCALA_HOME</code>. In my system, I have:</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>export JAVA_HOME=/usr/lib/jvm/java-6-sun-1.6.0.26
-export SCALA_HOME=/home/xuchen/Downloads/scala-2.10.2
-</code></pre>
-</div>
-
-<p>Then type:</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>ant
-</code></pre>
-</div>
-
-<p>build/lib/jacana-xy.jar will be built for you.</p>
-
-<p>If you build from Eclipse, first install scala-ide, then import the whole jacana folder as a Scala project. Eclipse should find the .project file and set up the project automatically for you.</p>
-
-<p>Demo
-scripts-align/runDemoServer.sh shows up the web demo. Direct your browser to http://localhost:8080/ and you should be able to align some sentences.</p>
-
-<p>Note: To make jacana-xy know where to look for resource files, pass the property JACANA_HOME with Java when you run it:</p>
-
-<p>java -DJACANA_HOME=/path/to/jacana -cp jacana-xy.jar \u2026\u2026</p>
-
-<p>Browser
-You can also browse one or two alignment files (*.json) with firefox opening src/web/AlignmentBrowser.html:</p>
-
-<p>Note 1: due to strict security setting for accessing local files, Chrome/IE won\u2019t work.</p>
-
-<p>Note 2: the input *.json files have to be in the same folder with AlignmentBrowser.html.</p>
-
-<p>Align
-scripts-align/alignFile.sh aligns tab-separated sentence files and outputs the output to a .json file that\u2019s accepted by the browser:</p>
-
-<p>java -DJACANA_HOME=../ -jar ../build/lib/jacana-xy.jar -src fr -tgt en -m fr-en.model -a s.txt -o s.json</p>
-
-<p>scripts-align/alignFile.sh takes GIZA++-style input files (one file containing the source sentences, and the other file the target sentences) and outputs to one .align file with dashed alignment indices (e.g. \u201c1-2 0-4\u201d):</p>
-
-<p>java -DJACANA_HOME=../ -jar ../build/lib/jacana-xy.jar -m fr-en.model -src fr -tgt en -a s1.txt -b s2.txt -o s.align</p>
-
-<p>Training
-java -DJACANA_HOME=../ -jar ../build/lib/jacana-xy.jar -r train.json -d dev.json -t test.json -m /tmp/align.model</p>
-
-<p>The aligner then would train on train.json, and report F1 values on dev.json for every 10 iterations, when the stopping criterion has reached, it will test on test.json.</p>
-
-<p>For every 10 iterations, a model file is saved to (in this example) /tmp/align.model.iter_XX.F1_XX.X. Normally what I do is to select the one with the best F1 on dev.json, then run a final test on test.json:</p>
-
-<p>java -DJACANA_HOME=../ -jar ../build/lib/jacana-xy.jar -t test.json -m /tmp/align.model.iter_XX.F1_XX.X</p>
-
-<p>In this case since the training data is missing, the aligner assumes it\u2019s a test job, then reads model file still from the -m option, and test on test.json.</p>
-
-<p>All the json files are in a format like the following (also accepted by the browser for display):</p>
-
-<p>[
-    {
-        \u201cid\u201d: \u201c0008\u201d,
-        \u201cname\u201d: \u201cHansards.french-english.0008\u201d,
-        \u201cpossibleAlign\u201d: \u201c0-0 0-1 0-2\u201d,
-        \u201csource\u201d: \u201cbravo !\u201d,
-        \u201csureAlign\u201d: \u201c1-3\u201d,
-        \u201ctarget\u201d: \u201chear , hear !\u201d
-    },
-    {
-        \u201cid\u201d: \u201c0009\u201d,
-        \u201cname\u201d: \u201cHansards.french-english.0009\u201d,
-        \u201cpossibleAlign\u201d: \u201c1-1 6-5 7-5 6-6 7-6 13-10 13-11\u201d,
-        \u201csource\u201d: \u201cmonsieur le Orateur , ma question se adresse � le ministre charg� de les transports .\u201d,
-        \u201csureAlign\u201d: \u201c0-0 2-1 3-2 4-3 5-4 8-7 9-8 10-9 12-10 14-11 15-12\u201d,
-        \u201ctarget\u201d: \u201cMr. Speaker , my question is directed to the Minister of Transport .\u201d
-    }
-]
-Where possibleAlign is not used.</p>
-
-<p>The stopping criterion is to run up to 300 iterations or when the objective difference between two iterations is less than 0.001, whichever happens first. Currently they are hard-coded. If you need to be flexible on this, send me an email!</p>
-
-<p>Support More Languages
-To add support to more languages, you need:</p>
-
-<p>labelled word alignment (in the download there\u2019s already French-English under alignment-data/fr-en; I also have Chinese-English and Arabic-English; let me know if you have more). Usually 100 labelled sentence pairs would be enough
-implement some feature functions for this language pair
-To add more features, you need to implement the following interface:</p>
-
-<p>edu.jhu.jacana.align.feature.AlignFeature</p>
-
-<p>and override the following function:</p>
-
-<p>addPhraseBasedFeature</p>
-
-<p>For instance, a simple feature that checks whether the two words are translations in wiktionary for the French-English alignment task has the function implemented as:</p>
-
-<p>def addPhraseBasedFeature(pair: AlignPair, ins:AlignFeatureVector, i:Int, srcSpan:Int, j:Int, tgtSpan:Int,
-      currState:Int, featureAlphabet: Alphabet){
-  if (j == -1) {
-  } else {
-    val srcTokens = pair.srcTokens.slice(i, i+srcSpan).mkString(\u201c \u201c)
-    val tgtTokens = pair.tgtTokens.slice(j, j+tgtSpan).mkString(\u201c \u201c)</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>if (WiktionaryMultilingual.exists(srcTokens, tgtTokens)) {
-  ins.addFeature("InWiktionary", NONE_STATE, currState, 1.0, srcSpan, featureAlphabet) 
-}
-</code></pre>
-</div>
-
-<p>}     <br />
-}
-This is a more general function that also deals with phrase alignment. But it is suggested to implement it just for token alignment as currently the phrase alignment part is very slow to train (60x slower than token alignment).</p>
-
-<p>Some other language-independent and English-only features are implemented under the package edu.jhu.jacana.align.feature, for instance:</p>
-
-<p>StringSimilarityAlignFeature: various string similarity measures</p>
-
-<p>PositionalAlignFeature: features based on relative sentence positions</p>
-
-<p>DistortionAlignFeature: Markovian (state transition) features</p>
-
-<p>When you add features for more languages, just create a new package like the one for French-English:</p>
-
-<p>edu.jhu.jacana.align.feature.fr_en</p>
-
-<p>and start coding!</p>
-
-
-
-          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
-
-          <!--   <p>This blog post shows a few different types of content that's supported and styled with Bootstrap. Basic typography, images, and code are all supported.</p> -->
-          <!--   <hr> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
-          <!--   <blockquote> -->
-          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit.</p> -->
-          <!--   </blockquote> -->
-          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur.</p> -->
-          <!--   <h2>Heading</h2> -->
-          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p> -->
-          <!--   <pre><code>Example code block</code></pre> -->
-          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus.</p> -->
-          <!--   <ul> -->
-          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl consectetur et.</li> -->
-          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> -->
-          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
-          <!--   </ul> -->
-          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue.</p> -->
-          <!--   <ol> -->
-          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> -->
-          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</li> -->
-          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet non magna.</li> -->
-          <!--   </ol> -->
-          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis.</p> -->
-          <!-- </div><\!-- /.blog-post -\-> -->
-
-        </div>
-
-      </div><!-- /.row -->
-
-      
-        
-    </div><!-- /.container -->
-
-    <!-- Bootstrap core JavaScript
-    ================================================== -->
-    <!-- Placed at the end of the document so the pages load faster -->
-    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
-    <script src="../../dist/js/bootstrap.min.js"></script>
-    <!-- <script src="../../assets/js/docs.min.js"></script> -->
-    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
-    <!-- <script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
-    -->
-
-    <!-- Start of StatCounter Code for Default Guide -->
-    <script type="text/javascript">
-      var sc_project=8264132; 
-      var sc_invisible=1; 
-      var sc_security="4b97fe2d"; 
-    </script>
-    <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>
-    <noscript>
-      <div class="statcounter">
-        <a title="hit counter joomla" 
-           href="http://statcounter.com/joomla/"
-           target="_blank">
-          <img class="statcounter"
-               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/"
-               alt="hit counter joomla" />
-        </a>
-      </div>
-    </noscript>
-    <!-- End of StatCounter Code for Default Guide -->
-  </body>
-</html>
-

http://git-wip-us.apache.org/repos/asf/incubator-joshua-site/blob/22be73ab/6/large-lms.html
----------------------------------------------------------------------
diff --git a/6/large-lms.html b/6/large-lms.html
deleted file mode 100644
index edf4878..0000000
--- a/6/large-lms.html
+++ /dev/null
@@ -1,390 +0,0 @@
-<!DOCTYPE html>
-<html lang="en">
-  <head>
-    <meta charset="utf-8">
-    <meta http-equiv="X-UA-Compatible" content="IE=edge">
-    <meta name="viewport" content="width=device-width, initial-scale=1">
-    <meta name="description" content="">
-    <meta name="author" content="">
-    <link rel="icon" href="../../favicon.ico">
-
-    <title>Joshua Documentation | Building large LMs with SRILM</title>
-
-    <!-- Bootstrap core CSS -->
-    <link href="/dist/css/bootstrap.min.css" rel="stylesheet">
-
-    <!-- Custom styles for this template -->
-    <link href="/joshua6.css" rel="stylesheet">
-  </head>
-
-  <body>
-
-    <div class="blog-masthead">
-      <div class="container">
-        <nav class="blog-nav">
-          <!-- <a class="blog-nav-item active" href="#">Joshua</a> -->
-          <a class="blog-nav-item" href="/">Joshua</a>
-          <!-- <a class="blog-nav-item" href="/6.0/whats-new.html">New features</a> -->
-          <a class="blog-nav-item" href="/language-packs/">Language packs</a>
-          <a class="blog-nav-item" href="/data/">Datasets</a>
-          <a class="blog-nav-item" href="/support/">Support</a>
-          <a class="blog-nav-item" href="/contributors.html">Contributors</a>
-        </nav>
-      </div>
-    </div>
-
-    <div class="container">
-
-      <div class="row">
-
-        <div class="col-sm-2">
-          <div class="sidebar-module">
-            <!-- <h4>About</h4> -->
-            <center>
-            <img src="/images/joshua-logo-small.png" />
-            <p>Joshua machine translation toolkit</p>
-            </center>
-          </div>
-          <hr>
-          <center>
-            <a href="/releases/current/" target="_blank"><button class="button">Download Joshua 6.0.5</button></a>
-            <br />
-            <a href="/releases/runtime/" target="_blank"><button class="button">Runtime only version</button></a>
-            <p>Released November 5, 2015</p>
-          </center>
-          <hr>
-          <!-- <div class="sidebar-module"> -->
-          <!--   <span id="download"> -->
-          <!--     <a href="http://joshua-decoder.org/downloads/joshua-6.0.tgz">Download</a> -->
-          <!--   </span> -->
-          <!-- </div> -->
-          <div class="sidebar-module">
-            <h4>Using Joshua</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/install.html">Installation</a></li>
-              <li><a href="/6.0/quick-start.html">Quick Start</a></li>
-            </ol>
-          </div>
-          <hr>
-          <div class="sidebar-module">
-            <h4>Building new models</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/pipeline.html">Pipeline</a></li>
-              <li><a href="/6.0/tutorial.html">Tutorial</a></li>
-              <li><a href="/6.0/faq.html">FAQ</a></li>
-            </ol>
-          </div>
-<!--
-          <div class="sidebar-module">
-            <h4>Phrase-based</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/phrase.html">Training</a></li>
-            </ol>
-          </div>
--->
-          <hr>
-          <div class="sidebar-module">
-            <h4>Advanced</h4>
-            <ol class="list-unstyled">
-              <li><a href="/6.0/bundle.html">Building language packs</a></li>
-              <li><a href="/6.0/decoder.html">Decoder options</a></li>
-              <li><a href="/6.0/file-formats.html">File formats</a></li>
-              <li><a href="/6.0/packing.html">Packing TMs</a></li>
-              <li><a href="/6.0/large-lms.html">Building large LMs</a></li>
-            </ol>
-          </div>
-
-          <hr> 
-          <div class="sidebar-module">
-            <h4>Developer</h4>
-            <ol class="list-unstyled">              
-		<li><a href="https://github.com/joshua-decoder/joshua">Github</a></li>
-		<li><a href="http://cs.jhu.edu/~post/joshua-docs">Javadoc</a></li>
-		<li><a href="https://groups.google.com/forum/?fromgroups#!forum/joshua_developers">Mailing list</a></li>              
-            </ol>
-          </div>
-
-        </div><!-- /.blog-sidebar -->
-
-        
-        <div class="col-sm-8 blog-main">
-        
-
-          <div class="blog-title">
-            <h2>Building large LMs with SRILM</h2>
-          </div>
-          
-          <div class="blog-post">
-
-            <p>The following is a tutorial for building a large language model from the
-English Gigaword Fifth Edition corpus
-<a href="http://www.ldc.upenn.edu/Catalog/catalogEntry.jsp?catalogId=LDC2011T07">LDC2011T07</a>
-using SRILM. English text is provided from seven different sources.</p>
-
-<h3 id="step-0-clean-up-the-corpus">Step 0: Clean up the corpus</h3>
-
-<p>The Gigaword corpus has to be stripped of all SGML tags and tokenized.
-Instructions for performing those steps are not included in this
-documentation. A description of this process can be found in a paper
-called <a href="https://akbcwekex2012.files.wordpress.com/2012/05/28_paper.pdf">\u201cAnnotated
-Gigaword\u201d</a>.</p>
-
-<p>The Joshua package ships with a script that converts all alphabetical
-characters to their lowercase equivalent. The script is located at
-<code class="highlighter-rouge">$JOSHUA/scripts/lowercase.perl</code>.</p>
-
-<p>Make a directory structure as follows:</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>gigaword/
-\u251c\u2500\u2500 corpus/
-\u2502�� \u251c\u2500\u2500 afp_eng/
-\u2502�� \u2502�� \u251c\u2500\u2500 afp_eng_199405.lc.gz
-\u2502�� \u2502�� \u251c\u2500\u2500 afp_eng_199406.lc.gz
-\u2502�� \u2502�� \u251c\u2500\u2500 ...
-\u2502�� \u2502�� \u2514\u2500\u2500 counts/
-\u2502�� \u251c\u2500\u2500 apw_eng/
-\u2502�� \u2502�� \u251c\u2500\u2500 apw_eng_199411.lc.gz
-\u2502�� \u2502�� \u251c\u2500\u2500 apw_eng_199412.lc.gz
-\u2502�� \u2502�� \u251c\u2500\u2500 ...
-\u2502�� \u2502�� \u2514\u2500\u2500 counts/
-\u2502�� \u251c\u2500\u2500 cna_eng/
-\u2502�� \u2502�� \u251c\u2500\u2500 ...
-\u2502�� \u2502�� \u2514\u2500\u2500 counts/
-\u2502�� \u251c\u2500\u2500 ltw_eng/
-\u2502�� \u2502�� \u251c\u2500\u2500 ...
-\u2502�� \u2502�� \u2514\u2500\u2500 counts/
-\u2502�� \u251c\u2500\u2500 nyt_eng/
-\u2502�� \u2502�� \u251c\u2500\u2500 ...
-\u2502�� \u2502�� \u2514\u2500\u2500 counts/
-\u2502�� \u251c\u2500\u2500 wpb_eng/
-\u2502�� \u2502�� \u251c\u2500\u2500 ...
-\u2502�� \u2502�� \u2514\u2500\u2500 counts/
-\u2502�� \u2514\u2500\u2500 xin_eng/
-\u2502��  �� \u251c\u2500\u2500 ...
-\u2502��  �� \u2514\u2500\u2500 counts/
-\u2514\u2500\u2500 lm/
- �� \u251c\u2500\u2500 afp_eng/
- �� \u251c\u2500\u2500 apw_eng/
- �� \u251c\u2500\u2500 cna_eng/
- �� \u251c\u2500\u2500 ltw_eng/
- �� \u251c\u2500\u2500 nyt_eng/
- �� \u251c\u2500\u2500 wpb_eng/
- �� \u2514\u2500\u2500 xin_eng/
-</code></pre>
-</div>
-
-<p>The next step will be to build smaller LMs and then interpolate them into one
-file.</p>
-
-<h3 id="step-1-count-ngrams">Step 1: Count ngrams</h3>
-
-<p>Run the following script once from each source directory under the <code class="highlighter-rouge">corpus/</code>
-directory (edit it to specify the path to the <code class="highlighter-rouge">ngram-count</code> binary as well as
-the number of processors):</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code><span class="c">#!/bin/sh</span>
-
-<span class="nv">NGRAM_COUNT</span><span class="o">=</span><span class="nv">$SRILM_SRC</span>/bin/i686-m64/ngram-count
-<span class="nv">args</span><span class="o">=</span><span class="s2">""</span>
-
-<span class="k">for </span><span class="nb">source </span><span class="k">in</span> <span class="k">*</span>.gz; <span class="k">do
-   </span><span class="nv">args</span><span class="o">=</span><span class="nv">$args</span><span class="s2">"-sort -order 5 -text </span><span class="nv">$source</span><span class="s2"> -write counts/</span><span class="nv">$source</span><span class="s2">-counts.gz "</span>
-<span class="k">done
-
-</span><span class="nb">echo</span> <span class="nv">$args</span> | xargs --max-procs<span class="o">=</span>4 -n 7 <span class="nv">$NGRAM_COUNT</span>
-</code></pre>
-</div>
-
-<p>Then move each <code class="highlighter-rouge">counts/</code> directory to the corresponding directory under
-<code class="highlighter-rouge">lm/</code>. Now that each ngram has been counted, we can make a language
-model for each of the seven sources.</p>
-
-<h3 id="step-2-make-individual-language-models">Step 2: Make individual language models</h3>
-
-<p>SRILM includes a script, called <code class="highlighter-rouge">make-big-lm</code>, for building large language
-models under resource-limited environments. The manual for this script can be
-read online
-<a href="http://www-speech.sri.com/projects/srilm/manpages/training-scripts.1.html">here</a>.
-Since the Gigaword corpus is so large, it is convenient to use <code class="highlighter-rouge">make-big-lm</code>
-even in environments with many parallel processors and a lot of memory.</p>
-
-<p>Initiate the following script from each of the source directories under the
-<code class="highlighter-rouge">lm/</code> directory (edit it to specify the path to the <code class="highlighter-rouge">make-big-lm</code> script as
-well as the pruning threshold):</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
-<span class="nb">set</span> -x
-
-<span class="nv">CMD</span><span class="o">=</span><span class="nv">$SRILM_SRC</span>/bin/make-big-lm
-<span class="nv">PRUNE_THRESHOLD</span><span class="o">=</span>1e-8
-
-<span class="nv">$CMD</span> <span class="se">\</span>
-  -name gigalm <span class="sb">`</span><span class="k">for </span>k <span class="k">in </span>counts/<span class="k">*</span>.gz; <span class="k">do </span><span class="nb">echo</span> <span class="s2">" </span><span class="se">\</span><span class="s2">
-  -read </span><span class="nv">$k</span><span class="s2"> "</span>; <span class="k">done</span><span class="sb">`</span> <span class="se">\</span>
-  -lm lm.gz <span class="se">\</span>
-  -max-per-file 100000000 <span class="se">\</span>
-  -order 5 <span class="se">\</span>
-  -kndiscount <span class="se">\</span>
-  -interpolate <span class="se">\</span>
-  -unk <span class="se">\</span>
-  -prune <span class="nv">$PRUNE_THRESHOLD</span>
-</code></pre>
-</div>
-
-<p>The language model attributes chosen are the following:</p>
-
-<ul>
-  <li>N-grams up to order 5</li>
-  <li>Kneser-Ney smoothing</li>
-  <li>N-gram probability estimates at the specified order <em>n</em> are interpolated with
-lower-order estimates</li>
-  <li>include the unknown-word token as a regular word</li>
-  <li>pruning N-grams based on the specified threshold</li>
-</ul>
-
-<p>Next, we will mix the models together into a single file.</p>
-
-<h3 id="step-3-mix-models-together">Step 3: Mix models together</h3>
-
-<p>Using development text, interpolation weights can determined that give highest
-weight to the source language models that have the lowest perplexity on the
-specified development set.</p>
-
-<h4 id="step-3-1-determine-interpolation-weights">Step 3-1: Determine interpolation weights</h4>
-
-<p>Initiate the following script from the <code class="highlighter-rouge">lm/</code> directory (edit it to specify the
-path to the <code class="highlighter-rouge">ngram</code> binary as well as the path to the development text file):</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
-<span class="nb">set</span> -x
-
-<span class="nv">NGRAM</span><span class="o">=</span><span class="nv">$SRILM_SRC</span>/bin/i686-m64/ngram
-<span class="nv">DEV_TEXT</span><span class="o">=</span>~mpost/expts/wmt12/runs/es-en/data/tune/tune.tok.lc.es
-
-<span class="nb">dirs</span><span class="o">=(</span> afp_eng apw_eng cna_eng ltw_eng nyt_eng wpb_eng xin_eng <span class="o">)</span>
-
-<span class="k">for </span>d <span class="k">in</span> <span class="k">${</span><span class="nv">dirs</span><span class="p">[@]</span><span class="k">}</span> ; <span class="k">do</span>
-  <span class="nv">$NGRAM</span> -debug 2 -order 5 -unk -lm <span class="nv">$d</span>/lm.gz -ppl <span class="nv">$DEV_TEXT</span> &gt; <span class="nv">$d</span>/lm.ppl ;
-<span class="k">done
-
-</span>compute-best-mix <span class="k">*</span>/lm.ppl &gt; best-mix.ppl
-</code></pre>
-</div>
-
-<p>Take a look at the contents of <code class="highlighter-rouge">best-mix.ppl</code>. It will contain a sequence of
-values in parenthesis. These are the interpolation weights of the source
-language models in the order specified. Copy and paste the values within the
-parenthesis into the script below.</p>
-
-<h4 id="step-3-2-combine-the-models">Step 3-2: Combine the models</h4>
-
-<p>Initiate the following script from the <code class="highlighter-rouge">lm/</code> directory (edit it to specify the
-path to the <code class="highlighter-rouge">ngram</code> binary as well as the interpolation weights):</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code><span class="c">#!/bin/bash</span>
-<span class="nb">set</span> -x
-
-<span class="nv">NGRAM</span><span class="o">=</span><span class="nv">$SRILM_SRC</span>/bin/i686-m64/ngram
-<span class="nv">DIRS</span><span class="o">=(</span>   afp_eng    apw_eng     cna_eng  ltw_eng   nyt_eng  wpb_eng  xin_eng <span class="o">)</span>
-<span class="nv">LAMBDAS</span><span class="o">=(</span>0.00631272 0.000647602 0.251555 0.0134726 0.348953 0.371566 0.00749238<span class="o">)</span>
-
-<span class="nv">$NGRAM</span> -order 5 -unk <span class="se">\</span>
-  -lm      <span class="k">${</span><span class="nv">DIRS</span><span class="p">[0]</span><span class="k">}</span>/lm.gz     -lambda  <span class="k">${</span><span class="nv">LAMBDAS</span><span class="p">[0]</span><span class="k">}</span> <span class="se">\</span>
-  -mix-lm  <span class="k">${</span><span class="nv">DIRS</span><span class="p">[1]</span><span class="k">}</span>/lm.gz <span class="se">\</span>
-  -mix-lm2 <span class="k">${</span><span class="nv">DIRS</span><span class="p">[2]</span><span class="k">}</span>/lm.gz -mix-lambda2 <span class="k">${</span><span class="nv">LAMBDAS</span><span class="p">[2]</span><span class="k">}</span> <span class="se">\</span>
-  -mix-lm3 <span class="k">${</span><span class="nv">DIRS</span><span class="p">[3]</span><span class="k">}</span>/lm.gz -mix-lambda3 <span class="k">${</span><span class="nv">LAMBDAS</span><span class="p">[3]</span><span class="k">}</span> <span class="se">\</span>
-  -mix-lm4 <span class="k">${</span><span class="nv">DIRS</span><span class="p">[4]</span><span class="k">}</span>/lm.gz -mix-lambda4 <span class="k">${</span><span class="nv">LAMBDAS</span><span class="p">[4]</span><span class="k">}</span> <span class="se">\</span>
-  -mix-lm5 <span class="k">${</span><span class="nv">DIRS</span><span class="p">[5]</span><span class="k">}</span>/lm.gz -mix-lambda5 <span class="k">${</span><span class="nv">LAMBDAS</span><span class="p">[5]</span><span class="k">}</span> <span class="se">\</span>
-  -mix-lm6 <span class="k">${</span><span class="nv">DIRS</span><span class="p">[6]</span><span class="k">}</span>/lm.gz -mix-lambda6 <span class="k">${</span><span class="nv">LAMBDAS</span><span class="p">[6]</span><span class="k">}</span> <span class="se">\</span>
-  -write-lm mixed_lm.gz
-</code></pre>
-</div>
-
-<p>The resulting file, <code class="highlighter-rouge">mixed_lm.gz</code> is a language model based on all the text in
-the Gigaword corpus and with some probabilities biased to the development text
-specify in step 3-1. It is in the ARPA format. The optional next step converts
-it into KenLM format.</p>
-
-<h4 id="step-3-3-convert-to-kenlm">Step 3-3: Convert to KenLM</h4>
-
-<p>The KenLM format has some speed advantages over the ARPA format. Issuing the
-following command will write a new language model file <code class="highlighter-rouge">mixed_lm-kenlm.gz</code> that
-is the <code class="highlighter-rouge">mixed_lm.gz</code> language model transformed into the KenLM format.</p>
-
-<div class="highlighter-rouge"><pre class="highlight"><code>$JOSHUA/src/joshua/decoder/ff/lm/kenlm/build_binary mixed_lm.gz mixed_lm.kenlm
-</code></pre>
-</div>
-
-
-
-          <!--   <h4 class="blog-post-title">Welcome to Joshua!</h4> -->
-
-          <!--   <p>This blog post shows a few different types of content that's supported and styled with Bootstrap. Basic typography, images, and code are all supported.</p> -->
-          <!--   <hr> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis <a href="#">dis parturient montes</a>, nascetur ridiculus mus. Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum.</p> -->
-          <!--   <blockquote> -->
-          <!--     <p>Curabitur blandit tempus porttitor. <strong>Nullam quis risus eget urna mollis</strong> ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit.</p> -->
-          <!--   </blockquote> -->
-          <!--   <p>Etiam porta <em>sem malesuada magna</em> mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur.</p> -->
-          <!--   <h2>Heading</h2> -->
-          <!--   <p>Vivamus sagittis lacus vel augue laoreet rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</p> -->
-          <!--   <pre><code>Example code block</code></pre> -->
-          <!--   <p>Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa.</p> -->
-          <!--   <h3>Sub-heading</h3> -->
-          <!--   <p>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus.</p> -->
-          <!--   <ul> -->
-          <!--     <li>Praesent commodo cursus magna, vel scelerisque nisl consectetur et.</li> -->
-          <!--     <li>Donec id elit non mi porta gravida at eget metus.</li> -->
-          <!--     <li>Nulla vitae elit libero, a pharetra augue.</li> -->
-          <!--   </ul> -->
-          <!--   <p>Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue.</p> -->
-          <!--   <ol> -->
-          <!--     <li>Vestibulum id ligula porta felis euismod semper.</li> -->
-          <!--     <li>Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus.</li> -->
-          <!--     <li>Maecenas sed diam eget risus varius blandit sit amet non magna.</li> -->
-          <!--   </ol> -->
-          <!--   <p>Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis.</p> -->
-          <!-- </div><\!-- /.blog-post -\-> -->
-
-        </div>
-
-      </div><!-- /.row -->
-
-      
-        
-    </div><!-- /.container -->
-
-    <!-- Bootstrap core JavaScript
-    ================================================== -->
-    <!-- Placed at the end of the document so the pages load faster -->
-    <script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
-    <script src="../../dist/js/bootstrap.min.js"></script>
-    <!-- <script src="../../assets/js/docs.min.js"></script> -->
-    <!-- IE10 viewport hack for Surface/desktop Windows 8 bug -->
-    <!-- <script src="../../assets/js/ie10-viewport-bug-workaround.js"></script>
-    -->
-
-    <!-- Start of StatCounter Code for Default Guide -->
-    <script type="text/javascript">
-      var sc_project=8264132; 
-      var sc_invisible=1; 
-      var sc_security="4b97fe2d"; 
-    </script>
-    <script type="text/javascript" src="http://www.statcounter.com/counter/counter.js"></script>
-    <noscript>
-      <div class="statcounter">
-        <a title="hit counter joomla" 
-           href="http://statcounter.com/joomla/"
-           target="_blank">
-          <img class="statcounter"
-               src="http://c.statcounter.com/8264132/0/4b97fe2d/1/"
-               alt="hit counter joomla" />
-        </a>
-      </div>
-    </noscript>
-    <!-- End of StatCounter Code for Default Guide -->
-  </body>
-</html>
-