You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@pdfbox.apache.org by ms...@apache.org on 2016/05/16 17:16:26 UTC
[3/4] pdfbox-docs git commit: Site checkin for project Apache PDFBox Website

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/799b1d78/content/1.8/cookbook/workingwithmetadata.html
----------------------------------------------------------------------
diff --git a/content/1.8/cookbook/workingwithmetadata.html b/content/1.8/cookbook/workingwithmetadata.html
index e188065..720b8b8 100644
--- a/content/1.8/cookbook/workingwithmetadata.html
+++ b/content/1.8/cookbook/workingwithmetadata.html
@@ -135,7 +135,7 @@
 <h2 id="introduction">Introduction</h2>
 
 <p>PDF documents can contain information describing the document itself or certain objects 
-within the document such as the author of the document or it&#39;s creation date. 
+within the document such as the author of the document or it\u2019s creation date. 
 Basic information can be set and retrieved using the PDDocumentInformation object.</p>
 
 <p>In addition to that more metadata can be retrieved using the XML metadata as decribed below.
@@ -143,7 +143,8 @@ Getting basic Metadata</p>
 
 <p>To set or retrieve basic information about the document the PDDocumentInformation object 
 provides a high level API to that information:</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">PDDocumentInformation</span> <span class="n">info</span> <span class="o">=</span> <span class="n">document</span><span class="o">.</span><span class="na">getDocumentInformation</span><span class="o">();</span>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="n">PDDocumentInformation</span> <span class="n">info</span> <span class="o">=</span> <span class="n">document</span><span class="o">.</span><span class="na">getDocumentInformation</span><span class="o">();</span>
 <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span> <span class="s">"Page Count="</span> <span class="o">+</span> <span class="n">document</span><span class="o">.</span><span class="na">getNumberOfPages</span><span class="o">()</span> <span class="o">);</span>
 <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span> <span class="s">"Title="</span> <span class="o">+</span> <span class="n">info</span><span class="o">.</span><span class="na">getTitle</span><span class="o">()</span> <span class="o">);</span>
 <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span> <span class="s">"Author="</span> <span class="o">+</span> <span class="n">info</span><span class="o">.</span><span class="na">getAuthor</span><span class="o">()</span> <span class="o">);</span>
@@ -154,26 +155,32 @@ provides a high level API to that information:</p>
 <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span> <span class="s">"Creation Date="</span> <span class="o">+</span> <span class="n">info</span><span class="o">.</span><span class="na">getCreationDate</span><span class="o">()</span> <span class="o">);</span>
 <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span> <span class="s">"Modification Date="</span> <span class="o">+</span> <span class="n">info</span><span class="o">.</span><span class="na">getModificationDate</span><span class="o">());</span>
 <span class="n">System</span><span class="o">.</span><span class="na">out</span><span class="o">.</span><span class="na">println</span><span class="o">(</span> <span class="s">"Trapped="</span> <span class="o">+</span> <span class="n">info</span><span class="o">.</span><span class="na">getTrapped</span><span class="o">()</span> <span class="o">);</span>      
-</code></pre></div>
+</code></pre>
+</div>
+
 <h2 id="accessing-pdf-metadata">Accessing PDF Metadata</h2>
 
-<p>See class:org.apache.pdfbox.pdmodel.common.PDMetadata<br>
-See example:AddMetadataFromDocInfo<br>
-See Adobe Documentation:XMP Specification  </p>
+<p>See class:org.apache.pdfbox.pdmodel.common.PDMetadata<br />
+See example:AddMetadataFromDocInfo<br />
+See Adobe Documentation:XMP Specification</p>
 
 <p>PDF documents can have XML metadata associated with certain objects within a PDF document.
 For example, the following PD Model objects have the ability to contain metadata:</p>
-<div class="highlight"><pre><code class="language-" data-lang="">PDDocumentCatalog
+
+<div class="highlighter-rouge"><pre class="highlight"><code>PDDocumentCatalog
 PDPage
 PDXObject
 PDICCBased
 PDStream
-</code></pre></div>
+</code></pre>
+</div>
+
 <p>The metadata that is stored in PDF objects conforms to the XMP specification, it is 
 recommended that you review that specification. Currently there is no high level API for 
 managing the XML metadata, PDFBox uses standard java InputStream/OutputStream to retrieve 
 or set the XML metadata.</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">PDDocument</span> <span class="n">doc</span> <span class="o">=</span> <span class="n">PDDocument</span><span class="o">.</span><span class="na">load</span><span class="o">(</span> <span class="o">...</span> <span class="o">);</span>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="n">PDDocument</span> <span class="n">doc</span> <span class="o">=</span> <span class="n">PDDocument</span><span class="o">.</span><span class="na">load</span><span class="o">(</span> <span class="o">...</span> <span class="o">);</span>
 <span class="n">PDDocumentCatalog</span> <span class="n">catalog</span> <span class="o">=</span> <span class="n">doc</span><span class="o">.</span><span class="na">getDocumentCatalog</span><span class="o">();</span>
 <span class="n">PDMetadata</span> <span class="n">metadata</span> <span class="o">=</span> <span class="n">catalog</span><span class="o">.</span><span class="na">getMetadata</span><span class="o">();</span>
 
@@ -184,7 +191,9 @@ or set the XML metadata.</p>
 <span class="n">InputStream</span> <span class="n">newXMPData</span> <span class="o">=</span> <span class="o">...;</span>
 <span class="n">PDMetadata</span> <span class="n">newMetadata</span> <span class="o">=</span> <span class="k">new</span> <span class="n">PDMetadata</span><span class="o">(</span><span class="n">doc</span><span class="o">,</span> <span class="n">newXMLData</span><span class="o">,</span> <span class="kc">false</span> <span class="o">);</span>
 <span class="n">catalog</span><span class="o">.</span><span class="na">setMetadata</span><span class="o">(</span> <span class="n">newMetadata</span> <span class="o">);</span>
-</code></pre></div>
+</code></pre>
+</div>
+
             </div>
         </div>
     </div>

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/799b1d78/content/1.8/dependencies.html
----------------------------------------------------------------------
diff --git a/content/1.8/dependencies.html b/content/1.8/dependencies.html
index d4de865..4e6f649 100644
--- a/content/1.8/dependencies.html
+++ b/content/1.8/dependencies.html
@@ -138,38 +138,39 @@
 
 <p class="alert alert-info">These components are needed during runtime, development and testing dependent on the details below.</p>
 
-<p>The three PDFBox components are named <code>pdfbox</code>, <code>fontbox</code> and <code>jempbox</code>. The Maven groupId of all PDFBox components is org.apache.pdfbox.</p>
+<p>The three PDFBox components are named <code class="highlighter-rouge">pdfbox</code>, <code class="highlighter-rouge">fontbox</code> and <code class="highlighter-rouge">jempbox</code>. The Maven groupId of all PDFBox components is org.apache.pdfbox.</p>
 
 <h3 id="minimum-requirement">Minimum Requirement</h3>
 
 <ul>
-<li>Java 1.5</li>
-<li><a href="http://commons.apache.org/logging/">commons-logging</a></li>
+  <li>Java 1.5</li>
+  <li><a href="http://commons.apache.org/logging/">commons-logging</a></li>
 </ul>
 
 <p>The main PDFBox component, pdfbox, has a hard dependency on the <a href="http://commons.apache.org/logging/">commons-logging</a> library.
-Commons Logging is a generic wrapper around different logging frameworks, so you&#39;ll either need to also use a logging library like <a href="http://logging.apache.org/log4j/">log4j</a>
+Commons Logging is a generic wrapper around different logging frameworks, so you\u2019ll either need to also use a logging library like <a href="http://logging.apache.org/log4j/">log4j</a>
 or let commons-logging fall back to the standard <a href="http://java.sun.com/j2se/1.4.2/docs/guide/util/logging/overview.html">java.util.logging API</a>
 included in the Java platform.</p>
 
 <p>For <strong>PDFBox Preflight only</strong> <a href="https://commons.apache.org/proper/commons-io/">commons-io 1.4</a> is needed.</p>
 
 <h3 id="font-handling">Font Handling</h3>
-
 <p>For font handling the fontbox component is needed.</p>
 
 <h3 id="xmp-metadata">XMP Metadata</h3>
-
 <p>To support XMP metadata the jembox component is needed.</p>
 
 <p>To add the pdfbox, fontbox, jempbox and commons-logging jars to your application, the easiest thing is to declare the Maven dependency shown below. This gives you the main
 pdfbox library directly and the other required jars as transitive dependencies.</p>
-<div class="highlight"><pre><code class="language-" data-lang="">&lt;dependency&gt;
+
+<div class="highlighter-rouge"><pre class="highlight"><code>&lt;dependency&gt;
   &lt;groupId&gt;org.apache.pdfbox&lt;/groupId&gt;
   &lt;artifactId&gt;pdfbox&lt;/artifactId&gt;
   &lt;version&gt;...&lt;/version&gt;
 &lt;/dependency&gt;
-</code></pre></div>
+</code></pre>
+</div>
+
 <p>Set the version field to the latest stable PDFBox version.</p>
 
 <h2 id="optional-dependencies">Optional dependencies</h2>
@@ -178,20 +179,20 @@ pdfbox library directly and the other required jars as transitive dependencies.<
 
 <h3 id="extented-image-format-support">Extented Image Format Support</h3>
 
-<p>To support JBIG2 and writing TIFF images additional libraries are needed. </p>
+<p>To support JBIG2 and writing TIFF images additional libraries are needed.</p>
 
 <p class="alert alert-warning">The image plugins described below are not part of the PDFBox distribution because of incompatible licensing terms. Please make sure to check if the licensing terms are compatible to your usage.</p>
 
 <p>For <strong>JBIG2</strong> support a Java ImageIO Plugin such as the <a href="https://github.com/levigo/jbig2-imageio">Levigo Plugin</a> or <a href="https://github.com/Borisvl/JBIG2-Image-Decoder">JBIG2-Image-Decoder
-</a> will be needed. </p>
+</a> will be needed.</p>
 
-<p>To write <strong>TIFF</strong> images a JAI ImageIO Core library will be needed. </p>
+<p>To write <strong>TIFF</strong> images a JAI ImageIO Core library will be needed.</p>
 
 <h4 id="pdf-encryption-and-signing">PDF Encryption and Signing</h4>
-
 <p>The most notable such optional feature is support for PDF encryption. Instead of implementing its own encryption algorithms, PDFBox uses libraries from the 
 <a href="http://www.bouncycastle.org/">Legion of the Bouncy Castle</a>. Both the bcprov and bcmail libraries are needed and can be included using the Maven dependencies shown below.</p>
-<div class="highlight"><pre><code class="language-" data-lang="">&lt;dependency&gt;
+
+<div class="highlighter-rouge"><pre class="highlight"><code>&lt;dependency&gt;
   &lt;groupId&gt;org.bouncycastle&lt;/groupId&gt;
   &lt;artifactId&gt;bcprov-jdk15&lt;/artifactId&gt;
   &lt;version&gt;1.44&lt;/version&gt;
@@ -201,30 +202,34 @@ pdfbox library directly and the other required jars as transitive dependencies.<
   &lt;artifactId&gt;bcmail-jdk15&lt;/artifactId&gt;
   &lt;version&gt;1.44&lt;/version&gt;
 &lt;/dependency&gt;
-</code></pre></div>
-<p><br/></p>
+</code></pre>
+</div>
 
-<h4 id="support-for-bidirectional-languages">Support for bidirectional languages</h4>
+<p><br /></p>
 
+<h4 id="support-for-bidirectional-languages">Support for bidirectional languages</h4>
 <p>Another important optional feature is support for bidirectional languages like Arabic. PDFBox uses the ICU4J library from the 
 <a href="http://site.icu-project.org/">International Components for Unicode</a> (ICU) project to support such languages in PDF documents. To add the ICU4J jar to your project, 
 use the following Maven dependency.</p>
-<div class="highlight"><pre><code class="language-" data-lang="">&lt;dependency&gt;
+
+<div class="highlighter-rouge"><pre class="highlight"><code>&lt;dependency&gt;
   &lt;groupId&gt;com.ibm.icu&lt;/groupId&gt;
   &lt;artifactId&gt;icu4j&lt;/artifactId&gt;
   &lt;version&gt;3.8&lt;/version&gt;
 &lt;/dependency&gt;
-</code></pre></div>
+</code></pre>
+</div>
+
 <p>PDFBox also contains extra support for use with the <a href="http://lucene.apache.org/">Lucene</a> and <a href="http://ant.apache.org/">Ant</a> projects. Since in these cases PDFBox is just an
 add-on feature to these projects, you should first set up your application to use Lucene or Ant and then add PDFBox support as described on this page.</p>
 
 <h2 id="dependencies-for-ant-builds">Dependencies for Ant builds</h2>
 
-<p>The above instructions expect that you&#39;re using <a href="http://maven.apache.org/">Maven</a> or another build tool like <a href="http://ant.apache.org/ivy/">Ivy</a> that supports Maven dependencies.
-If you instead use tools like <a href="http://ant.apache.org/">Ant</a> where you need to explicitly include all the required library jars in your application, you&#39;ll need to do
+<p>The above instructions expect that you\u2019re using <a href="http://maven.apache.org/">Maven</a> or another build tool like <a href="http://ant.apache.org/ivy/">Ivy</a> that supports Maven dependencies.
+If you instead use tools like <a href="http://ant.apache.org/">Ant</a> where you need to explicitly include all the required library jars in your application, you\u2019ll need to do
 something different.</p>
 
-<p>The easiest approach is to run <code>mvn dependency:copy-dependencies</code> inside the pdfbox directory of the latest PDFBox source release. This will copy all the required and optional
+<p>The easiest approach is to run <code class="highlighter-rouge">mvn dependency:copy-dependencies</code> inside the pdfbox directory of the latest PDFBox source release. This will copy all the required and optional
 libraries discussed above into the pdfbox/target/dependencies directory. You can then simply copy all the libraries you need from this directory to your application.</p>
 
             </div>

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/799b1d78/content/1.8/faq.html
----------------------------------------------------------------------
diff --git a/content/1.8/faq.html b/content/1.8/faq.html
index 7442c5c..aafc4bd 100644
--- a/content/1.8/faq.html
+++ b/content/1.8/faq.html
@@ -135,54 +135,61 @@
 <h3 id="general-questions">General Questions</h3>
 
 <ul>
-<li><a href="#log4j">I am getting the below Log4J warning message, how do I remove it?</a></li>
-<li><a href="#threadsafe">Is PDFBox thread safe?</a></li>
-<li><a href="#notclosed">Why do I get a &quot;Warning: You did not close the PDF Document&quot;?</a></li>
+  <li><a href="#log4j">I am getting the below Log4J warning message, how do I remove it?</a></li>
+  <li><a href="#threadsafe">Is PDFBox thread safe?</a></li>
+  <li><a href="#notclosed">Why do I get a \u201cWarning: You did not close the PDF Document\u201d?</a></li>
 </ul>
 
 <h3 id="text-extraction">Text Extraction</h3>
 
 <ul>
-<li><a href="#notext">How come I am not getting any text from the PDF document?</a></li>
-<li><a href="#gibberish">How come I am getting gibberish(G38G43G36G51G5) when extracting text?</a></li>
-<li><a href="#fontwidth">What does &quot;java.io.IOException: Can&#39;t handle font width&quot; mean?</a></li>
-<li><a href="#permission">Why do I get &quot;You do not have permission to extract text&quot; on some documents?</a></li>
-<li><a href="#partially">Can&#39;t we just extract the text without parsing the whole document or extract text as it is parsed?</a></li>
+  <li><a href="#notext">How come I am not getting any text from the PDF document?</a></li>
+  <li><a href="#gibberish">How come I am getting gibberish(G38G43G36G51G5) when extracting text?</a></li>
+  <li><a href="#fontwidth">What does \u201cjava.io.IOException: Can\u2019t handle font width\u201d mean?</a></li>
+  <li><a href="#permission">Why do I get \u201cYou do not have permission to extract text\u201d on some documents?</a></li>
+  <li><a href="#partially">Can\u2019t we just extract the text without parsing the whole document or extract text as it is parsed?</a></li>
 </ul>
 
-<h2 id="general-questions">General Questions</h2>
+<h2 id="general-questions-1">General Questions</h2>
 
-<p><a name="log4j"></a></p>
+<p><a name="log4j"></a>
+### I am getting the below Log4J warning message, how do I remove it? ###</p>
 
-<h3 id="i-am-getting-the-below-log4j-warning-message-how-do-i-remove-it">I am getting the below Log4J warning message, how do I remove it?</h3>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="nl">log4j:</span><span class="n">WARN</span> <span class="n">No</span> <span class="n">appenders</span> <span class="n">could</span> <span class="n">be</span> <span class="n">found</span> <span class="k">for</span> <span class="n">logger</span> <span class="o">(</span><span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">pdfbox</span><span class="o">.</span><span class="na">util</span><span class="o">.</span><span class="na">ResourceLoader</span><span class="o">).</span>
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="nl">log4j:</span><span class="n">WARN</span> <span class="n">No</span> <span class="n">appenders</span> <span class="n">could</span> <span class="n">be</span> <span class="n">found</span> <span class="k">for</span> <span class="n">logger</span> <span class="o">(</span><span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">pdfbox</span><span class="o">.</span><span class="na">util</span><span class="o">.</span><span class="na">ResourceLoader</span><span class="o">).</span>
 <span class="nl">log4j:</span><span class="n">WARN</span> <span class="n">Please</span> <span class="n">initialize</span> <span class="n">the</span> <span class="n">log4j</span> <span class="n">system</span> <span class="n">properly</span><span class="o">.</span>
-</code></pre></div>
+</code></pre>
+</div>
+
 <p>This message means that you need to configure the log4j logging system.
 See the <a href="http://logging.apache.org/log4j/1.2/manual.html">log4j documentation</a> for more information.</p>
 
 <p>PDFBox comes with a sample log4j configuration file.  To use it you set a system property like this</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">java</span> <span class="o">-</span><span class="n">Dlog4j</span><span class="o">.</span><span class="na">configuration</span><span class="o">=</span><span class="n">log4j</span><span class="o">.</span><span class="na">xml</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">pdfbox</span><span class="o">.</span><span class="na">ExtractText</span> <span class="o">&lt;</span><span class="n">PDF</span><span class="o">-</span><span class="n">file</span><span class="o">&gt;</span> <span class="o">&lt;</span><span class="n">output</span><span class="o">-</span><span class="n">text</span><span class="o">-</span><span class="n">file</span><span class="o">&gt;</span>
-</code></pre></div>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="n">java</span> <span class="o">-</span><span class="n">Dlog4j</span><span class="o">.</span><span class="na">configuration</span><span class="o">=</span><span class="n">log4j</span><span class="o">.</span><span class="na">xml</span> <span class="n">org</span><span class="o">.</span><span class="na">apache</span><span class="o">.</span><span class="na">pdfbox</span><span class="o">.</span><span class="na">ExtractText</span> <span class="o">&lt;</span><span class="n">PDF</span><span class="o">-</span><span class="n">file</span><span class="o">&gt;</span> <span class="o">&lt;</span><span class="n">output</span><span class="o">-</span><span class="n">text</span><span class="o">-</span><span class="n">file</span><span class="o">&gt;</span>
+</code></pre>
+</div>
+
 <p>If this is not working for you then you may have to specify the log4j config file using a URL path, like this:</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">log4j</span><span class="o">.</span><span class="na">configuration</span><span class="o">=</span><span class="nl">file:</span><span class="c1">///&lt;path to config file&gt;</span>
-</code></pre></div>
-<p><a name="threadsafe"></a></p>
 
-<h3 id="is-pdfbox-thread-safe">Is PDFBox thread safe?</h3>
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="n">log4j</span><span class="o">.</span><span class="na">configuration</span><span class="o">=</span><span class="nl">file:</span><span class="c1">///&lt;path to config file&gt;</span>
+</code></pre>
+</div>
+
+<p><a name="threadsafe"></a>
+### Is PDFBox thread safe? ###</p>
 
 <p>No! Only one thread may access a single document at a time. You can have multiple threads
 each accessing their own PDDocument object.</p>
 
-<p><a name="notclosed"></a></p>
-
-<h3 id="why-do-i-get-a-quot-warning-you-did-not-close-the-pdf-document-quot">Why do I get a &quot;Warning: You did not close the PDF Document&quot;?</h3>
+<p><a name="notclosed"></a>
+### Why do I get a \u201cWarning: You did not close the PDF Document\u201d? ###</p>
 
 <p>You need to call close() on the PDDocument inside the finally block, if you
-don&#39;t then the document will not be closed properly.  Also, you must close all
+don\u2019t then the document will not be closed properly.  Also, you must close all
 PDDocument objects that get created.  The following code creates <strong>two</strong>
-PDDocument objects; one from the &quot;new PDDocument()&quot; and the second by the load method.</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">PDDocument</span> <span class="n">doc</span> <span class="o">=</span> <span class="k">new</span> <span class="n">PDDocument</span><span class="o">();</span>
+PDDocument objects; one from the \u201cnew PDDocument()\u201d and the second by the load method.</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="n">PDDocument</span> <span class="n">doc</span> <span class="o">=</span> <span class="k">new</span> <span class="n">PDDocument</span><span class="o">();</span>
 <span class="k">try</span>
 <span class="o">{</span>
    <span class="n">doc</span> <span class="o">=</span> <span class="n">PDDocument</span><span class="o">.</span><span class="na">load</span><span class="o">(</span> <span class="s">"my.pdf"</span> <span class="o">);</span>
@@ -194,27 +201,27 @@ PDDocument objects; one from the &quot;new PDDocument()&quot; and the second by
       <span class="n">doc</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
    <span class="o">}</span>
 <span class="o">}</span>
-</code></pre></div>
-<h2 id="text-extraction">Text Extraction</h2>
+</code></pre>
+</div>
 
-<p><a name="notext"></a></p>
+<h2 id="text-extraction-1">Text Extraction</h2>
 
-<h3 id="how-come-i-am-not-getting-any-text-from-the-pdf-document">How come I am not getting any text from the PDF document?</h3>
+<p><a name="notext"></a>
+### How come I am not getting any text from the PDF document? ###</p>
 
 <p>Text extraction from a pdf document is a complicated task and there are many factors
 involved that effect the possibility and accuracy of text extraction.  It would be helpful
 to the PDFBox team if you could try a couple things.</p>
 
 <ul>
-<li>Open the PDF in Acrobat and try to extract text from there.  If Acrobat can extract text then PDFBox 
-should be able to as well and it is a bug if it cannot.  If Acrobat cannot extract text then PDFBox &#39;probably&#39; cannot either.</li>
-<li>It might really be an image instead of text.  Some PDF documents are just images that have been scanned in.
-You can tell by using the selection tool in Acrobat, if you can&#39;t select any text then it is probably an image.</li>
+  <li>Open the PDF in Acrobat and try to extract text from there.  If Acrobat can extract text then PDFBox 
+should be able to as well and it is a bug if it cannot.  If Acrobat cannot extract text then PDFBox \u2018probably\u2019 cannot either.</li>
+  <li>It might really be an image instead of text.  Some PDF documents are just images that have been scanned in.
+You can tell by using the selection tool in Acrobat, if you can\u2019t select any text then it is probably an image.</li>
 </ul>
 
-<p><a name="gibberish"></a></p>
-
-<h3 id="how-come-i-am-getting-gibberish-g38g43g36g51g5-when-extracting-text">How come I am getting gibberish(G38G43G36G51G5) when extracting text?</h3>
+<p><a name="gibberish"></a>
+### How come I am getting gibberish(G38G43G36G51G5) when extracting text? ###</p>
 
 <p>This is because the characters in a PDF document can use a custom encoding
 instead of unicode or ASCII.  When you see gibberish text then it
@@ -222,36 +229,33 @@ probably means that a meaningless internal encoding is being used.  The
 only way to access the text is to use OCR.  This may be a future
 enhancement.</p>
 
-<p><a name="fontwidth"></a></p>
-
-<h3 id="what-does-quot-java-io-ioexception-can-39-t-handle-font-width-quot-mean">What does &quot;java.io.IOException: Can&#39;t handle font width&quot; mean?</h3>
+<p><a name="fontwidth"></a>
+### What does \u201cjava.io.IOException: Can\u2019t handle font width\u201d mean? ###</p>
 
-<p>This probably means that the &quot;Resources&quot; directory is not in your classpath. The
+<p>This probably means that the \u201cResources\u201d directory is not in your classpath. The
 Resources directory is included in the PDFBox jar so this is only a problem if you
 are building PDFBox yourself and not using the binary.</p>
 
-<p><a name="permission"></a></p>
-
-<h3 id="why-do-i-get-quot-you-do-not-have-permission-to-extract-text-quot-on-some-documents">Why do I get &quot;You do not have permission to extract text&quot; on some documents?</h3>
+<p><a name="permission"></a>
+### Why do I get \u201cYou do not have permission to extract text\u201d on some documents? ###</p>
 
 <p>PDF documents have certain security permissions that can be applied to them and two 
-passwords associated with them, a user password and a master password. If the &quot;cannot extract text&quot;
+passwords associated with them, a user password and a master password. If the \u201ccannot extract text\u201d
 permission bit is set then you need to decrypt the document with the master password in order
 to extract the text.</p>
 
-<p><a name="partially"></a></p>
-
-<h3 id="can-39-t-we-just-extract-the-text-without-parsing-the-whole-document-or-extract-text-as-it-is-parsed">Can&#39;t we just extract the text without parsing the whole document or extract text as it is parsed?</h3>
+<p><a name="partially"></a>
+### Can\u2019t we just extract the text without parsing the whole document or extract text as it is parsed? ###</p>
 
 <p>Not really, for a couple reasons.</p>
 
 <ul>
-<li>If the document is encrypted then you need to parse at least until the encryption dictionary before 
+  <li>If the document is encrypted then you need to parse at least until the encryption dictionary before 
 you can decrypt.</li>
-<li>Sometimes the PDFont contains vital information needed for text extraction.</li>
-<li>Text on a page does not have to be drawn in reading order. For example: if the page said &quot;Hello World&quot;,
-the pdf could have been written such that &quot;World&quot; gets drawn and then the cursor moves to the left and 
-the word &quot;Hello&quot; is drawn.</li>
+  <li>Sometimes the PDFont contains vital information needed for text extraction.</li>
+  <li>Text on a page does not have to be drawn in reading order. For example: if the page said \u201cHello World\u201d,
+the pdf could have been written such that \u201cWorld\u201d gets drawn and then the cursor moves to the left and 
+the word \u201cHello\u201d is drawn.</li>
 </ul>
 
             </div>

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/799b1d78/content/2.0/commandline.html
----------------------------------------------------------------------
diff --git a/content/2.0/commandline.html b/content/2.0/commandline.html
index 0965aed..24d07b8 100644
--- a/content/2.0/commandline.html
+++ b/content/2.0/commandline.html
@@ -137,7 +137,7 @@
 <p>See the <a href="/2.0/dependencies.html">Dependencies</a> page for instructions on how to set your classpath in order to run 
 PDFBox tools as Java applications.</p>
 
-<p><strong>Table of Contents</strong><br>
+<p><strong>Table of Contents</strong><br />
 <a href="#decrypt">Decrypt</a>
 <a href="#encrypt">Encrypt</a>
 <a href="#extractimages">ExtractImages</a>
@@ -157,282 +157,297 @@ PDFBox tools as Java applications.</p>
 
 <p>NOTE: You must have the owner password to decrypt the document!</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar Decrypt [OPTIONS] &lt;inputfile&gt; [outputfile]</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-password</td>
-<td>Password to the PDF or certificate in keystore.</td>
-</tr>
-<tr>
-<td>-keyStore</td>
-<td>Path to keystore that holds certificate to decrypt the document. This is only required if the document is encrypted with a certificate, otherwise only the password is required.</td>
-</tr>
-<tr>
-<td>-alias</td>
-<td>The alias to the certificate in the keystore.</td>
-</tr>
-<tr>
-<td>inputfile</td>
-<td>The PDF file to decrypt.</td>
-</tr>
-<tr>
-<td>outputfile</td>
-<td>The file to save the decrypted document to. If left blank then it will be the same as the input file.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar Decrypt [OPTIONS] &lt;inputfile&gt; [outputfile]</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-password</td>
+      <td>Password to the PDF or certificate in keystore.</td>
+    </tr>
+    <tr>
+      <td>-keyStore</td>
+      <td>Path to keystore that holds certificate to decrypt the document. This is only required if the document is encrypted with a certificate, otherwise only the password is required.</td>
+    </tr>
+    <tr>
+      <td>-alias</td>
+      <td>The alias to the certificate in the keystore.</td>
+    </tr>
+    <tr>
+      <td>inputfile</td>
+      <td>The PDF file to decrypt.</td>
+    </tr>
+    <tr>
+      <td>outputfile</td>
+      <td>The file to save the decrypted document to. If left blank then it will be the same as the input file.</td>
+    </tr>
+  </tbody>
+</table>
 
 <h2 id="encrypt">Encrypt</h2>
 
 <p>This application will encrypt a PDF document.</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar Encrypt [OPTIONS] &lt;password&gt; &lt;inputfile&gt;</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-O</td>
-<td></td>
-<td>The owner password to the PDF, ignored if -certFile is specified.</td>
-</tr>
-<tr>
-<td>-U</td>
-<td></td>
-<td>The user password to the PDF, ignored if -certFile is specified.</td>
-</tr>
-<tr>
-<td>-certFile</td>
-<td></td>
-<td>Path to X.509 cert file.</td>
-</tr>
-<tr>
-<td>-canAssemble</td>
-<td>true</td>
-<td>Set the assemble permission.</td>
-</tr>
-<tr>
-<td>-canExtractContent</td>
-<td>true</td>
-<td>Set the extraction permission.</td>
-</tr>
-<tr>
-<td>-canExtractForAccessibility</td>
-<td>true</td>
-<td>Set the extraction permission.</td>
-</tr>
-<tr>
-<td>-canFillInForm</td>
-<td>true</td>
-<td>Set the fill in form permission.</td>
-</tr>
-<tr>
-<td>-canModify</td>
-<td>true</td>
-<td>Set the modify permission.</td>
-</tr>
-<tr>
-<td>-canModifyAnnotations</td>
-<td>true</td>
-<td>Set the modify annots permission.</td>
-</tr>
-<tr>
-<td>-canPrint</td>
-<td>true</td>
-<td>Set the print permission.</td>
-</tr>
-<tr>
-<td>-canPrintDegraded</td>
-<td>true</td>
-<td>Set the print degraded permission.</td>
-</tr>
-<tr>
-<td>-keyLength</td>
-<td>40, 128 or 256</td>
-<td>The number of bits for the encryption key. For 128 and above bits <a href="http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html">Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files</a> must be installed.</td>
-</tr>
-<tr>
-<td>inputfile</td>
-<td></td>
-<td>The PDF file to encrypt.</td>
-</tr>
-<tr>
-<td>outputfile</td>
-<td></td>
-<td>The file to save the encrypted document to. If left blank then it will be the same as the input file.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar Encrypt [OPTIONS] &lt;password&gt; &lt;inputfile&gt;</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-O</td>
+      <td>�</td>
+      <td>The owner password to the PDF, ignored if -certFile is specified.</td>
+    </tr>
+    <tr>
+      <td>-U</td>
+      <td>�</td>
+      <td>The user password to the PDF, ignored if -certFile is specified.</td>
+    </tr>
+    <tr>
+      <td>-certFile</td>
+      <td>�</td>
+      <td>Path to X.509 cert file.</td>
+    </tr>
+    <tr>
+      <td>-canAssemble</td>
+      <td>true</td>
+      <td>Set the assemble permission.</td>
+    </tr>
+    <tr>
+      <td>-canExtractContent</td>
+      <td>true</td>
+      <td>Set the extraction permission.</td>
+    </tr>
+    <tr>
+      <td>-canExtractForAccessibility</td>
+      <td>true</td>
+      <td>Set the extraction permission.</td>
+    </tr>
+    <tr>
+      <td>-canFillInForm</td>
+      <td>true</td>
+      <td>Set the fill in form permission.</td>
+    </tr>
+    <tr>
+      <td>-canModify</td>
+      <td>true</td>
+      <td>Set the modify permission.</td>
+    </tr>
+    <tr>
+      <td>-canModifyAnnotations</td>
+      <td>true</td>
+      <td>Set the modify annots permission.</td>
+    </tr>
+    <tr>
+      <td>-canPrint</td>
+      <td>true</td>
+      <td>Set the print permission.</td>
+    </tr>
+    <tr>
+      <td>-canPrintDegraded</td>
+      <td>true</td>
+      <td>Set the print degraded permission.</td>
+    </tr>
+    <tr>
+      <td>-keyLength</td>
+      <td>40, 128 or 256</td>
+      <td>The number of bits for the encryption key. For 128 and above bits <a href="http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html">Java Cryptography Extension (JCE) Unlimited Strength Jurisdiction Policy Files</a> must be installed.</td>
+    </tr>
+    <tr>
+      <td>inputfile</td>
+      <td>�</td>
+      <td>The PDF file to encrypt.</td>
+    </tr>
+    <tr>
+      <td>outputfile</td>
+      <td>�</td>
+      <td>The file to save the encrypted document to. If left blank then it will be the same as the input file.</td>
+    </tr>
+  </tbody>
+</table>
 
 <h2 id="extractimages">ExtractImages</h2>
 
 <p>This application will extract all images from the given PDF document.</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar ExtractImages [OPTIONS] &lt;inputfile&gt;</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-password</td>
-<td></td>
-<td>The password to the PDF document.</td>
-</tr>
-<tr>
-<td>-prefix</td>
-<td>PDF name</td>
-<td>Image prefix to use.</td>
-</tr>
-<tr>
-<td>-directJPEG</td>
-<td>false</td>
-<td>Forces the direct extraction of JPEG images regardless of colorspace.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar ExtractImages [OPTIONS] &lt;inputfile&gt;</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-password</td>
+      <td>�</td>
+      <td>The password to the PDF document.</td>
+    </tr>
+    <tr>
+      <td>-prefix</td>
+      <td>PDF name</td>
+      <td>Image prefix to use.</td>
+    </tr>
+    <tr>
+      <td>-directJPEG</td>
+      <td>false</td>
+      <td>Forces the direct extraction of JPEG images regardless of colorspace.</td>
+    </tr>
+  </tbody>
+</table>
 
 <h2 id="extracttext">ExtractText</h2>
 
 <p>This application will extract all text from the given PDF document.</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar ExtractText [OPTIONS] &lt;inputfile&gt; [Text file]</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-password</td>
-<td></td>
-<td>The password to the PDF document.</td>
-</tr>
-<tr>
-<td>-encoding</td>
-<td>default encoding</td>
-<td>The encoding type of the text file, e.g. ISO-8859-1, UTF-8, UTF-16BE.</td>
-</tr>
-<tr>
-<td>-console</td>
-<td>false</td>
-<td>Send text to console instead of file.</td>
-</tr>
-<tr>
-<td>-html</td>
-<td>false</td>
-<td>Output in HTML format instead of raw text.</td>
-</tr>
-<tr>
-<td>-sort</td>
-<td>false</td>
-<td>Sort the text before writing.</td>
-</tr>
-<tr>
-<td>-ignoreBeads</td>
-<td>false</td>
-<td>Disables the separation by beads.</td>
-</tr>
-<tr>
-<td>-force</td>
-<td>false</td>
-<td>Enables pdfbox to ignore corrupt objects.</td>
-</tr>
-<tr>
-<td>-debug</td>
-<td>false</td>
-<td>Enables debug output about the time consumption of every stage.</td>
-</tr>
-<tr>
-<td>-startPage</td>
-<td>1</td>
-<td>The first page to extract, one based.</td>
-</tr>
-<tr>
-<td>-endPage</td>
-<td>Integer.MAX_INT</td>
-<td>The last page to extract, one based.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar ExtractText [OPTIONS] &lt;inputfile&gt; [Text file]</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-password</td>
+      <td>�</td>
+      <td>The password to the PDF document.</td>
+    </tr>
+    <tr>
+      <td>-encoding</td>
+      <td>default encoding</td>
+      <td>The encoding type of the text file, e.g. ISO-8859-1, UTF-8, UTF-16BE.</td>
+    </tr>
+    <tr>
+      <td>-console</td>
+      <td>false</td>
+      <td>Send text to console instead of file.</td>
+    </tr>
+    <tr>
+      <td>-html</td>
+      <td>false</td>
+      <td>Output in HTML format instead of raw text.</td>
+    </tr>
+    <tr>
+      <td>-sort</td>
+      <td>false</td>
+      <td>Sort the text before writing.</td>
+    </tr>
+    <tr>
+      <td>-ignoreBeads</td>
+      <td>false</td>
+      <td>Disables the separation by beads.</td>
+    </tr>
+    <tr>
+      <td>-force</td>
+      <td>false</td>
+      <td>Enables pdfbox to ignore corrupt objects.</td>
+    </tr>
+    <tr>
+      <td>-debug</td>
+      <td>false</td>
+      <td>Enables debug output about the time consumption of every stage.</td>
+    </tr>
+    <tr>
+      <td>-startPage</td>
+      <td>1</td>
+      <td>The first page to extract, one based.</td>
+    </tr>
+    <tr>
+      <td>-endPage</td>
+      <td>Integer.MAX_INT</td>
+      <td>The last page to extract, one based.</td>
+    </tr>
+  </tbody>
+</table>
 
 <h2 id="overlaypdf">OverlayPDF</h2>
 
 <p>This application will overlay one document with the content of another document</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar OverlayPDF &lt;input.pdf&gt; [OPTIONS] &lt;output.pdf&gt;</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>inputfile</td>
-<td></td>
-<td>The PDF file to be overlayed.</td>
-</tr>
-<tr>
-<td>defaultOverlay.pdf</td>
-<td></td>
-<td>Default overlay file.</td>
-</tr>
-<tr>
-<td>-odd oddPageOverlay.pdf</td>
-<td></td>
-<td>Overlay file used for odd pages.</td>
-</tr>
-<tr>
-<td>-even evenPageOverlay.pdf</td>
-<td></td>
-<td>Overlay file used for even pages.</td>
-</tr>
-<tr>
-<td>-first firstPageOverlay.pdf</td>
-<td></td>
-<td>Overlay file used for the first page.</td>
-</tr>
-<tr>
-<td>-last lastPageOverlay.pdf</td>
-<td></td>
-<td>Overlay file used for the last pages.</td>
-</tr>
-<tr>
-<td>-page pageNumber specificPageOverlay.pdf</td>
-<td></td>
-<td>overlay file used for the given page number, may occur more than once.</td>
-</tr>
-<tr>
-<td>-position</td>
-<td>background</td>
-<td>Where to put the overlay, foreground or background.</td>
-</tr>
-<tr>
-<td>outputfile</td>
-<td></td>
-<td>The resulting pdf file.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar OverlayPDF &lt;input.pdf&gt; [OPTIONS] &lt;output.pdf&gt;</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>inputfile</td>
+      <td>�</td>
+      <td>The PDF file to be overlayed.</td>
+    </tr>
+    <tr>
+      <td>defaultOverlay.pdf</td>
+      <td>�</td>
+      <td>Default overlay file.</td>
+    </tr>
+    <tr>
+      <td>-odd oddPageOverlay.pdf</td>
+      <td>�</td>
+      <td>Overlay file used for odd pages.</td>
+    </tr>
+    <tr>
+      <td>-even evenPageOverlay.pdf</td>
+      <td>�</td>
+      <td>Overlay file used for even pages.</td>
+    </tr>
+    <tr>
+      <td>-first firstPageOverlay.pdf</td>
+      <td>�</td>
+      <td>Overlay file used for the first page.</td>
+    </tr>
+    <tr>
+      <td>-last lastPageOverlay.pdf</td>
+      <td>�</td>
+      <td>Overlay file used for the last pages.</td>
+    </tr>
+    <tr>
+      <td>-page pageNumber specificPageOverlay.pdf</td>
+      <td>�</td>
+      <td>overlay file used for the given page number, may occur more than once.</td>
+    </tr>
+    <tr>
+      <td>-position</td>
+      <td>background</td>
+      <td>Where to put the overlay, foreground or background.</td>
+    </tr>
+    <tr>
+      <td>outputfile</td>
+      <td>�</td>
+      <td>The resulting pdf file.</td>
+    </tr>
+  </tbody>
+</table>
 
 <p>Examples:</p>
 
 <ul>
-<li>OverlayPDF input.pdf overlay.pdf output.pdf</li>
-<li>OverlayPDF input.pdf defaultOverlay.pdf -page 10 overlayForPage10.pdf -position foreground output.pdf</li>
-<li>OverlayPDF input.pdf -odd oddOverlay.pdf -even evenOverlay.pdf output.pdf</li>
+  <li>OverlayPDF input.pdf overlay.pdf output.pdf</li>
+  <li>OverlayPDF input.pdf defaultOverlay.pdf -page 10 overlayForPage10.pdf -position foreground output.pdf</li>
+  <li>OverlayPDF input.pdf -odd oddOverlay.pdf -even evenOverlay.pdf output.pdf</li>
 </ul>
 
 <h2 id="pdfdebugger">PDFDebugger</h2>
@@ -440,121 +455,130 @@ PDFBox tools as Java applications.</p>
 <p>This application will take an existing PDF document and allows to analyze and inspect the internal structure.
 It is used as replacement for the PDFReader which was removed in 2.0.0.</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar PDFDebugger [inputfile]</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-password</td>
-<td></td>
-<td>The password to the PDF document.</td>
-</tr>
-<tr>
-<td>-viewstructure</td>
-<td></td>
-<td>Activates the &quot;view structure&quot; view on startup.</td>
-</tr>
-<tr>
-<td>inputfile</td>
-<td></td>
-<td>the name of an optional PDF file to open.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar PDFDebugger [inputfile]</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-password</td>
+      <td>�</td>
+      <td>The password to the PDF document.</td>
+    </tr>
+    <tr>
+      <td>-viewstructure</td>
+      <td>�</td>
+      <td>Activates the \u201cview structure\u201d view on startup.</td>
+    </tr>
+    <tr>
+      <td>inputfile</td>
+      <td>�</td>
+      <td>the name of an optional PDF file to open.</td>
+    </tr>
+  </tbody>
+</table>
 
 <h2 id="pdfmerger">PDFMerger</h2>
 
 <p>This application will take a list of pdf documents and merge them, saving the result in a new document.</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar PDFMerger &lt;Source PDF files (2 ..n)&gt; &lt;Target PDF file&gt;</code></p>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar PDFMerger &lt;Source PDF files (2 ..n)&gt; &lt;Target PDF file&gt;</code></p>
 
 <h2 id="pdfsplit">PDFSplit</h2>
 
 <p>This application will take an existing PDF document and split it into a number of other documents</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar PDFSplit [OPTIONS] &lt;PDF file&gt;</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-password</td>
-<td></td>
-<td>The password to the PDF document.</td>
-</tr>
-<tr>
-<td>-split</td>
-<td></td>
-<td>Number of pages of every splitted part of the pdf.</td>
-</tr>
-<tr>
-<td>-startPage</td>
-<td></td>
-<td>The page to start at.</td>
-</tr>
-<tr>
-<td>-endPage</td>
-<td></td>
-<td>The page to stop at.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar PDFSplit [OPTIONS] &lt;PDF file&gt;</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-password</td>
+      <td>�</td>
+      <td>The password to the PDF document.</td>
+    </tr>
+    <tr>
+      <td>-split</td>
+      <td>�</td>
+      <td>Number of pages of every splitted part of the pdf.</td>
+    </tr>
+    <tr>
+      <td>-startPage</td>
+      <td>�</td>
+      <td>The page to start at.</td>
+    </tr>
+    <tr>
+      <td>-endPage</td>
+      <td>�</td>
+      <td>The page to stop at.</td>
+    </tr>
+  </tbody>
+</table>
 
 <p>Examples:</p>
 
 <ul>
-<li>PDFSplit -split 2 sample_with_13_pages.pdf will split the pdf in pieces of 2 pages each except the last which will contain 1 page only.</li>
-<li>PDFSplit -startPage 5 sample_with_13_pages.pdf will provide a pdf containing all pages of the source pdf starting at page 5</li>
-<li>PDFSplit -startPage 5 -endPage 10 sample_with_13_pages.pdf will provide a pdf containing all pages from 5 to 10 of the source pdf</li>
-<li>PDFSplit -split 2 -startPage 5 -endPage 10 sample_with_13_pages.pdf will provide 3 pdfs containing all pages from 5 to 10 of the source pdf 2 pages each</li>
+  <li>PDFSplit -split 2 sample_with_13_pages.pdf will split the pdf in pieces of 2 pages each except the last which will contain 1 page only.</li>
+  <li>PDFSplit -startPage 5 sample_with_13_pages.pdf will provide a pdf containing all pages of the source pdf starting at page 5</li>
+  <li>PDFSplit -startPage 5 -endPage 10 sample_with_13_pages.pdf will provide a pdf containing all pages from 5 to 10 of the source pdf</li>
+  <li>PDFSplit -split 2 -startPage 5 -endPage 10 sample_with_13_pages.pdf will provide 3 pdfs containing all pages from 5 to 10 of the source pdf 2 pages each</li>
 </ul>
 
 <h2 id="pdftoimage">PDFToImage</h2>
 
 <p>This application will create an image for every page in the PDF document.</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar PDFToImage [OPTIONS] &lt;PDF file&gt;</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-password</td>
-<td></td>
-<td>The password to the PDF document.</td>
-</tr>
-<tr>
-<td>-imageType</td>
-<td>jpg</td>
-<td>The image type to write to. Currently only jpg or png.</td>
-</tr>
-<tr>
-<td>-outputPrefix</td>
-<td>Name of PDF document</td>
-<td>The prefix to the image file.</td>
-</tr>
-<tr>
-<td>-startPage</td>
-<td>1</td>
-<td>The first page to convert, one based.</td>
-</tr>
-<tr>
-<td>-endPage</td>
-<td>Integer.MAX_INT</td>
-<td>The last page to convert, one based.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar PDFToImage [OPTIONS] &lt;PDF file&gt;</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-password</td>
+      <td>�</td>
+      <td>The password to the PDF document.</td>
+    </tr>
+    <tr>
+      <td>-imageType</td>
+      <td>jpg</td>
+      <td>The image type to write to. Currently only jpg or png.</td>
+    </tr>
+    <tr>
+      <td>-outputPrefix</td>
+      <td>Name of PDF document</td>
+      <td>The prefix to the image file.</td>
+    </tr>
+    <tr>
+      <td>-startPage</td>
+      <td>1</td>
+      <td>The first page to convert, one based.</td>
+    </tr>
+    <tr>
+      <td>-endPage</td>
+      <td>Integer.MAX_INT</td>
+      <td>The last page to convert, one based.</td>
+    </tr>
+  </tbody>
+</table>
 
 <h2 id="printpdf">PrintPDF</h2>
 
@@ -562,106 +586,116 @@ It is used as replacement for the PDFReader which was removed in 2.0.0.</p>
 
 <p class="alert alert-info">You must have the correct permissions to print the document!</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar PrintPDF [OPTIONS] &lt;inputfile&gt;</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-password</td>
-<td>The password to decrypt the PDF.</td>
-</tr>
-<tr>
-<td>-silentPrint</td>
-<td>Print the PDF without prompting for a printer.</td>
-</tr>
-<tr>
-<td>inputfile</td>
-<td>The PDF file to print.</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar PrintPDF [OPTIONS] &lt;inputfile&gt;</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-password</td>
+      <td>The password to decrypt the PDF.</td>
+    </tr>
+    <tr>
+      <td>-silentPrint</td>
+      <td>Print the PDF without prompting for a printer.</td>
+    </tr>
+    <tr>
+      <td>inputfile</td>
+      <td>The PDF file to print.</td>
+    </tr>
+  </tbody>
+</table>
 
 <h2 id="texttopdf">TextToPDF</h2>
 
 <p>This application will create a PDF document from a text file.</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar TextToPDF [OPTIONS] &lt;outputfile&gt; &lt;textfile&gt;</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-standardFont</td>
-<td>Helvetica</td>
-<td>The font to use for the text. Either this or -ttf should be specified but not both.</td>
-</tr>
-<tr>
-<td>-ttf</td>
-<td></td>
-<td>The TTF font to use for the text. Either this or -standardFont should be specified but not both.</td>
-</tr>
-<tr>
-<td>-fontSize</td>
-<td>10</td>
-<td>The size of the font to use.</td>
-</tr>
-</tbody></table>
-
-<p>The following font names can be used for the parameter <code>standardFont</code>:</p>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar TextToPDF [OPTIONS] &lt;outputfile&gt; &lt;textfile&gt;</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-standardFont</td>
+      <td>Helvetica</td>
+      <td>The font to use for the text. Either this or -ttf should be specified but not both.</td>
+    </tr>
+    <tr>
+      <td>-ttf</td>
+      <td>�</td>
+      <td>The TTF font to use for the text. Either this or -standardFont should be specified but not both.</td>
+    </tr>
+    <tr>
+      <td>-fontSize</td>
+      <td>10</td>
+      <td>The size of the font to use.</td>
+    </tr>
+  </tbody>
+</table>
+
+<p>The following font names can be used for the parameter <code class="highlighter-rouge">standardFont</code>:</p>
 
 <ul>
-<li>Courier</li>
-<li>Courier-Bold</li>
-<li>Courier-Oblique</li>
-<li>Courier-BoldOblique</li>
-<li>Helvetica</li>
-<li>Helvetica-Bold</li>
-<li>Helvetica-Oblique</li>
-<li>Helvetica-BoldOblique</li>
-<li>Symbol</li>
-<li>Times-Bold</li>
-<li>Times-Roman</li>
-<li>Times-Italic</li>
-<li>Times-BoldItalic</li>
-<li>ZapfDingbats</li>
+  <li>Courier</li>
+  <li>Courier-Bold</li>
+  <li>Courier-Oblique</li>
+  <li>Courier-BoldOblique</li>
+  <li>Helvetica</li>
+  <li>Helvetica-Bold</li>
+  <li>Helvetica-Oblique</li>
+  <li>Helvetica-BoldOblique</li>
+  <li>Symbol</li>
+  <li>Times-Bold</li>
+  <li>Times-Roman</li>
+  <li>Times-Italic</li>
+  <li>Times-BoldItalic</li>
+  <li>ZapfDingbats</li>
 </ul>
 
 <h2 id="writedecodeddoc">WriteDecodedDoc</h2>
 
 <p>An application to decompress PDF documents.</p>
 
-<p>usage: <code>java -jar pdfbox-app-2.y.z.jar WriteDecodedDoc &lt;input-file&gt; &lt;output-file&gt;</code></p>
-
-<table><thead>
-<tr>
-<th>Command Line Parameter</th>
-<th>Default</th>
-<th>Description</th>
-</tr>
-</thead><tbody>
-<tr>
-<td>-password</td>
-<td></td>
-<td>The password to the PDF document.</td>
-</tr>
-<tr>
-<td><input-file></td>
-<td></td>
-<td>The PDF file to decompress</td>
-</tr>
-<tr>
-<td><output-file></td>
-<td></td>
-<td>The destination PDF file</td>
-</tr>
-</tbody></table>
+<p>usage: <code class="highlighter-rouge">java -jar pdfbox-app-2.y.z.jar WriteDecodedDoc &lt;input-file&gt; &lt;output-file&gt;</code></p>
+
+<table>
+  <thead>
+    <tr>
+      <th>Command Line Parameter</th>
+      <th>Default</th>
+      <th>Description</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>-password</td>
+      <td>�</td>
+      <td>The password to the PDF document.</td>
+    </tr>
+    <tr>
+      <td><input-file></input-file></td>
+      <td>�</td>
+      <td>The PDF file to decompress</td>
+    </tr>
+    <tr>
+      <td><output-file></output-file></td>
+      <td>�</td>
+      <td>The destination PDF file</td>
+    </tr>
+  </tbody>
+</table>
+
 
             </div>
         </div>

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/799b1d78/content/2.0/cookbook/encryption.html
----------------------------------------------------------------------
diff --git a/content/2.0/cookbook/encryption.html b/content/2.0/cookbook/encryption.html
index af0064b..d8978f2 100644
--- a/content/2.0/cookbook/encryption.html
+++ b/content/2.0/cookbook/encryption.html
@@ -132,32 +132,35 @@
             <div class="col-xs-12 col-sm-9">
                 <h1 id="encrypting-a-file">Encrypting a file</h1>
 
-<p>PDF encryption requires two passwords: the &quot;user password&quot; to open and view the file with restricted permissions, the &quot;owner password&quot; to access the file with all permission.</p>
+<p>PDF encryption requires two passwords: the \u201cuser password\u201d to open and view the file with restricted permissions, the \u201cowner password\u201d to access the file with all permission.</p>
 
 <h2 id="load-and-save-encrypted">Load and save encrypted</h2>
 
 <p>This small sample shows how to encrypt a file so that it can be viewed, but not printed..</p>
-<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="n">PDDocument</span> <span class="n">doc</span> <span class="o">=</span> <span class="n">PDDocument</span><span class="o">.</span><span class="na">load</span><span class="o">(</span><span class="s">"filename.pdf"</span><span class="o">);</span>
+
+<div class="highlighter-rouge"><pre class="highlight"><code><span class="n">PDDocument</span> <span class="n">doc</span> <span class="o">=</span> <span class="n">PDDocument</span><span class="o">.</span><span class="na">load</span><span class="o">(</span><span class="s">"filename.pdf"</span><span class="o">);</span>
 
 <span class="c1">// Define the length of the encryption key.</span>
 <span class="c1">// Possible values are 40, 128 or 256.</span>
 <span class="kt">int</span> <span class="n">keyLength</span> <span class="o">=</span> <span class="mi">256</span><span class="o">;</span>
-
+    
 <span class="n">AccessPermission</span> <span class="n">ap</span> <span class="o">=</span> <span class="k">new</span> <span class="n">AccessPermission</span><span class="o">();</span>
-
+        
 <span class="c1">// disable printing, everything else is allowed</span>
 <span class="n">ap</span><span class="o">.</span><span class="na">setCanPrint</span><span class="o">(</span><span class="kc">false</span><span class="o">);</span>
-
+        
 <span class="c1">// owner password (to open the file with all permissions) is "12345"</span>
 <span class="c1">// user password (to open the file but with restricted permissions, is empty here) </span>
 <span class="n">StandardProtectionPolicy</span> <span class="n">spp</span> <span class="o">=</span> <span class="k">new</span> <span class="n">StandardProtectionPolicy</span><span class="o">(</span><span class="s">"12345"</span><span class="o">,</span> <span class="s">""</span><span class="o">,</span> <span class="n">ap</span><span class="o">);</span>
 <span class="n">spp</span><span class="o">.</span><span class="na">setEncryptionKeyLength</span><span class="o">(</span><span class="n">keyLength</span><span class="o">);</span>
 <span class="n">spp</span><span class="o">.</span><span class="na">setPermissions</span><span class="o">(</span><span class="n">ap</span><span class="o">);</span>
 <span class="n">doc</span><span class="o">.</span><span class="na">protect</span><span class="o">(</span><span class="n">spp</span><span class="o">);</span>
-
+        
 <span class="n">doc</span><span class="o">.</span><span class="na">save</span><span class="o">(</span><span class="s">"filename-encrypted.pdf"</span><span class="o">);</span>
 <span class="n">doc</span><span class="o">.</span><span class="na">close</span><span class="o">();</span>
-</code></pre></div>
+</code></pre>
+</div>
+
             </div>
         </div>
     </div>

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/799b1d78/content/2.0/dependencies.html
----------------------------------------------------------------------
diff --git a/content/2.0/dependencies.html b/content/2.0/dependencies.html
index 96a7d90..6c3f719 100644
--- a/content/2.0/dependencies.html
+++ b/content/2.0/dependencies.html
@@ -135,11 +135,11 @@
 <p>PDFBox has the following basic dependencies:</p>
 
 <ul>
-<li>Java 6</li>
-<li><a href="http://commons.apache.org/logging/">commons-logging</a></li>
+  <li>Java 6</li>
+  <li><a href="http://commons.apache.org/logging/">commons-logging</a></li>
 </ul>
 
-<p>Commons Logging is a generic wrapper around different logging frameworks, so you&#39;ll either need to also use a logging library like <a href="http://logging.apache.org/log4j/">log4j</a>
+<p>Commons Logging is a generic wrapper around different logging frameworks, so you\u2019ll either need to also use a logging library like <a href="http://logging.apache.org/log4j/">log4j</a>
 or let commons-logging fall back to the standard <a href="http://java.sun.com/j2se/1.4.2/docs/guide/util/logging/overview.html">java.util.logging API</a>
 included in the Java platform.</p>
 
@@ -149,15 +149,15 @@ included in the Java platform.</p>
 
 <p>PDFBox does not ship with all features enabled. Third party components are necessary to get full support for certain functionality.</p>
 
-<h3 id="jai-image-i-o">JAI Image I/O</h3>
+<h3 id="jai-image-io">JAI Image I/O</h3>
 
 <p>PDF supports embedded image files, however support for some formats require third party libraries which are distributed under terms incompatible with the Apache 2.0 license:</p>
 
 <ul>
-<li>Reading <strong>JBIG2</strong> images: <a href="https://github.com/levigo/jbig2-imageio">JBIG2 ImageIO</a> or <a href="https://github.com/Borisvl/JBIG2-Image-Decoder">JBIG2-Image-Decoder
+  <li>Reading <strong>JBIG2</strong> images: <a href="https://github.com/levigo/jbig2-imageio">JBIG2 ImageIO</a> or <a href="https://github.com/Borisvl/JBIG2-Image-Decoder">JBIG2-Image-Decoder
 </a></li>
-<li>Reading <strong>JPEG 2000 (JPX)</strong> images: <a href="https://java.net/projects/jai-imageio-core">JAI Image I/O Tools Core</a></li>
-<li>Writing <strong>TIFF</strong> images requires <em>JAI Image I/O Tools Core</em> also.</li>
+  <li>Reading <strong>JPEG 2000 (JPX)</strong> images: <a href="https://java.net/projects/jai-imageio-core">JAI Image I/O Tools Core</a></li>
+  <li>Writing <strong>TIFF</strong> images requires <em>JAI Image I/O Tools Core</em> also.</li>
 </ul>
 
 <p>These libraries are optional and will be loaded if present on the classpath, otherwise support for these image formats will be disabled and a warning will be logged when an unsupported image is encountered.</p>
@@ -167,7 +167,8 @@ included in the Java platform.</p>
 <h3 id="encryption-and-signing">Encryption and Signing</h3>
 
 <p>Encrypting and sigining PDFs requires the <em>bcprov</em>, <em>bcmail</em> and <em>bcpkix</em> libraries from the <a href="http://www.bouncycastle.org/">Legion of the Bouncy Castle</a>. These can be included in your Maven project using the following dependencies:</p>
-<div class="highlight"><pre><code class="language-" data-lang="">&lt;dependency&gt;
+
+<div class="highlighter-rouge"><pre class="highlight"><code>&lt;dependency&gt;
     &lt;groupId&gt;org.bouncycastle&lt;/groupId&gt;
     &lt;artifactId&gt;bcprov-jdk15on&lt;/artifactId&gt;
     &lt;version&gt;1.54&lt;/version&gt;
@@ -184,12 +185,17 @@ included in the Java platform.</p>
     &lt;artifactId&gt;bcpkix-jdk15on&lt;/artifactId&gt;
     &lt;version&gt;1.54&lt;/version&gt;
 &lt;/dependency&gt;
-</code></pre></div>
+</code></pre>
+</div>
+
 <h3 id="java-cryptography-extension-jce">Java Cryptography Extension (JCE)</h3>
 
-<p>256-bit AES encryption requires a JDK with &quot;unlimited strength&quot; cryptography, which requires extra files to be installed. For JDK 7, see <a href="http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html">Java Cryptography Extension (JCE)</a>. If these files are not installed, building PDFBox will throw an exception with the following message:</p>
-<div class="highlight"><pre><code class="language-" data-lang="">JCE unlimited strength jurisdiction policy files are not installed
-</code></pre></div>
+<p>256-bit AES encryption requires a JDK with \u201cunlimited strength\u201d cryptography, which requires extra files to be installed. For JDK 7, see <a href="http://www.oracle.com/technetwork/java/javase/downloads/jce-7-download-432124.html">Java Cryptography Extension (JCE)</a>. If these files are not installed, building PDFBox will throw an exception with the following message:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>JCE unlimited strength jurisdiction policy files are not installed
+</code></pre>
+</div>
+
             </div>
         </div>
     </div>

http://git-wip-us.apache.org/repos/asf/pdfbox-docs/blob/799b1d78/content/2.0/getting-started.html
----------------------------------------------------------------------
diff --git a/content/2.0/getting-started.html b/content/2.0/getting-started.html
index 1d580f8..84d5120 100644
--- a/content/2.0/getting-started.html
+++ b/content/2.0/getting-started.html
@@ -136,29 +136,31 @@
 
 <h2 id="maven">Maven</h2>
 
-<p>To use the latest 2.0 snapshot release from the SVN trunk, you&#39;ll need to add the following dependency:</p>
-<div class="highlight"><pre><code class="language-" data-lang="">&lt;dependency&gt;
+<p>To use the latest 2.0 snapshot release from the SVN trunk, you\u2019ll need to add the following dependency:</p>
+
+<div class="highlighter-rouge"><pre class="highlight"><code>&lt;dependency&gt;
   &lt;groupId&gt;org.apache.pdfbox&lt;/groupId&gt;
   &lt;artifactId&gt;pdfbox&lt;/artifactId&gt;
   &lt;version&gt;2.0.0&lt;/version&gt;
 &lt;/dependency&gt;
-</code></pre></div>
+</code></pre>
+</div>
+
 <h2 id="pdfbox-and-java-8">PDFBox and Java 8</h2>
 
 <p class="alert alert-warning">Important notice when using PDFBox with Java 8
 </p>
-
-<p>Due to the change of the java color management module towards &quot;LittleCMS&quot;, users can experience slow performance in color operations.
+<p>Due to the change of the java color management module towards \u201cLittleCMS\u201d, users can experience slow performance in color operations.
 Solution: disable LittleCMS in favour of the old KCMS (Kodak Color Management System):</p>
 
 <ul>
-<li>start with <code>-Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider</code>or call</li>
-<li><code>System.setProperty(&quot;sun.java2d.cmm&quot;, &quot;sun.java2d.cmm.kcms.KcmsServiceProvider&quot;);</code></li>
+  <li>start with <code class="highlighter-rouge">-Dsun.java2d.cmm=sun.java2d.cmm.kcms.KcmsServiceProvider</code>or call</li>
+  <li><code class="highlighter-rouge">System.setProperty("sun.java2d.cmm", "sun.java2d.cmm.kcms.KcmsServiceProvider");</code></li>
 </ul>
 
-<p>Sources:<br>
-<a href="http://www.subshell.com/en/subshell/blog/Wrong-Colors-in-Images-with-Java8-100.html">http://www.subshell.com/en/subshell/blog/Wrong-Colors-in-Images-with-Java8-100.html</a><br>
-<a href="https://bugs.openjdk.java.net/browse/JDK-8041125">https://bugs.openjdk.java.net/browse/JDK-8041125</a></p>
+<p>Sources:<br />
+http://www.subshell.com/en/subshell/blog/Wrong-Colors-in-Images-with-Java8-100.html<br />
+https://bugs.openjdk.java.net/browse/JDK-8041125</p>
 
             </div>
         </div>