You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by ma...@apache.org on 2014/06/14 00:21:08 UTC

svn commit: r1602533 [4/5] - in /incubator/samza/site: ./ community/ contribute/ learn/documentation/0.7.0/ learn/documentation/0.7.0/api/ learn/documentation/0.7.0/comparisons/ learn/documentation/0.7.0/container/ learn/documentation/0.7.0/introductio...

Modified: incubator/samza/site/learn/documentation/0.7.0/introduction/concepts.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/introduction/concepts.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/introduction/concepts.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/introduction/concepts.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,59 +86,93 @@
           </div>
 
           <div class="content">
-            <h2>Concepts</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Concepts</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
 <p>This page gives an introduction to the high-level concepts in Samza.</p>
 
-<h3>Streams</h3>
+<h3 id="toc_0">Streams</h3>
 
-<p>Samza processes <em>streams</em>. A stream is composed of immutable <em>messages</em> of a similar type or category. For example, a stream could be all the clicks on a website, or all the updates to a particular database table, or all the logs produced by a service, or any other type of event data. Messages can be appended to a stream or read from a stream. A stream can have any number of <em>consumers</em>, and reading from a stream doesn&#39;t delete the message (so each message is effectively broadcast to all consumers). Messages can optionally have an associated key which is used for partitioning, which we&#39;ll talk about in a second.</p>
+<p>Samza processes <em>streams</em>. A stream is composed of immutable <em>messages</em> of a similar type or category. For example, a stream could be all the clicks on a website, or all the updates to a particular database table, or all the logs produced by a service, or any other type of event data. Messages can be appended to a stream or read from a stream. A stream can have any number of <em>consumers</em>, and reading from a stream doesn&rsquo;t delete the message (so each message is effectively broadcast to all consumers). Messages can optionally have an associated key which is used for partitioning, which we&rsquo;ll talk about in a second.</p>
 
 <p>Samza supports pluggable <em>systems</em> that implement the stream abstraction: in <a href="https://kafka.apache.org/">Kafka</a> a stream is a topic, in a database we might read a stream by consuming updates from a table, in Hadoop we might tail a directory of files in HDFS.</p>
 
 <p><img src="/img/0.7.0/learn/documentation/introduction/job.png" alt="job"></p>
 
-<h3>Jobs</h3>
+<h3 id="toc_1">Jobs</h3>
 
 <p>A Samza <em>job</em> is code that performs a logical transformation on a set of input streams to append output messages to set of output streams.</p>
 
 <p>If scalability were not a concern, streams and jobs would be all we need. However, in order to scale the throughput of the stream processor, we chop streams and jobs up into smaller units of parallelism: <em>partitions</em> and <em>tasks</em>.</p>
 
-<h3>Partitions</h3>
+<h3 id="toc_2">Partitions</h3>
 
 <p>Each stream is broken into one or more partitions. Each partition in the stream is a totally ordered sequence of messages.</p>
 
 <p>Each message in this sequence has an identifier called the <em>offset</em>, which is unique per partition. The offset can be a sequential integer, byte offset, or string depending on the underlying system implementation.</p>
 
-<p>When a message is appended to a stream, it is appended to only one of the stream&#39;s partitions. The assignment of the message to its partition is done with a key chosen by the writer. For example, if the user ID is used as the key, that ensures that all messages related to a particular user end up in the same partition.</p>
+<p>When a message is appended to a stream, it is appended to only one of the stream&rsquo;s partitions. The assignment of the message to its partition is done with a key chosen by the writer. For example, if the user ID is used as the key, that ensures that all messages related to a particular user end up in the same partition.</p>
 
 <p><img src="/img/0.7.0/learn/documentation/introduction/stream.png" alt="stream"></p>
 
-<h3>Tasks</h3>
+<h3 id="toc_3">Tasks</h3>
 
-<p>A job is scaled by breaking it into multiple <em>tasks</em>. The <em>task</em> is the unit of parallelism of the job, just as the partition is to the stream. Each task consumes data from one partition for each of the job&#39;s input streams.</p>
+<p>A job is scaled by breaking it into multiple <em>tasks</em>. The <em>task</em> is the unit of parallelism of the job, just as the partition is to the stream. Each task consumes data from one partition for each of the job&rsquo;s input streams.</p>
 
 <p>A task processes messages from each of its input partitions sequentially, in the order of message offset. There is no defined ordering across partitions. This allows each task to operate independently. The YARN scheduler assigns each task to a machine, so the job as a whole can be distributed across many machines.</p>
 
-<p>The number of tasks in a job is determined by the number of input partitions (there cannot be more tasks than input partitions, or there would be some tasks with no input). However, you can change the computational resources assigned to the job (the amount of memory, number of CPU cores, etc.) to satisfy the job&#39;s needs. See notes on <em>containers</em> below.</p>
+<p>The number of tasks in a job is determined by the number of input partitions (there cannot be more tasks than input partitions, or there would be some tasks with no input). However, you can change the computational resources assigned to the job (the amount of memory, number of CPU cores, etc.) to satisfy the job&rsquo;s needs. See notes on <em>containers</em> below.</p>
 
 <p>The assignment of partitions to tasks never changes: if a task is on a machine that fails, the task is restarted elsewhere, still consuming the same stream partitions.</p>
 
 <p><img src="/img/0.7.0/learn/documentation/introduction/job_detail.png" alt="job-detail"></p>
 
-<h3>Dataflow Graphs</h3>
+<h3 id="toc_4">Dataflow Graphs</h3>
 
 <p>We can compose multiple jobs to create a dataflow graph, where the nodes are streams containing data, and the edges are jobs performing transformations. This composition is done purely through the streams the jobs take as input and output. The jobs are otherwise totally decoupled: they need not be implemented in the same code base, and adding, removing, or restarting a downstream job will not impact an upstream job.</p>
 
-<p>These graphs are often acyclic&mdash;that is, data usually doesn&#39;t flow from a job, through other jobs, back to itself. However, it is possible to create cyclic graphs if you need to.</p>
+<p>These graphs are often acyclic&mdash;that is, data usually doesn&rsquo;t flow from a job, through other jobs, back to itself. However, it is possible to create cyclic graphs if you need to.</p>
 
 <p><img src="/img/0.7.0/learn/documentation/introduction/dag.png" width="430" alt="Directed acyclic job graph"></p>
 
-<h3>Containers</h3>
+<h3 id="toc_5">Containers</h3>
 
-<p>Partitions and tasks are both <em>logical</em> units of parallelism&mdash;they don&#39;t correspond to any particular assignment of computational resources (CPU, memory, disk space, etc). Containers are the unit of physical parallelism, and a container is essentially a Unix process (or Linux <a href="http://en.wikipedia.org/wiki/Cgroups">cgroup</a>). Each container runs one or more tasks. The number of tasks is determined automatically from the number of partitions in the input and is fixed, but the number of containers (and the CPU and memory resources associated with them) is specified by the user at run time and can be changed at any time.</p>
+<p>Partitions and tasks are both <em>logical</em> units of parallelism&mdash;they don&rsquo;t correspond to any particular assignment of computational resources (CPU, memory, disk space, etc). Containers are the unit of physical parallelism, and a container is essentially a Unix process (or Linux <a href="http://en.wikipedia.org/wiki/Cgroups">cgroup</a>). Each container runs one or more tasks. The number of tasks is determined automatically from the number of partitions in the input and is fixed, but the number of containers (and the CPU and memory resources associated with them) is specified by the user at run time and can be changed at any time.</p>
 
-<h2><a href="architecture.html">Architecture &raquo;</a></h2>
+<h2 id="toc_6"><a href="architecture.html">Architecture &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/documentation/0.7.0/jobs/configuration-table.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/jobs/configuration-table.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/jobs/configuration-table.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/jobs/configuration-table.html Fri Jun 13 22:21:06 2014
@@ -1,3 +1,19 @@
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html>
   <body>
     <table cellspacing="2" border="1" cellpadding="2">

Modified: incubator/samza/site/learn/documentation/0.7.0/jobs/configuration.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/jobs/configuration.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/jobs/configuration.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/jobs/configuration.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,7 +86,41 @@
           </div>
 
           <div class="content">
-            <h2>Configuration</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Configuration</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
 <p>All Samza jobs have a configuration file that defines the job. A very basic configuration file looks like this:</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text"># Job
@@ -94,12 +144,12 @@ systems.example-system.samza.msg.serde=j
 
 <ol>
 <li>The job section defines things like the name of the job, and whether to use the YarnJobFactory or LocalJobFactory.</li>
-<li>The task section is where you specify the class name for your <a href="../api/overview.html">StreamTask</a>. It&#39;s also where you define what the <a href="../container/streams.html">input streams</a> are for your task.</li>
+<li>The task section is where you specify the class name for your <a href="../api/overview.html">StreamTask</a>. It&rsquo;s also where you define what the <a href="../container/streams.html">input streams</a> are for your task.</li>
 <li>The serializers section defines the classes of the <a href="../container/serialization.html">serdes</a> used for serialization and deserialization of specific objects that are received and sent along different streams.</li>
-<li>The system section defines systems that your StreamTask can read from along with the types of serdes used for sending keys and messages from that system. Usually, you&#39;ll define a Kafka system, if you&#39;re reading from Kafka, although you can also specify your own self-implemented Samza-compatible systems. See the <a href="/startup/hello-samza/0.7.0">hello-samza example project</a>&#39;s Wikipedia system for a good example of a self-implemented system.</li>
+<li>The system section defines systems that your StreamTask can read from along with the types of serdes used for sending keys and messages from that system. Usually, you&rsquo;ll define a Kafka system, if you&rsquo;re reading from Kafka, although you can also specify your own self-implemented Samza-compatible systems. See the <a href="/startup/hello-samza/0.7.0">hello-samza example project</a>&rsquo;s Wikipedia system for a good example of a self-implemented system.</li>
 </ol>
 
-<h3>Required Configuration</h3>
+<h3 id="toc_0">Required Configuration</h3>
 
 <p>Configuration keys that absolutely must be defined for a Samza job are:</p>
 
@@ -110,11 +160,11 @@ systems.example-system.samza.msg.serde=j
 <li>task.inputs</li>
 </ul>
 
-<h3>Configuration Keys</h3>
+<h3 id="toc_1">Configuration Keys</h3>
 
 <p>A complete list of configuration keys can be found on the <a href="configuration-table.html">Configuration Table</a> page.</p>
 
-<h2><a href="packaging.html">Packaging &raquo;</a></h2>
+<h2 id="toc_2"><a href="packaging.html">Packaging &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/documentation/0.7.0/jobs/job-runner.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/jobs/job-runner.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/jobs/job-runner.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/jobs/job-runner.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,7 +86,41 @@
           </div>
 
           <div class="content">
-            <h2>JobRunner</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>JobRunner</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
 <p>Samza jobs are started using a script called run-job.sh.</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">samza-example/target/bin/run-job.sh \
@@ -84,7 +134,7 @@
 </code></pre></div>
 <p>The Config object is just a wrapper around Map<String, String>, with some nice helper methods. Out of the box, Samza ships with the PropertiesConfigFactory, but developers can implement any kind of ConfigFactory they wish.</p>
 
-<p>Once the JobRunner gets your configuration, it gives your configuration to the StreamJobFactory class defined by the &quot;job.factory&quot; property. Samza ships with two job factory implementations: LocalJobFactory and YarnJobFactory. The StreamJobFactory&#39;s responsibility is to give the JobRunner a job that it can run.</p>
+<p>Once the JobRunner gets your configuration, it gives your configuration to the StreamJobFactory class defined by the &ldquo;job.factory&rdquo; property. Samza ships with two job factory implementations: LocalJobFactory and YarnJobFactory. The StreamJobFactory&rsquo;s responsibility is to give the JobRunner a job that it can run.</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">public interface StreamJob {
   StreamJob submit();
 
@@ -99,9 +149,9 @@
 </code></pre></div>
 <p>Once the JobRunner gets a job, it calls submit() on the job. This method is what tells the StreamJob implementation to start the SamzaContainer. In the case of LocalJobRunner, it uses a run-container.sh script to execute the SamzaContainer in a separate process, which will start one SamzaContainer locally on the machine that you ran run-job.sh on.</p>
 
-<p>This flow differs slightly when you use YARN, but we&#39;ll get to that later.</p>
+<p>This flow differs slightly when you use YARN, but we&rsquo;ll get to that later.</p>
 
-<h2><a href="configuration.html">Configuration &raquo;</a></h2>
+<h2 id="toc_0"><a href="configuration.html">Configuration &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/documentation/0.7.0/jobs/logging.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/jobs/logging.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/jobs/logging.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/jobs/logging.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,13 +86,47 @@
           </div>
 
           <div class="content">
-            <h2>Logging</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Logging</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
 <p>Samza uses <a href="http://www.slf4j.org/">SLF4J</a> for all of its logging. By default, Samza only depends on slf4j-api, so you must add an SLF4J runtime dependency to your Samza packages for whichever underlying logging platform you wish to use.</p>
 
-<h3>Log4j</h3>
+<h3 id="toc_0">Log4j</h3>
 
-<p>The <a href="/startup/hello-samza/0.7.0">hello-samza</a> project shows how to use <a href="http://logging.apache.org/log4j/1.2/">log4j</a> with Samza. To turn on log4j logging, you just need to make sure slf4j-log4j12 is in your SamzaContainer&#39;s classpath. In Maven, this can be done by adding the following dependency to your Samza package project.</p>
+<p>The <a href="/startup/hello-samza/0.7.0">hello-samza</a> project shows how to use <a href="http://logging.apache.org/log4j/1.2/">log4j</a> with Samza. To turn on log4j logging, you just need to make sure slf4j-log4j12 is in your SamzaContainer&rsquo;s classpath. In Maven, this can be done by adding the following dependency to your Samza package project.</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">&lt;dependency&gt;
   &lt;groupId&gt;org.slf4j&lt;/groupId&gt;
   &lt;artifactId&gt;slf4j-log4j12&lt;/artifactId&gt;
@@ -84,17 +134,17 @@
   &lt;version&gt;1.6.2&lt;/version&gt;
 &lt;/dependency&gt;
 </code></pre></div>
-<p>If you&#39;re not using Maven, just make sure that slf4j-log4j12 ends up in your Samza package&#39;s lib directory.</p>
+<p>If you&rsquo;re not using Maven, just make sure that slf4j-log4j12 ends up in your Samza package&rsquo;s lib directory.</p>
 
-<h4>Log4j configuration</h4>
+<h4 id="toc_1">Log4j configuration</h4>
 
-<p>Samza&#39;s <a href="packaging.html">run-class.sh</a> script will automatically set the following setting if log4j.xml exists in your <a href="packaging.html">Samza package&#39;s</a> lib directory.</p>
+<p>Samza&rsquo;s <a href="packaging.html">run-class.sh</a> script will automatically set the following setting if log4j.xml exists in your <a href="packaging.html">Samza package&rsquo;s</a> lib directory.</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">-Dlog4j.configuration=file:$base_dir/lib/log4j.xml
 </code></pre></div>
 <p>The <a href="packaging.html">run-class.sh</a> script will also set the following Java system properties:</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">-Dsamza.log.dir=$SAMZA_LOG_DIR -Dsamza.container.name=$SAMZA_CONTAINER_NAME
 </code></pre></div>
-<p>These settings are very useful if you&#39;re using a file-based appender. For example, you can use a daily rolling appender by configuring log4j.xml like this:</p>
+<p>These settings are very useful if you&rsquo;re using a file-based appender. For example, you can use a daily rolling appender by configuring log4j.xml like this:</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">&lt;appender name=&quot;RollingAppender&quot; class=&quot;org.apache.log4j.DailyRollingFileAppender&quot;&gt;
    &lt;param name=&quot;File&quot; value=&quot;${samza.log.dir}/${samza.container.name}.log&quot; /&gt;
    &lt;param name=&quot;DatePattern&quot; value=&quot;&#39;.&#39;yyyy-MM-dd&quot; /&gt;
@@ -103,32 +153,32 @@
    &lt;/layout&gt;
 &lt;/appender&gt;
 </code></pre></div>
-<p>Setting up a file-based appender is recommended as a better alternative to using standard out. Standard out log files (see below) don&#39;t roll, and can get quite large if used for logging.</p>
+<p>Setting up a file-based appender is recommended as a better alternative to using standard out. Standard out log files (see below) don&rsquo;t roll, and can get quite large if used for logging.</p>
 
 <p><strong>NOTE:</strong> If you use the task.opts configuration property, the log configuration is disrupted. This is a known bug; please see <a href="https://issues.apache.org/jira/browse/SAMZA-109">SAMZA-109</a> for a workaround.</p>
 
-<h3>Log Directory</h3>
+<h3 id="toc_2">Log Directory</h3>
 
 <p>Samza will look for the <em>SAMZA</em>_<em>LOG</em>_<em>DIR</em> environment variable when it executes. If this variable is defined, all logs will be written to this directory. If the environment variable is empty, or not defined, then Samza will use /tmp. This environment variable can also be referenced inside log4j.xml files (see above).</p>
 
-<h3>Garbage Collection Logging</h3>
+<h3 id="toc_3">Garbage Collection Logging</h3>
 
-<p>Samza&#39;s will automatically set the following garbage collection logging setting, and will output it to <em>$SAMZA</em>_<em>LOG</em>_<em>DIR</em>/gc.log.</p>
+<p>Samza&rsquo;s will automatically set the following garbage collection logging setting, and will output it to <em>$SAMZA</em>_<em>LOG</em>_<em>DIR</em>/gc.log.</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">-XX:+PrintGCDateStamps -Xloggc:$SAMZA_LOG_DIR/gc.log
 </code></pre></div>
-<h4>Rotation</h4>
+<h4 id="toc_4">Rotation</h4>
 
-<p>In older versions of Java, it is impossible to have GC logs roll over based on time or size without the use of a secondary tool. This means that your GC logs will never be deleted until a Samza job ceases to run. As of <a href="http://www.oracle.com/technetwork/java/javase/2col/6u34-bugfixes-1733379.html">Java 6 Update 34</a>, and <a href="http://www.oracle.com/technetwork/java/javase/7u2-relnotes-1394228.html">Java 7 Update 2</a>, <a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6941923">new GC command line switches</a> have been added to support this functionality. If you are using a version of Java that supports GC log rotation, it&#39;s highly recommended that you turn it on.</p>
+<p>In older versions of Java, it is impossible to have GC logs roll over based on time or size without the use of a secondary tool. This means that your GC logs will never be deleted until a Samza job ceases to run. As of <a href="http://www.oracle.com/technetwork/java/javase/2col/6u34-bugfixes-1733379.html">Java 6 Update 34</a>, and <a href="http://www.oracle.com/technetwork/java/javase/7u2-relnotes-1394228.html">Java 7 Update 2</a>, <a href="http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=6941923">new GC command line switches</a> have been added to support this functionality. If you are using a version of Java that supports GC log rotation, it&rsquo;s highly recommended that you turn it on.</p>
 
-<h3>YARN</h3>
+<h3 id="toc_5">YARN</h3>
 
 <p>When a Samza job executes on a YARN grid, the <em>$SAMZA</em>_<em>LOG</em>_<em>DIR</em> environment variable will point to a directory that is secured such that only the user executing the Samza job can read and write to it, if YARN is <a href="http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html">securely configured</a>.</p>
 
-<h4>STDOUT</h4>
+<h4 id="toc_6">STDOUT</h4>
 
-<p>Samza&#39;s <a href="../yarn/application-master.html">ApplicationMaster</a> pipes all STDOUT and STDERR output to logs/stdout and logs/stderr, respectively. These files are never rotated.</p>
+<p>Samza&rsquo;s <a href="../yarn/application-master.html">ApplicationMaster</a> pipes all STDOUT and STDERR output to logs/stdout and logs/stderr, respectively. These files are never rotated.</p>
 
-<h2><a href="../yarn/application-master.html">Application Master &raquo;</a></h2>
+<h2 id="toc_7"><a href="reprocessing.html">Reprocessing &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/documentation/0.7.0/jobs/packaging.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/jobs/packaging.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/jobs/packaging.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/jobs/packaging.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,13 +86,47 @@
           </div>
 
           <div class="content">
-            <h2>Packaging</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Packaging</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
-<p>The <a href="job-runner.html">JobRunner</a> page talks about run-job.sh, and how it&#39;s used to start a job either locally (LocalJobFactory) or with YARN (YarnJobFactory). In the diagram that shows the execution flow, it also shows a run-container.sh script. This script, along with a run-am.sh script, are what Samza actually calls to execute its code.</p>
+<p>The <a href="job-runner.html">JobRunner</a> page talks about run-job.sh, and how it&rsquo;s used to start a job either locally (LocalJobFactory) or with YARN (YarnJobFactory). In the diagram that shows the execution flow, it also shows a run-container.sh script. This script, along with a run-am.sh script, are what Samza actually calls to execute its code.</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">bin/run-am.sh
 bin/run-container.sh
 </code></pre></div>
-<p>The run-container.sh script is responsible for starting the <a href="../container/samza-container.html">SamzaContainer</a>. The run-am.sh script is responsible for starting Samza&#39;s application master for YARN. Thus, the run-am.sh script is only used by the YarnJob, but both YarnJob and ProcessJob use run-container.sh.</p>
+<p>The run-container.sh script is responsible for starting the <a href="../container/samza-container.html">SamzaContainer</a>. The run-am.sh script is responsible for starting Samza&rsquo;s application master for YARN. Thus, the run-am.sh script is only used by the YarnJob, but both YarnJob and ProcessJob use run-container.sh.</p>
 
 <p>Typically, these two scripts are bundled into a tar.gz file that has a structure like this:</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">bin/run-am.sh
@@ -85,13 +135,13 @@ bin/run-job.sh
 bin/run-container.sh
 lib/*.jar
 </code></pre></div>
-<p>To run a Samza job, you un-zip its tar.gz file, and execute the run-job.sh script, as defined in the JobRunner section. There are a number of interesting implications from this packaging scheme. First, you&#39;ll notice that there is no configuration in the package. Second, you&#39;ll notice that the lib directory contains all JARs that you&#39;ll need to run your Samza job.</p>
+<p>To run a Samza job, you un-zip its tar.gz file, and execute the run-job.sh script, as defined in the JobRunner section. There are a number of interesting implications from this packaging scheme. First, you&rsquo;ll notice that there is no configuration in the package. Second, you&rsquo;ll notice that the lib directory contains all JARs that you&rsquo;ll need to run your Samza job.</p>
 
-<p>The reason that configuration is decoupled from your Samza job packaging is that it allows configuration to be updated without having to re-build the entire Samza package. This makes life easier for everyone when you just need to tweak one parameter, and don&#39;t want to have to worry about which branch your package was built from, or whether trunk is in a stable state. It also has the added benefit of forcing configuration to be fully resolved at runtime. This means that that the configuration for a job is resolved at the time run-job.sh is called (using --config-path and --config-provider parameters), and from that point on, the configuration is immutable, and passed where it needs to be by Samza (and YARN, if you&#39;re using it).</p>
+<p>The reason that configuration is decoupled from your Samza job packaging is that it allows configuration to be updated without having to re-build the entire Samza package. This makes life easier for everyone when you just need to tweak one parameter, and don&rsquo;t want to have to worry about which branch your package was built from, or whether trunk is in a stable state. It also has the added benefit of forcing configuration to be fully resolved at runtime. This means that that the configuration for a job is resolved at the time run-job.sh is called (using &ndash;config-path and &ndash;config-provider parameters), and from that point on, the configuration is immutable, and passed where it needs to be by Samza (and YARN, if you&rsquo;re using it).</p>
 
 <p>The second statement, that your Samza package contains all JARs that it needs to run, means that a Samza package is entirely self contained. This allows Samza jobs to run on independent Samza versions without conflicting with each other. This is in contrast to Hadoop, where JARs are pulled in from the local machine that the job is running on (using environment variables). With Samza, you might run your job on version 0.7.0, and someone else might run their job on version 0.8.0. There is no problem with this.</p>
 
-<h2><a href="yarn-jobs.html">YARN Jobs &raquo;</a></h2>
+<h2 id="toc_0"><a href="yarn-jobs.html">YARN Jobs &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/documentation/0.7.0/jobs/yarn-jobs.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/jobs/yarn-jobs.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/jobs/yarn-jobs.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/jobs/yarn-jobs.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,18 +86,52 @@
           </div>
 
           <div class="content">
-            <h2>YARN Jobs</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>YARN Jobs</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
-<p>When you define job.factory.class=org.apache.samza.job.yarn.YarnJobFactory in your job&#39;s configuration, Samza will use YARN to execute your job. The YarnJobFactory will use the YARN_HOME environment variable on the machine that run-job.sh is executed on to get the appropriate YARN configuration, which will define where the YARN resource manager is. The YarnJob will work with the resource manager to get your job started on the YARN cluster.</p>
+<p>When you define job.factory.class=org.apache.samza.job.yarn.YarnJobFactory in your job&rsquo;s configuration, Samza will use YARN to execute your job. The YarnJobFactory will use the YARN_HOME environment variable on the machine that run-job.sh is executed on to get the appropriate YARN configuration, which will define where the YARN resource manager is. The YarnJob will work with the resource manager to get your job started on the YARN cluster.</p>
 
-<p>If you want to use YARN to run your Samza job, you&#39;ll also need to define the location of your Samza job&#39;s package. For example, you might say:</p>
+<p>If you want to use YARN to run your Samza job, you&rsquo;ll also need to define the location of your Samza job&rsquo;s package. For example, you might say:</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">yarn.package.path=http://my.http.server/jobs/ingraphs-package-0.0.55.tgz
 </code></pre></div>
 <p>This .tgz file follows the conventions outlined on the <a href="packaging.html">Packaging</a> page (it has bin/run-am.sh and bin/run-container.sh). YARN NodeManagers will take responsibility for downloading this .tgz file on the appropriate machines, and untar&#39;ing them. From there, YARN will execute run-am.sh or run-container.sh for the Samza Application Master, and SamzaContainer, respectively.</p>
 
 <!-- TODO document yarn.container.count and other key configs -->
 
-<h2><a href="logging.html">Logging &raquo;</a></h2>
+<h2 id="toc_0"><a href="logging.html">Logging &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/documentation/0.7.0/operations/kafka.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/operations/kafka.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/operations/kafka.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/operations/kafka.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,7 +86,41 @@
           </div>
 
           <div class="content">
-            <h2>Kafka</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Kafka</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
 <!-- TODO kafka page should be fleshed out a bit -->
 
@@ -78,9 +128,9 @@
 
 <p>Kafka has a great <a href="http://kafka.apache.org/08/ops.html">operations wiki</a>, which provides some detail on how to operate Kafka at scale.</p>
 
-<h3>Auto-Create Topics</h3>
+<h3 id="toc_0">Auto-Create Topics</h3>
 
-<p>Kafka brokers should be configured to automatically create topics. Without this, it&#39;s going to be very cumbersome to run Samze jobs, since jobs will write to arbitrary (and sometimes new) topics.</p>
+<p>Kafka brokers should be configured to automatically create topics. Without this, it&rsquo;s going to be very cumbersome to run Samze jobs, since jobs will write to arbitrary (and sometimes new) topics.</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">auto.create.topics.enable=true
 </code></pre></div>
 

Modified: incubator/samza/site/learn/documentation/0.7.0/operations/security.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/operations/security.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/operations/security.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/operations/security.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,27 +86,61 @@
           </div>
 
           <div class="content">
-            <h2>Security</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Security</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
 <p>Samza provides no security. All security is implemented in the stream system, or in the environment that Samza containers run.</p>
 
-<h3>Securing Streaming Systems</h3>
+<h3 id="toc_0">Securing Streaming Systems</h3>
 
 <p>Samza does not provide any security at the stream system level. It is up to individual streaming systems to enforce their own security. If a stream system requires usernames and passwords in order to consume from specific streams, these values must be supplied via configuration, and used at the StreamConsumer/StreamConsumerFactory implementation. The same holds true if the streaming system uses SSL certificates or Kerberos. The environment in which Samza runs must provide the appropriate certificate or Kerberos ticket, and the StreamConsumer must be implemented to use these certificates or tickets.</p>
 
-<h4>Securing Kafka</h4>
+<h4 id="toc_1">Securing Kafka</h4>
 
-<p>Kafka provides no security for its topics, and therefore Samza doesn&#39;t provide any security when using Kafka topics.</p>
+<p>Kafka provides no security for its topics, and therefore Samza doesn&rsquo;t provide any security when using Kafka topics.</p>
 
-<h3>Securing Samza&#39;s Environment</h3>
+<h3 id="toc_2">Securing Samza&rsquo;s Environment</h3>
 
 <p>The most important thing to keep in mind when securing an environment that Samza containers run in is that <strong>Samza containers execute arbitrary user code</strong>. They must considered an adversarial application, and the environment must be locked down accordingly.</p>
 
-<h4>Configuration</h4>
+<h4 id="toc_3">Configuration</h4>
 
-<p>Samza reads all configuration at the time a Samza job is started using the run-job.sh script. If configuration contains sensitive information, then care must be taken to provide the JobRunner with the configuration. This means implementing a ConfigFactory that understands the configuration security model, and resolves configuration to Samza&#39;s Config object in a secure way.</p>
+<p>Samza reads all configuration at the time a Samza job is started using the run-job.sh script. If configuration contains sensitive information, then care must be taken to provide the JobRunner with the configuration. This means implementing a ConfigFactory that understands the configuration security model, and resolves configuration to Samza&rsquo;s Config object in a secure way.</p>
 
-<p>During the duration of a Samza job&#39;s execution, the configuration is kept in memory. The only time configuration is visible is:</p>
+<p>During the duration of a Samza job&rsquo;s execution, the configuration is kept in memory. The only time configuration is visible is:</p>
 
 <ol>
 <li>When configuration is resolved using a ConfigFactory.</li>
@@ -100,31 +150,31 @@
 
 <p>If configuration contains sensitive data, then these three points must be secured.</p>
 
-<h4>Ports</h4>
+<h4 id="toc_4">Ports</h4>
 
 <p>The only port that a Samza container opens by default is an un-secured JMX port that is randomly selected at start time. If this is not desired, JMX can be disabled through configuration. See the <a href="configuration.html">Configuration</a> page for details.</p>
 
 <p>Users might open ports from inside a Samza container. If this is not desired, then the user that executes the Samza container must have the appropriate permissions revoked, usually using iptables.</p>
 
-<h4>Logs</h4>
+<h4 id="toc_5">Logs</h4>
 
 <p>Samza container logs contain configuration, and might contain arbitrary sensitive data logged by the user. A secure log directory must be provided to the Samza container.</p>
 
-<h4>Starting a Samza Job</h4>
+<h4 id="toc_6">Starting a Samza Job</h4>
 
 <p>If operators do not wish to allow Samza containers to be executed by arbitrary users, then the mechanism that Samza containers are deployed must secured. Usually, this means controlling execution of the run-job.sh script. The recommended pattern is to lock down the machines that Samza containers run on, and execute run-job.sh from either a blessed web service or special machine, and only allow access to the service or machine by specific users.</p>
 
-<h4>Shell Scripts</h4>
+<h4 id="toc_7">Shell Scripts</h4>
 
 <p>Please see the <a href="packaging.html">Packaging</a> section for details on the the shell scripts that Samza uses. Samza containers allow users to execute arbitrary shell commands, so user permissions must be locked down to prevent users from damaging the environment or reading sensitive data.</p>
 
-<h4>YARN</h4>
+<h4 id="toc_8">YARN</h4>
 
 <!-- TODO make the security page link to the actual YARN security document, when we write it. -->
 
-<p>Samza provides out-of-the-box YARN integration. Take a look at Samza&#39;s YARN Security page for details.</p>
+<p>Samza provides out-of-the-box YARN integration. Take a look at Samza&rsquo;s YARN Security page for details.</p>
 
-<h2><a href="kafka.html">Kafka &raquo;</a></h2>
+<h2 id="toc_9"><a href="kafka.html">Kafka &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/documentation/0.7.0/yarn/application-master.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/yarn/application-master.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/yarn/application-master.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/yarn/application-master.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,13 +86,47 @@
           </div>
 
           <div class="content">
-            <h2>Application Master</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Application Master</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
-<p>YARN is Hadoop&#39;s next-generation cluster manager. It allows developers to deploy and execute arbitrary commands on a grid. If you&#39;re unfamiliar with YARN, or the concept of an ApplicationMaster (AM), please read Hadoop&#39;s <a href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a> page.</p>
+<p>YARN is Hadoop&rsquo;s next-generation cluster manager. It allows developers to deploy and execute arbitrary commands on a grid. If you&rsquo;re unfamiliar with YARN, or the concept of an ApplicationMaster (AM), please read Hadoop&rsquo;s <a href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a> page.</p>
 
-<h3>Integration</h3>
+<h3 id="toc_0">Integration</h3>
 
-<p>Samza&#39;s main integration with YARN comes in the form of a Samza ApplicationMaster. This is the chunk of code responsible for managing a Samza job in a YARN grid. It decides what to do when a stream processor fails, which machines a Samza job&#39;s <a href="../container/samza-container.html">containers</a> should run on, and so on.</p>
+<p>Samza&rsquo;s main integration with YARN comes in the form of a Samza ApplicationMaster. This is the chunk of code responsible for managing a Samza job in a YARN grid. It decides what to do when a stream processor fails, which machines a Samza job&rsquo;s <a href="../container/samza-container.html">containers</a> should run on, and so on.</p>
 
 <p>When the Samza ApplicationMaster starts up, it does the following:</p>
 
@@ -84,46 +134,46 @@
 <li>Receives configuration from YARN via the STREAMING_CONFIG environment variable.</li>
 <li>Starts a JMX server on a random port.</li>
 <li>Instantiates a metrics registry and reporters to keep track of relevant metrics.</li>
-<li>Registers the AM with YARN&#39;s RM.</li>
-<li>Get the total number of partitions for the Samza job using each input stream&#39;s PartitionManager (see the <a href="../container/streams.html">Streams</a> page for details).</li>
-<li>Read the total number of containers requested from the Samza job&#39;s configuration.</li>
-<li>Assign each partition to a container (called a Task Group in Samza&#39;s AM dashboard).</li>
+<li>Registers the AM with YARN&rsquo;s RM.</li>
+<li>Get the total number of partitions for the Samza job using each input stream&rsquo;s PartitionManager (see the <a href="../container/streams.html">Streams</a> page for details).</li>
+<li>Read the total number of containers requested from the Samza job&rsquo;s configuration.</li>
+<li>Assign each partition to a container (called a Task Group in Samza&rsquo;s AM dashboard).</li>
 <li>Make a <a href="http://hadoop.apache.org/docs/current/api/org/apache/hadoop/yarn/api/records/ResourceRequest.html">ResourceRequest</a> to YARN for each container.</li>
 <li>Poll the YARN RM every second to check for allocated and released containers.</li>
 </ol>
 
 <p>From this point on, the ApplicationMaster just reacts to events from the RM.</p>
 
-<h3>Fault Tolerance</h3>
+<h3 id="toc_1">Fault Tolerance</h3>
 
 <p>Whenever a container is allocated, the AM will work with the YARN NM to start a SamzaContainer (with appropriate partitions assigned to it) in the container. If a container fails with a non-zero return code, the AM will request a new container, and restart the SamzaContainer. If a SamzaContainer fails too many times, too quickly, the ApplicationMaster will fail the whole Samza job with a non-zero return code. See the yarn.countainer.retry.count and yarn.container.retry.window.ms <a href="../jobs/configuration.html">configuration</a> parameters for details.</p>
 
-<p>When the AM receives a reboot signal from YARN, it will throw a SamzaException. This will trigger a clean and successful shutdown of the AM (YARN won&#39;t think the AM failed).</p>
+<p>When the AM receives a reboot signal from YARN, it will throw a SamzaException. This will trigger a clean and successful shutdown of the AM (YARN won&rsquo;t think the AM failed).</p>
 
-<p>If the AM, itself, fails, YARN will handle restarting the AM. When the AM is restarted, all containers that were running will be killed, and the AM will start from scratch. The same list of operations, shown above, will be executed. The AM will request new containers for its SamzaContainers, and proceed as though it has just started for the first time. YARN has a yarn.resourcemanager.am.max-retries configuration parameter that&#39;s defined in <a href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-site.xml</a>. This configuration defaults to 1, which means that, by default, a single AM failure will cause your Samza job to stop running.</p>
+<p>If the AM, itself, fails, YARN will handle restarting the AM. When the AM is restarted, all containers that were running will be killed, and the AM will start from scratch. The same list of operations, shown above, will be executed. The AM will request new containers for its SamzaContainers, and proceed as though it has just started for the first time. YARN has a yarn.resourcemanager.am.max-retries configuration parameter that&rsquo;s defined in <a href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml">yarn-site.xml</a>. This configuration defaults to 1, which means that, by default, a single AM failure will cause your Samza job to stop running.</p>
 
-<h3>Dashboard</h3>
+<h3 id="toc_2">Dashboard</h3>
 
-<p>Samza&#39;s ApplicationMaster comes with a dashboard to show useful information such as:</p>
+<p>Samza&rsquo;s ApplicationMaster comes with a dashboard to show useful information such as:</p>
 
 <ol>
 <li>Where containers are located.</li>
 <li>Links to logs.</li>
-<li>The Samza job&#39;s configuration.</li>
+<li>The Samza job&rsquo;s configuration.</li>
 <li>Container failure count.</li>
 </ol>
 
-<p>You can find this dashboard by going to your YARN grid&#39;s ResourceManager page (usually something like <a href="http://localhost:8088/cluster">http://localhost:8088/cluster</a>), and clicking on the &quot;ApplicationMaster&quot; link of a running Samza job.</p>
+<p>You can find this dashboard by going to your YARN grid&rsquo;s ResourceManager page (usually something like <a href="http://localhost:8088/cluster">http://localhost:8088/cluster</a>), and clicking on the &ldquo;ApplicationMaster&rdquo; link of a running Samza job.</p>
 
 <p><img src="/img/0.7.0/learn/documentation/yarn/samza-am-dashboard.png" alt="Screenshot of ApplicationMaster dashboard" class="diagram-large"></p>
 
-<h3>Security</h3>
+<h3 id="toc_3">Security</h3>
 
-<p>The Samza dashboard&#39;s HTTP access is currently un-secured, even when using YARN in secure-mode. This means that users with access to a YARN grid could port-scan a Samza ApplicationMaster&#39;s HTTP server, and open the dashboard in a browser to view its contents. Sensitive configuration can be viewed by anyone, in this way, and care should be taken. There are plans to secure Samza&#39;s ApplicationMaster using <a href="http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.0/bk_installing_manually_book/content/rpm-chap14-2-3-1.html">Hadoop&#39;s security</a> features (<a href="http://en.wikipedia.org/wiki/SPNEGO">SPENAGO</a>).</p>
+<p>The Samza dashboard&rsquo;s HTTP access is currently un-secured, even when using YARN in secure-mode. This means that users with access to a YARN grid could port-scan a Samza ApplicationMaster&rsquo;s HTTP server, and open the dashboard in a browser to view its contents. Sensitive configuration can be viewed by anyone, in this way, and care should be taken. There are plans to secure Samza&rsquo;s ApplicationMaster using <a href="http://docs.hortonworks.com/HDPDocuments/HDP1/HDP-1.3.0/bk_installing_manually_book/content/rpm-chap14-2-3-1.html">Hadoop&rsquo;s security</a> features (<a href="http://en.wikipedia.org/wiki/SPNEGO">SPENAGO</a>).</p>
 
-<p>See Samza&#39;s <a href="../operations/security.html">security</a> page for more details.</p>
+<p>See Samza&rsquo;s <a href="../operations/security.html">security</a> page for more details.</p>
 
-<h2><a href="isolation.html">Isolation &raquo;</a></h2>
+<h2 id="toc_4"><a href="isolation.html">Isolation &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/documentation/0.7.0/yarn/isolation.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/yarn/isolation.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/yarn/isolation.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/yarn/isolation.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,33 +86,67 @@
           </div>
 
           <div class="content">
-            <h2>Isolation</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Isolation</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
-<p>When running Samza jobs in a shared, distributed environment, the stream processors can have an impact on one another&#39;s performance. A stream processor that uses 100% of a machine&#39;s CPU will slow down all other stream processors on the machine.</p>
+<p>When running Samza jobs in a shared, distributed environment, the stream processors can have an impact on one another&rsquo;s performance. A stream processor that uses 100% of a machine&rsquo;s CPU will slow down all other stream processors on the machine.</p>
 
-<p>One of <a href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a>&#39;s responsibilities is to manage resources so that this doesn&#39;t happen. Each of YARN&#39;s Node Managers (NM) has a chunk of &quot;resources&quot; dedicated to it. The YARN Resource Manager (RM) will only allow a container to be allocated on a NM if it has enough resources to satisfy the container&#39;s needs.</p>
+<p>One of <a href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a>&rsquo;s responsibilities is to manage resources so that this doesn&rsquo;t happen. Each of YARN&rsquo;s Node Managers (NM) has a chunk of &ldquo;resources&rdquo; dedicated to it. The YARN Resource Manager (RM) will only allow a container to be allocated on a NM if it has enough resources to satisfy the container&rsquo;s needs.</p>
 
 <p>YARN currently supports resource management for memory and CPU.</p>
 
-<h3>Memory</h3>
+<h3 id="toc_0">Memory</h3>
 
-<p>YARN will automatically enforce memory limits for all containers that it executes. All containers must have a max-memory size defined when they&#39;re created. If the sum of all memory usage for processes associated with a single YARN container exceeds this maximum, YARN will kill the container.</p>
+<p>YARN will automatically enforce memory limits for all containers that it executes. All containers must have a max-memory size defined when they&rsquo;re created. If the sum of all memory usage for processes associated with a single YARN container exceeds this maximum, YARN will kill the container.</p>
 
-<p>Samza supports memory limits using the yarn.container.memory.mb and yarn.am.container.memory.mb configuration parameters. Keep in mind that this is simply the amount of memory YARN will allow a <a href="../container/samza-container.html">SamzaContainer</a> or <a href="application-master.html">ApplicationMaster</a> to have. You&#39;ll still need to configure your heap settings appropriately using task.opts, when using Java (the default is -Xmx160M). See the <a href="../jobs/configuration.html">Configuration</a> and <a href="../jobs/packaging.html">Packaging</a> pages for details.</p>
+<p>Samza supports memory limits using the yarn.container.memory.mb and yarn.am.container.memory.mb configuration parameters. Keep in mind that this is simply the amount of memory YARN will allow a <a href="../container/samza-container.html">SamzaContainer</a> or <a href="application-master.html">ApplicationMaster</a> to have. You&rsquo;ll still need to configure your heap settings appropriately using task.opts, when using Java (the default is -Xmx160M). See the <a href="../jobs/configuration.html">Configuration</a> and <a href="../jobs/packaging.html">Packaging</a> pages for details.</p>
 
-<h3>CPU</h3>
+<h3 id="toc_1">CPU</h3>
 
 <p>YARN has the concept of a virtual core. Each NM is assigned a total number of virtual cores (32, by default). When a container request is made, it must specify how many virtual cores it needs. The YARN RM will only assign the container to a NM that has enough virtual cores to satisfy the request.</p>
 
-<h4>CGroups</h4>
+<h4 id="toc_2">CGroups</h4>
 
-<p>Unlike memory, which YARN can enforce itself (by looking at the /proc folder), YARN can&#39;t enforce CPU isolation, since this must be done at the Linux kernel level. One of YARN&#39;s interesting new features is its support for Linux <a href="https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt">CGroups</a>. CGroups are a way to control process utilization at the kernel level in Linux.</p>
+<p>Unlike memory, which YARN can enforce itself (by looking at the /proc folder), YARN can&rsquo;t enforce CPU isolation, since this must be done at the Linux kernel level. One of YARN&rsquo;s interesting new features is its support for Linux <a href="https://www.kernel.org/doc/Documentation/cgroups/cgroups.txt">CGroups</a>. CGroups are a way to control process utilization at the kernel level in Linux.</p>
 
-<p>If YARN is setup to use CGroups, then YARN will guarantee that a container will get at least the amount of CPU that it requires. Currently, YARN will give you more CPU, if it&#39;s available. For details on enforcing &quot;at most&quot; CPU usage, see <a href="https://issues.apache.org/jira/browse/YARN-810">YARN-810</a>. </p>
+<p>If YARN is setup to use CGroups, then YARN will guarantee that a container will get at least the amount of CPU that it requires. Currently, YARN will give you more CPU, if it&rsquo;s available. For details on enforcing &ldquo;at most&rdquo; CPU usage, see <a href="https://issues.apache.org/jira/browse/YARN-810">YARN-810</a>. </p>
 
 <p>See <a href="http://riccomini.name/posts/hadoop/2013-06-14-yarn-with-cgroups/">this blog post</a> for details on setting up YARN with CGroups.</p>
 
-<h2><a href="../operations/security.html">Security &raquo;</a></h2>
+<h2 id="toc_3"><a href="../operations/security.html">Security &raquo;</a></h2>
 
 
           </div>

Modified: incubator/samza/site/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html (original)
+++ incubator/samza/site/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,11 +86,45 @@
           </div>
 
           <div class="content">
-            <h2>Deploying a Samza job from HDFS</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Deploying a Samza job from HDFS</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
-<p>This tutorial uses <a href="../../../startup/hello-samza/0.7.0/">hello-samza</a> to illustrate how to run a Samza job if you want to publish the Samza job&#39;s .tar.gz package to HDFS.</p>
+<p>This tutorial uses <a href="../../../startup/hello-samza/0.7.0/">hello-samza</a> to illustrate how to run a Samza job if you want to publish the Samza job&rsquo;s .tar.gz package to HDFS.</p>
 
-<h3>Build a new Samza job package</h3>
+<h3 id="toc_0">Build a new Samza job package</h3>
 
 <p>Build a new Samza job package to include the hadoop-hdfs-version.jar.</p>
 
@@ -101,14 +151,14 @@
 <li>Make sure hadoop-common-version.jar has the same version as your hadoop-hdfs-version.jar. Otherwise, you may still have errors.</li>
 </ul>
 
-<h3>Upload the package</h3>
+<h3 id="toc_1">Upload the package</h3>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">hadoop fs -put ./samza-job-package/target/samza-job-package-0.7.0-dist.tar.gz /path/for/tgz
 </code></pre></div>
-<h3>Add HDFS configuration</h3>
+<h3 id="toc_2">Add HDFS configuration</h3>
 
 <p>Put the hdfs-site.xml file of your cluster into ~/.samza/conf directory. (The same place as the yarn-site.xml)</p>
 
-<h3>Change properties file</h3>
+<h3 id="toc_3">Change properties file</h3>
 
 <p>Change the yarn.package.path in the properties file to your HDFS location.</p>
 <div class="highlight"><pre><code class="text language-text" data-lang="text">yarn.package.path=hdfs://&lt;hdfs name node ip&gt;:&lt;hdfs name node port&gt;/path/to/tgz

Modified: incubator/samza/site/learn/tutorials/0.7.0/index.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/tutorials/0.7.0/index.html?rev=1602533&r1=1602532&r2=1602533&view=diff
==============================================================================
--- incubator/samza/site/learn/tutorials/0.7.0/index.html (original)
+++ incubator/samza/site/learn/tutorials/0.7.0/index.html Fri Jun 13 22:21:06 2014
@@ -1,4 +1,20 @@
 <!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 <html lang="en">
   <head>
     <meta charset="utf-8">
@@ -70,7 +86,41 @@
           </div>
 
           <div class="content">
-            <h2>Tutorials</h2>
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Tutorials</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
 
 <p><a href="remote-debugging-samza.html">Remote Debugging with Samza</a></p>
 
@@ -89,7 +139,7 @@
 <a href="initialize-close.html">Initializing and Closing</a><br/>
 <a href="windowing.html">Windowing</a><br/>
 <a href="committing.html">Committing</a><br/>
--->
+&ndash;>
 
 
           </div>