You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by cr...@apache.org on 2014/07/09 18:37:02 UTC

svn commit: r1609232 [1/4] - in /incubator/samza/site: ./ community/ contribute/ css/ learn/documentation/0.7.0/ learn/documentation/0.7.0/api/ learn/documentation/0.7.0/comparisons/ learn/documentation/0.7.0/container/ learn/documentation/0.7.0/introd...

Author: criccomini
Date: Wed Jul  9 16:37:01 2014
New Revision: 1609232

URL: http://svn.apache.org/r1609232
Log:
updating docs page

Modified:
    incubator/samza/site/README.md
    incubator/samza/site/community/committers.html
    incubator/samza/site/community/irc.html
    incubator/samza/site/community/mailing-lists.html
    incubator/samza/site/contribute/code.html
    incubator/samza/site/contribute/coding-guide.html
    incubator/samza/site/contribute/disclaimer.html
    incubator/samza/site/contribute/projects.html
    incubator/samza/site/contribute/rules.html
    incubator/samza/site/contribute/seps.html
    incubator/samza/site/css/main.css
    incubator/samza/site/css/ropa-sans.css
    incubator/samza/site/index.html
    incubator/samza/site/learn/documentation/0.7.0/api/overview.html
    incubator/samza/site/learn/documentation/0.7.0/comparisons/introduction.html
    incubator/samza/site/learn/documentation/0.7.0/comparisons/mupd8.html
    incubator/samza/site/learn/documentation/0.7.0/comparisons/storm.html
    incubator/samza/site/learn/documentation/0.7.0/container/checkpointing.html
    incubator/samza/site/learn/documentation/0.7.0/container/event-loop.html
    incubator/samza/site/learn/documentation/0.7.0/container/jmx.html
    incubator/samza/site/learn/documentation/0.7.0/container/metrics.html
    incubator/samza/site/learn/documentation/0.7.0/container/samza-container.html
    incubator/samza/site/learn/documentation/0.7.0/container/serialization.html
    incubator/samza/site/learn/documentation/0.7.0/container/state-management.html
    incubator/samza/site/learn/documentation/0.7.0/container/streams.html
    incubator/samza/site/learn/documentation/0.7.0/container/windowing.html
    incubator/samza/site/learn/documentation/0.7.0/index.html
    incubator/samza/site/learn/documentation/0.7.0/introduction/architecture.html
    incubator/samza/site/learn/documentation/0.7.0/introduction/background.html
    incubator/samza/site/learn/documentation/0.7.0/introduction/concepts.html
    incubator/samza/site/learn/documentation/0.7.0/jobs/configuration-table.html
    incubator/samza/site/learn/documentation/0.7.0/jobs/configuration.html
    incubator/samza/site/learn/documentation/0.7.0/jobs/job-runner.html
    incubator/samza/site/learn/documentation/0.7.0/jobs/logging.html
    incubator/samza/site/learn/documentation/0.7.0/jobs/packaging.html
    incubator/samza/site/learn/documentation/0.7.0/jobs/yarn-jobs.html
    incubator/samza/site/learn/documentation/0.7.0/operations/kafka.html
    incubator/samza/site/learn/documentation/0.7.0/operations/security.html
    incubator/samza/site/learn/documentation/0.7.0/yarn/application-master.html
    incubator/samza/site/learn/documentation/0.7.0/yarn/isolation.html
    incubator/samza/site/learn/tutorials/0.7.0/deploy-samza-job-from-hdfs.html
    incubator/samza/site/learn/tutorials/0.7.0/index.html
    incubator/samza/site/learn/tutorials/0.7.0/remote-debugging-samza.html
    incubator/samza/site/learn/tutorials/0.7.0/run-hello-samza-without-internet.html
    incubator/samza/site/learn/tutorials/0.7.0/run-in-multi-node-yarn.html
    incubator/samza/site/less/main.less
    incubator/samza/site/sitemap.xml
    incubator/samza/site/startup/download/index.html
    incubator/samza/site/startup/hello-samza/0.7.0/index.html

Modified: incubator/samza/site/README.md
URL: http://svn.apache.org/viewvc/incubator/samza/site/README.md?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/README.md (original)
+++ incubator/samza/site/README.md Wed Jul  9 16:37:01 2014
@@ -16,17 +16,19 @@
 -->
 ## Setup
 
-Samza's documentation uses Jekyll to build a website out of markdown pages. To install Jekyll, run this command:
+Samza's documentation uses Jekyll to build a website out of markdown pages. Prerequisites:
 
-    sudo gem install jekyll redcarpet
+1. You need [Ruby](https://www.ruby-lang.org/) installed on your machine (run `ruby --version` to check)
+2. Install [Bundler](http://bundler.io/) by running `sudo gem install bundler`
+3. To install Jekyll and its dependencies, change to the `docs` directory and run `bundle install`
 
-To run the website locally, execute:
+To serve the website on [localhost:4000](http://localhost:4000/):
 
-    jekyll serve --watch --host 0.0.0.0
+    bundle exec jekyll serve --watch
 
-To compile the website in the _site directory, execute:
+To compile the website in the \_site directory, execute:
 
-    jekyll build
+    bundle exec jekyll build
 
 ## Versioning
 

Modified: incubator/samza/site/community/committers.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/community/committers.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/community/committers.html (original)
+++ incubator/samza/site/community/committers.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -157,6 +158,11 @@ Committer, and PMC member<br/>
 <a href="https://www.linkedin.com/in/zjshen" target="_blank"><i class="fa fa-linkedin committer-icon"></i></a>
 <a href="https://twitter.com/zhijieshen" target="_blank"><i class="fa fa-twitter committer-icon"></i></a></p>
 
+<p><strong>Yan Fang</strong><br/>
+Committer, and PMC member<br/>
+<a href="https://www.linkedin.com/in/yanfangus" target="_blank"><i class="fa fa-linkedin committer-icon"></i></a>
+<a href="https://twitter.com/yanfang724" target="_blank"><i class="fa fa-twitter committer-icon"></i></a></p>
+
 <h3 id="mentors">Mentors</h3>
 
 <p><strong>Chris Douglas</strong><br/>

Modified: incubator/samza/site/community/irc.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/community/irc.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/community/irc.html (original)
+++ incubator/samza/site/community/irc.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/community/mailing-lists.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/community/mailing-lists.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/community/mailing-lists.html (original)
+++ incubator/samza/site/community/mailing-lists.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/contribute/code.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/contribute/code.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/contribute/code.html (original)
+++ incubator/samza/site/contribute/code.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -131,12 +132,7 @@
 
 <p>If you are a committer you need to use https instead of http to check in, otherwise you will get an error regarding an inability to acquire a lock. Note that older versions of git may also give this error even when the repo was cloned with https; if you experience this try a newer version of git.</p>
 
-<p>The Samza website is built by Jekyll from the markdown files found in the docs subdirectory. For committers wishing to update the webpage first install Jekyll:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">gem install jekyll
-</code></pre></div>
-<p>Depending on your system you may also need install some additional dependencies when you try and run it. Note that some Linux distributions may have older versions of Jekyll packaged that treat arguments differently and may result in changes not being incorporated into the generated site.</p>
-
-<p>The script to commit the updated webpage files is docs/_tools/publish-site.sh</p>
+<p>The Samza website is built by Jekyll from the markdown files found in the docs subdirectory. For committers wishing to update the webpage, please see <code>docs/README.md</code> for instructions.</p>
 
 
           </div>

Modified: incubator/samza/site/contribute/coding-guide.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/contribute/coding-guide.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/contribute/coding-guide.html (original)
+++ incubator/samza/site/contribute/coding-guide.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/contribute/disclaimer.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/contribute/disclaimer.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/contribute/disclaimer.html (original)
+++ incubator/samza/site/contribute/disclaimer.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/contribute/projects.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/contribute/projects.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/contribute/projects.html (original)
+++ incubator/samza/site/contribute/projects.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/contribute/rules.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/contribute/rules.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/contribute/rules.html (original)
+++ incubator/samza/site/contribute/rules.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/contribute/seps.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/contribute/seps.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/contribute/seps.html (original)
+++ incubator/samza/site/contribute/seps.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/css/main.css
URL: http://svn.apache.org/viewvc/incubator/samza/site/css/main.css?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/css/main.css (original)
+++ incubator/samza/site/css/main.css Wed Jul  9 16:37:01 2014
@@ -1,3 +1,21 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
 /* Non-responsive overrides
  *
  * Utilitze the following CSS to disable the responsive-ness of the container,
@@ -136,14 +154,13 @@ h4 {
 pre {
   border: 0px !important;
   border-radius: 0px !important;
-  overflow: scroll !important;
-  white-space: pre;
-  overflow-wrap: normal;
-  word-wrap: normal !important;
+  overflow-x: auto;
+  background-color: #f7f7f7;
+  font-size: 12px;
 }
 pre code {
+  overflow-wrap: normal;
   white-space: pre;
-  font-size: 12px;
 }
 th.header {
   cursor: pointer;
@@ -193,29 +210,12 @@ td.key {
 .committer-icon {
   font-size: 16px;
 }
-ul.documentation-list {
-  list-style: none;
-  padding-left: 20px;
-}
 img.diagram-large {
   width: 100%;
 }
-table.documentation {
-  border-collapse: collapse;
-  font-size: 12px;
-  margin: 1em 0;
-}
-table.documentation th, table.documentation td {
-  text-align: left;
-  vertical-align: top;
-  border: 1px solid #888;
-  padding: 5px;
-}
-table.documentation th.nowrap, table.documentation td.nowrap {
-  white-space: nowrap;
-}
-table.documentation th {
-  background-color: #eee;
+ul.documentation-list {
+  list-style: none;
+  padding-left: 20px;
 }
 .footer {
   clear: both;

Modified: incubator/samza/site/css/ropa-sans.css
URL: http://svn.apache.org/viewvc/incubator/samza/site/css/ropa-sans.css?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/css/ropa-sans.css (original)
+++ incubator/samza/site/css/ropa-sans.css Wed Jul  9 16:37:01 2014
@@ -1,3 +1,22 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+
 @font-face {
   font-family: 'Ropa Sans';
   font-style: normal;

Modified: incubator/samza/site/index.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/index.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/index.html (original)
+++ incubator/samza/site/index.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/learn/documentation/0.7.0/api/overview.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/api/overview.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/api/overview.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/api/overview.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -123,54 +124,57 @@
 -->
 
 <p>When writing a stream processor for Samza, you must implement the <a href="javadocs/org/apache/samza/task/StreamTask.html">StreamTask</a> interface:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">package com.example.samza;
 
-public class MyTaskClass implements StreamTask {
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kn">package</span> <span class="n">com</span><span class="o">.</span><span class="na">example</span><span class="o">.</span><span class="na">samza</span><span class="o">;</span>
+
+<span class="kd">public</span> <span class="kd">class</span> <span class="nc">MyTaskClass</span> <span class="kd">implements</span> <span class="n">StreamTask</span> <span class="o">{</span>
+
+  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">process</span><span class="o">(</span><span class="n">IncomingMessageEnvelope</span> <span class="n">envelope</span><span class="o">,</span>
+                      <span class="n">MessageCollector</span> <span class="n">collector</span><span class="o">,</span>
+                      <span class="n">TaskCoordinator</span> <span class="n">coordinator</span><span class="o">)</span> <span class="o">{</span>
+    <span class="c1">// process message</span>
+  <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
 
-  public void process(IncomingMessageEnvelope envelope,
-                      MessageCollector collector,
-                      TaskCoordinator coordinator) {
-    // process message
-  }
-}
-</code></pre></div>
 <p>When you run your job, Samza will create several instances of your class (potentially on multiple machines). These task instances process the messages in the input streams.</p>
 
 <p>In your job&rsquo;s configuration you can tell Samza which streams you want to consume. An incomplete example could look like this (see the <a href="../jobs/configuration.html">configuration documentation</a> for more detail):</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># This is the class above, which Samza will instantiate when the job is run
-task.class=com.example.samza.MyTaskClass
 
-# Define a system called &quot;kafka&quot; (you can give it any name, and you can define
-# multiple systems if you want to process messages from different sources)
-systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
-
-# The job consumes a topic called &quot;PageViewEvent&quot; from the &quot;kafka&quot; system
-task.inputs=kafka.PageViewEvent
-
-# Define a serializer/deserializer called &quot;json&quot; which parses JSON messages
-serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory
-
-# Use the &quot;json&quot; serializer for messages in the &quot;PageViewEvent&quot; topic
-systems.kafka.streams.PageViewEvent.samza.msg.serde=json
-</code></pre></div>
+<div class="highlight"><pre><code class="language-jproperties" data-lang="jproperties"><span class="c"># This is the class above, which Samza will instantiate when the job is run</span>
+<span class="na">task.class</span><span class="o">=</span><span class="s">com.example.samza.MyTaskClass</span>
+
+<span class="c"># Define a system called &quot;kafka&quot; (you can give it any name, and you can define</span>
+<span class="c"># multiple systems if you want to process messages from different sources)</span>
+<span class="na">systems.kafka.samza.factory</span><span class="o">=</span><span class="s">org.apache.samza.system.kafka.KafkaSystemFactory</span>
+
+<span class="c"># The job consumes a topic called &quot;PageViewEvent&quot; from the &quot;kafka&quot; system</span>
+<span class="na">task.inputs</span><span class="o">=</span><span class="s">kafka.PageViewEvent</span>
+
+<span class="c"># Define a serializer/deserializer called &quot;json&quot; which parses JSON messages</span>
+<span class="na">serializers.registry.json.class</span><span class="o">=</span><span class="s">org.apache.samza.serializers.JsonSerdeFactory</span>
+
+<span class="c"># Use the &quot;json&quot; serializer for messages in the &quot;PageViewEvent&quot; topic</span>
+<span class="na">systems.kafka.streams.PageViewEvent.samza.msg.serde</span><span class="o">=</span><span class="s">json</span></code></pre></div>
+
 <p>For each message that Samza receives from the task&rsquo;s input streams, the <em>process</em> method is called. The <a href="javadocs/org/apache/samza/system/IncomingMessageEnvelope.html">envelope</a> contains three things of importance: the message, the key, and the stream that the message came from.</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">/** Every message that is delivered to a StreamTask is wrapped
- * in an IncomingMessageEnvelope, which contains metadata about
- * the origin of the message. */
-public class IncomingMessageEnvelope {
-  /** A deserialized message. */
-  Object getMessage() { ... }
-
-  /** A deserialized key. */
-  Object getKey() { ... }
-
-  /** The stream and partition that this message came from. */
-  SystemStreamPartition getSystemStreamPartition() { ... }
-}
-</code></pre></div>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="cm">/** Every message that is delivered to a StreamTask is wrapped</span>
+<span class="cm"> * in an IncomingMessageEnvelope, which contains metadata about</span>
+<span class="cm"> * the origin of the message. */</span>
+<span class="kd">public</span> <span class="kd">class</span> <span class="nc">IncomingMessageEnvelope</span> <span class="o">{</span>
+  <span class="cm">/** A deserialized message. */</span>
+  <span class="n">Object</span> <span class="nf">getMessage</span><span class="o">()</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
+
+  <span class="cm">/** A deserialized key. */</span>
+  <span class="n">Object</span> <span class="nf">getKey</span><span class="o">()</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
+
+  <span class="cm">/** The stream and partition that this message came from. */</span>
+  <span class="n">SystemStreamPartition</span> <span class="nf">getSystemStreamPartition</span><span class="o">()</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+
 <p>The key and value are declared as Object, and need to be cast to the correct type. If you don&rsquo;t configure a <a href="../container/serialization.html">serializer/deserializer</a>, they are typically Java byte arrays. A deserializer can convert these bytes into any other type, for example the JSON deserializer mentioned above parses the byte array into java.util.Map, java.util.List and String objects.</p>
 
-<p>The getSystemStreamPartition() method returns a <a href="javadocs/org/apache/samza/system/SystemStreamPartition.html">SystemStreamPartition</a> object, which tells you where the message came from. It consists of three parts:</p>
+<p>The <code>getSystemStreamPartition()</code> method returns a <a href="javadocs/org/apache/samza/system/SystemStreamPartition.html">SystemStreamPartition</a> object, which tells you where the message came from. It consists of three parts:</p>
 
 <ol>
 <li>The <em>system</em>: the name of the system from which the message came, as defined in your job configuration. You can have multiple systems for input and/or output, each with a different name.</li>
@@ -179,53 +183,56 @@ public class IncomingMessageEnvelope {
 </ol>
 
 <p>The API looks like this:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">/** A triple of system name, stream name and partition. */
-public class SystemStreamPartition extends SystemStream {
 
-  /** The name of the system which provides this stream. It is
-      defined in the Samza job&#39;s configuration. */
-  public String getSystem() { ... }
-
-  /** The name of the stream/topic/queue within the system. */
-  public String getStream() { ... }
-
-  /** The partition within the stream. */
-  public Partition getPartition() { ... }
-}
-</code></pre></div>
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="cm">/** A triple of system name, stream name and partition. */</span>
+<span class="kd">public</span> <span class="kd">class</span> <span class="nc">SystemStreamPartition</span> <span class="kd">extends</span> <span class="n">SystemStream</span> <span class="o">{</span>
+
+  <span class="cm">/** The name of the system which provides this stream. It is</span>
+<span class="cm">      defined in the Samza job&#39;s configuration. */</span>
+  <span class="kd">public</span> <span class="n">String</span> <span class="nf">getSystem</span><span class="o">()</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
+
+  <span class="cm">/** The name of the stream/topic/queue within the system. */</span>
+  <span class="kd">public</span> <span class="n">String</span> <span class="nf">getStream</span><span class="o">()</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
+
+  <span class="cm">/** The partition within the stream. */</span>
+  <span class="kd">public</span> <span class="n">Partition</span> <span class="nf">getPartition</span><span class="o">()</span> <span class="o">{</span> <span class="o">...</span> <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+
 <p>In the example job configuration above, the system name is &ldquo;kafka&rdquo;, the stream name is &ldquo;PageViewEvent&rdquo;. (The name &ldquo;kafka&rdquo; isn&rsquo;t special &mdash; you can give your system any name you want.) If you have several input streams feeding into your StreamTask, you can use the SystemStreamPartition to determine what kind of message you&rsquo;ve received.</p>
 
 <p>What about sending messages? If you take a look at the process() method in StreamTask, you&rsquo;ll see that you get a <a href="javadocs/org/apache/samza/task/MessageCollector.html">MessageCollector</a>.</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">/** When a task wishes to send a message, it uses this interface. */
-public interface MessageCollector {
-  void send(OutgoingMessageEnvelope envelope);
-}
-</code></pre></div>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="cm">/** When a task wishes to send a message, it uses this interface. */</span>
+<span class="kd">public</span> <span class="kd">interface</span> <span class="nc">MessageCollector</span> <span class="o">{</span>
+  <span class="kt">void</span> <span class="nf">send</span><span class="o">(</span><span class="n">OutgoingMessageEnvelope</span> <span class="n">envelope</span><span class="o">);</span>
+<span class="o">}</span></code></pre></div>
+
 <p>To send a message, you create an <a href="javadocs/org/apache/samza/system/OutgoingMessageEnvelope.html">OutgoingMessageEnvelope</a> object and pass it to the message collector. At a minimum, the envelope specifies the message you want to send, and the system and stream name to send it to. Optionally you can specify the partitioning key and other parameters. See the <a href="javadocs/org/apache/samza/system/OutgoingMessageEnvelope.html">javadoc</a> for details.</p>
 
-<p><strong>NOTE:</strong> Please only use the MessageCollector object within the process() method. If you hold on to a MessageCollector instance and use it again later, your messages may not be sent correctly.</p>
+<p><strong>NOTE:</strong> Please only use the MessageCollector object within the <code>process()</code> method. If you hold on to a MessageCollector instance and use it again later, your messages may not be sent correctly.</p>
 
 <p>For example, here&rsquo;s a simple task that splits each input message into words, and emits each word as a separate message:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">public class SplitStringIntoWords implements StreamTask {
 
-  // Send outgoing messages to a stream called &quot;words&quot;
-  // in the &quot;kafka&quot; system.
-  private final SystemStream OUTPUT_STREAM =
-    new SystemStream(&quot;kafka&quot;, &quot;words&quot;);
-
-  public void process(IncomingMessageEnvelope envelope,
-                      MessageCollector collector,
-                      TaskCoordinator coordinator) {
-    String message = (String) envelope.getMessage();
-
-    for (String word : message.split(&quot; &quot;)) {
-      // Use the word as the key, and 1 as the value.
-      // A second task can add the 1&#39;s to get the word count.
-      collector.send(new OutgoingMessageEnvelope(OUTPUT_STREAM, word, 1));
-    }
-  }
-}
-</code></pre></div>
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">class</span> <span class="nc">SplitStringIntoWords</span> <span class="kd">implements</span> <span class="n">StreamTask</span> <span class="o">{</span>
+
+  <span class="c1">// Send outgoing messages to a stream called &quot;words&quot;</span>
+  <span class="c1">// in the &quot;kafka&quot; system.</span>
+  <span class="kd">private</span> <span class="kd">final</span> <span class="n">SystemStream</span> <span class="n">OUTPUT_STREAM</span> <span class="o">=</span>
+    <span class="k">new</span> <span class="nf">SystemStream</span><span class="o">(</span><span class="s">&quot;kafka&quot;</span><span class="o">,</span> <span class="s">&quot;words&quot;</span><span class="o">);</span>
+
+  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">process</span><span class="o">(</span><span class="n">IncomingMessageEnvelope</span> <span class="n">envelope</span><span class="o">,</span>
+                      <span class="n">MessageCollector</span> <span class="n">collector</span><span class="o">,</span>
+                      <span class="n">TaskCoordinator</span> <span class="n">coordinator</span><span class="o">)</span> <span class="o">{</span>
+    <span class="n">String</span> <span class="n">message</span> <span class="o">=</span> <span class="o">(</span><span class="n">String</span><span class="o">)</span> <span class="n">envelope</span><span class="o">.</span><span class="na">getMessage</span><span class="o">();</span>
+
+    <span class="k">for</span> <span class="o">(</span><span class="n">String</span> <span class="n">word</span> <span class="o">:</span> <span class="n">message</span><span class="o">.</span><span class="na">split</span><span class="o">(</span><span class="s">&quot; &quot;</span><span class="o">))</span> <span class="o">{</span>
+      <span class="c1">// Use the word as the key, and 1 as the value.</span>
+      <span class="c1">// A second task can add the 1&#39;s to get the word count.</span>
+      <span class="n">collector</span><span class="o">.</span><span class="na">send</span><span class="o">(</span><span class="k">new</span> <span class="nf">OutgoingMessageEnvelope</span><span class="o">(</span><span class="n">OUTPUT_STREAM</span><span class="o">,</span> <span class="n">word</span><span class="o">,</span> <span class="mi">1</span><span class="o">));</span>
+    <span class="o">}</span>
+  <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+
 <h2 id="samzacontainer-&raquo;"><a href="../container/samza-container.html">SamzaContainer &raquo;</a></h2>
 
 

Modified: incubator/samza/site/learn/documentation/0.7.0/comparisons/introduction.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/comparisons/introduction.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/comparisons/introduction.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/comparisons/introduction.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/learn/documentation/0.7.0/comparisons/mupd8.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/comparisons/mupd8.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/comparisons/mupd8.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/comparisons/mupd8.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/learn/documentation/0.7.0/comparisons/storm.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/comparisons/storm.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/comparisons/storm.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/comparisons/storm.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>

Modified: incubator/samza/site/learn/documentation/0.7.0/container/checkpointing.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/container/checkpointing.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/container/checkpointing.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/container/checkpointing.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -143,59 +144,80 @@
 <p>This guarantee is called <em>at-least-once processing</em>: Samza ensures that your job doesn&rsquo;t miss any messages, even if containers need to be restarted. However, it is possible for your job to see the same message more than once when a container is restarted. We are planning to address this in a future version of Samza, but for now it is just something to be aware of: for example, if you are counting page views, a forcefully killed container could cause events to be slightly over-counted. You can reduce duplication by checkpointing more frequently, at a slight performance cost.</p>
 
 <p>For checkpoints to be effective, they need to be written somewhere where they will survive faults. Samza allows you to write checkpoints to the file system (using FileSystemCheckpointManager), but that doesn&rsquo;t help if the machine fails and the container needs to be restarted on another machine. The most common configuration is to use Kafka for checkpointing. You can enable this with the following job configuration:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># The name of your job determines the name under which checkpoints will be stored
-job.name=example-job
 
-# Define a system called &quot;kafka&quot; for consuming and producing to a Kafka cluster
-systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
+<div class="highlight"><pre><code class="language-jproperties" data-lang="jproperties"><span class="c"># The name of your job determines the name under which checkpoints will be stored</span>
+<span class="na">job.name</span><span class="o">=</span><span class="s">example-job</span>
+
+<span class="c"># Define a system called &quot;kafka&quot; for consuming and producing to a Kafka cluster</span>
+<span class="na">systems.kafka.samza.factory</span><span class="o">=</span><span class="s">org.apache.samza.system.kafka.KafkaSystemFactory</span>
+
+<span class="c"># Declare that we want our job&#39;s checkpoints to be written to Kafka</span>
+<span class="na">task.checkpoint.factory</span><span class="o">=</span><span class="s">org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory</span>
+<span class="na">task.checkpoint.system</span><span class="o">=</span><span class="s">kafka</span>
+
+<span class="c"># By default, a checkpoint is written every 60 seconds. You can change this if you like.</span>
+<span class="na">task.commit.ms</span><span class="o">=</span><span class="s">60000</span></code></pre></div>
 
-# Declare that we want our job&#39;s checkpoints to be written to Kafka
-task.checkpoint.factory=org.apache.samza.checkpoint.kafka.KafkaCheckpointManagerFactory
-task.checkpoint.system=kafka
-
-# By default, a checkpoint is written every 60 seconds. You can change this if you like.
-task.commit.ms=60000
-</code></pre></div>
 <p>In this configuration, Samza writes checkpoints to a separate Kafka topic called __samza_checkpoint_&lt;job-name&gt;_&lt;job-id&gt; (in the example configuration above, the topic would be called __samza_checkpoint_example-job_1). Once per minute, Samza automatically sends a message to this topic, in which the current offsets of the input streams are encoded. When a Samza container starts up, it looks for the most recent offset message in this topic, and loads that checkpoint.</p>
 
 <p>Sometimes it can be useful to use checkpoints only for some input streams, but not for others. In this case, you can tell Samza to ignore any checkpointed offsets for a particular stream name:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># Ignore any checkpoints for the topic &quot;my-special-topic&quot;
-systems.kafka.streams.my-special-topic.samza.reset.offset=true
 
-# Always start consuming &quot;my-special-topic&quot; at the oldest available offset
-systems.kafka.streams.my-special-topic.samza.offset.default=oldest
-</code></pre></div>
+<div class="highlight"><pre><code class="language-jproperties" data-lang="jproperties"><span class="c"># Ignore any checkpoints for the topic &quot;my-special-topic&quot;</span>
+<span class="na">systems.kafka.streams.my-special-topic.samza.reset.offset</span><span class="o">=</span><span class="s">true</span>
+
+<span class="c"># Always start consuming &quot;my-special-topic&quot; at the oldest available offset</span>
+<span class="na">systems.kafka.streams.my-special-topic.samza.offset.default</span><span class="o">=</span><span class="s">oldest</span></code></pre></div>
+
 <p>The following table explains the meaning of these configuration parameters:</p>
 
-<table class="documentation">
-  <tr>
-    <th>Parameter name</th>
-    <th>Value</th>
-    <th>Meaning</th>
-  </tr>
-  <tr>
-    <td rowspan="2" class="nowrap">systems.&lt;system&gt;.<br>streams.&lt;stream&gt;.<br>samza.reset.offset</td>
-    <td>false (default)</td>
-    <td>When container starts up, resume processing from last checkpoint</td>
-  </tr>
-  <tr>
-    <td>true</td>
-    <td>Ignore checkpoint (pretend that no checkpoint is present)</td>
-  </tr>
-  <tr>
-    <td rowspan="2" class="nowrap">systems.&lt;system&gt;.<br>streams.&lt;stream&gt;.<br>samza.offset.default</td>
-    <td>upcoming (default)</td>
-    <td>When container starts and there is no checkpoint (or the checkpoint is ignored), only process messages that are published after the job is started, but no old messages</td>
-  </tr>
-  <tr>
-    <td>oldest</td>
-    <td>When container starts and there is no checkpoint (or the checkpoint is ignored), jump back to the oldest available message in the system, and consume all messages from that point onwards (most likely this means repeated processing of messages already seen previously)</td>
-  </tr>
+<table class="table table-condensed table-bordered table-striped">
+  <thead>
+    <tr>
+      <th>Parameter name</th>
+      <th>Value</th>
+      <th>Meaning</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td rowspan="2" class="nowrap">systems.&lt;system&gt;.<br>streams.&lt;stream&gt;.<br>samza.reset.offset</td>
+      <td>false (default)</td>
+      <td>When container starts up, resume processing from last checkpoint</td>
+    </tr>
+    <tr>
+      <td>true</td>
+      <td>Ignore checkpoint (pretend that no checkpoint is present)</td>
+    </tr>
+    <tr>
+      <td rowspan="2" class="nowrap">systems.&lt;system&gt;.<br>streams.&lt;stream&gt;.<br>samza.offset.default</td>
+      <td>upcoming (default)</td>
+      <td>When container starts and there is no checkpoint (or the checkpoint is ignored), only process messages that are published after the job is started, but no old messages</td>
+    </tr>
+    <tr>
+      <td>oldest</td>
+      <td>When container starts and there is no checkpoint (or the checkpoint is ignored), jump back to the oldest available message in the system, and consume all messages from that point onwards (most likely this means repeated processing of messages already seen previously)</td>
+    </tr>
+  </tbody>
 </table>
 
 <p>Note that the example configuration above causes your tasks to start consuming from the oldest offset <em>every time a container starts up</em>. This is useful in case you have some in-memory state in your tasks that you need to rebuild from source data in an input stream. If you are using streams in this way, you may also find <a href="streams.html">bootstrap streams</a> useful.</p>
 
-<p>If you want to make a one-off change to a job&rsquo;s consumer offsets, for example to force old messages to be processed again with a new version of your code, you can use CheckpointTool to manipulate the job&rsquo;s checkpoint. The tool is included in Samza&rsquo;s <a href="/contribute/code.html">source repository</a> and documented in the README.</p>
+<h3 id="manipulating-checkpoints-manually">Manipulating Checkpoints Manually</h3>
+
+<p>If you want to make a one-off change to a job&rsquo;s consumer offsets, for example to force old messages to be <a href="../jobs/reprocessing.html">processed again</a> with a new version of your code, you can use CheckpointTool to inspect and manipulate the job&rsquo;s checkpoint. The tool is included in Samza&rsquo;s <a href="/contribute/code.html">source repository</a>.</p>
+
+<p>To inspect a job&rsquo;s latest checkpoint, you need to specify your job&rsquo;s config file, so that the tool knows which job it is dealing with:</p>
+
+<div class="highlight"><pre><code class="language-bash" data-lang="bash">samza-example/target/bin/checkpoint-tool.sh <span class="se">\</span>
+  --config-path<span class="o">=</span>file:///path/to/job/config.properties</code></pre></div>
+
+<p>This command prints out the latest checkpoint in a properties file format. You can save the output to a file, and edit it as you wish. For example, to jump back to the oldest possible point in time, you can set all the offsets to 0. Then you can feed that properties file back into checkpoint-tool.sh and save the modified checkpoint:</p>
+
+<div class="highlight"><pre><code class="language-bash" data-lang="bash">samza-example/target/bin/checkpoint-tool.sh <span class="se">\</span>
+  --config-path<span class="o">=</span>file:///path/to/job/config.properties <span class="se">\</span>
+  --new-offsets<span class="o">=</span>file:///path/to/new/offsets.properties</code></pre></div>
+
+<p>Note that Samza only reads checkpoints on container startup. In order for your checkpoint change to take effect, you need to first stop the job, then save the modified offsets, and then start the job again. If you write a checkpoint while the job is running, it will most likely have no effect.</p>
 
 <h2 id="state-management-&raquo;"><a href="state-management.html">State Management &raquo;</a></h2>
 

Modified: incubator/samza/site/learn/documentation/0.7.0/container/event-loop.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/container/event-loop.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/container/event-loop.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/container/event-loop.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -151,12 +152,13 @@
 <p>To receive notifications when such events happen, you can implement the <a href="../api/javadocs/org/apache/samza/task/TaskLifecycleListenerFactory.html">TaskLifecycleListenerFactory</a> interface. It returns a <a href="../api/javadocs/org/apache/samza/task/TaskLifecycleListener.html">TaskLifecycleListener</a>, whose methods are called by Samza at the appropriate times.</p>
 
 <p>You can then tell Samza to use your lifecycle listener with the following properties in your job configuration:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># Define a listener called &quot;my-listener&quot; by giving the factory class name
-task.lifecycle.listener.my-listener.class=com.example.foo.MyListenerFactory
 
-# Enable it in this job (multiple listeners can be separated by commas)
-task.lifecycle.listeners=my-listener
-</code></pre></div>
+<div class="highlight"><pre><code class="language-jproperties" data-lang="jproperties"><span class="c"># Define a listener called &quot;my-listener&quot; by giving the factory class name</span>
+<span class="na">task.lifecycle.listener.my-listener.class</span><span class="o">=</span><span class="s">com.example.foo.MyListenerFactory</span>
+
+<span class="c"># Enable it in this job (multiple listeners can be separated by commas)</span>
+<span class="na">task.lifecycle.listeners</span><span class="o">=</span><span class="s">my-listener</span></code></pre></div>
+
 <p>The Samza container creates one instance of your <a href="../api/javadocs/org/apache/samza/task/TaskLifecycleListener.html">TaskLifecycleListener</a>. If the container has multiple task instances (processing different input stream partitions), the beforeInit, afterInit, beforeClose and afterClose methods are called for each task instance. The <a href="../api/javadocs/org/apache/samza/task/TaskContext.html">TaskContext</a> argument of those methods gives you more information about the partitions.</p>
 
 <h2 id="jmx-&raquo;"><a href="jmx.html">JMX &raquo;</a></h2>

Modified: incubator/samza/site/learn/documentation/0.7.0/container/jmx.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/container/jmx.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/container/jmx.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/container/jmx.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -125,12 +126,13 @@
 <p>Samza&rsquo;s containers and YARN ApplicationMaster enable <a href="http://docs.oracle.com/javase/tutorial/jmx/">JMX</a> by default. JMX can be used for managing the JVM; for example, you can connect to it using <a href="http://docs.oracle.com/javase/7/docs/technotes/guides/management/jconsole.html">jconsole</a>, which is included in the JDK.</p>
 
 <p>You can tell Samza to publish its internal <a href="metrics.html">metrics</a>, and any custom metrics you define, as JMX MBeans. To enable this, set the following properties in your job configuration:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># Define a Samza metrics reporter called &quot;jmx&quot;, which publishes to JMX
-metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory
 
-# Use it (if you have multiple reporters defined, separate them with commas)
-metrics.reporters=jmx
-</code></pre></div>
+<div class="highlight"><pre><code class="language-jproperties" data-lang="jproperties"><span class="c"># Define a Samza metrics reporter called &quot;jmx&quot;, which publishes to JMX</span>
+<span class="na">metrics.reporter.jmx.class</span><span class="o">=</span><span class="s">org.apache.samza.metrics.reporter.JmxReporterFactory</span>
+
+<span class="c"># Use it (if you have multiple reporters defined, separate them with commas)</span>
+<span class="na">metrics.reporters</span><span class="o">=</span><span class="s">jmx</span></code></pre></div>
+
 <p>JMX needs to be configured to use a specific port, but in a distributed environment, there is no way of knowing in advance which ports are available on the machines running your containers. Therefore Samza chooses the JMX port randomly. If you need to connect to it, you can find the port by looking in the container&rsquo;s logs, which report the JMX server details as follows:</p>
 <div class="highlight"><pre><code class="language-text" data-lang="text">2014-06-02 21:50:17 JmxServer [INFO] According to InetAddress.getLocalHost.getHostName we are samza-grid-1234.example.com
 2014-06-02 21:50:17 JmxServer [INFO] Started JmxServer registry port=50214 server port=50215 url=service:jmx:rmi://localhost:50215/jndi/rmi://localhost:50214/jmxrmi

Modified: incubator/samza/site/learn/documentation/0.7.0/container/metrics.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/container/metrics.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/container/metrics.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/container/metrics.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -127,68 +128,71 @@
 <p>Metrics can be reported in various ways. You can expose them via <a href="jmx.html">JMX</a>, which is useful in development. In production, a common setup is for each Samza container to periodically publish its metrics to a &ldquo;metrics&rdquo; Kafka topic, in which the metrics from all Samza jobs are aggregated. You can then consume this stream in another Samza job, and send the metrics to your favorite graphing system such as <a href="http://graphite.wikidot.com/">Graphite</a>.</p>
 
 <p>To set up your job to publish metrics to Kafka, you can use the following configuration:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># Define a metrics reporter called &quot;snapshot&quot;, which publishes metrics
-# every 60 seconds.
-metrics.reporters=snapshot
-metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory
-
-# Tell the snapshot reporter to publish to a topic called &quot;metrics&quot;
-# in the &quot;kafka&quot; system.
-metrics.reporter.snapshot.stream=kafka.metrics
-
-# Encode metrics data as JSON.
-serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory
-systems.kafka.streams.metrics.samza.msg.serde=metrics
-</code></pre></div>
+
+<div class="highlight"><pre><code class="language-jproperties" data-lang="jproperties"><span class="c"># Define a metrics reporter called &quot;snapshot&quot;, which publishes metrics</span>
+<span class="c"># every 60 seconds.</span>
+<span class="na">metrics.reporters</span><span class="o">=</span><span class="s">snapshot</span>
+<span class="na">metrics.reporter.snapshot.class</span><span class="o">=</span><span class="s">org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory</span>
+
+<span class="c"># Tell the snapshot reporter to publish to a topic called &quot;metrics&quot;</span>
+<span class="c"># in the &quot;kafka&quot; system.</span>
+<span class="na">metrics.reporter.snapshot.stream</span><span class="o">=</span><span class="s">kafka.metrics</span>
+
+<span class="c"># Encode metrics data as JSON.</span>
+<span class="na">serializers.registry.metrics.class</span><span class="o">=</span><span class="s">org.apache.samza.serializers.MetricsSnapshotSerdeFactory</span>
+<span class="na">systems.kafka.streams.metrics.samza.msg.serde</span><span class="o">=</span><span class="s">metrics</span></code></pre></div>
+
 <p>With this configuration, the job automatically sends several JSON-encoded messages to the &ldquo;metrics&rdquo; topic in Kafka every 60 seconds. The messages look something like this:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">{
-  &quot;header&quot;: {
-    &quot;container-name&quot;: &quot;samza-container-0&quot;,
-    &quot;host&quot;: &quot;samza-grid-1234.example.com&quot;,
-    &quot;job-id&quot;: &quot;1&quot;,
-    &quot;job-name&quot;: &quot;my-samza-job&quot;,
-    &quot;reset-time&quot;: 1401729000347,
-    &quot;samza-version&quot;: &quot;0.0.1&quot;,
-    &quot;source&quot;: &quot;Partition-2&quot;,
-    &quot;time&quot;: 1401729420566,
-    &quot;version&quot;: &quot;0.0.1&quot;
-  },
-  &quot;metrics&quot;: {
-    &quot;org.apache.samza.container.TaskInstanceMetrics&quot;: {
-      &quot;commit-calls&quot;: 7,
-      &quot;commit-skipped&quot;: 77948,
-      &quot;kafka-input-topic-offset&quot;: &quot;1606&quot;,
-      &quot;messages-sent&quot;: 985,
-      &quot;process-calls&quot;: 1093,
-      &quot;send-calls&quot;: 985,
-      &quot;send-skipped&quot;: 76970,
-      &quot;window-calls&quot;: 0,
-      &quot;window-skipped&quot;: 77955
-    }
-  }
-}
-</code></pre></div>
+
+<div class="highlight"><pre><code class="language-json" data-lang="json"><span class="p">{</span>
+  <span class="nt">&quot;header&quot;</span><span class="p">:</span> <span class="p">{</span>
+    <span class="nt">&quot;container-name&quot;</span><span class="p">:</span> <span class="s2">&quot;samza-container-0&quot;</span><span class="p">,</span>
+    <span class="nt">&quot;host&quot;</span><span class="p">:</span> <span class="s2">&quot;samza-grid-1234.example.com&quot;</span><span class="p">,</span>
+    <span class="nt">&quot;job-id&quot;</span><span class="p">:</span> <span class="s2">&quot;1&quot;</span><span class="p">,</span>
+    <span class="nt">&quot;job-name&quot;</span><span class="p">:</span> <span class="s2">&quot;my-samza-job&quot;</span><span class="p">,</span>
+    <span class="nt">&quot;reset-time&quot;</span><span class="p">:</span> <span class="mi">1401729000347</span><span class="p">,</span>
+    <span class="nt">&quot;samza-version&quot;</span><span class="p">:</span> <span class="s2">&quot;0.0.1&quot;</span><span class="p">,</span>
+    <span class="nt">&quot;source&quot;</span><span class="p">:</span> <span class="s2">&quot;Partition-2&quot;</span><span class="p">,</span>
+    <span class="nt">&quot;time&quot;</span><span class="p">:</span> <span class="mi">1401729420566</span><span class="p">,</span>
+    <span class="nt">&quot;version&quot;</span><span class="p">:</span> <span class="s2">&quot;0.0.1&quot;</span>
+  <span class="p">},</span>
+  <span class="nt">&quot;metrics&quot;</span><span class="p">:</span> <span class="p">{</span>
+    <span class="nt">&quot;org.apache.samza.container.TaskInstanceMetrics&quot;</span><span class="p">:</span> <span class="p">{</span>
+      <span class="nt">&quot;commit-calls&quot;</span><span class="p">:</span> <span class="mi">7</span><span class="p">,</span>
+      <span class="nt">&quot;commit-skipped&quot;</span><span class="p">:</span> <span class="mi">77948</span><span class="p">,</span>
+      <span class="nt">&quot;kafka-input-topic-offset&quot;</span><span class="p">:</span> <span class="s2">&quot;1606&quot;</span><span class="p">,</span>
+      <span class="nt">&quot;messages-sent&quot;</span><span class="p">:</span> <span class="mi">985</span><span class="p">,</span>
+      <span class="nt">&quot;process-calls&quot;</span><span class="p">:</span> <span class="mi">1093</span><span class="p">,</span>
+      <span class="nt">&quot;send-calls&quot;</span><span class="p">:</span> <span class="mi">985</span><span class="p">,</span>
+      <span class="nt">&quot;send-skipped&quot;</span><span class="p">:</span> <span class="mi">76970</span><span class="p">,</span>
+      <span class="nt">&quot;window-calls&quot;</span><span class="p">:</span> <span class="mi">0</span><span class="p">,</span>
+      <span class="nt">&quot;window-skipped&quot;</span><span class="p">:</span> <span class="mi">77955</span>
+    <span class="p">}</span>
+  <span class="p">}</span>
+<span class="p">}</span></code></pre></div>
+
 <p>There is a separate message for each task instance, and the header tells you the job name, job ID and partition of the task. The metrics allow you to see how many messages have been processed and sent, the current offset in the input stream partition, and other details. There are additional messages which give you metrics about the JVM (heap size, garbage collection information, threads etc.), internal metrics of the Kafka producers and consumers, and more.</p>
 
 <p>It&rsquo;s easy to generate custom metrics in your job, if there&rsquo;s some value you want to keep an eye on. You can use Samza&rsquo;s built-in metrics framework, which is similar in design to Coda Hale&rsquo;s <a href="http://metrics.codahale.com/">metrics</a> library. </p>
 
 <p>You can register your custom metrics through a <a href="../api/javadocs/org/apache/samza/metrics/MetricsRegistry.html">MetricsRegistry</a>. Your stream task needs to implement <a href="../api/javadocs/org/apache/samza/task/InitableTask.html">InitableTask</a>, so that you can get the metrics registry from the <a href="../api/javadocs/org/apache/samza/task/TaskContext.html">TaskContext</a>. This simple example shows how to count the number of messages processed by your task:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">public class MyJavaStreamTask implements StreamTask, InitableTask {
-  private Counter messageCount;
 
-  public void init(Config config, TaskContext context) {
-    this.messageCount = context
-      .getMetricsRegistry()
-      .newCounter(getClass().getName(), &quot;message-count&quot;);
-  }
-
-  public void process(IncomingMessageEnvelope envelope,
-                      MessageCollector collector,
-                      TaskCoordinator coordinator) {
-    messageCount.inc();
-  }
-}
-</code></pre></div>
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">class</span> <span class="nc">MyJavaStreamTask</span> <span class="kd">implements</span> <span class="n">StreamTask</span><span class="o">,</span> <span class="n">InitableTask</span> <span class="o">{</span>
+  <span class="kd">private</span> <span class="n">Counter</span> <span class="n">messageCount</span><span class="o">;</span>
+
+  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">init</span><span class="o">(</span><span class="n">Config</span> <span class="n">config</span><span class="o">,</span> <span class="n">TaskContext</span> <span class="n">context</span><span class="o">)</span> <span class="o">{</span>
+    <span class="k">this</span><span class="o">.</span><span class="na">messageCount</span> <span class="o">=</span> <span class="n">context</span>
+      <span class="o">.</span><span class="na">getMetricsRegistry</span><span class="o">()</span>
+      <span class="o">.</span><span class="na">newCounter</span><span class="o">(</span><span class="n">getClass</span><span class="o">().</span><span class="na">getName</span><span class="o">(),</span> <span class="s">&quot;message-count&quot;</span><span class="o">);</span>
+  <span class="o">}</span>
+
+  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">process</span><span class="o">(</span><span class="n">IncomingMessageEnvelope</span> <span class="n">envelope</span><span class="o">,</span>
+                      <span class="n">MessageCollector</span> <span class="n">collector</span><span class="o">,</span>
+                      <span class="n">TaskCoordinator</span> <span class="n">coordinator</span><span class="o">)</span> <span class="o">{</span>
+    <span class="n">messageCount</span><span class="o">.</span><span class="na">inc</span><span class="o">();</span>
+  <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+
 <p>Samza currently supports two kind of metrics: <a href="../api/javadocs/org/apache/samza/metrics/Counter.html">counters</a> and <a href="../api/javadocs/org/apache/samza/metrics/Gauge.html">gauges</a>. Use a counter when you want to track how often something occurs, and a gauge when you want to report the level of something, such as the size of a buffer. Each task instance (for each input stream partition) gets its own set of metrics.</p>
 
 <p>If you want to report metrics in some other way, e.g. directly to a graphing system (without going via Kafka), you can implement a <a href="../api/javadocs/org/apache/samza/metrics/MetricsReporterFactory.html">MetricsReporterFactory</a> and reference it in your job configuration.</p>

Modified: incubator/samza/site/learn/documentation/0.7.0/container/samza-container.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/container/samza-container.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/container/samza-container.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/container/samza-container.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -142,16 +143,17 @@
 <h3 id="tasks-and-partitions">Tasks and Partitions</h3>
 
 <p>When the container starts, it creates instances of the <a href="../api/overview.html">task class</a> that you&rsquo;ve written. If the task class implements the <a href="../api/javadocs/org/apache/samza/task/InitableTask.html">InitableTask</a> interface, the SamzaContainer will also call the init() method.</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">/** Implement this if you want a callback when your task starts up. */
-public interface InitableTask {
-  void init(Config config, TaskContext context);
-}
-</code></pre></div>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="cm">/** Implement this if you want a callback when your task starts up. */</span>
+<span class="kd">public</span> <span class="kd">interface</span> <span class="nc">InitableTask</span> <span class="o">{</span>
+  <span class="kt">void</span> <span class="nf">init</span><span class="o">(</span><span class="n">Config</span> <span class="n">config</span><span class="o">,</span> <span class="n">TaskContext</span> <span class="n">context</span><span class="o">);</span>
+<span class="o">}</span></code></pre></div>
+
 <p>How many instances of your task class are created depends on the number of partitions in the job&rsquo;s input streams. If your Samza job has ten partitions, there will be ten instantiations of your task class: one for each partition. The first task instance will receive all messages for partition one, the second instance will receive all messages for partition two, and so on.</p>
 
 <p><img src="/img/0.7.0/learn/documentation/container/tasks-and-partitions.svg" alt="Illustration of tasks consuming partitions" class="diagram-large"></p>
 
-<p>The number of partitions in the input streams is determined by the systems from which you are consuming. For example, if your input system is Kafka, you can specify the number of partitions when you create a topic.</p>
+<p>The number of partitions in the input streams is determined by the systems from which you are consuming. For example, if your input system is Kafka, you can specify the number of partitions when you create a topic from the command line or using the num.partitions in Kafka&rsquo;s server properties file.</p>
 
 <p>If a Samza job has more than one input stream, the number of task instances for the Samza job is the maximum number of partitions across all input streams. For example, if a Samza job is reading from PageViewEvent (12 partitions), and ServiceMetricEvent (14 partitions), then the Samza job would have 14 task instances (numbered 0 through 13). Task instances 12 and 13 only receive events from ServiceMetricEvent, because there is no corresponding PageViewEvent partition.</p>
 
@@ -171,12 +173,27 @@ public interface InitableTask {
 
 <p>If your job has multiple input streams, Samza provides a simple but powerful mechanism for joining data from different streams: each task instance receives messages from one partition of <em>each</em> of the input streams. For example, say you have two input streams, A and B, each with four partitions. Samza creates four task instances to process them, and assigns the partitions as follows:</p>
 
-<table class="documentation">
-<tr><th>Task instance</th><th>Consumes stream partitions</th></tr>
-<tr><td>0</td><td>stream A partition 0, stream B partition 0</td></tr>
-<tr><td>1</td><td>stream A partition 1, stream B partition 1</td></tr>
-<tr><td>2</td><td>stream A partition 2, stream B partition 2</td></tr>
-<tr><td>3</td><td>stream A partition 3, stream B partition 3</td></tr>
+<table class="table table-condensed table-bordered table-striped">
+  <thead>
+    <tr>
+      <th>Task instance</th>
+      <th>Consumes stream partitions</th>
+    </tr>
+  </thead>
+  <tbody>
+    <tr>
+      <td>0</td><td>stream A partition 0, stream B partition 0</td>
+    </tr>
+    <tr>
+      <td>1</td><td>stream A partition 1, stream B partition 1</td>
+    </tr>
+    <tr>
+      <td>2</td><td>stream A partition 2, stream B partition 2</td>
+    </tr>
+    <tr>
+      <td>3</td><td>stream A partition 3, stream B partition 3</td>
+    </tr>
+  </tbody>
 </table>
 
 <p>Thus, if you want two events in different streams to be processed by the same task instance, you need to ensure they are sent to the same partition number. You can achieve this by using the same partitioning key when <a href="../api/overview.html">sending the messages</a>. Joining streams is discussed in detail in the <a href="state-management.html">state management</a> section.</p>

Modified: incubator/samza/site/learn/documentation/0.7.0/container/serialization.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/container/serialization.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/container/serialization.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/container/serialization.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -131,30 +132,31 @@
 </ol>
 
 <p>You can use whatever makes sense for your job; Samza doesn&rsquo;t impose any particular data model or serialization scheme on you. However, the cleanest solution is usually to use Samza&rsquo;s serde layer. The following configuration example shows how to use it.</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># Define a system called &quot;kafka&quot;
-systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
 
-# The job is going to consume a topic called &quot;PageViewEvent&quot; from the &quot;kafka&quot; system
-task.inputs=kafka.PageViewEvent
+<div class="highlight"><pre><code class="language-jproperties" data-lang="jproperties"><span class="c"># Define a system called &quot;kafka&quot;</span>
+<span class="na">systems.kafka.samza.factory</span><span class="o">=</span><span class="s">org.apache.samza.system.kafka.KafkaSystemFactory</span>
 
-# Define a serde called &quot;json&quot; which parses/serializes JSON objects
-serializers.registry.json.class=org.apache.samza.serializers.JsonSerdeFactory
+<span class="c"># The job is going to consume a topic called &quot;PageViewEvent&quot; from the &quot;kafka&quot; system</span>
+<span class="na">task.inputs</span><span class="o">=</span><span class="s">kafka.PageViewEvent</span>
+
+<span class="c"># Define a serde called &quot;json&quot; which parses/serializes JSON objects</span>
+<span class="na">serializers.registry.json.class</span><span class="o">=</span><span class="s">org.apache.samza.serializers.JsonSerdeFactory</span>
+
+<span class="c"># Define a serde called &quot;integer&quot; which encodes an integer as 4 binary bytes (big-endian)</span>
+<span class="na">serializers.registry.integer.class</span><span class="o">=</span><span class="s">org.apache.samza.serializers.IntegerSerdeFactory</span>
+
+<span class="c"># For messages in the &quot;PageViewEvent&quot; topic, the key (the ID of the user viewing the page)</span>
+<span class="c"># is encoded as a binary integer, and the message is encoded as JSON.</span>
+<span class="na">systems.kafka.streams.PageViewEvent.samza.key.serde</span><span class="o">=</span><span class="s">integer</span>
+<span class="na">systems.kafka.streams.PageViewEvent.samza.msg.serde</span><span class="o">=</span><span class="s">json</span>
+
+<span class="c"># Define a key-value store which stores the most recent page view for each user ID.</span>
+<span class="c"># Again, the key is an integer user ID, and the value is JSON.</span>
+<span class="na">stores.LastPageViewPerUser.factory</span><span class="o">=</span><span class="s">org.apache.samza.storage.kv.KeyValueStorageEngineFactory</span>
+<span class="na">stores.LastPageViewPerUser.changelog</span><span class="o">=</span><span class="s">kafka.last-page-view-per-user</span>
+<span class="na">stores.LastPageViewPerUser.key.serde</span><span class="o">=</span><span class="s">integer</span>
+<span class="na">stores.LastPageViewPerUser.msg.serde</span><span class="o">=</span><span class="s">json</span></code></pre></div>
 
-# Define a serde called &quot;integer&quot; which encodes an integer as 4 binary bytes (big-endian)
-serializers.registry.integer.class=org.apache.samza.serializers.IntegerSerdeFactory
-
-# For messages in the &quot;PageViewEvent&quot; topic, the key (the ID of the user viewing the page)
-# is encoded as a binary integer, and the message is encoded as JSON.
-systems.kafka.streams.PageViewEvent.samza.key.serde=integer
-systems.kafka.streams.PageViewEvent.samza.msg.serde=json
-
-# Define a key-value store which stores the most recent page view for each user ID.
-# Again, the key is an integer user ID, and the value is JSON.
-stores.LastPageViewPerUser.factory=org.apache.samza.storage.kv.KeyValueStorageEngineFactory
-stores.LastPageViewPerUser.changelog=kafka.last-page-view-per-user
-stores.LastPageViewPerUser.key.serde=integer
-stores.LastPageViewPerUser.msg.serde=json
-</code></pre></div>
 <p>Each serde is defined with a factory class. Samza comes with several builtin serdes for UTF-8 strings, binary-encoded integers, JSON (requires the samza-serializers dependency) and more. You can also create your own serializer by implementing the <a href="../api/javadocs/org/apache/samza/serializers/SerdeFactory.html">SerdeFactory</a> interface.</p>
 
 <p>The name you give to a serde (such as &ldquo;json&rdquo; and &ldquo;integer&rdquo; in the example above) is only for convenience in your job configuration; you can choose whatever name you like. For each stream and each state store, you can use the serde name to declare how messages should be serialized and deserialized.</p>

Modified: incubator/samza/site/learn/documentation/0.7.0/container/state-management.html
URL: http://svn.apache.org/viewvc/incubator/samza/site/learn/documentation/0.7.0/container/state-management.html?rev=1609232&r1=1609231&r2=1609232&view=diff
==============================================================================
--- incubator/samza/site/learn/documentation/0.7.0/container/state-management.html (original)
+++ incubator/samza/site/learn/documentation/0.7.0/container/state-management.html Wed Jul  9 16:37:01 2014
@@ -23,6 +23,7 @@
     <link href="/css/bootstrap.min.css" rel="stylesheet"/>
     <link href="/css/font-awesome.min.css" rel="stylesheet"/>
     <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
     <link rel="icon" type="image/png" href="/img/samza-icon.png">
   </head>
   <body>
@@ -244,74 +245,51 @@
 <p>Samza includes an additional in-memory caching layer in front of LevelDB, which avoids the cost of deserialization for frequently-accessed objects and batches writes. If the same key is updated multiple times in quick succession, the batching coalesces those updates into a single write. The writes are flushed to the changelog when a task <a href="checkpointing.html">commits</a>.</p>
 
 <p>To use a key-value store in your job, add the following to your job config:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># Use the key-value store implementation for a store called &quot;my-store&quot;
-stores.my-store.factory=org.apache.samza.storage.kv.KeyValueStorageEngineFactory
 
-# Use the Kafka topic &quot;my-store-changelog&quot; as the changelog stream for this store.
-# This enables automatic recovery of the store after a failure. If you don&#39;t
-# configure this, no changelog stream will be generated.
-stores.my-store.changelog=kafka.my-store-changelog
-
-# Encode keys and values in the store as UTF-8 strings.
-serializers.registry.string.class=org.apache.samza.serializers.StringSerdeFactory
-stores.my-store.key.serde=string
-stores.my-store.msg.serde=string
-</code></pre></div>
+<div class="highlight"><pre><code class="language-jproperties" data-lang="jproperties"><span class="c"># Use the key-value store implementation for a store called &quot;my-store&quot;</span>
+<span class="na">stores.my-store.factory</span><span class="o">=</span><span class="s">org.apache.samza.storage.kv.KeyValueStorageEngineFactory</span>
+
+<span class="c"># Use the Kafka topic &quot;my-store-changelog&quot; as the changelog stream for this store.</span>
+<span class="c"># This enables automatic recovery of the store after a failure. If you don&#39;t</span>
+<span class="c"># configure this, no changelog stream will be generated.</span>
+<span class="na">stores.my-store.changelog</span><span class="o">=</span><span class="s">kafka.my-store-changelog</span>
+
+<span class="c"># Encode keys and values in the store as UTF-8 strings.</span>
+<span class="na">serializers.registry.string.class</span><span class="o">=</span><span class="s">org.apache.samza.serializers.StringSerdeFactory</span>
+<span class="na">stores.my-store.key.serde</span><span class="o">=</span><span class="s">string</span>
+<span class="na">stores.my-store.msg.serde</span><span class="o">=</span><span class="s">string</span></code></pre></div>
+
 <p>See the <a href="serialization.html">serialization section</a> for more information on the <em>serde</em> options.</p>
 
 <p>Here is a simple example that writes every incoming message to the store:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">public class MyStatefulTask implements StreamTask, InitableTask {
-  private KeyValueStore&lt;String, String&gt; store;
 
-  public void init(Config config, TaskContext context) {
-    this.store = (KeyValueStore&lt;String, String&gt;) context.getStore(&quot;my-store&quot;);
-  }
-
-  public void process(IncomingMessageEnvelope envelope,
-                      MessageCollector collector,
-                      TaskCoordinator coordinator) {
-    store.put((String) envelope.getKey(), (String) envelope.getMessage());
-  }
-}
-</code></pre></div>
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">class</span> <span class="nc">MyStatefulTask</span> <span class="kd">implements</span> <span class="n">StreamTask</span><span class="o">,</span> <span class="n">InitableTask</span> <span class="o">{</span>
+  <span class="kd">private</span> <span class="n">KeyValueStore</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">String</span><span class="o">&gt;</span> <span class="n">store</span><span class="o">;</span>
+
+  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">init</span><span class="o">(</span><span class="n">Config</span> <span class="n">config</span><span class="o">,</span> <span class="n">TaskContext</span> <span class="n">context</span><span class="o">)</span> <span class="o">{</span>
+    <span class="k">this</span><span class="o">.</span><span class="na">store</span> <span class="o">=</span> <span class="o">(</span><span class="n">KeyValueStore</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">String</span><span class="o">&gt;)</span> <span class="n">context</span><span class="o">.</span><span class="na">getStore</span><span class="o">(</span><span class="s">&quot;my-store&quot;</span><span class="o">);</span>
+  <span class="o">}</span>
+
+  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">process</span><span class="o">(</span><span class="n">IncomingMessageEnvelope</span> <span class="n">envelope</span><span class="o">,</span>
+                      <span class="n">MessageCollector</span> <span class="n">collector</span><span class="o">,</span>
+                      <span class="n">TaskCoordinator</span> <span class="n">coordinator</span><span class="o">)</span> <span class="o">{</span>
+    <span class="n">store</span><span class="o">.</span><span class="na">put</span><span class="o">((</span><span class="n">String</span><span class="o">)</span> <span class="n">envelope</span><span class="o">.</span><span class="na">getKey</span><span class="o">(),</span> <span class="o">(</span><span class="n">String</span><span class="o">)</span> <span class="n">envelope</span><span class="o">.</span><span class="na">getMessage</span><span class="o">());</span>
+  <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+
 <p>Here is the complete key-value store API:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text">public interface KeyValueStore&lt;K, V&gt; {
-  V get(K key);
-  void put(K key, V value);
-  void putAll(List&lt;Entry&lt;K,V&gt;&gt; entries);
-  void delete(K key);
-  KeyValueIterator&lt;K,V&gt; range(K from, K to);
-  KeyValueIterator&lt;K,V&gt; all();
-}
-</code></pre></div>
-<p>Here is a list of additional configurations accepted by the key-value store, along with their default values:</p>
-<div class="highlight"><pre><code class="language-text" data-lang="text"># The number of writes to batch together
-stores.my-store.write.batch.size=500
-
-# The number of objects to keep in Samza&#39;s cache (in front of LevelDB).
-# This must be at least as large as write.batch.size.
-# A cache size of 0 disables all caching and batching.
-stores.my-store.object.cache.size=1000
-
-# The size of the off-heap leveldb block cache in bytes, per container.
-# If you have multiple tasks within one container, each task is given a
-# proportional share of this cache.
-stores.my-store.container.cache.size.bytes=104857600
-
-# The amount of memory leveldb uses for buffering writes before they are
-# written to disk, per container. If you have multiple tasks within one
-# container, each task is given a proportional share of this buffer.
-# This setting also determines the size of leveldb&#39;s segment files.
-stores.my-store.container.write.buffer.size.bytes=33554432
-
-# Enable block compression? (set compression=none to disable)
-stores.my-store.leveldb.compression=snappy
-
-# If compression is enabled, leveldb groups approximately this many
-# uncompressed bytes into one compressed block. You probably don&#39;t need
-# to change this unless you are a compulsive fiddler.
-stores.my-store.leveldb.block.size.bytes=4096
-</code></pre></div>
+
+<div class="highlight"><pre><code class="language-java" data-lang="java"><span class="kd">public</span> <span class="kd">interface</span> <span class="nc">KeyValueStore</span><span class="o">&lt;</span><span class="n">K</span><span class="o">,</span> <span class="n">V</span><span class="o">&gt;</span> <span class="o">{</span>
+  <span class="n">V</span> <span class="nf">get</span><span class="o">(</span><span class="n">K</span> <span class="n">key</span><span class="o">);</span>
+  <span class="kt">void</span> <span class="nf">put</span><span class="o">(</span><span class="n">K</span> <span class="n">key</span><span class="o">,</span> <span class="n">V</span> <span class="n">value</span><span class="o">);</span>
+  <span class="kt">void</span> <span class="nf">putAll</span><span class="o">(</span><span class="n">List</span><span class="o">&lt;</span><span class="n">Entry</span><span class="o">&lt;</span><span class="n">K</span><span class="o">,</span><span class="n">V</span><span class="o">&gt;&gt;</span> <span class="n">entries</span><span class="o">);</span>
+  <span class="kt">void</span> <span class="nf">delete</span><span class="o">(</span><span class="n">K</span> <span class="n">key</span><span class="o">);</span>
+  <span class="n">KeyValueIterator</span><span class="o">&lt;</span><span class="n">K</span><span class="o">,</span><span class="n">V</span><span class="o">&gt;</span> <span class="nf">range</span><span class="o">(</span><span class="n">K</span> <span class="n">from</span><span class="o">,</span> <span class="n">K</span> <span class="n">to</span><span class="o">);</span>
+  <span class="n">KeyValueIterator</span><span class="o">&lt;</span><span class="n">K</span><span class="o">,</span><span class="n">V</span><span class="o">&gt;</span> <span class="nf">all</span><span class="o">();</span>
+<span class="o">}</span></code></pre></div>
+
+<p>Additional configuration properties for the key-value store are documented in the <a href="../jobs/configuration-table.html#keyvalue">configuration reference</a>.</p>
+
 <h3 id="implementing-common-use-cases-with-the-key-value-store">Implementing common use cases with the key-value store</h3>
 
 <p>Earlier in this section we discussed some example use cases for stateful stream processing. Let&rsquo;s look at how each of these could be implemented using a key-value storage engine such as Samza&rsquo;s LevelDB.</p>