You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by jm...@apache.org on 2017/06/09 18:46:24 UTC

svn commit: r1798258 [21/22] - in /samza/site: ./ archive/ community/ contribute/ img/latest/learn/documentation/introduction/ img/latest/learn/tutorials/ img/latest/learn/tutorials/hello-samza-high-level/ learn/documentation/latest/ learn/documentatio...

Added: samza/site/learn/tutorials/latest/hello-samza-high-level-code.html
URL: http://svn.apache.org/viewvc/samza/site/learn/tutorials/latest/hello-samza-high-level-code.html?rev=1798258&view=auto
==============================================================================
--- samza/site/learn/tutorials/latest/hello-samza-high-level-code.html (added)
+++ samza/site/learn/tutorials/latest/hello-samza-high-level-code.html Fri Jun  9 18:46:20 2017
@@ -0,0 +1,583 @@
+<!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <title>Samza - Hello Samza High Level API - Code Walkthrough</title>
+    <link href='/css/ropa-sans.css' rel='stylesheet' type='text/css'/>
+    <link href="/css/bootstrap.min.css" rel="stylesheet"/>
+    <link href="/css/font-awesome.min.css" rel="stylesheet"/>
+    <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
+    <link rel="icon" type="image/png" href="/img/samza-icon.png">
+    <script src="/js/jquery-1.11.1.min.js"></script>
+  </head>
+  <body>
+    <div class="wrapper">
+      <div class="wrapper-content">
+
+        <div class="masthead">
+          <div class="container">
+            <div class="masthead-logo">
+              <a href="/" class="logo">samza</a>
+            </div>
+            <div class="masthead-icons">
+              <div class="pull-right">
+                <a href="/startup/download"><i class="fa fa-arrow-circle-o-down masthead-icon"></i></a>
+                <a href="https://git-wip-us.apache.org/repos/asf?p=samza.git;a=tree" target="_blank"><i class="fa fa-code masthead-icon" style="font-weight: bold;"></i></a>
+                <a href="https://twitter.com/samzastream" target="_blank"><i class="fa fa-twitter masthead-icon"></i></a>
+                <!-- this icon only shows in versioned pages -->
+                
+                  
+                    
+                  
+                  <a href="http://samza.apache.org/learn/tutorials/0.13/hello-samza-high-level-code.html"><i id="switch-version-button"></i></a>
+                   <!-- links for the navigation bar -->
+                
+
+              </div>
+            </div>
+          </div><!-- /.container -->
+        </div>
+
+        <div class="container">
+          <div class="menu">
+            <h1><i class="fa fa-rocket"></i> Getting Started</h1>
+            <ul>
+              <li><a href="/startup/hello-samza/latest">Hello Samza</a></li>
+              <li><a href="/startup/download">Download</a></li>
+              <li><a href="/startup/preview">Feature Preview</a></li>
+            </ul>
+
+            <h1><i class="fa fa-book"></i> Learn</h1>
+            <ul>
+              <li><a href="/learn/documentation/latest">Documentation</a></li>
+              <li><a href="/learn/documentation/latest/jobs/configuration-table.html">Configuration</a></li>
+              <li><a href="/learn/documentation/latest/container/metrics-table.html">Metrics</a></li>
+              <li><a href="/learn/documentation/latest/api/javadocs/">Javadocs</a></li>
+              <li><a href="/learn/tutorials/latest">Tutorials</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/display/SAMZA/FAQ">FAQ</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/display/SAMZA/Apache+Samza">Wiki</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=51812876">Papers &amp; Talks</a></li>
+              <li><a href="http://blogs.apache.org/samza">Blog</a></li>
+            </ul>
+
+            <h1><i class="fa fa-comments"></i> Community</h1>
+            <ul>
+              <li><a href="/community/mailing-lists.html">Mailing Lists</a></li>
+              <li><a href="/community/irc.html">IRC</a></li>
+              <li><a href="https://issues.apache.org/jira/browse/SAMZA">Bugs</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/display/SAMZA/Powered+By">Powered by</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/display/SAMZA/Ecosystem">Ecosystem</a></li>
+              <li><a href="/community/committers.html">Committers</a></li>
+            </ul>
+
+            <h1><i class="fa fa-code"></i> Contribute</h1>
+            <ul>
+              <li><a href="/contribute/contributors-corner.html">Contributor's Corner</a></li>
+              <li><a href="/contribute/coding-guide.html">Coding Guide</a></li>
+              <li><a href="/contribute/design-documents.html">Design Documents</a></li>
+              <li><a href="/contribute/code.html">Code</a></li>
+              <li><a href="/contribute/tests.html">Tests</a></li>
+            </ul>
+
+            <h1><i class="fa fa-history"></i> Archive</h1>
+            <ul>
+              <li><a href="/archive/index.html#latest">latest</a></li>
+              <li><a href="/archive/index.html#13">0.13</a></li>
+              <li><a href="/archive/index.html#12">0.12</a></li>
+              <li><a href="/archive/index.html#11">0.11</a></li>
+              <li><a href="/archive/index.html#10">0.10</a></li>
+              <li><a href="/archive/index.html#09">0.9</a></li>
+              <li><a href="/archive/index.html#08">0.8</a></li>
+              <li><a href="/archive/index.html#07">0.7</a></li>
+            </ul>
+          </div>
+
+          <div class="content">
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Hello Samza High Level API - Code Walkthrough</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<p>This tutorial introduces the high level API by showing you how to build wikipedia application from the <a href="hello-samza-high-level-yarn.html">hello-samza high level API Yarn tutorial</a>. Upon completion of this tutorial, you&rsquo;ll know how to implement and configure a <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html">StreamApplication</a>. Along the way, you&rsquo;ll see how to use some of the basic operators as well as how to leverage key-value stores and metrics in an app.</p>
+
+<p>The same <a href="https://github.com/apache/samza-hello-samza">hello-samza</a> project is used for this tutorial as for many of the others. You will clone that project and by the end of the tutorial, you will have implemented a duplicate of the <code>WikipediaApplication</code>.</p>
+
+<p>Let&rsquo;s get started.</p>
+
+<h3 id="get-the-code">Get the Code</h3>
+
+<p>Check out the hello-samza project:</p>
+
+<div class="highlight"><pre><code class="bash">git clone https://git.apache.org/samza-hello-samza.git hello-samza
+<span class="nb">cd </span>hello-samza
+git checkout latest</code></pre></div>
+
+<p>This project already contains implementations of the wikipedia application using both the low-level task API and the high-level API. The low-level task implementations are in the <code>samza.examples.wikipedia.task</code> package. The high-level application implementation is in the <code>samza.examples.wikipedia.application</code> package.</p>
+
+<p>This tutorial will provide step by step instructions to recreate the existing wikipedia application.</p>
+
+<h3 id="introduction-to-wikipedia-consumer">Introduction to Wikipedia Consumer</h3>
+
+<p>In order to consume events from Wikipedia, the hello-samza project includes a <code>WikipediaSystemFactory</code> implementation of the Samza <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/system/SystemFactory.html">SystemFactory</a> that provides a <code>WikipediaConsumer</code>.</p>
+
+<p>The WikipediaConsumer is an implementation of <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/system/SystemConsumer.html">SystemConsumer</a> that can consume events from Wikipedia. It is also a listener for events from the <code>WikipediaFeed</code>. It&rsquo;s important to note that the events received in <code>onEvent</code> are of the type <code>WikipediaFeedEvent</code>, so we will expect that type for messages on our input streams. For other systems, the messages may come in the form of <code>byte[]</code>. In that case you may want to configure a samza <a href="/learn/documentation/latest/container/serialization.html">serde</a> and the application should expect the output type of that serde.</p>
+
+<p>Now that we understand the Wikipedia system and the types of inputs we&rsquo;ll be processing, we can proceed with creating our application.</p>
+
+<h3 id="create-the-initial-config">Create the Initial Config</h3>
+
+<p>In the hello-samza project, configs are kept in the <em>src/main/config/</em> path. This is where we will add the config for our application.
+Create a new file named <em>my-wikipedia-application.properties</em> in this location.</p>
+
+<h4 id="core-configuration">Core Configuration</h4>
+
+<p>Let&rsquo;s start by adding some of the core properties to the file:</p>
+
+<div class="highlight"><pre><code class="bash"><span class="c"># Licensed to the Apache Software Foundation (ASF) under one</span>
+<span class="c"># or more contributor license agreements.  See the NOTICE file</span>
+<span class="c"># distributed with this work for additional information</span>
+<span class="c"># regarding copyright ownership.  The ASF licenses this file</span>
+<span class="c"># to you under the Apache License, Version 2.0 (the</span>
+<span class="c"># &quot;License&quot;); you may not use this file except in compliance</span>
+<span class="c"># with the License.  You may obtain a copy of the License at</span>
+<span class="c">#</span>
+<span class="c">#   http://www.apache.org/licenses/LICENSE-2.0</span>
+<span class="c">#</span>
+<span class="c"># Unless required by applicable law or agreed to in writing,</span>
+<span class="c"># software distributed under the License is distributed on an</span>
+<span class="c"># &quot;AS IS&quot; BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY</span>
+<span class="c"># KIND, either express or implied.  See the License for the</span>
+<span class="c"># specific language governing permissions and limitations</span>
+<span class="c"># under the License.</span>
+
+app.class<span class="o">=</span>samza.examples.wikipedia.application.MyWikipediaApplication
+app.runner.class<span class="o">=</span>org.apache.samza.runtime.RemoteApplicationRunner
+
+job.factory.class<span class="o">=</span>org.apache.samza.job.yarn.YarnJobFactory
+job.name<span class="o">=</span>my-wikipedia-application
+job.default.system<span class="o">=</span>kafka
+
+yarn.package.path<span class="o">=</span>file://<span class="k">${</span><span class="nv">basedir</span><span class="k">}</span>/target/<span class="k">${</span><span class="nv">project</span><span class="p">.artifactId</span><span class="k">}</span>-<span class="k">${</span><span class="nv">pom</span><span class="p">.version</span><span class="k">}</span>-dist.tar.gz</code></pre></div>
+
+<p>Be sure to include the Apache header. The project will not compile without it. </p>
+
+<p>Here&rsquo;s a brief summary of what we configured so far.</p>
+
+<ul>
+<li><strong>app.class</strong>: the class that defines the application logic. We will create this class later.</li>
+<li><strong>app.runner.class</strong>: the runner implementation which will launch our application. Since we are using YARN, we use <code>RemoteApplicationRunner</code> which is required for any cluster-based deployment.</li>
+<li><strong>job.factory.class</strong>: the <a href="/learn/documentation/latest/jobs/job-runner.html">factory</a> that will create the runtime instances of our jobs. Since we are using YARN, we want each job to be created as a <a href="/learn/documentation/latest/jobs/yarn-jobs.html">YARN job</a>, so we use <code>YarnJobFactory</code></li>
+<li><strong>job.name</strong>: the primary identifier for the job.</li>
+<li><strong>job.default.system</strong>: the default system to use for input, output, and internal metadata streams. This can be overridden on a per-stream basis. The <em>kafka</em> system will be defined in the next section.</li>
+<li><strong>yarn.package.path</strong>: tells YARN where to find the <a href="/learn/documentation/latest/jobs/packaging.html">job package</a> so the Node Managers can download it.</li>
+</ul>
+
+<p>These basic configurations are enough to launch the application on YARN but we haven’t defined any streaming systems for Samza to use, so the application would not process anything.</p>
+
+<p>Next, let&rsquo;s define the streaming systems with which the application will interact. </p>
+
+<h4 id="define-systems">Define Systems</h4>
+
+<p>This Wikipedia application will consume events from Wikipedia and produce stats to a Kafka topic. We need to define those systems in config before Samza can use them. Add the following lines to the config:</p>
+
+<div class="highlight"><pre><code class="bash">systems.wikipedia.samza.factory<span class="o">=</span>samza.examples.wikipedia.system.WikipediaSystemFactory
+systems.wikipedia.host<span class="o">=</span>irc.wikimedia.org
+systems.wikipedia.port<span class="o">=</span>6667
+
+systems.kafka.samza.factory<span class="o">=</span>org.apache.samza.system.kafka.KafkaSystemFactory
+systems.kafka.consumer.zookeeper.connect<span class="o">=</span>localhost:2181/
+systems.kafka.producer.bootstrap.servers<span class="o">=</span>localhost:9092
+systems.kafka.default.stream.replication.factor<span class="o">=</span>1
+systems.kafka.default.stream.samza.msg.serde<span class="o">=</span>json</code></pre></div>
+
+<p>The above configuration defines 2 systems; one called <em>wikipedia</em> and one called <em>kafka</em>.</p>
+
+<p>A factory is required for each system, so the <em>systems.system-name.samza.system.factory</em> property is required for both systems. The other properties are system and use-case specific.</p>
+
+<p>For the <em>kafka</em> system, we set the default replication factor to 1 for all streams because this application is intended for a demo deployment which utilizes a Kafka cluster with only 1 broker, so a replication factor larger than 1 is invalid. The default serde is JSON, which means by default any streams consumed or produced to the <em>kafka</em> system will use a <em>json</em> serde, which we will define in the next section.</p>
+
+<p>The <em>wikipedia</em> system does not need a serde because the <code>WikipediaConsumer</code> already produces a usable type.</p>
+
+<h4 id="serdes">Serdes</h4>
+
+<p>Next, we need to configure the <a href="/learn/documentation/latest/container/serialization.html">serdes</a> we will use for streams and stores in the application.</p>
+
+<div class="highlight"><pre><code class="bash">serializers.registry.json.class<span class="o">=</span>org.apache.samza.serializers.JsonSerdeFactory
+serializers.registry.string.class<span class="o">=</span>org.apache.samza.serializers.StringSerdeFactory
+serializers.registry.integer.class<span class="o">=</span>org.apache.samza.serializers.IntegerSerdeFactory</code></pre></div>
+
+<p>The <em>json</em> serde was used for the <em>kafka</em> system above. The <em>string</em> and <em>integer</em> serdes will be used later.</p>
+
+<h4 id="configure-streams">Configure Streams</h4>
+
+<p>Samza identifies streams using a unique stream ID. In most cases, the stream ID is the same as the actual stream name. However, if a stream has a name that doesn&rsquo;t match the pattern <code>[A-Za-z0-9_-]+</code>, we need to configure a separate <em>physical.name</em> to associate the actual stream name with a legal stream ID. The Wikipedia channels we will consume have a &lsquo;#&rsquo; character in the names. So for each of them we must pick a legal stream ID and then configure the physical name to match the channel.</p>
+
+<p>Samza uses the <em>job.default.system</em> for any streams that do not explicitly specify a system. In the previous sections, we defined 2 systems, <em>wikipedia</em> and <em>kafka</em>, and we configured <em>kafka</em> as the default. To understand why, let&rsquo;s look at the streams and how Samza will use them.</p>
+
+<p>For this app, Samza will:</p>
+
+<ol>
+<li>Consume from input streams</li>
+<li>Produce to an output stream and a metrics stream</li>
+<li>Both produce and consume from job-coordination, checkpoint, and changelog streams</li>
+</ol>
+
+<p>While the <em>wikipedia</em> system is necessary for case 1, it does not support producers (we can&rsquo;t write Samza output to Wikipedia), which are needed for cases 2-3. So it is more convenient to use <em>kafka</em> as the default system. We can then explicitly configure the input streams to use the <em>wikipedia</em> system.</p>
+
+<div class="highlight"><pre><code class="bash">streams.en-wikipedia.samza.system<span class="o">=</span>wikipedia
+streams.en-wikipedia.samza.physical.name<span class="o">=</span><span class="c">#en.wikipedia</span>
+
+streams.en-wiktionary.samza.system<span class="o">=</span>wikipedia
+streams.en-wiktionary.samza.physical.name<span class="o">=</span><span class="c">#en.wiktionary</span>
+
+streams.en-wikinews.samza.system<span class="o">=</span>wikipedia
+streams.en-wikinews.samza.physical.name<span class="o">=</span><span class="c">#en.wikinews</span></code></pre></div>
+
+<p>The above configurations declare 3 streams with IDs, <em>en-wikipedia</em>, <em>en-wiktionary</em>, and <em>en-wikinews</em>. It associates each stream with the <em>wikipedia</em> system we defined earlier and set the physical name to the corresponding Wikipedia channel. </p>
+
+<p>Since all the Kafka streams for cases 2-3 are on the default system and do not include special characters in their names, we do not need to configure them explicitly.</p>
+
+<h3 id="create-a-streamapplication">Create a StreamApplication</h3>
+
+<p>With the core configuration settled, we turn our attention to code.</p>
+
+<h3 id="define-application-logic">Define Application Logic</h3>
+
+<p>Let&rsquo;s create the application class we configured above. The next 8 sections walk you through writing the code for the Wikipedia application.</p>
+
+<p>Create a new class named <code>MyWikipediaApplication</code> in the <code>samza.examples.wikipedia.application</code> package. The class must implement <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html">StreamApplication</a> and should look like this:</p>
+
+<div class="highlight"><pre><code class="java"><span class="cm">/*</span>
+<span class="cm"> * Licensed to the Apache Software Foundation (ASF) under one</span>
+<span class="cm"> * or more contributor license agreements.  See the NOTICE file</span>
+<span class="cm"> * distributed with this work for additional information</span>
+<span class="cm"> * regarding copyright ownership.  The ASF licenses this file</span>
+<span class="cm"> * to you under the Apache License, Version 2.0 (the</span>
+<span class="cm"> * &quot;License&quot;); you may not use this file except in compliance</span>
+<span class="cm"> * with the License.  You may obtain a copy of the License at</span>
+<span class="cm"> *</span>
+<span class="cm"> *   http://www.apache.org/licenses/LICENSE-2.0</span>
+<span class="cm"> *</span>
+<span class="cm"> * Unless required by applicable law or agreed to in writing,</span>
+<span class="cm"> * software distributed under the License is distributed on an</span>
+<span class="cm"> * &quot;AS IS&quot; BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY</span>
+<span class="cm"> * KIND, either express or implied.  See the License for the</span>
+<span class="cm"> * specific language governing permissions and limitations</span>
+<span class="cm"> * under the License.</span>
+<span class="cm"> */</span>
+<span class="kn">package</span> <span class="n">samza</span><span class="o">.</span><span class="na">examples</span><span class="o">.</span><span class="na">wikipedia</span><span class="o">.</span><span class="na">application</span><span class="o">;</span>
+
+<span class="kn">import</span> <span class="nn">org.apache.samza.application.StreamApplication</span><span class="o">;</span>
+<span class="kn">import</span> <span class="nn">org.apache.samza.config.Config</span><span class="o">;</span>
+<span class="kn">import</span> <span class="nn">org.apache.samza.operators.StreamGraph</span><span class="o">;</span>
+
+<span class="kd">public</span> <span class="kd">class</span> <span class="nc">MyWikipediaApplication</span> <span class="kd">implements</span> <span class="n">StreamApplication</span><span class="o">{</span>
+  <span class="nd">@Override</span>
+  <span class="kd">public</span> <span class="kt">void</span> <span class="nf">init</span><span class="o">(</span><span class="n">StreamGraph</span> <span class="n">streamGraph</span><span class="o">,</span> <span class="n">Config</span> <span class="n">config</span><span class="o">)</span> <span class="o">{</span>
+    
+  <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+
+<p>Be sure to include the Apache header. The project will not compile without it.</p>
+
+<p>The <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html#init-org.apache.samza.operators.StreamGraph-org.apache.samza.config.Config-">init</a> method is where the application logic is defined. The <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/config/Config.html">Config</a> argument is the runtime configuration loaded from the properties file we defined earlier. The <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/StreamGraph.html">StreamGraph</a> argument provides methods to declare input streams. You can then invoke a number of flexible operations on those streams. The result of each operation is another stream, so you can keep chaining more operations or direct the result to an output stream.</p>
+
+<p>Next, we will declare the input streams for the Wikipedia application.</p>
+
+<h4 id="inputs">Inputs</h4>
+
+<p>The Wikipedia application consumes events from three channels. Let&rsquo;s declare each of those channels as an input streams via the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/StreamGraph.html">StreamGraph</a> in the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html#init-org.apache.samza.operators.StreamGraph-org.apache.samza.config.Config-">init</a> method.</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">MessageStream</span><span class="o">&lt;</span><span class="n">WikipediaFeedEvent</span><span class="o">&gt;</span> <span class="n">wikipediaEvents</span> <span class="o">=</span> <span class="n">streamGraph</span><span class="o">.</span><span class="na">getInputStream</span><span class="o">(</span><span class="s">&quot;en-wikipedia&quot;</span><span class="o">,</span> <span class="o">(</span><span class="n">k</span><span class="o">,</span> <span class="n">v</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="o">(</span><span class="n">WikipediaFeedEvent</span><span class="o">)</span> <span class="n">v</span><span class="o">);</span>
+<span class="n">MessageStream</span><span class="o">&lt;</span><span class="n">WikipediaFeedEvent</span><span class="o">&gt;</span> <span class="n">wiktionaryEvents</span> <span class="o">=</span> <span class="n">streamGraph</span><span class="o">.</span><span class="na">getInputStream</span><span class="o">(</span><span class="s">&quot;en-wiktionary&quot;</span><span class="o">,</span> <span class="o">(</span><span class="n">k</span><span class="o">,</span> <span class="n">v</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="o">(</span><span class="n">WikipediaFeedEvent</span><span class="o">)</span> <span class="n">v</span><span class="o">);</span>
+<span class="n">MessageStream</span><span class="o">&lt;</span><span class="n">WikipediaFeedEvent</span><span class="o">&gt;</span> <span class="n">wikiNewsEvents</span> <span class="o">=</span> <span class="n">streamGraph</span><span class="o">.</span><span class="na">getInputStream</span><span class="o">(</span><span class="s">&quot;en-wikinews&quot;</span><span class="o">,</span> <span class="o">(</span><span class="n">k</span><span class="o">,</span> <span class="n">v</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="o">(</span><span class="n">WikipediaFeedEvent</span><span class="o">)</span> <span class="n">v</span><span class="o">);</span></code></pre></div>
+
+<p>The first argument to the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/StreamGraph.html#getInputStream-java.lang.String-java.util.function.BiFunction-">getInputStream</a> method is the stream ID. Each ID must match the corresponding stream IDs we configured earlier.</p>
+
+<p>The second argument is the <em>message builder</em>. It converts the input key and message to the appropriate type. In this case, we don&rsquo;t have a key and want to sent the events as-is, so we have a very simple builder that just forwards the input value.</p>
+
+<p>Note the streams are all MessageStreams of type WikipediaFeedEvent. <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html">MessageStream</a> is the in-memory representation of a stream in Samza. It uses generics to ensure type safety across the streams and operations. We knew the WikipediaFeedEvent type by inspecting the WikipediaConsumer above and we made it explicit with the cast on the output of the MessageBuilder. If our inputs used a serde, we would know the type based on which serde is configured for the input streams.</p>
+
+<h4 id="merge">Merge</h4>
+
+<p>We&rsquo;d like to use the same processing logic for all three input streams, so we will use the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#mergeAll-java.util.Collection-">mergeAll</a> operator to merge them together. Note: this is not the same as a <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#join-org.apache.samza.operators.MessageStream-org.apache.samza.operators.functions.JoinFunction-java.time.Duration-">join</a> because we are not associating events by key. We are simply combining three streams into one, like a union.</p>
+
+<p>Add the following snippet to the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html#init-org.apache.samza.operators.StreamGraph-org.apache.samza.config.Config-">init</a> method. It merges all the input streams into a new one called <em>allWikipediaEvents</em></p>
+
+<div class="highlight"><pre><code class="java"><span class="n">MessageStream</span><span class="o">&lt;</span><span class="n">WikipediaFeed</span><span class="o">.</span><span class="na">WikipediaFeedEvent</span><span class="o">&gt;</span> <span class="n">allWikipediaEvents</span> <span class="o">=</span> <span class="n">MessageStream</span><span class="o">.</span><span class="na">mergeAll</span><span class="o">(</span><span class="n">ImmutableList</span><span class="o">.</span><span class="na">of</span><span class="o">(</span><span class="n">wikipediaEvents</span><span class="o">,</span> <span class="n">wiktionaryEvents</span><span class="o">,</span> <span class="n">wikiNewsEvents</span><span class="o">));</span></code></pre></div>
+
+<p>Note there is a <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#merge-java.util.Collection-">merge</a> operator instance method on MessageStream, but the static <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#mergeAll-java.util.Collection-">mergeAll</a> method is a more convenient alternative if you need to merge many streams.</p>
+
+<h4 id="parse">Parse</h4>
+
+<p>The next step is to parse the events and extract some information. We will use the pre-existing `WikipediaParser.parseEvent()&lsquo; method to do this. The parser extracts some flags we want to monitor as well as some metadata about the event. Inspect the method signature. The input is a WikipediaFeedEvents and the output is a Map<String, Object>. These types will be reflected in the types of the streams before and after the operation.</p>
+
+<p>In the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html#init-org.apache.samza.operators.StreamGraph-org.apache.samza.config.Config-">init</a> method, invoke the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#map-org.apache.samza.operators.functions.MapFunction-">map</a> operation on <code>allWikipediaEvents</code>, passing the <code>WikipediaParser::parseEvent</code> method reference as follows:</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">allWikipediaEvents</span><span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="nl">WikipediaParser:</span><span class="o">:</span><span class="n">parseEvent</span><span class="o">);</span></code></pre></div>
+
+<h4 id="window">Window</h4>
+
+<p>Now that we have the relevant information extracted, let&rsquo;s perform some aggregations over a 10-second <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/windows/Window.html">window</a>.</p>
+
+<p>First, we need a container class for statistics we want to track. Add the following static class after the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html#init-org.apache.samza.operators.StreamGraph-org.apache.samza.config.Config-">init</a> method.</p>
+
+<div class="highlight"><pre><code class="java"><span class="kd">private</span> <span class="kd">static</span> <span class="kd">class</span> <span class="nc">WikipediaStats</span> <span class="o">{</span>
+  <span class="kt">int</span> <span class="n">edits</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span>
+  <span class="kt">int</span> <span class="n">byteDiff</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span>
+  <span class="n">Set</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;</span> <span class="n">titles</span> <span class="o">=</span> <span class="k">new</span> <span class="n">HashSet</span><span class="o">&lt;</span><span class="n">String</span><span class="o">&gt;();</span>
+  <span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="n">counts</span> <span class="o">=</span> <span class="k">new</span> <span class="n">HashMap</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;();</span>
+<span class="o">}</span></code></pre></div>
+
+<p>Now we need to define the logic to aggregate the stats over the duration of the window. To do this, we implement <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/functions/FoldLeftFunction.html">FoldLeftFunction</a> by adding the following class after the <code>WikipediaStats</code> class:</p>
+
+<div class="highlight"><pre><code class="java"><span class="kd">private</span> <span class="kd">class</span> <span class="nc">WikipediaStatsAggregator</span> <span class="kd">implements</span> <span class="n">FoldLeftFunction</span><span class="o">&lt;</span><span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Object</span><span class="o">&gt;,</span> <span class="n">WikipediaStats</span><span class="o">&gt;</span> <span class="o">{</span>
+
+  <span class="nd">@Override</span>
+  <span class="kd">public</span> <span class="n">WikipediaStats</span> <span class="nf">apply</span><span class="o">(</span><span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Object</span><span class="o">&gt;</span> <span class="n">edit</span><span class="o">,</span> <span class="n">WikipediaStats</span> <span class="n">stats</span><span class="o">)</span> <span class="o">{</span>
+    <span class="c1">// Update window stats</span>
+    <span class="n">stats</span><span class="o">.</span><span class="na">edits</span><span class="o">++;</span>
+    <span class="n">stats</span><span class="o">.</span><span class="na">byteDiff</span> <span class="o">+=</span> <span class="o">(</span><span class="n">Integer</span><span class="o">)</span> <span class="n">edit</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">&quot;diff-bytes&quot;</span><span class="o">);</span>
+    <span class="n">stats</span><span class="o">.</span><span class="na">titles</span><span class="o">.</span><span class="na">add</span><span class="o">((</span><span class="n">String</span><span class="o">)</span> <span class="n">edit</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">&quot;title&quot;</span><span class="o">));</span>
+
+    <span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Boolean</span><span class="o">&gt;</span> <span class="n">flags</span> <span class="o">=</span> <span class="o">(</span><span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Boolean</span><span class="o">&gt;)</span> <span class="n">edit</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">&quot;flags&quot;</span><span class="o">);</span>
+    <span class="k">for</span> <span class="o">(</span><span class="n">Map</span><span class="o">.</span><span class="na">Entry</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Boolean</span><span class="o">&gt;</span> <span class="n">flag</span> <span class="o">:</span> <span class="n">flags</span><span class="o">.</span><span class="na">entrySet</span><span class="o">())</span> <span class="o">{</span>
+      <span class="k">if</span> <span class="o">(</span><span class="n">Boolean</span><span class="o">.</span><span class="na">TRUE</span><span class="o">.</span><span class="na">equals</span><span class="o">(</span><span class="n">flag</span><span class="o">.</span><span class="na">getValue</span><span class="o">()))</span> <span class="o">{</span>
+        <span class="n">stats</span><span class="o">.</span><span class="na">counts</span><span class="o">.</span><span class="na">compute</span><span class="o">(</span><span class="n">flag</span><span class="o">.</span><span class="na">getKey</span><span class="o">(),</span> <span class="o">(</span><span class="n">k</span><span class="o">,</span> <span class="n">v</span><span class="o">)</span> <span class="o">-&gt;</span> <span class="n">v</span> <span class="o">==</span> <span class="kc">null</span> <span class="o">?</span> <span class="mi">0</span> <span class="o">:</span> <span class="n">v</span> <span class="o">+</span> <span class="mi">1</span><span class="o">);</span>
+      <span class="o">}</span>
+    <span class="o">}</span>
+
+    <span class="k">return</span> <span class="n">stats</span><span class="o">;</span>
+  <span class="o">}</span>
+<span class="o">}</span></code></pre></div>
+
+<p>Note: the type parameters for <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/functions/FoldLeftFunction.html">FoldLeftFunction</a> reflect the upstream data type and the window value type, respectively. In our case, the upstream type is the output of the parser and the window value is our <code>WikipediaStats</code> class.</p>
+
+<p>Finally, we can define our <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/windows/Window.html">window</a> back in the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html#init-org.apache.samza.operators.StreamGraph-org.apache.samza.config.Config-">init</a> method by chaining the result of the parser:</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">allWikipediaEvents</span><span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="nl">WikipediaParser:</span><span class="o">:</span><span class="n">parseEvent</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">window</span><span class="o">(</span><span class="n">Windows</span><span class="o">.</span><span class="na">tumblingWindow</span><span class="o">(</span><span class="n">Duration</span><span class="o">.</span><span class="na">ofSeconds</span><span class="o">(</span><span class="mi">10</span><span class="o">),</span> <span class="nl">WikipediaStats:</span><span class="o">:</span><span class="k">new</span><span class="o">,</span> <span class="k">new</span> <span class="n">WikipediaStatsAggregator</span><span class="o">()));</span></code></pre></div>
+
+<p>This defines an unkeyed <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/windows/Windows.html">tumbling window</a> that spans 10s, which instantiates a new <code>WikipediaStats</code> object at the beginning of each window and aggregates the stats using <code>WikipediaStatsAggregator</code>.</p>
+
+<p>The output of the window is a <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/windows/WindowPane.html">WindowPane</a> with a key and value. Since we used an unkeyed tumbling window, the key is <code>Void</code>. The value is our <code>WikipediaStats</code> object.</p>
+
+<h4 id="output">Output</h4>
+
+<p>We want to use a JSON serializer to output the window values to Kafka, so we will do one more <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#map-org.apache.samza.operators.functions.MapFunction-">map</a> to format the output.</p>
+
+<p>First, let&rsquo;s define the method to format the stats as a <code>Map&lt;String, String&gt;</code> so the <em>json</em> serde can handle it. Paste the following after the aggregator class:</p>
+
+<div class="highlight"><pre><code class="java"><span class="kd">private</span> <span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="n">formatOutput</span><span class="o">(</span><span class="n">WindowPane</span><span class="o">&lt;</span><span class="n">Void</span><span class="o">,</span> <span class="n">WikipediaStats</span><span class="o">&gt;</span> <span class="n">statsWindowPane</span><span class="o">)</span> <span class="o">{</span>
+  <span class="n">WikipediaStats</span> <span class="n">stats</span> <span class="o">=</span> <span class="n">statsWindowPane</span><span class="o">.</span><span class="na">getMessage</span><span class="o">();</span>
+
+  <span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="n">counts</span> <span class="o">=</span> <span class="k">new</span> <span class="n">HashMap</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;(</span><span class="n">stats</span><span class="o">.</span><span class="na">counts</span><span class="o">);</span>
+  <span class="n">counts</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">&quot;edits&quot;</span><span class="o">,</span> <span class="n">stats</span><span class="o">.</span><span class="na">edits</span><span class="o">);</span>
+  <span class="n">counts</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">&quot;bytes-added&quot;</span><span class="o">,</span> <span class="n">stats</span><span class="o">.</span><span class="na">byteDiff</span><span class="o">);</span>
+  <span class="n">counts</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">&quot;unique-titles&quot;</span><span class="o">,</span> <span class="n">stats</span><span class="o">.</span><span class="na">titles</span><span class="o">.</span><span class="na">size</span><span class="o">());</span>
+
+  <span class="k">return</span> <span class="n">counts</span><span class="o">;</span>
+<span class="o">}</span></code></pre></div>
+
+<p>Now, we can invoke the method by adding another <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#map-org.apache.samza.operators.functions.MapFunction-">map</a> operation to the chain in <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/application/StreamApplication.html#init-org.apache.samza.operators.StreamGraph-org.apache.samza.config.Config-">init</a>. The operator chain should now look like this:</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">allWikipediaEvents</span><span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="nl">WikipediaParser:</span><span class="o">:</span><span class="n">parseEvent</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">window</span><span class="o">(</span><span class="n">Windows</span><span class="o">.</span><span class="na">tumblingWindow</span><span class="o">(</span><span class="n">Duration</span><span class="o">.</span><span class="na">ofSeconds</span><span class="o">(</span><span class="mi">10</span><span class="o">),</span> <span class="nl">WikipediaStats:</span><span class="o">:</span><span class="k">new</span><span class="o">,</span> <span class="k">new</span> <span class="n">WikipediaStatsAggregator</span><span class="o">()))</span>
+        <span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="k">this</span><span class="o">::</span><span class="n">formatOutput</span><span class="o">);</span></code></pre></div>
+
+<p>Next we need to get the output stream to which we will send the stats. Insert the following line below the creation of the 3 input streams:</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">OutputStream</span><span class="o">&lt;</span><span class="n">Void</span><span class="o">,</span> <span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;,</span> <span class="n">Map</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;&gt;</span>
+        <span class="n">wikipediaStats</span> <span class="o">=</span> <span class="n">streamGraph</span><span class="o">.</span><span class="na">getOutputStream</span><span class="o">(</span><span class="s">&quot;wikipedia-stats&quot;</span><span class="o">,</span> <span class="n">m</span> <span class="o">-&gt;</span> <span class="kc">null</span><span class="o">,</span> <span class="n">m</span> <span class="o">-&gt;</span> <span class="n">m</span><span class="o">);</span></code></pre></div>
+
+<p>The <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/OutputStream.html">OutputStream</a> is parameterized by 3 types; the key type for the output, the value type for the output, and upstream type.</p>
+
+<p>The first parameter of <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/StreamGraph.html#getOutputStream-java.lang.String-java.util.function.Function-java.util.function.Function-">getOutputStream</a> is the output stream ID. We will use <em>wikipedia-stats</em> and since it contains no special characters, we won&rsquo;t bother configuring a physical name so Samza will use the stream ID as the topic name.</p>
+
+<p>The second and third parameters are the <em>key extractor</em> and the <em>message extractor</em>, respectively. We have no key, so the <em>key extractor</em> simply produces null. The <em>message extractor</em> simply passes the message because it&rsquo;s already the correct type for the <em>json</em> serde. Note: we could have skipped the previous <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#map-org.apache.samza.operators.functions.MapFunction-">map</a> operator and invoked our formatter here, but we kept them separate for pedagogical purposes.</p>
+
+<p>Finally, we can send our output to the output stream using the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#sendTo-org.apache.samza.operators.OutputStream-">sendTo</a> operator:</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">allWikipediaEvents</span><span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="nl">WikipediaParser:</span><span class="o">:</span><span class="n">parseEvent</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">window</span><span class="o">(</span><span class="n">Windows</span><span class="o">.</span><span class="na">tumblingWindow</span><span class="o">(</span><span class="n">Duration</span><span class="o">.</span><span class="na">ofSeconds</span><span class="o">(</span><span class="mi">10</span><span class="o">),</span> <span class="nl">WikipediaStats:</span><span class="o">:</span><span class="k">new</span><span class="o">,</span> <span class="k">new</span> <span class="n">WikipediaStatsAggregator</span><span class="o">()))</span>
+        <span class="o">.</span><span class="na">map</span><span class="o">(</span><span class="k">this</span><span class="o">::</span><span class="n">formatOutput</span><span class="o">)</span>
+        <span class="o">.</span><span class="na">sendTo</span><span class="o">(</span><span class="n">wikipediaStats</span><span class="o">);</span></code></pre></div>
+
+<p>Tip: Because the MessageStream type information is preserved in the operator chain, it is often easier to define the OutputStream inline with the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/MessageStream.html#sendTo-org.apache.samza.operators.OutputStream-">sendTo</a> operator and then refactor it for readability. That way you don&rsquo;t have to hunt down the types.</p>
+
+<h4 id="kvstore">KVStore</h4>
+
+<p>We now have an operational Wikipedia application which provides stats aggregated over a 10 second interval. One of those stats is a count of the number of edits within the 10s window. But what if we want to keep an additional durable counter of the total edits?</p>
+
+<p>We will do this by keeping a separate count outside the window and persisting it in a <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/storage/kv/KeyValueStore.html">KeyValueStore</a>.</p>
+
+<p>We start by defining the store in the config file:</p>
+
+<div class="highlight"><pre><code class="bash">stores.wikipedia-stats.factory<span class="o">=</span>org.apache.samza.storage.kv.RocksDbKeyValueStorageEngineFactory
+stores.wikipedia-stats.changelog<span class="o">=</span>kafka.wikipedia-stats-changelog
+stores.wikipedia-stats.key.serde<span class="o">=</span>string
+stores.wikipedia-stats.msg.serde<span class="o">=</span>integer</code></pre></div>
+
+<p>These properties declare a <a href="http://rocksdb.org/">RocksDB</a> key-value store named &ldquo;wikipedia-stats&rdquo;. The store is replicated to a changelog stream called &ldquo;wikipedia-stats-changelog&rdquo; on the <em>kafka</em> system for durability. It uses the <em>string</em> and <em>integer</em> serdes you defined earlier for keys and values respectively.</p>
+
+<p>Next, we add a total count member variable to the <code>WikipediaStats</code> class:</p>
+
+<div class="highlight"><pre><code class="java"><span class="kt">int</span> <span class="n">totalEdits</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span></code></pre></div>
+
+<p>To use the store in the application, we need to get it from the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/task/TaskContext.html">TaskContext</a>. Also, since we want to emit the total edit count along with the window edit count, it&rsquo;s easiest to update both of them in our aggregator. Declare the store as a member variable of the <code>WikipediaStatsAggregator</code> class:</p>
+
+<div class="highlight"><pre><code class="java"><span class="kd">private</span> <span class="n">KeyValueStore</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;</span> <span class="n">store</span><span class="o">;</span></code></pre></div>
+
+<p>Then override the <a href="/learn/documentation/latest/api/javadocs/org/apache/samza/operators/functions/InitableFunction.html#init-org.apache.samza.config.Config-org.apache.samza.task.TaskContext-">init</a> method in <code>WikipediaStatsAggregator</code> to initialize the store.</p>
+
+<div class="highlight"><pre><code class="java"><span class="nd">@Override</span>
+<span class="kd">public</span> <span class="kt">void</span> <span class="nf">init</span><span class="o">(</span><span class="n">Config</span> <span class="n">config</span><span class="o">,</span> <span class="n">TaskContext</span> <span class="n">context</span><span class="o">)</span> <span class="o">{</span>
+  <span class="n">store</span> <span class="o">=</span> <span class="o">(</span><span class="n">KeyValueStore</span><span class="o">&lt;</span><span class="n">String</span><span class="o">,</span> <span class="n">Integer</span><span class="o">&gt;)</span> <span class="n">context</span><span class="o">.</span><span class="na">getStore</span><span class="o">(</span><span class="s">&quot;wikipedia-stats&quot;</span><span class="o">);</span>
+<span class="o">}</span></code></pre></div>
+
+<p>Update and persist the counter in the <code>apply</code> method.</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">Integer</span> <span class="n">editsAllTime</span> <span class="o">=</span> <span class="n">store</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">&quot;count-edits-all-time&quot;</span><span class="o">);</span>
+<span class="k">if</span> <span class="o">(</span><span class="n">editsAllTime</span> <span class="o">==</span> <span class="kc">null</span><span class="o">)</span> <span class="n">editsAllTime</span> <span class="o">=</span> <span class="mi">0</span><span class="o">;</span>
+<span class="n">editsAllTime</span><span class="o">++;</span>
+<span class="n">store</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">&quot;count-edits-all-time&quot;</span><span class="o">,</span> <span class="n">editsAllTime</span><span class="o">);</span>
+<span class="n">stats</span><span class="o">.</span><span class="na">totalEdits</span> <span class="o">=</span> <span class="n">editsAllTime</span><span class="o">;</span></code></pre></div>
+
+<p>Finally, update the <code>MyWikipediaApplication#formatOutput</code> method to include the total counter.</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">counts</span><span class="o">.</span><span class="na">put</span><span class="o">(</span><span class="s">&quot;edits-all-time&quot;</span><span class="o">,</span> <span class="n">stats</span><span class="o">.</span><span class="na">totalEdits</span><span class="o">);</span></code></pre></div>
+
+<h4 id="metrics">Metrics</h4>
+
+<p>Lastly, let&rsquo;s add a metric to the application which counts the number of repeat edits each topic within the window interval.</p>
+
+<p>As with the key-value store, we must first define the metrics reporters in the config file.</p>
+
+<div class="highlight"><pre><code class="bash">metrics.reporters<span class="o">=</span>snapshot,jmx
+metrics.reporter.snapshot.class<span class="o">=</span>org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory
+metrics.reporter.snapshot.stream<span class="o">=</span>kafka.metrics
+metrics.reporter.jmx.class<span class="o">=</span>org.apache.samza.metrics.reporter.JmxReporterFactory</code></pre></div>
+
+<p>The above properties define 2 metrics reporters. The first emits metrics to a <em>metrics</em> topic on the <em>kafka</em> system. The second reporter emits metrics to JMX.</p>
+
+<p>In the WikipediaStatsAggregator, declare a counter member variable.</p>
+
+<div class="highlight"><pre><code class="java"><span class="kd">private</span> <span class="n">Counter</span> <span class="n">repeatEdits</span><span class="o">;</span></code></pre></div>
+
+<p>Then add the following to the <code>WikipediaStatsAggregator#init</code> method to initialize the counter.</p>
+
+<div class="highlight"><pre><code class="java"><span class="n">repeatEdits</span> <span class="o">=</span> <span class="n">context</span><span class="o">.</span><span class="na">getMetricsRegistry</span><span class="o">().</span><span class="na">newCounter</span><span class="o">(</span><span class="s">&quot;edit-counters&quot;</span><span class="o">,</span> <span class="s">&quot;repeat-edits&quot;</span><span class="o">);</span></code></pre></div>
+
+<p>Update and persist the counter from the <code>apply</code> method.</p>
+
+<div class="highlight"><pre><code class="java"><span class="kt">boolean</span> <span class="n">newTitle</span> <span class="o">=</span> <span class="n">stats</span><span class="o">.</span><span class="na">titles</span><span class="o">.</span><span class="na">add</span><span class="o">((</span><span class="n">String</span><span class="o">)</span> <span class="n">edit</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">&quot;title&quot;</span><span class="o">));</span>
+
+<span class="k">if</span> <span class="o">(!</span><span class="n">newTitle</span><span class="o">)</span> <span class="o">{</span>
+  <span class="n">repeatEdits</span><span class="o">.</span><span class="na">inc</span><span class="o">();</span>
+  <span class="n">log</span><span class="o">.</span><span class="na">info</span><span class="o">(</span><span class="s">&quot;Frequent edits for title: {}&quot;</span><span class="o">,</span> <span class="n">edit</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">&quot;title&quot;</span><span class="o">));</span>
+<span class="o">}</span></code></pre></div>
+
+<h4 id="run-and-view-plan">Run and View Plan</h4>
+
+<p>You can set up the grid and run the application using the same instructions from the <a href="hello-samza-high-level-yarn.html">hello samza high level API Yarn tutorial</a>. The only difference is to replace the <code>wikipedia-application.properties</code> config file in the <em>config-path</em> command line parameter with <code>my-wikipedia-application.properties</code></p>
+
+<h3 id="summary">Summary</h3>
+
+<p>Congratulations! You have built and executed a Wikipedia stream application on Samza using the high level API. The final application should be directly comparable to the pre-existing <code>WikipediaApplication</code> in the project.</p>
+
+<p>You can provide feedback on this tutorial in the <a href="mailto:dev@samza.apache.org">dev mailing list</a>.</p>
+
+
+          </div>
+        </div>
+
+      </div><!-- /.wrapper-content -->
+    </div><!-- /.wrapper -->
+
+    <div class="footer">
+      <div class="container">
+        <!-- nothing for now. -->
+      </div>
+    </div>
+
+  
+    <script>
+      $( document ).ready(function() {
+        if ( $.fn.urlExists( "/learn/tutorials/0.13/hello-samza-high-level-code.html" ) ) {
+          $("#switch-version-button").addClass("fa fa-history masthead-icon");
+        }
+      });
+
+      /* a function to test whether the url exists or not */
+      (function( $ ) {
+        $.fn.urlExists = function(url) {
+          var http = new XMLHttpRequest();
+          http.open('HEAD', url, false);
+          http.send();
+          return http.status != 404;
+        };
+      }( jQuery ));
+    </script>
+  
+
+    <!-- Google Analytics -->
+    <script>
+      (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+      m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+      })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+      ga('create', 'UA-43122768-1', 'apache.org');
+      ga('send', 'pageview');
+
+    </script>
+  </body>
+</html>

Added: samza/site/learn/tutorials/latest/hello-samza-high-level-yarn.html
URL: http://svn.apache.org/viewvc/samza/site/learn/tutorials/latest/hello-samza-high-level-yarn.html?rev=1798258&view=auto
==============================================================================
--- samza/site/learn/tutorials/latest/hello-samza-high-level-yarn.html (added)
+++ samza/site/learn/tutorials/latest/hello-samza-high-level-yarn.html Fri Jun  9 18:46:20 2017
@@ -0,0 +1,286 @@
+<!DOCTYPE html>
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+<html lang="en">
+  <head>
+    <meta charset="utf-8">
+    <title>Samza - Hello Samza High Level API - YARN Deployment</title>
+    <link href='/css/ropa-sans.css' rel='stylesheet' type='text/css'/>
+    <link href="/css/bootstrap.min.css" rel="stylesheet"/>
+    <link href="/css/font-awesome.min.css" rel="stylesheet"/>
+    <link href="/css/main.css" rel="stylesheet"/>
+    <link href="/css/syntax.css" rel="stylesheet"/>
+    <link rel="icon" type="image/png" href="/img/samza-icon.png">
+    <script src="/js/jquery-1.11.1.min.js"></script>
+  </head>
+  <body>
+    <div class="wrapper">
+      <div class="wrapper-content">
+
+        <div class="masthead">
+          <div class="container">
+            <div class="masthead-logo">
+              <a href="/" class="logo">samza</a>
+            </div>
+            <div class="masthead-icons">
+              <div class="pull-right">
+                <a href="/startup/download"><i class="fa fa-arrow-circle-o-down masthead-icon"></i></a>
+                <a href="https://git-wip-us.apache.org/repos/asf?p=samza.git;a=tree" target="_blank"><i class="fa fa-code masthead-icon" style="font-weight: bold;"></i></a>
+                <a href="https://twitter.com/samzastream" target="_blank"><i class="fa fa-twitter masthead-icon"></i></a>
+                <!-- this icon only shows in versioned pages -->
+                
+                  
+                    
+                  
+                  <a href="http://samza.apache.org/learn/tutorials/0.13/hello-samza-high-level-yarn.html"><i id="switch-version-button"></i></a>
+                   <!-- links for the navigation bar -->
+                
+
+              </div>
+            </div>
+          </div><!-- /.container -->
+        </div>
+
+        <div class="container">
+          <div class="menu">
+            <h1><i class="fa fa-rocket"></i> Getting Started</h1>
+            <ul>
+              <li><a href="/startup/hello-samza/latest">Hello Samza</a></li>
+              <li><a href="/startup/download">Download</a></li>
+              <li><a href="/startup/preview">Feature Preview</a></li>
+            </ul>
+
+            <h1><i class="fa fa-book"></i> Learn</h1>
+            <ul>
+              <li><a href="/learn/documentation/latest">Documentation</a></li>
+              <li><a href="/learn/documentation/latest/jobs/configuration-table.html">Configuration</a></li>
+              <li><a href="/learn/documentation/latest/container/metrics-table.html">Metrics</a></li>
+              <li><a href="/learn/documentation/latest/api/javadocs/">Javadocs</a></li>
+              <li><a href="/learn/tutorials/latest">Tutorials</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/display/SAMZA/FAQ">FAQ</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/display/SAMZA/Apache+Samza">Wiki</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=51812876">Papers &amp; Talks</a></li>
+              <li><a href="http://blogs.apache.org/samza">Blog</a></li>
+            </ul>
+
+            <h1><i class="fa fa-comments"></i> Community</h1>
+            <ul>
+              <li><a href="/community/mailing-lists.html">Mailing Lists</a></li>
+              <li><a href="/community/irc.html">IRC</a></li>
+              <li><a href="https://issues.apache.org/jira/browse/SAMZA">Bugs</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/display/SAMZA/Powered+By">Powered by</a></li>
+              <li><a href="https://cwiki.apache.org/confluence/display/SAMZA/Ecosystem">Ecosystem</a></li>
+              <li><a href="/community/committers.html">Committers</a></li>
+            </ul>
+
+            <h1><i class="fa fa-code"></i> Contribute</h1>
+            <ul>
+              <li><a href="/contribute/contributors-corner.html">Contributor's Corner</a></li>
+              <li><a href="/contribute/coding-guide.html">Coding Guide</a></li>
+              <li><a href="/contribute/design-documents.html">Design Documents</a></li>
+              <li><a href="/contribute/code.html">Code</a></li>
+              <li><a href="/contribute/tests.html">Tests</a></li>
+            </ul>
+
+            <h1><i class="fa fa-history"></i> Archive</h1>
+            <ul>
+              <li><a href="/archive/index.html#latest">latest</a></li>
+              <li><a href="/archive/index.html#13">0.13</a></li>
+              <li><a href="/archive/index.html#12">0.12</a></li>
+              <li><a href="/archive/index.html#11">0.11</a></li>
+              <li><a href="/archive/index.html#10">0.10</a></li>
+              <li><a href="/archive/index.html#09">0.9</a></li>
+              <li><a href="/archive/index.html#08">0.8</a></li>
+              <li><a href="/archive/index.html#07">0.7</a></li>
+            </ul>
+          </div>
+
+          <div class="content">
+            <!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<h2>Hello Samza High Level API - YARN Deployment</h2>
+
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<p>The <a href="https://github.com/apache/samza-hello-samza">hello-samza</a> project is an example project designed to help you run your first Samza application. It has examples of applications using the low level task API as well as the high level API.</p>
+
+<p>This tutorial demonstrates a simple wikipedia application created with the high level API. The <a href="/startup/hello-samza/latest/index.html">Hello Samza tutorial</a> is the low-level analog to this tutorial. It demonstrates the same logic but is created with the task API. The tutorials are designed to be as similar as possible. The primary differences are that with the high level API we accomplish the equivalent of 3 separate low-level jobs with a single application, we skip the intermediate topics for simplicity, and we can visualize the execution plan after we start the application.</p>
+
+<h3 id="get-the-code">Get the Code</h3>
+
+<p>Check out the hello-samza project:</p>
+
+<div class="highlight"><pre><code class="bash">git clone https://git.apache.org/samza-hello-samza.git hello-samza
+<span class="nb">cd </span>hello-samza
+git checkout latest</code></pre></div>
+
+<p>This project contains everything you&rsquo;ll need to run your first Samza application.</p>
+
+<h3 id="start-a-grid">Start a Grid</h3>
+
+<p>A Samza grid usually comprises three different systems: <a href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html">YARN</a>, <a href="http://kafka.apache.org/">Kafka</a>, and <a href="http://zookeeper.apache.org/">ZooKeeper</a>. The hello-samza project comes with a script called &ldquo;grid&rdquo; to help you setup these systems. Start by running:</p>
+
+<div class="highlight"><pre><code class="bash">./bin/grid bootstrap</code></pre></div>
+
+<p>This command will download, install, and start ZooKeeper, Kafka, and YARN. It will also check out the latest version of Samza and build it. All package files will be put in a sub-directory called &ldquo;deploy&rdquo; inside hello-samza&rsquo;s root folder.</p>
+
+<p>If you get a complaint that JAVA_HOME is not set, then you&rsquo;ll need to set it to the path where Java is installed on your system.</p>
+
+<p>Once the grid command completes, you can verify that YARN is up and running by going to <a href="http://localhost:8088">http://localhost:8088</a>. This is the YARN UI.</p>
+
+<h3 id="build-a-samza-application-package">Build a Samza Application Package</h3>
+
+<p>Before you can run a Samza application, you need to build a package for it. This package is what YARN uses to deploy your apps on the grid.</p>
+
+<p>NOTE: if you are building from the latest branch of hello-samza project, make sure that you run the following step from your local Samza project first:</p>
+
+<div class="highlight"><pre><code class="bash">./gradlew publishToMavenLocal</code></pre></div>
+
+<p>Then, you can continue w/ the following command in hello-samza project:</p>
+
+<div class="highlight"><pre><code class="bash">mvn clean package
+mkdir -p deploy/samza
+tar -xvf ./target/hello-samza-0.13.1-SNAPSHOT-dist.tar.gz -C deploy/samza</code></pre></div>
+
+<h3 id="run-a-samza-application">Run a Samza Application</h3>
+
+<p>After you&rsquo;ve built your Samza package, you can start the app on the grid using the run-app.sh script.</p>
+
+<div class="highlight"><pre><code class="bash">./deploy/samza/bin/run-app.sh --config-factory<span class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory --config-path<span class="o">=</span>file://<span class="nv">$PWD</span>/deploy/samza/config/wikipedia-application.properties</code></pre></div>
+
+<p>The app will do all of the following:</p>
+
+<ol>
+<li>Consume 3 feeds of real-time edits from Wikipedia</li>
+<li>Parse the events to extract information about the size of the edit, who made the change, etc.</li>
+<li>Calculate counts, every ten seconds, for all edits that were made during that window </li>
+<li>Output the counts to the wikipedia-stats topic</li>
+</ol>
+
+<p>For details about how the app works, take a look at the <a href="hello-samza-high-level-code.html">code walkthrough</a>.</p>
+
+<p>Give the job a minute to startup, and then tail the Kafka topic:</p>
+
+<div class="highlight"><pre><code class="bash">./deploy/kafka/bin/kafka-console-consumer.sh  --zookeeper localhost:2181 --topic wikipedia-stats</code></pre></div>
+
+<p>The messages in the stats topic look like this:</p>
+
+<div class="highlight"><pre><code class="json"><span class="p">{</span><span class="nt">&quot;is-talk&quot;</span><span class="p">:</span><span class="mi">2</span><span class="p">,</span><span class="nt">&quot;bytes-added&quot;</span><span class="p">:</span><span class="mi">5276</span><span class="p">,</span><span class="nt">&quot;edits&quot;</span><span class="p">:</span><span class="mi">13</span><span class="p">,</span><span class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span class="mi">13</span><span class="p">}</span>
+<span class="p">{</span><span class="nt">&quot;is-bot-edit&quot;</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span><span class="nt">&quot;is-talk&quot;</span><span class="p">:</span><span class="mi">3</span><span class="p">,</span><span class="nt">&quot;bytes-added&quot;</span><span class="p">:</span><span class="mi">4211</span><span class="p">,</span><span class="nt">&quot;edits&quot;</span><span class="p">:</span><span class="mi">30</span><span class="p">,</span><span class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span class="mi">30</span><span class="p">,</span><span class="nt">&quot;is-unpatrolled&quot;</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span><span class="nt">&quot;is-new&quot;</span><span class="p">:</span><span class="mi">2</span><span class="p">,</span><span class="nt">&quot;is-minor&quot;</span><span class="p">:</span><span class="mi">7</span><span class="p">}</span>
+<span class="p">{</span><span class="nt">&quot;bytes-added&quot;</span><span class="p">:</span><span class="mi">3180</span><span class="p">,</span><span class="nt">&quot;edits&quot;</span><span class="p">:</span><span class="mi">19</span><span class="p">,</span><span class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span class="mi">19</span><span class="p">,</span><span class="nt">&quot;is-unpatrolled&quot;</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span><span class="nt">&quot;is-new&quot;</span><span class="p">:</span><span class="mi">1</span><span class="p">,</span><span class="nt">&quot;is-minor&quot;</span><span class="p">:</span><span class="mi">3</span><span class="p">}</span>
+<span class="p">{</span><span class="nt">&quot;bytes-added&quot;</span><span class="p">:</span><span class="mi">2218</span><span class="p">,</span><span class="nt">&quot;edits&quot;</span><span class="p">:</span><span class="mi">18</span><span class="p">,</span><span class="nt">&quot;unique-titles&quot;</span><span class="p">:</span><span class="mi">18</span><span class="p">,</span><span class="nt">&quot;is-unpatrolled&quot;</span><span class="p">:</span><span class="mi">2</span><span class="p">,</span><span class="nt">&quot;is-new&quot;</span><span class="p">:</span><span class="mi">2</span><span class="p">,</span><span class="nt">&quot;is-minor&quot;</span><span class="p">:</span><span class="mi">3</span><span class="p">}</span></code></pre></div>
+
+<p>Pretty neat, right? Now, check out the YARN UI again (<a href="http://localhost:8088">http://localhost:8088</a>). This time around, you&rsquo;ll see your Samza job is running!</p>
+
+<h3 id="view-the-execution-plan">View the Execution Plan</h3>
+
+<p>Each application goes through an execution planner and you can visualize the execution plan after starting the job by opening the following file in a browser</p>
+
+<div class="highlight"><pre><code class="bash">deploy/samza/bin/plan.html</code></pre></div>
+
+<p>This plan will make more sense after the <a href="hello-samza-high-level-code.html">code walkthrough</a>. For now, just take note that this visualization is available and it is useful for visibility into the structure of the application. For this tutorial, the plan should look something like this:</p>
+
+<p><img src="/img/latest/learn/tutorials/hello-samza-high-level/wikipedia-execution-plan.png" alt="Execution plan" style="max-width: 100%; height: auto;" onclick="window.open(this.src)"/></p>
+
+<h3 id="shutdown">Shutdown</h3>
+
+<p>To shutdown the app, use the same <em>run-app.sh</em> script with an extra <em>&ndash;operation=kill</em> argument</p>
+
+<div class="highlight"><pre><code class="bash">./deploy/samza/bin/run-app.sh --config-factory<span class="o">=</span>org.apache.samza.config.factories.PropertiesConfigFactory --config-path<span class="o">=</span>file://<span class="nv">$PWD</span>/deploy/samza/config/wikipedia-application.properties --operation<span class="o">=</span><span class="nb">kill</span></code></pre></div>
+
+<p>After you&rsquo;re done, you can clean everything up using the same grid script.</p>
+
+<div class="highlight"><pre><code class="bash">./bin/grid stop all</code></pre></div>
+
+<p>Congratulations! You&rsquo;ve now setup a local grid that includes YARN, Kafka, and ZooKeeper, and run a Samza application on it. Curious how this application was built? See the <a href="hello-samza-high-level-code.html">code walk-through</a>.</p>
+
+
+          </div>
+        </div>
+
+      </div><!-- /.wrapper-content -->
+    </div><!-- /.wrapper -->
+
+    <div class="footer">
+      <div class="container">
+        <!-- nothing for now. -->
+      </div>
+    </div>
+
+  
+    <script>
+      $( document ).ready(function() {
+        if ( $.fn.urlExists( "/learn/tutorials/0.13/hello-samza-high-level-yarn.html" ) ) {
+          $("#switch-version-button").addClass("fa fa-history masthead-icon");
+        }
+      });
+
+      /* a function to test whether the url exists or not */
+      (function( $ ) {
+        $.fn.urlExists = function(url) {
+          var http = new XMLHttpRequest();
+          http.open('HEAD', url, false);
+          http.send();
+          return http.status != 404;
+        };
+      }( jQuery ));
+    </script>
+  
+
+    <!-- Google Analytics -->
+    <script>
+      (function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
+      (i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
+      m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
+      })(window,document,'script','//www.google-analytics.com/analytics.js','ga');
+
+      ga('create', 'UA-43122768-1', 'apache.org');
+      ga('send', 'pageview');
+
+    </script>
+  </body>
+</html>