You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@samza.apache.org by ja...@apache.org on 2018/10/12 02:47:45 UTC

[01/32] samza git commit: Reorganize website content, link hyper-links correctly, fix image links

Repository: samza
Updated Branches:
  refs/heads/master 6dd012217 -> f7ebe5918


http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/operations/monitoring.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/operations/monitoring.md b/docs/learn/documentation/versioned/operations/monitoring.md
new file mode 100644
index 0000000..ccd2442
--- /dev/null
+++ b/docs/learn/documentation/versioned/operations/monitoring.md
@@ -0,0 +1,612 @@
+---
+layout: page
+title: Monitoring
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIFND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+# Monitoring Samza Applications
+
+This section provides details on monitoring of Samza jobs, not to be confused with _Samza Monitors_ (components of the Samza-REST service that provide cluster-wide monitoring capabilities).
+
+
+
+Like any other production software, it is critical to monitor the health of our Samza jobs. Samza relies on metrics for monitoring and includes an extensible metrics library. While a few standard metrics are provided out-of-the-box, it is easy to define metrics specific to your application.
+
+
+* [A. Metrics Reporters](#a-metrics-reporters)
+  + [A.1 Reporting Metrics to JMX (JMX Reporter)](#jmxreporter)
+    + [Enabling the JMX Reporter](#enablejmxreporter)
+    - [Using the JMX Reporter](#jmxreporter)
+  + [A.2 Reporting Metrics to Kafka (MetricsSnapshot Reporter)](#snapshotreporter)
+    - [Enabling the MetricsSnapshot Reporter](#enablesnapshotreporter)
+  + [A.3 Creating a Custom MetricsReporter](#customreporter)
+* [B. Metric Types in Samza](#metrictypes)
+* [C. Adding User-Defined Metrics](#userdefinedmetrics)
+  + [Low-level API](#lowlevelapi)
+  + [High-Level API](#highlevelapi)
+* [D. Key Internal Samza Metrics](#keyinternalsamzametrics)
+  + [D.1 Vital Metrics](#vitalmetrics)
+  + [D.2 Store Metrics](#storemetrics)
+  + [D.3 Operator Metrics](#operatormetrics)
+* [E. Metrics Reference Sheet](#metricssheet)
+
+## A. Metrics Reporters
+
+Samza&#39;s metrics library encapsulates the metrics collection and sampling logic. Metrics Reporters in Samza are responsible for emitting metrics to external services which may archive, process, visualize the metrics&#39; values, or trigger alerts based on them.
+
+Samza includes default implementations for two such Metrics Reporters:
+
+1. a) A _JMXReporter_ (detailed [below](#jmxreporter)) which allows using standard JMX clients for probing containers to retrieve metrics encoded as JMX MBeans. Visualization tools such as [Grafana](https://grafana.com/dashboards/3457) could also be used to visualize this JMX data.
+
+1. b) A _MetricsSnapshot_ reporter (detailed [below](#snapshotreporter)) which allows periodically publishing all metrics to Kafka. A downstream Samza job could then consume and publish these metrics to other metrics management systems such as [Prometheus](https://prometheus.io/) and [Graphite](https://graphiteapp.org/).
+
+Note that Samza allows multiple Metrics Reporters to be used simultaneously.
+
+
+### <a name="jmxreporter"></a> A.1 Reporting Metrics to JMX (JMX Reporter)
+
+This reporter encodes all its internal and user-defined metrics as JMX MBeans and hosts a JMX MBean server. Standard JMX clients (such as JConsole, VisualVM) can thus be used to probe Samza&#39;s containers and YARN-ApplicationMaster to retrieve these metrics&#39; values. JMX also provides additional profiling capabilities (e.g., for CPU and memory utilization), which are also enabled by this reporter.
+
+#### <a name="enablejmxreporter"></a> Enabling the JMX Reporter
+JMXReporter can be enabled by adding the following configuration.
+
+```
+#Define a Samza metrics reporter called "jxm", which publishes to JMX
+metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory
+
+# Use the jmx reporter (if using multiple reporters, separate them with commas)
+metrics.reporters=jmx
+
+```
+
+#### <a name="usejmxreporter"></a> Using the JMX Reporter
+
+To connect to the JMX MBean server, first obtain the JMX Server URL and port, published in the container logs:
+
+
+```
+
+2018-08-14 11:30:49.888 [main] JmxServer [INFO] Started JmxServer registry port=54661 server port=54662 url=service:jmx:rmi://localhost:54662/jndi/rmi://localhost:54661/jmxrmi
+
+```
+
+
+If using the **JConsole** JMX client, launch it with the service URL as:
+
+```
+jconsole service:jmx:rmi://localhost:54662/jndi/rmi://localhost:54661/jmxrmi
+```
+
+<img src="/img/{{site.version}}/learn/documentation/operations/jconsole.png" alt="JConsole" class="diagram-large">
+
+ 
+
+If using the VisualVM JMX client, run:
+
+```
+jvisualvm
+```
+
+After **VisualVM** starts, click the &quot;Add JMX Connection&quot; button and paste in your JMX server URL (obtained from the logs).
+Install the VisualVM-MBeans plugin (Tools->Plugin) to view the metrics MBeans.
+
+<img src="/img/{{site.version}}/learn/documentation/operations/visualvm.png" alt="VisualVM" class="diagram-small">
+
+ 
+###  <a name="snapshotreporter"></a> A.2 Reporting Metrics to Kafka (MetricsSnapshot Reporter)
+
+This reporter publishes metrics to Kafka.
+
+#### <a name="enablesnapshotreporter"></a> Enabling the MetricsSnapshot Reporter
+To enable this reporter, simply append the following to your job&#39;s configuration.
+
+```
+#Define a metrics reporter called "snapshot"
+metrics.reporters=snapshot
+metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory
+```
+
+
+Specify the kafka topic to which the reporter should publish to
+
+```
+metrics.reporter.snapshot.stream=kafka.metrics
+```
+
+
+Specify the serializer to be used for the metrics data
+
+```
+serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory
+systems.kafka.streams.metrics.samza.msg.serde=metrics
+```
+With this configuration, all containers (including the YARN-ApplicationMaster) will publish their JSON-encoded metrics 
+to a Kafka topic called &quot;metrics&quot; every 60 seconds.
+The following is an example of such a metrics message:
+
+```
+{
+  "header": {
+    "container-name": "samza-container-0",
+
+    "exec-env-container-id": "YARN-generated containerID",
+    "host": "samza-grid-1234.example.com",
+    "job-id": "1",
+    "job-name": "my-samza-job",
+    "reset-time": 1401729000347,
+    "samza-version": "0.0.1",
+    "source": "TaskName-Partition1",
+    "time": 1401729420566,
+    "version": "0.0.1"
+  },
+  "metrics": {
+    "org.apache.samza.container.TaskInstanceMetrics": {
+      "commit-calls": 1,
+      "window-calls": 0,
+      "process-calls": 14,
+
+      "messages-actually-processed": 14,
+      "send-calls": 0,
+
+      "flush-calls": 1,
+      "pending-messages": 0,
+      "messages-in-flight": 0,
+      "async-callback-complete-calls": 14,
+        "wikipedia-#en.wikipedia-0-offset": 8979,
+    }
+  }
+}
+```
+
+
+Each message contains a header which includes information about the job, time, and container from which the metrics were obtained. 
+The remainder of the message contains the metric values, grouped by their types, such as TaskInstanceMetrics, SamzaContainerMetrics,  KeyValueStoreMetrics, JVMMetrics, etc. Detailed descriptions of the various metric categories and metrics are available [here](#metricssheet).
+
+It is possible to configure the MetricsSnapshot reporter to use a different serializer using this configuration
+
+```
+serializers.registry.metrics.class=<classpath-to-my-custom-serializer-factory>
+```
+
+
+
+To configure the reporter to publish with a different frequency (default 60 seconds), add the following to your job&#39;s configuration
+
+```
+metrics.reporter.snapshot.interval=<publish frequency in seconds>
+```
+
+Similarly, to limit the set of metrics emitted you can use the regex based blacklist supported by this reporter. For example, to limit it to publishing only SamzaContainerMetrics use:
+
+```
+metrics.reporter.snapshot.blacklist=^(?!.\*?(?:SamzaContainerMetrics)).\*$
+```
+
+
+### <a name="customreporter"></a> A.3 Creating a Custom MetricsReporter
+
+Creating a custom MetricsReporter entails implementing the MetricsReporter interface. The lifecycle of Metrics Reporters is managed by Samza and is aligned with the Samza container lifecycle. Metrics Reporters can poll metric values and can receive callbacks when new metrics are added at runtime, e.g., user-defined metrics. Metrics Reporters are responsible for maintaining executor pools, IO connections, and any in-memory state that they require in order to export metrics to the desired external system, and managing the lifecycles of such components.
+
+After implementation, a custom reporter can be enabled by appending the following to the Samza job&#39;s configuration:
+
+```
+#Define a metrics reporter with a desired name
+metrics.reporter.<my-custom-reporter-name>.class=<classpath-of-my-custom-reporter-factory>
+
+
+#Enable its use for metrics reporting
+metrics.reporters=<my-custom-reporter-name>
+```
+
+
+
+## <a name="metrictypes"></a> B. Metric Types in Samza 
+
+Metrics in Samza are divided into three types -- _Gauges_, _Counters_, and _Timers_.
+
+_Gauges_ are useful when measuring the magnitude of a certain system property, e.g., the current queue length, or a buffer size.
+
+_Counters_ are useful in measuring metrics that are cumulative values, e.g., the number of messages processed since container startup. Certain counters are also useful when visualized with their rate-of-change, e.g., the rate of message processing.
+
+_Timers_ are useful for storing and reporting a sliding-window of timing values. Samza also supports a ListGauge type metric, which can be used to store and report a list of any primitive-type such as strings.
+
+## <a name="userdefinedmetrics"></a> C. Adding User-Defined Metrics
+
+
+To add a new metric, you can simply use the _MetricsRegistry_ in the provided TaskContext of 
+the init() method to register new metrics. The code snippets below show examples of registering and updating a user-defined
+ Counter metric. Timers and gauges can similarly be used from within your task class.
+
+### <a name="lowlevelapi"></a> Low-level API
+
+Simply have your task implement the InitableTask interface and access the MetricsRegistry from the TaskContext.
+
+```
+public class MyJavaStreamTask implements StreamTask, InitableTask {
+
+  private Counter messageCount;
+  public void init(Config config, TaskContext context) {
+    this.messageCount = context.getMetricsRegistry().newCounter(getClass().getName(), "message-count");
+
+  }
+
+  public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) {
+    messageCount.inc();
+  }
+
+}
+```
+
+### <a name="highlevelapi"></a> High-Level API
+
+In the high-level API, you can define a ContextManager and access the MetricsRegistry from the TaskContext, using which you can add and update your metrics.
+
+```
+public class MyJavaStreamApp implements StreamApplication {
+
+  private Counter messageCount = null;
+
+  @Override
+  public void init(StreamGraph graph, Config config) {
+    graph.withContextManager(new DemoContextManager());
+    MessageStream<IndexedRecord> viewEvent = ...;
+    viewEvent
+        .map(this::countMessage)
+        ...;
+  }
+
+  public final class DemoContextManager implements ContextManager {
+
+  @Override
+  public void init(Config config, TaskContext context) {
+      messageCount = context.getMetricsRegistry().
+      newCounter(getClass().getName(), "message-count");
+  }
+
+  private IndexedRecord countMessage(IndexedRecord value) {
+    messageCount.inc();
+    return value;
+  }
+
+  @Override
+  public void close() { }
+
+  }
+```
+
+## <a name="keyinternalsamzametrics"></a> D. Key Internal Samza Metrics
+
+Samza&#39;s internal metrics allow for detailed monitoring of a Samza job and all its components. Detailed descriptions 
+of all internal metrics are listed in a reference sheet [here](#e-metrics-reference-sheet). 
+However, a small subset of internal metrics facilitates easy high-level monitoring of a job.
+
+These key metrics can be grouped into three categories: _Vital metrics_, _Store__metrics_, and _Operator metrics_. 
+We explain each of these categories in detail below.
+
+### <a name="vitalmetrics"></a> D.1. Vital Metrics
+
+These metrics indicate the vital signs of a Samza job&#39;s health. Note that these metrics are categorized into different groups based on the Samza component they are emitted by, (e.g. SamzaContainerMetrics, TaskInstanceMetrics, ApplicationMaster metrics, etc).
+
+| **Metric Name** | **Group** | **Meaning** |
+| --- | --- | --- |
+| **Availability -- Are there any resource failures impacting my job?** |
+| job-healthy | ContainerProcessManagerMetrics | A binary value, where 1 indicates that all the required containers configured for a job are running, 0 otherwise. |
+| failed-containers | ContainerProcessManagerMetrics  | Number of containers that have failed in the job&#39;s lifetime |
+| **Input Processing Lag -- Is my job lagging ?** |
+| \<Topic\>-\<Partition\>-messages-behind-high-watermark |
+KafkaSystemConsumerMetrics | Number of input messages waiting to be processed on an input topic-partition |
+| consumptionLagMs | EventHubSystemConsumer | Time difference between the processing and enqueuing (into EventHub)  of input events |
+| millisBehindLatest | KinesisSystemConsumerMetrics | Current processing lag measured from the tip of the stream, expressed in milliseconds. |
+| **Output/Produce Errors -- Is my job failing to produce output?** |
+| producer-send-failed | KafkaSystemProducerMetrics | Number of send requests to Kafka (e.g., output topics) that failed due to unrecoverable errors |
+| flush-failed | HdfsSystemProducerMetrics | Number of failed flushes to HDFS |
+| **Processing Time -- Is my job spending too much time processing inputs?** |
+| process-ns | SamzaContainerMetrics | Amount of time the job is spending in processing each input |
+| commit-ns | SamzaContainerMetrics | Amount of time the job is spending in checkpointing inputs (and flushing producers, checkpointing KV stores, flushing side input stores).
+The frequency of this function is configured using _task.commit.ms_ |
+| window-ns | SamzaContainerMetrics | In case of WindowableTasks being used, amount of time the job is spending in its window() operations |
+
+### <a name="storemetrics"></a>  D.2. Store Metrics
+
+Stateful Samza jobs typically use RocksDB backed KV stores for storing state. Therefore, timing metrics associated with 
+KV stores can be useful for monitoring input lag. These are some key metrics for KV stores. 
+The metrics reference sheet [here](#e-metrics-reference-sheet) details all metrics for KV stores.
+
+
+
+| **Metric name** | **Group** | **Meaning** |
+| --- | --- | --- |
+| get-ns, put-ns, delete-ns, all-ns | KeyValueStorageEngineMetrics | Time spent performing respective KV store operations |
+
+
+
+### <a name="operatormetrics"></a>  D.3. Operator Metrics
+
+If your Samza job uses Samza&#39;s Fluent API or Samza-SQL, Samza creates a DAG (directed acyclic graph) of 
+_operators_ to form the required data processing pipeline. In such cases, operator metrics allow fine-grained 
+monitoring of such operators. Key operator metrics are listed below, while a detailed list is present 
+in the metrics reference sheet.
+
+| **Metric name** | **Group** | **Meaning** |
+| --- | --- | --- |
+| <Operator-ID\>-handle-message-ns | WindowOperatorImpl, PartialJoinOperatorImpl, StreamOperatorImpl, StreamTableJoinOperatorImpl, etc | Time spent handling a given input message by the operator |
+
+
+
+## <a name="metricssheet"></a>  E. Metrics Reference Sheet
+Suffixes &quot;-ms&quot; and &quot;-ns&quot; to metric names indicated milliseconds and nanoseconds respectively. All &quot;average time&quot; metrics are calculated over a sliding time window of 300 seconds.
+
+All \<system\>, \<stream\>, \<partition\>, \<store-name\>, \<topic\>, are populated with the corresponding actual values at runtime.
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **ContainerProcessManagerMetrics** | running-containers | Total number of running containers. |
+| | needed-containers | Number of containers needed for the job to be declared healthy. |
+| | completed-containers | Number of containers that have completed their execution and exited. |
+| | failed-containers | Number of containers that have failed in the job&#39;s lifetime. |
+| | released-containers | Number of containers released due to overallocation by the YARN-ResourceManager. |
+| | container-count | Number of containers configured for the job. |
+| | redundant-notifications | Number of redundant onResourceCompletedcallbacks received from the RM after container shutdown. |
+| | job-healthy | A binary value, where 1 indicates that all the required containers configured for a job are running, 0 otherwise. |
+| | preferred-host-requests | Number of container resource-requests for a preferred host received by the cluster manager. |
+| | any-host-requests | Number of container resource-requests for _any_ host received by the cluster manager |
+| | expired-preferred-host-requests | Number of expired resource-requests-for -preferred-host received by the cluster manager. |
+| | expired-any-host-requests | Number of expired resource-requests-for -any-host received by the cluster manager. |
+| | host-affinity-match-pct | Percentage of non-expired preferred host requests. This measures the % of resource-requests for which host-affinity provided the preferred host. |
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **SamzaContainerMetrics (Timer metrics)** | choose-ns | Average time spent by a task instance for choosing the input to process; this includes time spent waiting for input, selecting one in case of multiple inputs, and deserializing input. |
+| | window-ns | In case of WindowableTasks being used, average time a task instance is spending in its window() operations. |
+| | timer-ns | Average time spent in the timer-callback when a timer registered with TaskContext fires. |
+| | process-ns | Average time the job is spending in processing each input. |
+| | commit-ns | Average time the job is spending in checkpointing inputs (and flushing producers, checkpointing KV stores, flushing side input stores). The frequency of this function is configured using _task.commit.ms._ |
+| | block-ns | Average time the run loop is blocked because all task instances are busy processing input; could indicate lag accumulating. |
+| | container-startup-time | Time spent in starting the container. This includes time to start the JMX server, starting metrics reporters, starting system producers, consumers, system admins, offset manager, locality manager, disk space manager, security manager, statistics manager, and initializing all task instances. |
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **SamzaContainerMetrics (Counters and Gauges)** | commit-calls | Number of commits. Each commit includes input checkpointing, flushing producers, checkpointing KV stores, flushing side input stores, etc. |
+| | window-calls | In case of WindowableTask, this measures the number of window invocations. |
+| | timer-calls | Number of timer callbacks. |
+| | process-calls | Number of process method invocations. |
+| | process-envelopers | Number of input message envelopes processed. |
+| | process-null-envelopes | Number of times no input message envelopes was available for the run loop to process. |
+| | event-loop-utilization | The duty-cycle of the event loop. That is, the fraction of time of each event loop iteration that is spent in process(), window(), and commit. |
+| | disk-usage-bytes | Total disk space size used by key-value stores (in bytes). |
+| | disk-quota-bytes | Disk memory usage quota for key-value stores (in bytes). |
+| | executor-work-factor | The work factor of the run loop. A work factor of 1 indicates full throughput, while a work factor of less than 1 will introduce delays into the execution to approximate the requested work factor. The work factor is set by the disk space monitor in accordance with the disk quota policy. Given the latest percentage of available disk quota, this policy returns the work factor that should be applied. |
+| | physical-memory-mb | The physical memory used by the Samza container process (native + on heap) (in MBs). |
+| | <TaskName\>-<StoreName\>-restore-time | Time taken to restore task stores (per task store). |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **Job-Coordinator Metrics (Gauge)** | \<system\>-\<stream\>-partitionCount | The current number of partitions detected by the Stream Partition Count Monitor. This can be enabled by configuring _job.coordinator.monitor-partition-change_ to true. |
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **TaskInstance Metrics (Counters and Gauges)** | \<system\>-\<stream\>-\<partition\>-offset | The offset of the last processed message on the given system-stream-partition input. |
+|   | commit-calls | Number of commit calls for the task. Each commit call involves checkpointing inputs (and flushing producers, checkpointing KV stores, flushing side input stores). |
+|   | window-calls | In case of WIndowableTask, the number of window() invocations on the task. |
+|   | process-calls | Number of process method calls. |
+|   | send-calls | Number of send method calls (representing number of messages that were sent to the underlying SystemProducers) |
+|   | flush-calls | Number of times the underlying system producers were flushed. |
+|   | messages-actually-processed | Number of messages processed by the task. |
+|   | pending-messages | Number of pending messages in the pending envelope queue
+|   | messages-in-flight | Number of input messages currently being processed. This is impacted by the task.max.concurrency configuration. |
+|   | async-callback-complete-calls | Number of processAsync invocations that have completed (applicable to AsyncStreamTasks). |
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| OffsetManagerMetrics (Gauge) | \<system\>-\<stream\>-\<partition\>-checkpointed-offset | Latest checkpointed offsets for each input system-stream-partition. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **JvmMetrics (Timers)** | gc-time-millis | Total time spent in GC. |
+|   | <gc-name\>-time-millis | Total time spent in garbage collection (for each garbage collector) (in milliseconds) |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **JvmMetrics (Counters and Gauges)** | gc-count | Number of GC invocations. |
+|   | mem-heap-committed-mb | Size of committed heap memory (in MBs) Because the guest allocates memory lazily to the JVM heap and because the difference between Free and Used memory is opaque to the guest, the guest commits memory to the JVM heap as it is required. The Committed memory, therefore, is a measure of how much memory the JVM heap is really consuming in the guest.[https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html](https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html) |
+|   | mem-heap-used-mb | Used memory from the perspective of the JVM is (Working set + Garbage) and Free memory is (Current heap size – Used memory). |
+|   | mem-heap-max-mb | Size of maximum heap memory (in MBs). This is defined by the –Xmx option. |
+|   | mem-nonheap-committed-mb | Size of non-heap memory committed in MBs. |
+|   | mem-nonheap-used-mb | Size of non-heap memory used in MBs. |
+|   | mem-nonheap-max-mb | Size of non-heap memory in MBs. This can be changed using –XX:MaxPermSize VM option. |
+|   | threads-new | Number of threads not started at that instant. |
+|   | threads-runnable | Number of running threads at that instant. |
+|   | threads-timed-waiting | Current number of timed threads waiting at that instant. A thread in TIMED\_WAITING stated as: &quot;A thread that is waiting for another thread to perform an action for up to a specified waiting time is in this state.&quot; |
+|   | threads-waiting | Current number of waiting threads. |
+|   | threads-blocked | Current number of blocked threads. |
+|   | threads-terminated | Current number of terminated threads. |
+|   | \<gc-name\>-gc-count | Number of garbage collection calls (for each garbage collector). |
+| **(Emitted only if the OS supports it)** | process-cpu-usage | Returns the &quot;recent cpu usage&quot; for the Java Virtual Machine process. |
+| **(Emitted only if the OS supports it)** | system-cpu-usage | Returns the &quot;recent cpu usage&quot; for the whole system. |
+| **(Emitted only if the OS supports it)** | open-file-descriptor-count | Count of open file descriptors. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **SystemConsumersMetrics (Counters and Gauges)** <br/> These metrics are emitted when multiplexing and coordinating between per-system consumers and message choosers for polling | chose-null | Number of times the message chooser returned a null message envelope. This is typically indicative of low input traffic on one or more input partitions. |
+|   | chose-object | Number of times the message chooser returned a non-null message envelope. |
+|   | deserialization-error | Number of times an incoming message was not deserialized successfully. |
+|   | ssps-needed-by-chooser | Number of systems for which no buffered message exists, and hence these systems need to be polled (to obtain a message). |
+|   | poll-timeout | The timeout for polling at that instant. |
+|   | unprocessed-messages | Number of unprocessed messages buffered in SystemConsumers. |
+|   | \<system\>-polls | Number of times the given system was polled |
+|   | \<system\>-ssp-fetches-per-poll | Number of partitions of the given system polled at that instant. |
+|   | \<system\>-messages-per-poll | Number of times the SystemConsumer for the underlying system was polled to get new messages. |
+|   | \<system\>-\<stream\>-\<partition\>-messages-chosen | Number of messages that were chosen by the MessageChooser for particular system stream partition. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **SystemConsumersMetrics (Timers)** | poll-ns | Average time spent polling all underlying systems for new messages (in nanoseconds). |
+|   | deserialization-ns | Average time spent deserializing incoming messages (in nanoseconds). |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **KafkaSystemConsumersMetrics (Timers)** | \<system\>-\<topic\>-\<partition\>-offset-change | The next offset to be read for this topic and partition. |
+|   | \<system\>-\<topic\>-\<partition\>-bytes-read | Total size of all messages read for a topic partition (payload + key size). |
+|   | \<system\>-\<topic\>-\<partition\>-messages-read | Number of messages read for a topic partition. |
+|   | \<system\>-\<topic\>-\<partition\>-high-watermark | Offset of the last committed message in Kafka&#39;s topic partition. |
+|   | \<system\>-\<topic\>-\<partition\>-messages-behind-high-watermark | Number of input messages waiting to be processed on an input topic-partition. That is, the difference between high watermark and next offset. |
+|   | \<system\>-<host\>-<port\>-reconnects | Number of reconnects to a broker on a particular host and port. |
+|   | \<system\>-<host\>-<port\>-bytes-read | Total size of all messages read from a broker on a particular host and port. |
+|   | \<system\>-<host\>-<port\>-messages-read | Number of times the consumer used a broker on a particular host and port to get new messages. |
+|   | \<system\>-<host\>-<port\>-skipped-fetch-requests | Number of times the fetchMessage method is called but no topic/partitions needed new messages. |
+|   | \<system\>-<host\>-<port\>-topic-partitions | Number of broker&#39;s topic partitions which are being consumed. |
+|   | poll-count | Number of polls the KafkaSystemConsumer performed to get new messages. |
+|   | no-more-messages-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Indicates if the Kafka consumer is at the head for particular partition. |
+|   | blocking-poll-count-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Number of times a blocking poll is executed (polling until we get at least one message, or until we catch up to the head of the stream) (per partition). |
+|   | blocking-poll-timeout-count-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Number of times a blocking poll has timed out (polling until we get at least one message within a timeout period) (per partition). |
+|   | buffered-message-count-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Current number of messages in queue (per partition). |
+|   | buffered-message-size-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Current size of messages in queue (if systems.system.samza.fetch.threshold.bytes is defined) (per partition). |
+|   | \<system\>-\<topic\>-\<partition\>-offset-change | The next offset to be read for a topic partition. |
+|   | \<system\>-\<topic\>-\<partition\>-bytes-read | Total size of all messages read for a topic partition (payload + key size). |
+
+
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **SystemProducersMetrics (Counters and Gauges)** <br/>These metrics are aggregated across Producers. | sends | Number of send method calls. Representing total number of sent messages. |
+|   | flushes | Number of flush method calls for all registered producers. |
+|   | <source\>-sends | Number of sent messages for a particular source (task instance). |
+|   | <source\>-flushes | Number of flushes for particular source (task instance). |
+|   | serialization error | Number of errors occurred while serializing envelopes before sending. |
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **KafkaSystemProducersMetrics (Counters)** | \<system\>-producer-sends | Number of send invocations to the KafkaSystemProducer. |
+|   | \<system\>-producer-send-success | Number of send requests that were successfully completed by the KafkaSystemProducer. |
+|   | \<system\>-producer-send-failed | Number of send requests to Kafka (e.g., output topics) that failed due to unrecoverable errors |
+|   | \<system\>-flushes | Number of calls made to flush in the KafkaSystemProducer. |
+|   | \<system\>-flush-failed | Number of times flush operation failed. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **KafkaSystemProducersMetrics (Timers)** | \<system\>-flush-ns | Represents average time the flush call takes to complete (in nanoseconds). |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **KeyValueStorageEngineMetrics (Counters)** <br/> These metrics provide insight into the type and number of KV Store operations taking place | <store-name\>-puts | Total number of put operations on the given KV store. |
+|   | <store-name\>-put-alls | Total number putAll operations on the given KV store. |
+|   | <store-name\>-gets | Total number get operations on the given KV store. |
+|   | <store-name\>-get-alls | Total number getAll operations on the given KV store. |
+|   | <store-name\>-alls | Total number of accesses to the iterator on the given KV store. |
+|   | <store-name\>-ranges | Total number of accesses to a sorted-range iterator on the given KV store. |
+|   | <store-name\>-deletes | Total number delete operations on the given KV store. |
+|   | <store-name\>-delete-alls | Total number deleteAll operations on the given KV store. |
+|   | <store-name\>-flushes | Total number flush operations on the given KV store. |
+|   | <store-name\>-restored-messages | Number of entries in the KV store restored from the changelog for that store. |
+|   | <store-name\>-restored-bytes | Size in bytes of entries in the KV store restored from the changelog for that store. |
+|   | <store-name\>-snapshots | Total number of snapshot operations on the given KV store. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **KeyValueStorageEngineMetrics (Timers)** <br/> These metrics provide insight into the latencies of  of KV Store operations | <store-name\>-get-ns | Average duration of the get operation on the given KV Store. |
+|   | <store-name\>-get-all-ns | Average duration of the getAll operation on the given KV Store. |
+|   | <store-name\>-put-ns | Average duration of the put operation on the given KV Store. |
+|   | <store-name\>-put-all-ns | Average duration of the putAll operation on the given KV Store. |
+|   | <store-name\>-delete-ns | Average duration of the delete operation on the given KV Store. |
+|   | <store-name\>-delete-all-ns | Average duration of the deleteAll operation on the given KV Store. |
+|   | <store-name\>-flush-ns | Average duration of the flush operation on the given KV Store. |
+|   | <store-name\>-all-ns | Average duration of obtaining an iterator (using the all operation) on the given KV Store. |
+|   | <store-name\>-range-ns | Average duration of obtaining a sorted-range iterator (using the all operation) on the given KV Store. |
+|   | <store-name\>-snapshot-ns | Average duration of the snapshot operation on the given KV Store. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **KeyValueStoreMetrics (Counters)** <br/> These metrics are measured at the App-facing layer for different KV Stores, e.g., RocksDBStore, InMemoryKVStore. | <store-name\>-gets, <store-name\>-getAlls, <store-name\>-puts, <store-name\>-putAlls, <store-name\>-deletes, <store-name\>-deleteAlls, <store-name\>-alls, <store-name\>-ranges, <store-name\>-flushes | Total number of the specified operation on the given KV Store.(These metrics have are equivalent to the respective ones under KeyValueStorageEngineMetrics). |
+|   | bytes-read | Total number of bytes read (when serving reads -- gets, getAlls, and iterations). |
+|   | bytes-written | Total number of bytes written (when serving writes -- puts, putAlls). |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **SerializedKeyValueStoreMetrics (Counters)** <br/> These metrics are measured at the serialization layer. | <store-name\>-gets, <store-name\>-getAlls, <store-name\>-puts, <store-name\>-putAlls, <store-name\>-deletes, <store-name\>-deleteAlls, <store-name\>-alls, <store-name\>-ranges, <store-name\>-flushes | Total number of the specified operation on the given KV Store. (These metrics have are equivalent to the respective ones under KeyValueStorageEngineMetrics) |
+|   | bytes-deserialized | Total number of bytes deserialized (when serving reads -- gets, getAlls, and iterations). |
+|   | bytes-serialized | Total number of bytes serialized (when serving reads and writes -- gets, getAlls, puts, putAlls). In addition to writes, serialization is also done during reads to serialize key to bytes for lookup in the underlying store. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **LoggedStoreMetrics (Counters)** <br/> These metrics are measured at the changeLog-backup layer for KV stores. | <store-name\>-gets, <store-name\>-puts, <store-name\>-alls, <store-name\>-deletes, <store-name\>-flushes, <store-name\>-ranges, | Total number of the specified operation on the given KV Store.
+|
+
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **CachedStoreMetrics (Counters and Gauges)** <br/> These metrics are measured at the caching layer for RocksDB-backed KV stores. | <store-name\>-gets, <store-name\>-puts, <store-name\>-alls, <store-name\>-deletes, <store-name\>-flushes, <store-name\>-ranges, | Total number of the specified operation on the given KV Store.|
+|   | cache-hits | Total number of get and getAll operations that hit cached entries. |
+|   | put-all-dirty-entries-batch-size | Total number of dirty KV-entries written-back to the underlying store. |
+|   | dirty-count | Number of entries in the cache marked dirty at that instant. |
+|   | cache-count | Number of entries in the cache at that instant. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **RoundRobinChooserMetrics (Counters)** | buffered-messages | Size of the queue with potential messages to process. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **BatchingChooserMetrics (Counters and gauges)** | batch-resets | Number of batch resets because they  exceeded the max batch size limit. |
+|   | batched-envelopes | Number of envelopes in the batch at the current instant. |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **BootstrappingChooserMetrics (Gauges)** | lagging-batch-streams | Number of bootstrapping streams that are lagging. |
+|   | \<system\>-\<stream\>-lagging-partitions | Number of lagging partitions in the stream (for each stream marked as bootstrapping stream). |
+
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **HdfsSystemProducerMetrics (Counters)** | system-producer-sends | Total number of attempts to write to HDFS. |
+|   | system-send-success | Total number of successful writes to HDFS. |
+|   | system-send-failed | Total number of failures while sending envelopes to HDFS. |
+|   | system-flushes | Total number of attempts to flush data to HDFS. |
+|   | system-flush-success | Total number of successfully flushed all written data to HDFS. |
+|   | system-flush-failed | Total number of failures while flushing data to HDFS. |
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **HdfsSystemProducerMetrics (Timers)** | system-send-ms | Average time spent for writing messages to HDFS (in milliseconds). |
+|   | system-flush-ms | Average time spent for flushing messages to HDFS (in milliseconds). |
+
+
+| **Group** | **Metric name** | **Meaning** |
+| --- | --- | --- |
+| **ElasticsearchSystemProducerMetrics (Counters)** | system-bulk-send-success | Total number of successful bulk requests |
+|   | system-docs-inserted | Total number of documents created. |
+|   | system-docs-updated | Total number of documents updated. |
+|   | system-version-conflicts | Number of times the failed requests due to conflicts with the current state of the document. |


[22/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/85050645
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/85050645
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/85050645

Branch: refs/heads/master
Commit: 85050645cf0673d3f4e89a5461b1f565d2ccf846
Parents: cad265f 992b217
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 16:20:21 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 16:20:21 2018 -0700

----------------------------------------------------------------------
 .../samza/operators/BaseTableDescriptor.java    | 10 +-
 .../table/remote/RemoteTableDescriptor.java     |  4 +-
 .../kv/inmemory/InMemoryTableDescriptor.java    |  4 +-
 .../storage/kv/RocksDbTableDescriptor.java      | 98 +++++++++++++++++++-
 .../kv/BaseLocalStoreBackedTableDescriptor.java | 20 +++-
 .../apache/samza/test/framework/TestRunner.java | 45 +++++++--
 .../system/InMemorySystemDescriptor.java        |  1 -
 .../AsyncStreamTaskIntegrationTest.java         |  3 +-
 .../StreamApplicationIntegrationTest.java       | 61 +++++++++++-
 .../framework/StreamTaskIntegrationTest.java    | 93 +++++++++++++++++++
 .../table/PageViewToProfileJoinFunction.java    |  2 +-
 .../table/TestLocalTableWithSideInputs.java     | 20 ++--
 12 files changed, 322 insertions(+), 39 deletions(-)
----------------------------------------------------------------------



[04/32] samza git commit: Fix navigation layout on home-page. Add overall Samza architecture diagrams

Posted by ja...@apache.org.
Fix navigation layout on home-page. Add overall Samza architecture diagrams


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/115041ac
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/115041ac
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/115041ac

Branch: refs/heads/master
Commit: 115041ac48bafff343358a6cedc540a50df67d14
Parents: c480b7e
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 22:04:55 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 22:05:16 2018 -0700

----------------------------------------------------------------------
 docs/_includes/main-navigation.html             |   4 ++--
 docs/_layouts/default.html                      |  19 +++++++++++--------
 docs/css/main.new.css                           |  16 +++++++++++++++-
 .../documentation/api/samza-arch-detailed.png   | Bin 0 -> 71311 bytes
 .../learn/documentation/api/samza-arch1.png     | Bin 0 -> 94384 bytes
 .../learn/documentation/api/samza-arch2.png     | Bin 0 -> 81257 bytes
 6 files changed, 28 insertions(+), 11 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/115041ac/docs/_includes/main-navigation.html
----------------------------------------------------------------------
diff --git a/docs/_includes/main-navigation.html b/docs/_includes/main-navigation.html
index d9ad2d0..626501e 100644
--- a/docs/_includes/main-navigation.html
+++ b/docs/_includes/main-navigation.html
@@ -29,8 +29,8 @@
       </a>
     </div>
     <div class="main-navigation__items" data-menu-opened>
-      <a class="main-navigation__item" href="/case-studies/">Home</a>
-      <a class="main-navigation__item" href="/learn/documentation/{{site.version}}/introduction/background.html">Docs</a>
+      <a class="main-navigation__item" href="/">Home</a>
+      <a class="main-navigation__item" href="/learn/documentation/{{site.version}}/core-concepts/core-concepts.html">Docs</a>
       <a class="main-navigation__item" href="/powered-by/">Powered By</a>
       <a class="main-navigation__item" href="/startup/download/">Downloads</a>
       <a class="main-navigation__item" href="/blog/">Blog</a>

http://git-wip-us.apache.org/repos/asf/samza/blob/115041ac/docs/_layouts/default.html
----------------------------------------------------------------------
diff --git a/docs/_layouts/default.html b/docs/_layouts/default.html
index 74ca7ab..ba10e8b 100644
--- a/docs/_layouts/default.html
+++ b/docs/_layouts/default.html
@@ -58,7 +58,7 @@
           Apache Samza
         </h1>
         <h2 class="section__title section__title--sub">
-          Build scalable, fault-tolerant applications that process your data in real-time
+          A distributed stream processing framework
         </h2>
         <div class="content">
           <a class="button" href="/startup/hello-samza/{{site.version}}">
@@ -106,15 +106,18 @@
 
 
   <div class="section section--what-is-samza">
-    <div class="section__title">What is Apache Samza?</div>
-    <div class="content">
+
+   <!-- <div class="section__title">What is Apache Samza?</div> -->
+  <div class="content--samza-intro">
       <p>
-        <strong>Apache Samza</strong>, a top level project of the
-        <a href="https://www.apache.org/" target="_blank">Apache Software Foundation</a>, is a distributed stream processing framework. It uses
-        <a target="_blank" href="https://kafka.apache.org">Apache Kafka</a> for messaging, and
-        <a target="_blank" href="https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html">Apache Hadoop YARN</a> to provide fault tolerance, processor isolation, security, and resource management.
+        Samza allows you to build stateful applications that process data in real-time from multiple sources including Apache Kafka.
+        <br/> <br/> 
+        Battle-tested at scale, it supports flexible deployment options to run on <a target="_blank" href="https://kafka.apache.org">YARN</a> or as a 
+        <a href="/learn/documentation/latest/deployment/standalone.html">standalone library</a>.
       </p>
-    </div>
+
+      <img src="/img/latest/learn/documentation/api/samza-arch1.png">
+  </div>
   </div>
 
   <div class="section section--highlight section--bottom-flare section--features">

http://git-wip-us.apache.org/repos/asf/samza/blob/115041ac/docs/css/main.new.css
----------------------------------------------------------------------
diff --git a/docs/css/main.new.css b/docs/css/main.new.css
index 2c77a70..623ca55 100644
--- a/docs/css/main.new.css
+++ b/docs/css/main.new.css
@@ -95,10 +95,24 @@ a.side-navigation__group-title:hover::after {
 }
 
 .content {
+    max-width: 1200px;
+    margin: auto;
+    padding: 20px;
+    line-height: 25px;
+}
+
+.content--samza-intro {
   max-width: 1200px;
   margin: auto;
   padding: 20px;
   line-height: 25px;
+  display: flex;
+  align-items: center;
+}
+
+.content--samza-intro p {
+  font-size: 18px;
+  line-height: 25px;
 }
 
 .content p {
@@ -482,7 +496,7 @@ footer .side-by-side > * {
 }
 
 .section--what-is-samza {
-  padding: 0px 20px 100px;
+  padding: 0px 1px 1px;
 }
 
 .section--hero {

http://git-wip-us.apache.org/repos/asf/samza/blob/115041ac/docs/img/versioned/learn/documentation/api/samza-arch-detailed.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch-detailed.png b/docs/img/versioned/learn/documentation/api/samza-arch-detailed.png
new file mode 100644
index 0000000..acdaa65
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch-detailed.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/115041ac/docs/img/versioned/learn/documentation/api/samza-arch1.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch1.png b/docs/img/versioned/learn/documentation/api/samza-arch1.png
new file mode 100644
index 0000000..68172fa
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch1.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/115041ac/docs/img/versioned/learn/documentation/api/samza-arch2.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch2.png b/docs/img/versioned/learn/documentation/api/samza-arch2.png
new file mode 100644
index 0000000..24c1740
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch2.png differ


[30/32] samza git commit: Add Powered By pages for Samza users in the community

Posted by ja...@apache.org.
Add Powered By pages for Samza users in the community


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/a05fee91
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/a05fee91
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/a05fee91

Branch: refs/heads/master
Commit: a05fee910e5878819852bd3f51129849caf87dd6
Parents: 9cd7cdd
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 19:20:16 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 19:20:16 2018 -0700

----------------------------------------------------------------------
 docs/_powered-by/doubledutch.md | 4 +---
 docs/_powered-by/fortscale.md   | 3 +--
 docs/_powered-by/vmware.md      | 5 +----
 3 files changed, 3 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/a05fee91/docs/_powered-by/doubledutch.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/doubledutch.md b/docs/_powered-by/doubledutch.md
index 95ae897..fda5aae 100644
--- a/docs/_powered-by/doubledutch.md
+++ b/docs/_powered-by/doubledutch.md
@@ -19,6 +19,4 @@ domain: doubledutch.me
    limitations under the License.
 -->
 
-<a class="external-link" href="www.doubledutch.me" rel="nofollow">DoubleDutch</a> provides mobile applications and performance analytics for events, conferences, and trade shows for more than 1,000 customers including SAP, UBM, and Urban Land Institute. It uses Samza to power their analytics platform and stream data live into an event dashboard for real-time insights.
-
-
+<a class="external-link" href="www.doubledutch.me" rel="nofollow">DoubleDutch</a> provides mobile applications and performance analytics for events, conferences, and trade shows for more than 1,000 customers including SAP, UBM, and Urban Land Institute. It uses Samza to power their analytics platform and stream data live into an event dashboard for real-time insights.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/a05fee91/docs/_powered-by/fortscale.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/fortscale.md b/docs/_powered-by/fortscale.md
index acbfea7..eac4502 100644
--- a/docs/_powered-by/fortscale.md
+++ b/docs/_powered-by/fortscale.md
@@ -19,5 +19,4 @@ domain: fortscale.com
    limitations under the License.
 -->
 
-<a class="external-link" href="https://www.fortscale.com/" rel="nofollow">Fortscale</a> is redefining behavioral analytics, with the industry’s first embeddable engine, making behavioral analytics available for everyone. It is using Samza to process security events as part of their data ingestion pipelines and for the creation of on-line machine learning models.
-
+<a class="external-link" href="https://www.fortscale.com/" rel="nofollow">Fortscale</a> is redefining behavioral analytics, with the industry’s first embeddable engine, making behavioral analytics available for everyone. It is using Samza to process security events as part of their data ingestion pipelines and for the creation of on-line machine learning models.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/a05fee91/docs/_powered-by/vmware.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/vmware.md b/docs/_powered-by/vmware.md
index 132f6f8..6ef5c4a 100644
--- a/docs/_powered-by/vmware.md
+++ b/docs/_powered-by/vmware.md
@@ -22,7 +22,4 @@ priority: 01
 
 <a class="external-link" href="http://www.vmware.com/products/vrealize-network-insight.html" rel="nofollow">vRealize Network Insight (vRNI)</a> is VMware’s flagship product for delivering intelligent operations for software defined network environments (e.g. NSX).
  
-At the heart of the vRNI architecture are a set of distributed processing and analytics modules that crunch large amounts of streaming data on a cluster of multiple machines. It is critical that these operations are carried out in a way that is reliable, efficient and robust - even in the face of dynamic faults in the underlying infrastructure layers. Vmware has been successfully using Apache Samza as a distributed streaming data processing framework for executing these analytical modules reliably and efficiently at a very large scale, thus helping them focus on our core business problems.
-
-
-
+At the heart of the vRNI architecture are a set of distributed processing and analytics modules that crunch large amounts of streaming data on a cluster of multiple machines. It is critical that these operations are carried out in a way that is reliable, efficient and robust - even in the face of dynamic faults in the underlying infrastructure layers. Vmware has been successfully using Apache Samza as a distributed streaming data processing framework for executing these analytical modules reliably and efficiently at a very large scale, thus helping them focus on our core business problems.
\ No newline at end of file


[18/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/44329cf4
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/44329cf4
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/44329cf4

Branch: refs/heads/master
Commit: 44329cf4842252edf0f0a9c4c2e57f6a61696f7b
Parents: d9431b7 623661e
Author: Jagadish <jv...@linkedin.com>
Authored: Wed Oct 10 18:42:16 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Wed Oct 10 18:42:16 2018 -0700

----------------------------------------------------------------------
 build.gradle                                    |  50 ++-
 docs/_docs/replace-versioned.sh                 |   3 +
 docs/_layouts/default.html                      |   6 +-
 docs/_meetups/april-2018.md                     |  52 ---
 docs/_meetups/aug-2016.md                       |  44 ++
 docs/_meetups/aug-2017.md                       |  31 ++
 docs/_meetups/dec-2017.md                       |  45 ++
 docs/_meetups/december-2018.md                  |  24 +-
 docs/_meetups/feb-2015.md                       |  42 ++
 docs/_meetups/feb-2016.md                       |  32 ++
 docs/_meetups/feb-2017.md                       |  45 ++
 docs/_meetups/july-2015.md                      |  43 ++
 docs/_meetups/july-2018.md                      |   6 -
 docs/_meetups/jun-2015.md                       |  31 ++
 docs/_meetups/jun-2016.md                       |  56 +++
 docs/_meetups/june-2018.md                      |  31 ++
 docs/_meetups/mar-2018.md                       |  46 ++
 docs/_meetups/may-2014.md                       |  31 ++
 docs/_meetups/may-2015.md                       |  54 +++
 docs/_meetups/may-2016.md                       |  31 ++
 docs/_meetups/may-2017.md                       |  53 +++
 docs/_meetups/nov-2013.md                       |  41 ++
 docs/_meetups/nov-2014.md                       |  50 +++
 docs/_meetups/nov-2016.md                       |  42 ++
 docs/_meetups/nov-2017.md                       |  31 ++
 docs/_meetups/oct-2014.md                       |  41 ++
 docs/_meetups/oct-2015.md                       |  66 +++
 docs/_meetups/sep-2013.md                       |  42 ++
 docs/_meetups/sep-2014.md                       |  42 ++
 docs/_meetups/sep-2015.md                       |  30 ++
 docs/_meetups/sep-2017.md                       |  56 +++
 docs/_menu/index.html                           |   2 +-
 docs/css/google-fonts.css                       |  18 +
 .../versioned/jobs/configuration-table.html     |   4 +-
 .../documentation/versioned/jobs/logging.md     | 156 ++++++-
 .../versioned/hello-samza-high-level-code.md    |   2 +-
 docs/meetups/index.html                         |  20 +-
 docs/startup/quick-start/versioned/index.md     | 254 +++++++++++
 gradle.properties                               |   2 +-
 gradle/dependency-versions.gradle               |   3 +-
 .../application/ApplicationDescriptor.java      |  30 +-
 .../samza/container/SamzaContainerContext.java  |  55 ---
 .../context/ApplicationContainerContext.java    |   7 +-
 .../samza/context/ApplicationTaskContext.java   |   4 +
 .../org/apache/samza/context/JobContext.java    |   1 +
 .../apache/samza/operators/ContextManager.java  |  49 ---
 .../operators/functions/InitableFunction.java   |   9 +-
 .../samza/scheduler/CallbackScheduler.java      |   1 +
 .../samza/scheduler/ScheduledCallback.java      |   5 +-
 .../samza/storage/StorageEngineFactory.java     |   8 +-
 .../org/apache/samza/table/ReadableTable.java   |   9 +-
 .../org/apache/samza/table/TableProvider.java   |   9 +-
 .../org/apache/samza/task/InitableTask.java     |   6 +-
 .../java/org/apache/samza/task/TaskContext.java |  98 -----
 .../java/org/apache/samza/util/RateLimiter.java |   9 +-
 samza-azure/src/test/resources/log4j.xml        |   8 -
 samza-azure/src/test/resources/log4j2.xml       |  32 ++
 .../application/ApplicationDescriptorImpl.java  |  55 ++-
 .../apache/samza/container/TaskContextImpl.java | 169 -------
 .../org/apache/samza/context/ContextImpl.java   |  60 ++-
 .../apache/samza/context/JobContextImpl.java    |  22 +-
 .../apache/samza/context/TaskContextImpl.java   |  34 +-
 .../samza/execution/ExecutionPlanner.java       | 144 +++++-
 .../execution/IntermediateStreamManager.java    | 253 +++--------
 .../org/apache/samza/execution/JobGraph.java    |  34 +-
 .../execution/OperatorSpecGraphAnalyzer.java    | 134 +++++-
 .../operators/impl/BroadcastOperatorImpl.java   |   9 +-
 .../samza/operators/impl/InputOperatorImpl.java |   5 +-
 .../samza/operators/impl/OperatorImpl.java      |  54 +--
 .../samza/operators/impl/OperatorImplGraph.java |  84 ++--
 .../operators/impl/OutputOperatorImpl.java      |   5 +-
 .../operators/impl/PartialJoinOperatorImpl.java |  11 +-
 .../operators/impl/PartitionByOperatorImpl.java |  17 +-
 .../operators/impl/SendToTableOperatorImpl.java |  15 +-
 .../samza/operators/impl/SinkOperatorImpl.java  |   9 +-
 .../operators/impl/StreamOperatorImpl.java      |   7 +-
 .../impl/StreamTableJoinOperatorImpl.java       |  18 +-
 .../operators/impl/WindowOperatorImpl.java      |  15 +-
 .../operators/spec/FilterOperatorSpec.java      |   7 +-
 .../samza/operators/spec/MapOperatorSpec.java   |   7 +-
 .../apache/samza/processor/StreamProcessor.java |  78 ++--
 .../samza/runtime/LocalApplicationRunner.java   |   3 +-
 .../samza/runtime/LocalContainerRunner.java     |  14 +-
 .../apache/samza/storage/StorageRecovery.java   |   9 +-
 .../org/apache/samza/table/TableManager.java    |  18 +-
 .../samza/table/caching/CachingTable.java       |  28 +-
 .../table/caching/CachingTableProvider.java     |   8 +-
 .../table/caching/guava/GuavaCacheTable.java    |  18 +-
 .../caching/guava/GuavaCacheTableProvider.java  |   2 +-
 .../table/remote/RemoteReadWriteTable.java      |  24 +-
 .../samza/table/remote/RemoteReadableTable.java |  30 +-
 .../samza/table/remote/RemoteTableProvider.java |  26 +-
 .../samza/table/utils/BaseTableProvider.java    |  11 +-
 .../table/utils/DefaultTableReadMetrics.java    |  11 +-
 .../table/utils/DefaultTableWriteMetrics.java   |  11 +-
 .../samza/table/utils/TableMetricsUtil.java     |  21 +-
 .../org/apache/samza/task/AsyncRunLoop.java     |   5 +-
 .../samza/task/AsyncStreamTaskAdapter.java      |   6 +-
 .../apache/samza/task/StreamOperatorTask.java   |  33 +-
 .../org/apache/samza/task/TaskFactoryUtil.java  |   4 +-
 .../samza/util/EmbeddedTaggedRateLimiter.java   |  30 +-
 .../apache/samza/container/SamzaContainer.scala |  71 +--
 .../apache/samza/container/TaskInstance.scala   |  53 ++-
 .../samza/job/local/ThreadJobFactory.scala      |   8 +-
 .../TestStreamApplicationDescriptorImpl.java    |  37 +-
 .../TestTaskApplicationDescriptorImpl.java      |  36 +-
 .../org/apache/samza/context/MockContext.java   |  73 ++++
 .../apache/samza/context/TestContextImpl.java   |  12 +-
 .../samza/context/TestTaskContextImpl.java      |  15 +-
 .../execution/ExecutionPlannerTestBase.java     |   2 +-
 .../samza/execution/TestExecutionPlanner.java   | 399 ++++++++++++++---
 .../TestIntermediateStreamManager.java          |  68 ---
 .../TestJobNodeConfigurationGenerator.java      |  22 +-
 .../samza/operators/TestJoinOperator.java       |  29 +-
 .../samza/operators/impl/TestOperatorImpl.java  |  52 ++-
 .../operators/impl/TestOperatorImplGraph.java   | 137 +++---
 .../operators/impl/TestSinkOperatorImpl.java    |   7 +-
 .../operators/impl/TestStreamOperatorImpl.java  |   6 -
 .../impl/TestStreamTableJoinOperatorImpl.java   |  17 +-
 .../operators/impl/TestWindowOperator.java      | 114 ++---
 .../samza/operators/spec/TestOperatorSpec.java  |   2 +-
 .../spec/TestPartitionByOperatorSpec.java       |   2 +-
 .../operators/spec/TestWindowOperatorSpec.java  |   9 +-
 .../samza/processor/TestStreamProcessor.java    |   9 +-
 .../samza/storage/MockStorageEngineFactory.java |  16 +-
 .../apache/samza/table/TestTableManager.java    |  18 +-
 .../samza/table/caching/TestCachingTable.java   |  48 +-
 .../samza/table/remote/TestRemoteTable.java     |  39 +-
 .../table/remote/TestRemoteTableDescriptor.java |  41 +-
 .../retry/TestRetriableTableFunctions.java      |  12 +-
 .../apache/samza/task/IdentityStreamTask.java   |   6 +-
 .../org/apache/samza/task/TestAsyncRunLoop.java |  44 +-
 .../samza/task/TestAsyncStreamAdapter.java      |   6 +-
 .../samza/task/TestEpochTimeScheduler.java      |   3 +-
 .../samza/task/TestStreamOperatorTask.java      |  27 ++
 .../util/TestEmbeddedTaggedRateLimiter.java     |  48 +-
 .../samza/container/TestSamzaContainer.scala    |  51 ++-
 .../samza/container/TestTaskInstance.scala      |  90 +++-
 .../processor/StreamProcessorTestUtils.scala    |  31 +-
 .../InMemoryKeyValueStorageEngineFactory.scala  |  13 +-
 .../samza/storage/kv/RocksDbKeyValueReader.java |  11 +-
 .../samza/storage/kv/RocksDbOptionsHelper.java  |  15 +-
 .../RocksDbKeyValueStorageEngineFactory.scala   |  23 +-
 .../storage/kv/TestRocksDbTableDescriptor.java  |   3 +-
 .../kv/BaseLocalStoreBackedTableProvider.java   |  18 +-
 .../kv/LocalStoreBackedReadWriteTable.java      |  10 +-
 .../kv/LocalStoreBackedReadableTable.java       |  10 +-
 .../kv/BaseKeyValueStorageEngineFactory.scala   |  41 +-
 .../TestBaseLocalStoreBackedTableProvider.java  |  18 +-
 .../apache/samza/config/Log4jSystemConfig.java  |  88 ++++
 .../samza/logging/log4j2/StreamAppender.java    | 436 +++++++++++++++++++
 .../logging/log4j2/StreamAppenderMetrics.java   |  43 ++
 .../serializers/LoggingEventJsonSerde.java      | 194 +++++++++
 .../LoggingEventJsonSerdeFactory.java           |  36 ++
 .../serializers/LoggingEventStringSerde.java    |  76 ++++
 .../LoggingEventStringSerdeFactory.java         |  32 ++
 .../samza/logging/log4j2/MockSystemAdmin.java   |  74 ++++
 .../samza/logging/log4j2/MockSystemFactory.java |  45 ++
 .../logging/log4j2/MockSystemProducer.java      |  61 +++
 .../log4j2/MockSystemProducerAppender.java      |  77 ++++
 .../logging/log4j2/TestStreamAppender.java      | 298 +++++++++++++
 .../TestLoggingEventStringSerde.java            |  52 +++
 samza-log4j2/src/test/resources/log4j2.xml      |  37 ++
 samza-rest/src/main/resources/log4j2.xml        |  40 ++
 samza-shell/src/main/bash/checkpoint-tool.sh    |   6 +-
 samza-shell/src/main/bash/kill-all.sh           |   8 +-
 .../src/main/bash/kill-yarn-job-by-name.sh      |   7 +-
 samza-shell/src/main/bash/kill-yarn-job.sh      |   6 +-
 samza-shell/src/main/bash/list-yarn-job.sh      |   6 +-
 samza-shell/src/main/bash/read-rocksdb-tool.sh  |   6 +-
 samza-shell/src/main/bash/run-app.sh            |   6 +-
 samza-shell/src/main/bash/run-class.sh          |  10 +-
 samza-shell/src/main/bash/run-config-manager.sh |   6 +-
 .../main/bash/run-coordinator-stream-writer.sh  |   6 +-
 samza-shell/src/main/bash/run-job.sh            |   6 +-
 samza-shell/src/main/bash/stat-yarn-job.sh      |   6 +-
 samza-shell/src/main/bash/state-storage-tool.sh |   6 +-
 samza-shell/src/main/bash/validate-yarn-job.sh  |   6 +-
 .../src/main/resources/log4j2-console.xml       |  35 ++
 .../sql/runner/SamzaSqlApplicationContext.java  |  44 ++
 .../samza/sql/translator/FilterTranslator.java  |   9 +-
 .../samza/sql/translator/ModifyTranslator.java  |  11 +-
 .../samza/sql/translator/ProjectTranslator.java |   8 +-
 .../samza/sql/translator/QueryTranslator.java   |  49 +--
 .../samza/sql/translator/ScanTranslator.java    |  11 +-
 .../apache/samza/sql/e2e/TestSamzaSqlTable.java |   1 -
 .../runner/TestSamzaSqlApplicationRunner.java   |   2 -
 .../samza/sql/system/TestAvroSystemFactory.java |   1 -
 .../sql/testutil/TestIOResolverFactory.java     |   1 -
 .../sql/testutil/TestSamzaSqlFileParser.java    |   2 -
 .../sql/translator/TestFilterTranslator.java    |  16 +-
 .../sql/translator/TestProjectTranslator.java   |  25 +-
 .../sql/translator/TestQueryTranslator.java     |  47 +-
 samza-sql/src/test/resources/log4j.xml          |   9 -
 samza-sql/src/test/resources/log4j2.xml         |  35 ++
 .../samza/example/KeyValueStoreExample.java     |   7 +-
 .../test/framework/MessageStreamAssert.java     |  15 +-
 .../test/integration/NegateNumberTask.java      |   9 +-
 .../test/integration/SimpleStatefulTask.java    |  13 +-
 .../test/integration/StatePerfTestTask.java     |   9 +-
 .../samza/test/integration/join/Checker.java    |  19 +-
 .../samza/test/integration/join/Emitter.java    |  23 +-
 .../samza/test/integration/join/Joiner.java     |  26 +-
 .../samza/test/integration/join/Watcher.java    |  17 +-
 samza-test/src/main/resources/log4j2.xml        |  41 ++
 .../performance/TestKeyValuePerformance.scala   |  21 +-
 .../test/performance/TestPerformanceTask.scala  |  19 +-
 .../processor/TestZkStreamProcessorBase.java    |   6 +-
 .../test/framework/FaultInjectionTest.java      |   1 -
 .../samza/test/framework/TestSchedulingApp.java |   2 +-
 .../test/processor/IdentityStreamTask.java      |   5 +-
 .../test/processor/TestStreamProcessor.java     |   5 +-
 .../apache/samza/test/table/TestLocalTable.java |  33 +-
 .../table/TestLocalTableWithSideInputs.java     |  28 +-
 .../samza/test/table/TestRemoteTable.java       |  27 +-
 .../table/TestTableDescriptorsProvider.java     |   3 +-
 .../test/integration/StreamTaskTestUtil.scala   |  19 +-
 .../integration/TestShutdownStatefulTask.scala  |   8 +-
 .../test/integration/TestStatefulTask.scala     |   8 +-
 samza-tools/src/main/resources/log4j.xml        |   9 -
 samza-tools/src/main/resources/log4j2.xml       |  32 ++
 settings.gradle                                 |   1 +
 222 files changed, 5649 insertions(+), 2121 deletions(-)
----------------------------------------------------------------------



[15/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/652260a4
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/652260a4
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/652260a4

Branch: refs/heads/master
Commit: 652260a4cdc5bfb3b89f35807210835d7d5de2f5
Parents: e384e5c 531b35e
Author: Jagadish <jv...@linkedin.com>
Authored: Fri Oct 5 14:33:03 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Fri Oct 5 14:33:03 2018 -0700

----------------------------------------------------------------------
 .../versioned/jobs/configuration-table.html     | 22 ++------
 .../samza/config/JobCoordinatorConfig.java      | 34 ++++++++----
 .../samza/config/TestJobCoordinatorConfig.java  | 58 --------------------
 .../samza/sql/testutil/SamzaSqlTestConfig.java  |  1 -
 .../apache/samza/test/framework/TestRunner.java |  3 -
 .../EndOfStreamIntegrationTest.java             |  2 -
 .../WatermarkIntegrationTest.java               |  2 -
 .../samza/test/framework/SchedulingTest.java    |  1 -
 .../operator/TestRepartitionJoinWindowApp.java  |  3 -
 .../test/operator/TestRepartitionWindowApp.java |  1 -
 .../apache/samza/test/table/TestLocalTable.java |  2 -
 11 files changed, 27 insertions(+), 102 deletions(-)
----------------------------------------------------------------------



[02/32] samza git commit: Reorganize website content, link hyper-links correctly, fix image links

Posted by ja...@apache.org.
Reorganize website content, link hyper-links correctly, fix image links


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/1bf8bf5a
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/1bf8bf5a
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/1bf8bf5a

Branch: refs/heads/master
Commit: 1bf8bf5a632cac7548223ab9d990ce6e70c1c2f0
Parents: 334d24e
Author: Jagadish <jv...@linkedin.com>
Authored: Mon Oct 1 15:44:22 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Mon Oct 1 15:45:42 2018 -0700

----------------------------------------------------------------------
 .../learn/documentation/container/jconsole.png  | Bin 145220 -> 0 bytes
 .../learn/documentation/operations/jconsole.png | Bin 0 -> 145220 bytes
 .../learn/documentation/operations/visualvm.png | Bin 0 -> 198050 bytes
 .../versioned/api/high-level-api.md             |  24 +
 .../versioned/api/low-level-api.md              |  52 ++
 .../documentation/versioned/api/samza-sql.md    |  52 ++
 .../architecture/architecture-overview.md       |  23 +
 .../versioned/architecture/kinesis.md           |  23 +
 .../documentation/versioned/aws/kinesis.md      | 124 ----
 .../versioned/connectors/eventhubs.md           |  24 +
 .../documentation/versioned/connectors/hdfs.md  |  24 +
 .../versioned/connectors/kinesis.md             | 124 ++++
 .../versioned/connectors/overview.md            |  24 +
 .../versioned/container/monitoring.md           | 612 -------------------
 .../versioned/core-concepts/core-concepts.md    |  23 +
 .../versioned/deployment/standalone.md          | 217 +++++++
 .../documentation/versioned/deployment/yarn.md  |  27 +
 docs/learn/documentation/versioned/index.html   |  35 +-
 .../versioned/operations/monitoring.md          | 612 +++++++++++++++++++
 19 files changed, 1264 insertions(+), 756 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/img/versioned/learn/documentation/container/jconsole.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/container/jconsole.png b/docs/img/versioned/learn/documentation/container/jconsole.png
deleted file mode 100644
index 6058b16..0000000
Binary files a/docs/img/versioned/learn/documentation/container/jconsole.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/img/versioned/learn/documentation/operations/jconsole.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/operations/jconsole.png b/docs/img/versioned/learn/documentation/operations/jconsole.png
new file mode 100644
index 0000000..6058b16
Binary files /dev/null and b/docs/img/versioned/learn/documentation/operations/jconsole.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/img/versioned/learn/documentation/operations/visualvm.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/operations/visualvm.png b/docs/img/versioned/learn/documentation/operations/visualvm.png
new file mode 100644
index 0000000..4399d7f
Binary files /dev/null and b/docs/img/versioned/learn/documentation/operations/visualvm.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/api/high-level-api.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/api/high-level-api.md b/docs/learn/documentation/versioned/api/high-level-api.md
new file mode 100644
index 0000000..2a54215
--- /dev/null
+++ b/docs/learn/documentation/versioned/api/high-level-api.md
@@ -0,0 +1,24 @@
+---
+layout: page
+title: Streams DSL
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+
+# High level API section 1
+

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/api/low-level-api.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/api/low-level-api.md b/docs/learn/documentation/versioned/api/low-level-api.md
new file mode 100644
index 0000000..c162ca2
--- /dev/null
+++ b/docs/learn/documentation/versioned/api/low-level-api.md
@@ -0,0 +1,52 @@
+---
+layout: page
+title: Low level API
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+
+# Section 1
+
+# Sample Applications
+
+
+# Section 2
+
+# Section 3
+
+
+# Section 4
+
+The table below summarizes table metrics:
+
+
+| Metrics | Class | Description |
+|---------|-------|-------------|
+|`get-ns`|`ReadableTable`|Average latency of `get/getAsync()` operations|
+|`getAll-ns`|`ReadableTable`|Average latency of `getAll/getAllAsync()` operations|
+|`num-gets`|`ReadableTable`|Count of `get/getAsync()` operations
+|`num-getAlls`|`ReadableTable`|Count of `getAll/getAllAsync()` operations
+
+
+### Section 5 example
+
+It is up to the developer whether to implement both `TableReadFunction` and 
+`TableWriteFunction` in one class or two separate classes. Defining them in 
+separate classes can be cleaner if their implementations are elaborate and 
+extended, whereas keeping them in a single class may be more practical if 
+they share a considerable amount of code or are relatively short.

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/api/samza-sql.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/api/samza-sql.md b/docs/learn/documentation/versioned/api/samza-sql.md
new file mode 100644
index 0000000..bad7545
--- /dev/null
+++ b/docs/learn/documentation/versioned/api/samza-sql.md
@@ -0,0 +1,52 @@
+---
+layout: page
+title: Samza SQL
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+
+# Section 1
+
+# Sample Applications
+
+
+# Section 2
+
+# Section 3
+
+
+# Section 4
+
+The table below summarizes table metrics:
+
+
+| Metrics | Class | Description |
+|---------|-------|-------------|
+|`get-ns`|`ReadableTable`|Average latency of `get/getAsync()` operations|
+|`getAll-ns`|`ReadableTable`|Average latency of `getAll/getAllAsync()` operations|
+|`num-gets`|`ReadableTable`|Count of `get/getAsync()` operations
+|`num-getAlls`|`ReadableTable`|Count of `getAll/getAllAsync()` operations
+
+
+### Section 5 example
+
+It is up to the developer whether to implement both `TableReadFunction` and 
+`TableWriteFunction` in one class or two separate classes. Defining them in 
+separate classes can be cleaner if their implementations are elaborate and 
+extended, whereas keeping them in a single class may be more practical if 
+they share a considerable amount of code or are relatively short.

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/architecture/architecture-overview.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/architecture/architecture-overview.md b/docs/learn/documentation/versioned/architecture/architecture-overview.md
new file mode 100644
index 0000000..6c1fbb1
--- /dev/null
+++ b/docs/learn/documentation/versioned/architecture/architecture-overview.md
@@ -0,0 +1,23 @@
+---
+layout: page
+title: Architecture page
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+## Samza architecture page
+

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/architecture/kinesis.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/architecture/kinesis.md b/docs/learn/documentation/versioned/architecture/kinesis.md
new file mode 100644
index 0000000..6c1fbb1
--- /dev/null
+++ b/docs/learn/documentation/versioned/architecture/kinesis.md
@@ -0,0 +1,23 @@
+---
+layout: page
+title: Architecture page
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+## Samza architecture page
+

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/aws/kinesis.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/aws/kinesis.md b/docs/learn/documentation/versioned/aws/kinesis.md
deleted file mode 100644
index a866484..0000000
--- a/docs/learn/documentation/versioned/aws/kinesis.md
+++ /dev/null
@@ -1,124 +0,0 @@
----
-layout: page
-title: Kinesis Connector
----
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
--->
-
-## Overview
-
-The Samza Kinesis connector provides access to [Amazon Kinesis Data Streams](https://aws.amazon.com/kinesis/data-streams),
-Amazon’s data streaming service. A Kinesis Data Stream is similar to a Kafka topic and can have multiple partitions.
-Each message consumed from a Kinesis Data Stream is an instance of [Record](http://docs.aws.amazon.com/goto/WebAPI/kinesis-2013-12-02/Record).
-Samza’s [KinesisSystemConsumer](https://github.com/apache/samza/blob/master/samza-aws/src/main/java/org/apache/samza/system/kinesis/consumer/KinesisSystemConsumer.java)
-wraps the Record into a [KinesisIncomingMessageEnvelope](https://github.com/apache/samza/blob/master/samza-aws/src/main/java/org/apache/samza/system/kinesis/consumer/KinesisIncomingMessageEnvelope.java).
-
-## Consuming from Kinesis
-
-### Basic Configuration
-
-You can configure your Samza jobs to process data from Kinesis Streams. To configure Samza job to consume from Kinesis
-streams, please add the below configuration:
-
-{% highlight jproperties %}
-// define a kinesis system factory with your identifier. eg: kinesis-system
-systems.kinesis-system.samza.factory=org.apache.samza.system.eventhub.KinesisSystemFactory
-
-// kinesis system consumer works with only AllSspToSingleTaskGrouperFactory
-job.systemstreampartition.grouper.factory=org.apache.samza.container.grouper.stream.AllSspToSingleTaskGrouperFactory
-
-// define your streams
-task.inputs=kinesis-system.input0
-
-// define required properties for your streams
-systems.kinesis-system.streams.input0.aws.region=YOUR-STREAM-REGION
-systems.kinesis-system.streams.input0.aws.accessKey=YOUR-ACCESS_KEY
-sensitive.systems.kinesis-system.streams.input0.aws.secretKey=YOUR-SECRET-KEY
-{% endhighlight %}
-
-The tuple required to access the Kinesis data stream must be provided, namely the following fields:<br>
-**YOUR-STREAM-REGION**, **YOUR-ACCESS-KEY**, **YOUR-SECRET-KEY**.
-
-
-### Advanced Configuration
-
-#### AWS Client configs
-You can configure any [AWS client config](http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/ClientConfiguration.html)
-with the prefix **systems.system-name.aws.clientConfig.***
-
-{% highlight jproperties %}
-systems.system-name.aws.clientConfig.CONFIG-PARAM=CONFIG-VALUE
-{% endhighlight %}
-
-As an example, to set a *proxy host* and *proxy port* for the AWS Client:
-
-{% highlight jproperties %}
-systems.system-name.aws.clientConfig.ProxyHost=my-proxy-host.com
-systems.system-name.aws.clientConfig.ProxyPort=my-proxy-port
-{% endhighlight %}
-
-#### Kinesis Client Library Configs
-Samza Kinesis Connector uses [Kinesis Client Library](https://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-kcl.html#kinesis-record-processor-overview-kcl)
-(KCL) to access the Kinesis data streams. You can set any [Kinesis Client Lib Configuration](https://github.com/awslabs/amazon-kinesis-client/blob/master/amazon-kinesis-client-multilang/src/main/java/software/amazon/kinesis/coordinator/KinesisClientLibConfiguration.java)
-for a stream by configuring it under **systems.system-name.streams.stream-name.aws.kcl.***
-
-{% highlight jproperties %}
-systems.system-name.streams.stream-name.aws.kcl.CONFIG-PARAM=CONFIG-VALUE
-{% endhighlight %}
-
-Obtain the config param from the public functions in [Kinesis Client Lib Configuration](https://github.com/awslabs/amazon-kinesis-client/blob/master/amazon-kinesis-client-multilang/src/main/java/software/amazon/kinesis/coordinator/KinesisClientLibConfiguration.java)
-by removing the *"with"* prefix. For example: config param corresponding to **withTableName()** is **TableName**.
-
-### Resetting Offsets
-
-The source of truth for checkpointing while using Kinesis Connector is not the Samza checkpoint topic but Kinesis itself.
-The Kinesis Client Library (KCL) [uses DynamoDB](https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-ddb.html)
-to store it’s checkpoints. By default, Kinesis Connector reads from the latest offset in the stream.
-
-To reset the checkpoints and consume from earliest/latest offset of a Kinesis data stream, please change the KCL TableName
-and set the appropriate starting position for the stream as shown below.
-
-{% highlight jproperties %}
-// change the TableName to a unique name to reset checkpoint.
-systems.kinesis-system.streams.input0.aws.kcl.TableName=my-app-table-name
-// set the starting position to either TRIM_HORIZON (oldest) or LATEST (latest)
-systems.kinesis-system.streams.input0.aws.kcl.InitialPositionInStream=my-start-position
-{% endhighlight %}
-
-To manipulate checkpoints to start from a particular position in the Kinesis stream, in lieu of Samza CheckpointTool,
-please login to the AWS Console and change the offsets in the DynamoDB Table with the table name that you have specified
-in the config above. By default, the table name has the following format:
-"\<job name\>-\<job id\>-\<kinesis stream\>".
-
-### Known Limitations
-
-The following limitations apply to Samza jobs consuming from Kinesis streams using the Samza consumer:
-
-- Stateful processing (eg: windows or joins) is not supported on Kinesis streams. However, you can accomplish this by
-chaining two Samza jobs where the first job reads from Kinesis and sends to Kafka while the second job processes the
-data from Kafka.
-- Kinesis streams cannot be configured as [bootstrap](https://samza.apache.org/learn/documentation/latest/container/streams.html)
-or [broadcast](https://samza.apache.org/learn/documentation/latest/container/samza-container.html) streams.
-- Kinesis streams must be used ONLY with the [AllSspToSingleTaskGrouperFactory](https://github.com/apache/samza/blob/master/samza-core/src/main/java/org/apache/samza/container/grouper/stream/AllSspToSingleTaskGrouperFactory.java)
-as the Kinesis consumer does the partition management by itself. No other grouper is supported.
-- A Samza job that consumes from Kinesis cannot consume from any other input source. However, you can send your results
-to any destination (eg: Kafka, EventHubs), and have another Samza job consume them.
-
-## Producing to Kinesis
-
-The KinesisSystemProducer for Samza is not yet implemented.
-

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/connectors/eventhubs.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/connectors/eventhubs.md b/docs/learn/documentation/versioned/connectors/eventhubs.md
new file mode 100644
index 0000000..b99b46d
--- /dev/null
+++ b/docs/learn/documentation/versioned/connectors/eventhubs.md
@@ -0,0 +1,24 @@
+---
+layout: page
+title: Eventhubs Connector
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+# Section 1
+# Section 2
+

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/connectors/hdfs.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/connectors/hdfs.md b/docs/learn/documentation/versioned/connectors/hdfs.md
new file mode 100644
index 0000000..a78c4aa
--- /dev/null
+++ b/docs/learn/documentation/versioned/connectors/hdfs.md
@@ -0,0 +1,24 @@
+---
+layout: page
+title: HDFS Connector
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+# Section 1
+# Section 2
+

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/connectors/kinesis.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/connectors/kinesis.md b/docs/learn/documentation/versioned/connectors/kinesis.md
new file mode 100644
index 0000000..a866484
--- /dev/null
+++ b/docs/learn/documentation/versioned/connectors/kinesis.md
@@ -0,0 +1,124 @@
+---
+layout: page
+title: Kinesis Connector
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+## Overview
+
+The Samza Kinesis connector provides access to [Amazon Kinesis Data Streams](https://aws.amazon.com/kinesis/data-streams),
+Amazon’s data streaming service. A Kinesis Data Stream is similar to a Kafka topic and can have multiple partitions.
+Each message consumed from a Kinesis Data Stream is an instance of [Record](http://docs.aws.amazon.com/goto/WebAPI/kinesis-2013-12-02/Record).
+Samza’s [KinesisSystemConsumer](https://github.com/apache/samza/blob/master/samza-aws/src/main/java/org/apache/samza/system/kinesis/consumer/KinesisSystemConsumer.java)
+wraps the Record into a [KinesisIncomingMessageEnvelope](https://github.com/apache/samza/blob/master/samza-aws/src/main/java/org/apache/samza/system/kinesis/consumer/KinesisIncomingMessageEnvelope.java).
+
+## Consuming from Kinesis
+
+### Basic Configuration
+
+You can configure your Samza jobs to process data from Kinesis Streams. To configure Samza job to consume from Kinesis
+streams, please add the below configuration:
+
+{% highlight jproperties %}
+// define a kinesis system factory with your identifier. eg: kinesis-system
+systems.kinesis-system.samza.factory=org.apache.samza.system.eventhub.KinesisSystemFactory
+
+// kinesis system consumer works with only AllSspToSingleTaskGrouperFactory
+job.systemstreampartition.grouper.factory=org.apache.samza.container.grouper.stream.AllSspToSingleTaskGrouperFactory
+
+// define your streams
+task.inputs=kinesis-system.input0
+
+// define required properties for your streams
+systems.kinesis-system.streams.input0.aws.region=YOUR-STREAM-REGION
+systems.kinesis-system.streams.input0.aws.accessKey=YOUR-ACCESS_KEY
+sensitive.systems.kinesis-system.streams.input0.aws.secretKey=YOUR-SECRET-KEY
+{% endhighlight %}
+
+The tuple required to access the Kinesis data stream must be provided, namely the following fields:<br>
+**YOUR-STREAM-REGION**, **YOUR-ACCESS-KEY**, **YOUR-SECRET-KEY**.
+
+
+### Advanced Configuration
+
+#### AWS Client configs
+You can configure any [AWS client config](http://docs.aws.amazon.com/AWSJavaSDK/latest/javadoc/com/amazonaws/ClientConfiguration.html)
+with the prefix **systems.system-name.aws.clientConfig.***
+
+{% highlight jproperties %}
+systems.system-name.aws.clientConfig.CONFIG-PARAM=CONFIG-VALUE
+{% endhighlight %}
+
+As an example, to set a *proxy host* and *proxy port* for the AWS Client:
+
+{% highlight jproperties %}
+systems.system-name.aws.clientConfig.ProxyHost=my-proxy-host.com
+systems.system-name.aws.clientConfig.ProxyPort=my-proxy-port
+{% endhighlight %}
+
+#### Kinesis Client Library Configs
+Samza Kinesis Connector uses [Kinesis Client Library](https://docs.aws.amazon.com/streams/latest/dev/developing-consumers-with-kcl.html#kinesis-record-processor-overview-kcl)
+(KCL) to access the Kinesis data streams. You can set any [Kinesis Client Lib Configuration](https://github.com/awslabs/amazon-kinesis-client/blob/master/amazon-kinesis-client-multilang/src/main/java/software/amazon/kinesis/coordinator/KinesisClientLibConfiguration.java)
+for a stream by configuring it under **systems.system-name.streams.stream-name.aws.kcl.***
+
+{% highlight jproperties %}
+systems.system-name.streams.stream-name.aws.kcl.CONFIG-PARAM=CONFIG-VALUE
+{% endhighlight %}
+
+Obtain the config param from the public functions in [Kinesis Client Lib Configuration](https://github.com/awslabs/amazon-kinesis-client/blob/master/amazon-kinesis-client-multilang/src/main/java/software/amazon/kinesis/coordinator/KinesisClientLibConfiguration.java)
+by removing the *"with"* prefix. For example: config param corresponding to **withTableName()** is **TableName**.
+
+### Resetting Offsets
+
+The source of truth for checkpointing while using Kinesis Connector is not the Samza checkpoint topic but Kinesis itself.
+The Kinesis Client Library (KCL) [uses DynamoDB](https://docs.aws.amazon.com/streams/latest/dev/kinesis-record-processor-ddb.html)
+to store it’s checkpoints. By default, Kinesis Connector reads from the latest offset in the stream.
+
+To reset the checkpoints and consume from earliest/latest offset of a Kinesis data stream, please change the KCL TableName
+and set the appropriate starting position for the stream as shown below.
+
+{% highlight jproperties %}
+// change the TableName to a unique name to reset checkpoint.
+systems.kinesis-system.streams.input0.aws.kcl.TableName=my-app-table-name
+// set the starting position to either TRIM_HORIZON (oldest) or LATEST (latest)
+systems.kinesis-system.streams.input0.aws.kcl.InitialPositionInStream=my-start-position
+{% endhighlight %}
+
+To manipulate checkpoints to start from a particular position in the Kinesis stream, in lieu of Samza CheckpointTool,
+please login to the AWS Console and change the offsets in the DynamoDB Table with the table name that you have specified
+in the config above. By default, the table name has the following format:
+"\<job name\>-\<job id\>-\<kinesis stream\>".
+
+### Known Limitations
+
+The following limitations apply to Samza jobs consuming from Kinesis streams using the Samza consumer:
+
+- Stateful processing (eg: windows or joins) is not supported on Kinesis streams. However, you can accomplish this by
+chaining two Samza jobs where the first job reads from Kinesis and sends to Kafka while the second job processes the
+data from Kafka.
+- Kinesis streams cannot be configured as [bootstrap](https://samza.apache.org/learn/documentation/latest/container/streams.html)
+or [broadcast](https://samza.apache.org/learn/documentation/latest/container/samza-container.html) streams.
+- Kinesis streams must be used ONLY with the [AllSspToSingleTaskGrouperFactory](https://github.com/apache/samza/blob/master/samza-core/src/main/java/org/apache/samza/container/grouper/stream/AllSspToSingleTaskGrouperFactory.java)
+as the Kinesis consumer does the partition management by itself. No other grouper is supported.
+- A Samza job that consumes from Kinesis cannot consume from any other input source. However, you can send your results
+to any destination (eg: Kafka, EventHubs), and have another Samza job consume them.
+
+## Producing to Kinesis
+
+The KinesisSystemProducer for Samza is not yet implemented.
+

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/connectors/overview.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/connectors/overview.md b/docs/learn/documentation/versioned/connectors/overview.md
new file mode 100644
index 0000000..579c494
--- /dev/null
+++ b/docs/learn/documentation/versioned/connectors/overview.md
@@ -0,0 +1,24 @@
+---
+layout: page
+title: Connectors overview
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+# Section 1
+# Section 2
+

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/container/monitoring.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/container/monitoring.md b/docs/learn/documentation/versioned/container/monitoring.md
deleted file mode 100644
index af6ec77..0000000
--- a/docs/learn/documentation/versioned/container/monitoring.md
+++ /dev/null
@@ -1,612 +0,0 @@
----
-layout: page
-title: Monitoring
----
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIFND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
--->
-
-# Monitoring Samza Applications
-
-This section provides details on monitoring of Samza jobs, not to be confused with _Samza Monitors_ (components of the Samza-REST service that provide cluster-wide monitoring capabilities).
-
-
-
-Like any other production software, it is critical to monitor the health of our Samza jobs. Samza relies on metrics for monitoring and includes an extensible metrics library. While a few standard metrics are provided out-of-the-box, it is easy to define metrics specific to your application.
-
-
-* [A. Metrics Reporters](#a-metrics-reporters)
-  + [A.1 Reporting Metrics to JMX (JMX Reporter)](#jmxreporter)
-    + [Enabling the JMX Reporter](#enablejmxreporter)
-    - [Using the JMX Reporter](#jmxreporter)
-  + [A.2 Reporting Metrics to Kafka (MetricsSnapshot Reporter)](#snapshotreporter)
-    - [Enabling the MetricsSnapshot Reporter](#enablesnapshotreporter)
-  + [A.3 Creating a Custom MetricsReporter](#customreporter)
-* [B. Metric Types in Samza](#metrictypes)
-* [C. Adding User-Defined Metrics](#userdefinedmetrics)
-  + [Low-level API](#lowlevelapi)
-  + [High-Level API](#highlevelapi)
-* [D. Key Internal Samza Metrics](#keyinternalsamzametrics)
-  + [D.1 Vital Metrics](#vitalmetrics)
-  + [D.2 Store Metrics](#storemetrics)
-  + [D.3 Operator Metrics](#operatormetrics)
-* [E. Metrics Reference Sheet](#metricssheet)
-
-## A. Metrics Reporters
-
-Samza&#39;s metrics library encapsulates the metrics collection and sampling logic. Metrics Reporters in Samza are responsible for emitting metrics to external services which may archive, process, visualize the metrics&#39; values, or trigger alerts based on them.
-
-Samza includes default implementations for two such Metrics Reporters:
-
-1. a) A _JMXReporter_ (detailed [below](#jmxreporter)) which allows using standard JMX clients for probing containers to retrieve metrics encoded as JMX MBeans. Visualization tools such as [Grafana](https://grafana.com/dashboards/3457) could also be used to visualize this JMX data.
-
-1. b) A _MetricsSnapshot_ reporter (detailed [below](#snapshotreporter)) which allows periodically publishing all metrics to Kafka. A downstream Samza job could then consume and publish these metrics to other metrics management systems such as [Prometheus](https://prometheus.io/) and [Graphite](https://graphiteapp.org/).
-
-Note that Samza allows multiple Metrics Reporters to be used simultaneously.
-
-
-### <a name="jmxreporter"></a> A.1 Reporting Metrics to JMX (JMX Reporter)
-
-This reporter encodes all its internal and user-defined metrics as JMX MBeans and hosts a JMX MBean server. Standard JMX clients (such as JConsole, VisualVM) can thus be used to probe Samza&#39;s containers and YARN-ApplicationMaster to retrieve these metrics&#39; values. JMX also provides additional profiling capabilities (e.g., for CPU and memory utilization), which are also enabled by this reporter.
-
-#### <a name="enablejmxreporter"></a> Enabling the JMX Reporter
-JMXReporter can be enabled by adding the following configuration.
-
-```
-#Define a Samza metrics reporter called "jxm", which publishes to JMX
-metrics.reporter.jmx.class=org.apache.samza.metrics.reporter.JmxReporterFactory
-
-# Use the jmx reporter (if using multiple reporters, separate them with commas)
-metrics.reporters=jmx
-
-```
-
-#### <a name="usejmxreporter"></a> Using the JMX Reporter
-
-To connect to the JMX MBean server, first obtain the JMX Server URL and port, published in the container logs:
-
-
-```
-
-2018-08-14 11:30:49.888 [main] JmxServer [INFO] Started JmxServer registry port=54661 server port=54662 url=service:jmx:rmi://localhost:54662/jndi/rmi://localhost:54661/jmxrmi
-
-```
-
-
-If using the **JConsole** JMX client, launch it with the service URL as:
-
-```
-jconsole service:jmx:rmi://localhost:54662/jndi/rmi://localhost:54661/jmxrmi
-```
-
-<img src="/img/versioned/learn/documentation/container/jconsole.png" alt="JConsole" class="diagram-large">
-
- 
-
-If using the VisualVM JMX client, run:
-
-```
-jvisualvm
-```
-
-After **VisualVM** starts, click the &quot;Add JMX Connection&quot; button and paste in your JMX server URL (obtained from the logs).
-Install the VisualVM-MBeans plugin (Tools->Plugin) to view the metrics MBeans.
-
-<img src="/img/versioned/learn/documentation/container/visualvm.png" alt="VisualVM" class="diagram-small">
-
- 
-###  <a name="snapshotreporter"></a> A.2 Reporting Metrics to Kafka (MetricsSnapshot Reporter)
-
-This reporter publishes metrics to Kafka.
-
-#### <a name="enablesnapshotreporter"></a> Enabling the MetricsSnapshot Reporter
-To enable this reporter, simply append the following to your job&#39;s configuration.
-
-```
-#Define a metrics reporter called "snapshot"
-metrics.reporters=snapshot
-metrics.reporter.snapshot.class=org.apache.samza.metrics.reporter.MetricsSnapshotReporterFactory
-```
-
-
-Specify the kafka topic to which the reporter should publish to
-
-```
-metrics.reporter.snapshot.stream=kafka.metrics
-```
-
-
-Specify the serializer to be used for the metrics data
-
-```
-serializers.registry.metrics.class=org.apache.samza.serializers.MetricsSnapshotSerdeFactory
-systems.kafka.streams.metrics.samza.msg.serde=metrics
-```
-With this configuration, all containers (including the YARN-ApplicationMaster) will publish their JSON-encoded metrics 
-to a Kafka topic called &quot;metrics&quot; every 60 seconds.
-The following is an example of such a metrics message:
-
-```
-{
-  "header": {
-    "container-name": "samza-container-0",
-
-    "exec-env-container-id": "YARN-generated containerID",
-    "host": "samza-grid-1234.example.com",
-    "job-id": "1",
-    "job-name": "my-samza-job",
-    "reset-time": 1401729000347,
-    "samza-version": "0.0.1",
-    "source": "TaskName-Partition1",
-    "time": 1401729420566,
-    "version": "0.0.1"
-  },
-  "metrics": {
-    "org.apache.samza.container.TaskInstanceMetrics": {
-      "commit-calls": 1,
-      "window-calls": 0,
-      "process-calls": 14,
-
-      "messages-actually-processed": 14,
-      "send-calls": 0,
-
-      "flush-calls": 1,
-      "pending-messages": 0,
-      "messages-in-flight": 0,
-      "async-callback-complete-calls": 14,
-        "wikipedia-#en.wikipedia-0-offset": 8979,
-    }
-  }
-}
-```
-
-
-Each message contains a header which includes information about the job, time, and container from which the metrics were obtained. 
-The remainder of the message contains the metric values, grouped by their types, such as TaskInstanceMetrics, SamzaContainerMetrics,  KeyValueStoreMetrics, JVMMetrics, etc. Detailed descriptions of the various metric categories and metrics are available [here](#metricssheet).
-
-It is possible to configure the MetricsSnapshot reporter to use a different serializer using this configuration
-
-```
-serializers.registry.metrics.class=<classpath-to-my-custom-serializer-factory>
-```
-
-
-
-To configure the reporter to publish with a different frequency (default 60 seconds), add the following to your job&#39;s configuration
-
-```
-metrics.reporter.snapshot.interval=<publish frequency in seconds>
-```
-
-Similarly, to limit the set of metrics emitted you can use the regex based blacklist supported by this reporter. For example, to limit it to publishing only SamzaContainerMetrics use:
-
-```
-metrics.reporter.snapshot.blacklist=^(?!.\*?(?:SamzaContainerMetrics)).\*$
-```
-
-
-### <a name="customreporter"></a> A.3 Creating a Custom MetricsReporter
-
-Creating a custom MetricsReporter entails implementing the MetricsReporter interface. The lifecycle of Metrics Reporters is managed by Samza and is aligned with the Samza container lifecycle. Metrics Reporters can poll metric values and can receive callbacks when new metrics are added at runtime, e.g., user-defined metrics. Metrics Reporters are responsible for maintaining executor pools, IO connections, and any in-memory state that they require in order to export metrics to the desired external system, and managing the lifecycles of such components.
-
-After implementation, a custom reporter can be enabled by appending the following to the Samza job&#39;s configuration:
-
-```
-#Define a metrics reporter with a desired name
-metrics.reporter.<my-custom-reporter-name>.class=<classpath-of-my-custom-reporter-factory>
-
-
-#Enable its use for metrics reporting
-metrics.reporters=<my-custom-reporter-name>
-```
-
-
-
-## <a name="metrictypes"></a> B. Metric Types in Samza 
-
-Metrics in Samza are divided into three types -- _Gauges_, _Counters_, and _Timers_.
-
-_Gauges_ are useful when measuring the magnitude of a certain system property, e.g., the current queue length, or a buffer size.
-
-_Counters_ are useful in measuring metrics that are cumulative values, e.g., the number of messages processed since container startup. Certain counters are also useful when visualized with their rate-of-change, e.g., the rate of message processing.
-
-_Timers_ are useful for storing and reporting a sliding-window of timing values. Samza also supports a ListGauge type metric, which can be used to store and report a list of any primitive-type such as strings.
-
-## <a name="userdefinedmetrics"></a> C. Adding User-Defined Metrics
-
-
-To add a new metric, you can simply use the _MetricsRegistry_ in the provided TaskContext of 
-the init() method to register new metrics. The code snippets below show examples of registering and updating a user-defined
- Counter metric. Timers and gauges can similarly be used from within your task class.
-
-### <a name="lowlevelapi"></a> Low-level API
-
-Simply have your task implement the InitableTask interface and access the MetricsRegistry from the TaskContext.
-
-```
-public class MyJavaStreamTask implements StreamTask, InitableTask {
-
-  private Counter messageCount;
-  public void init(Config config, TaskContext context) {
-    this.messageCount = context.getMetricsRegistry().newCounter(getClass().getName(), "message-count");
-
-  }
-
-  public void process(IncomingMessageEnvelope envelope, MessageCollector collector, TaskCoordinator coordinator) {
-    messageCount.inc();
-  }
-
-}
-```
-
-### <a name="highlevelapi"></a> High-Level API
-
-In the high-level API, you can define a ContextManager and access the MetricsRegistry from the TaskContext, using which you can add and update your metrics.
-
-```
-public class MyJavaStreamApp implements StreamApplication {
-
-  private Counter messageCount = null;
-
-  @Override
-  public void init(StreamGraph graph, Config config) {
-    graph.withContextManager(new DemoContextManager());
-    MessageStream<IndexedRecord> viewEvent = ...;
-    viewEvent
-        .map(this::countMessage)
-        ...;
-  }
-
-  public final class DemoContextManager implements ContextManager {
-
-  @Override
-  public void init(Config config, TaskContext context) {
-      messageCount = context.getMetricsRegistry().
-      newCounter(getClass().getName(), "message-count");
-  }
-
-  private IndexedRecord countMessage(IndexedRecord value) {
-    messageCount.inc();
-    return value;
-  }
-
-  @Override
-  public void close() { }
-
-  }
-```
-
-## <a name="keyinternalsamzametrics"></a> D. Key Internal Samza Metrics
-
-Samza&#39;s internal metrics allow for detailed monitoring of a Samza job and all its components. Detailed descriptions 
-of all internal metrics are listed in a reference sheet [here](#e-metrics-reference-sheet). 
-However, a small subset of internal metrics facilitates easy high-level monitoring of a job.
-
-These key metrics can be grouped into three categories: _Vital metrics_, _Store__metrics_, and _Operator metrics_. 
-We explain each of these categories in detail below.
-
-### <a name="vitalmetrics"></a> D.1. Vital Metrics
-
-These metrics indicate the vital signs of a Samza job&#39;s health. Note that these metrics are categorized into different groups based on the Samza component they are emitted by, (e.g. SamzaContainerMetrics, TaskInstanceMetrics, ApplicationMaster metrics, etc).
-
-| **Metric Name** | **Group** | **Meaning** |
-| --- | --- | --- |
-| **Availability -- Are there any resource failures impacting my job?** |
-| job-healthy | ContainerProcessManagerMetrics | A binary value, where 1 indicates that all the required containers configured for a job are running, 0 otherwise. |
-| failed-containers | ContainerProcessManagerMetrics  | Number of containers that have failed in the job&#39;s lifetime |
-| **Input Processing Lag -- Is my job lagging ?** |
-| \<Topic\>-\<Partition\>-messages-behind-high-watermark |
-KafkaSystemConsumerMetrics | Number of input messages waiting to be processed on an input topic-partition |
-| consumptionLagMs | EventHubSystemConsumer | Time difference between the processing and enqueuing (into EventHub)  of input events |
-| millisBehindLatest | KinesisSystemConsumerMetrics | Current processing lag measured from the tip of the stream, expressed in milliseconds. |
-| **Output/Produce Errors -- Is my job failing to produce output?** |
-| producer-send-failed | KafkaSystemProducerMetrics | Number of send requests to Kafka (e.g., output topics) that failed due to unrecoverable errors |
-| flush-failed | HdfsSystemProducerMetrics | Number of failed flushes to HDFS |
-| **Processing Time -- Is my job spending too much time processing inputs?** |
-| process-ns | SamzaContainerMetrics | Amount of time the job is spending in processing each input |
-| commit-ns | SamzaContainerMetrics | Amount of time the job is spending in checkpointing inputs (and flushing producers, checkpointing KV stores, flushing side input stores).
-The frequency of this function is configured using _task.commit.ms_ |
-| window-ns | SamzaContainerMetrics | In case of WindowableTasks being used, amount of time the job is spending in its window() operations |
-
-### <a name="storemetrics"></a>  D.2. Store Metrics
-
-Stateful Samza jobs typically use RocksDB backed KV stores for storing state. Therefore, timing metrics associated with 
-KV stores can be useful for monitoring input lag. These are some key metrics for KV stores. 
-The metrics reference sheet [here](#e-metrics-reference-sheet) details all metrics for KV stores.
-
-
-
-| **Metric name** | **Group** | **Meaning** |
-| --- | --- | --- |
-| get-ns, put-ns, delete-ns, all-ns | KeyValueStorageEngineMetrics | Time spent performing respective KV store operations |
-
-
-
-### <a name="operatormetrics"></a>  D.3. Operator Metrics
-
-If your Samza job uses Samza&#39;s Fluent API or Samza-SQL, Samza creates a DAG (directed acyclic graph) of 
-_operators_ to form the required data processing pipeline. In such cases, operator metrics allow fine-grained 
-monitoring of such operators. Key operator metrics are listed below, while a detailed list is present 
-in the metrics reference sheet.
-
-| **Metric name** | **Group** | **Meaning** |
-| --- | --- | --- |
-| <Operator-ID\>-handle-message-ns | WindowOperatorImpl, PartialJoinOperatorImpl, StreamOperatorImpl, StreamTableJoinOperatorImpl, etc | Time spent handling a given input message by the operator |
-
-
-
-## <a name="metricssheet"></a>  E. Metrics Reference Sheet
-Suffixes &quot;-ms&quot; and &quot;-ns&quot; to metric names indicated milliseconds and nanoseconds respectively. All &quot;average time&quot; metrics are calculated over a sliding time window of 300 seconds.
-
-All \<system\>, \<stream\>, \<partition\>, \<store-name\>, \<topic\>, are populated with the corresponding actual values at runtime.
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **ContainerProcessManagerMetrics** | running-containers | Total number of running containers. |
-| | needed-containers | Number of containers needed for the job to be declared healthy. |
-| | completed-containers | Number of containers that have completed their execution and exited. |
-| | failed-containers | Number of containers that have failed in the job&#39;s lifetime. |
-| | released-containers | Number of containers released due to overallocation by the YARN-ResourceManager. |
-| | container-count | Number of containers configured for the job. |
-| | redundant-notifications | Number of redundant onResourceCompletedcallbacks received from the RM after container shutdown. |
-| | job-healthy | A binary value, where 1 indicates that all the required containers configured for a job are running, 0 otherwise. |
-| | preferred-host-requests | Number of container resource-requests for a preferred host received by the cluster manager. |
-| | any-host-requests | Number of container resource-requests for _any_ host received by the cluster manager |
-| | expired-preferred-host-requests | Number of expired resource-requests-for -preferred-host received by the cluster manager. |
-| | expired-any-host-requests | Number of expired resource-requests-for -any-host received by the cluster manager. |
-| | host-affinity-match-pct | Percentage of non-expired preferred host requests. This measures the % of resource-requests for which host-affinity provided the preferred host. |
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **SamzaContainerMetrics (Timer metrics)** | choose-ns | Average time spent by a task instance for choosing the input to process; this includes time spent waiting for input, selecting one in case of multiple inputs, and deserializing input. |
-| | window-ns | In case of WindowableTasks being used, average time a task instance is spending in its window() operations. |
-| | timer-ns | Average time spent in the timer-callback when a timer registered with TaskContext fires. |
-| | process-ns | Average time the job is spending in processing each input. |
-| | commit-ns | Average time the job is spending in checkpointing inputs (and flushing producers, checkpointing KV stores, flushing side input stores). The frequency of this function is configured using _task.commit.ms._ |
-| | block-ns | Average time the run loop is blocked because all task instances are busy processing input; could indicate lag accumulating. |
-| | container-startup-time | Time spent in starting the container. This includes time to start the JMX server, starting metrics reporters, starting system producers, consumers, system admins, offset manager, locality manager, disk space manager, security manager, statistics manager, and initializing all task instances. |
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **SamzaContainerMetrics (Counters and Gauges)** | commit-calls | Number of commits. Each commit includes input checkpointing, flushing producers, checkpointing KV stores, flushing side input stores, etc. |
-| | window-calls | In case of WindowableTask, this measures the number of window invocations. |
-| | timer-calls | Number of timer callbacks. |
-| | process-calls | Number of process method invocations. |
-| | process-envelopers | Number of input message envelopes processed. |
-| | process-null-envelopes | Number of times no input message envelopes was available for the run loop to process. |
-| | event-loop-utilization | The duty-cycle of the event loop. That is, the fraction of time of each event loop iteration that is spent in process(), window(), and commit. |
-| | disk-usage-bytes | Total disk space size used by key-value stores (in bytes). |
-| | disk-quota-bytes | Disk memory usage quota for key-value stores (in bytes). |
-| | executor-work-factor | The work factor of the run loop. A work factor of 1 indicates full throughput, while a work factor of less than 1 will introduce delays into the execution to approximate the requested work factor. The work factor is set by the disk space monitor in accordance with the disk quota policy. Given the latest percentage of available disk quota, this policy returns the work factor that should be applied. |
-| | physical-memory-mb | The physical memory used by the Samza container process (native + on heap) (in MBs). |
-| | <TaskName\>-<StoreName\>-restore-time | Time taken to restore task stores (per task store). |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **Job-Coordinator Metrics (Gauge)** | \<system\>-\<stream\>-partitionCount | The current number of partitions detected by the Stream Partition Count Monitor. This can be enabled by configuring _job.coordinator.monitor-partition-change_ to true. |
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **TaskInstance Metrics (Counters and Gauges)** | \<system\>-\<stream\>-\<partition\>-offset | The offset of the last processed message on the given system-stream-partition input. |
-|   | commit-calls | Number of commit calls for the task. Each commit call involves checkpointing inputs (and flushing producers, checkpointing KV stores, flushing side input stores). |
-|   | window-calls | In case of WIndowableTask, the number of window() invocations on the task. |
-|   | process-calls | Number of process method calls. |
-|   | send-calls | Number of send method calls (representing number of messages that were sent to the underlying SystemProducers) |
-|   | flush-calls | Number of times the underlying system producers were flushed. |
-|   | messages-actually-processed | Number of messages processed by the task. |
-|   | pending-messages | Number of pending messages in the pending envelope queue
-|   | messages-in-flight | Number of input messages currently being processed. This is impacted by the task.max.concurrency configuration. |
-|   | async-callback-complete-calls | Number of processAsync invocations that have completed (applicable to AsyncStreamTasks). |
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| OffsetManagerMetrics (Gauge) | \<system\>-\<stream\>-\<partition\>-checkpointed-offset | Latest checkpointed offsets for each input system-stream-partition. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **JvmMetrics (Timers)** | gc-time-millis | Total time spent in GC. |
-|   | <gc-name\>-time-millis | Total time spent in garbage collection (for each garbage collector) (in milliseconds) |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **JvmMetrics (Counters and Gauges)** | gc-count | Number of GC invocations. |
-|   | mem-heap-committed-mb | Size of committed heap memory (in MBs) Because the guest allocates memory lazily to the JVM heap and because the difference between Free and Used memory is opaque to the guest, the guest commits memory to the JVM heap as it is required. The Committed memory, therefore, is a measure of how much memory the JVM heap is really consuming in the guest.[https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html](https://pubs.vmware.com/vfabric52/index.jsp?topic=/com.vmware.vfabric.em4j.1.2/em4j/conf-heap-management.html) |
-|   | mem-heap-used-mb | Used memory from the perspective of the JVM is (Working set + Garbage) and Free memory is (Current heap size – Used memory). |
-|   | mem-heap-max-mb | Size of maximum heap memory (in MBs). This is defined by the –Xmx option. |
-|   | mem-nonheap-committed-mb | Size of non-heap memory committed in MBs. |
-|   | mem-nonheap-used-mb | Size of non-heap memory used in MBs. |
-|   | mem-nonheap-max-mb | Size of non-heap memory in MBs. This can be changed using –XX:MaxPermSize VM option. |
-|   | threads-new | Number of threads not started at that instant. |
-|   | threads-runnable | Number of running threads at that instant. |
-|   | threads-timed-waiting | Current number of timed threads waiting at that instant. A thread in TIMED\_WAITING stated as: &quot;A thread that is waiting for another thread to perform an action for up to a specified waiting time is in this state.&quot; |
-|   | threads-waiting | Current number of waiting threads. |
-|   | threads-blocked | Current number of blocked threads. |
-|   | threads-terminated | Current number of terminated threads. |
-|   | \<gc-name\>-gc-count | Number of garbage collection calls (for each garbage collector). |
-| **(Emitted only if the OS supports it)** | process-cpu-usage | Returns the &quot;recent cpu usage&quot; for the Java Virtual Machine process. |
-| **(Emitted only if the OS supports it)** | system-cpu-usage | Returns the &quot;recent cpu usage&quot; for the whole system. |
-| **(Emitted only if the OS supports it)** | open-file-descriptor-count | Count of open file descriptors. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **SystemConsumersMetrics (Counters and Gauges)** <br/> These metrics are emitted when multiplexing and coordinating between per-system consumers and message choosers for polling | chose-null | Number of times the message chooser returned a null message envelope. This is typically indicative of low input traffic on one or more input partitions. |
-|   | chose-object | Number of times the message chooser returned a non-null message envelope. |
-|   | deserialization-error | Number of times an incoming message was not deserialized successfully. |
-|   | ssps-needed-by-chooser | Number of systems for which no buffered message exists, and hence these systems need to be polled (to obtain a message). |
-|   | poll-timeout | The timeout for polling at that instant. |
-|   | unprocessed-messages | Number of unprocessed messages buffered in SystemConsumers. |
-|   | \<system\>-polls | Number of times the given system was polled |
-|   | \<system\>-ssp-fetches-per-poll | Number of partitions of the given system polled at that instant. |
-|   | \<system\>-messages-per-poll | Number of times the SystemConsumer for the underlying system was polled to get new messages. |
-|   | \<system\>-\<stream\>-\<partition\>-messages-chosen | Number of messages that were chosen by the MessageChooser for particular system stream partition. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **SystemConsumersMetrics (Timers)** | poll-ns | Average time spent polling all underlying systems for new messages (in nanoseconds). |
-|   | deserialization-ns | Average time spent deserializing incoming messages (in nanoseconds). |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **KafkaSystemConsumersMetrics (Timers)** | \<system\>-\<topic\>-\<partition\>-offset-change | The next offset to be read for this topic and partition. |
-|   | \<system\>-\<topic\>-\<partition\>-bytes-read | Total size of all messages read for a topic partition (payload + key size). |
-|   | \<system\>-\<topic\>-\<partition\>-messages-read | Number of messages read for a topic partition. |
-|   | \<system\>-\<topic\>-\<partition\>-high-watermark | Offset of the last committed message in Kafka&#39;s topic partition. |
-|   | \<system\>-\<topic\>-\<partition\>-messages-behind-high-watermark | Number of input messages waiting to be processed on an input topic-partition. That is, the difference between high watermark and next offset. |
-|   | \<system\>-<host\>-<port\>-reconnects | Number of reconnects to a broker on a particular host and port. |
-|   | \<system\>-<host\>-<port\>-bytes-read | Total size of all messages read from a broker on a particular host and port. |
-|   | \<system\>-<host\>-<port\>-messages-read | Number of times the consumer used a broker on a particular host and port to get new messages. |
-|   | \<system\>-<host\>-<port\>-skipped-fetch-requests | Number of times the fetchMessage method is called but no topic/partitions needed new messages. |
-|   | \<system\>-<host\>-<port\>-topic-partitions | Number of broker&#39;s topic partitions which are being consumed. |
-|   | poll-count | Number of polls the KafkaSystemConsumer performed to get new messages. |
-|   | no-more-messages-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Indicates if the Kafka consumer is at the head for particular partition. |
-|   | blocking-poll-count-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Number of times a blocking poll is executed (polling until we get at least one message, or until we catch up to the head of the stream) (per partition). |
-|   | blocking-poll-timeout-count-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Number of times a blocking poll has timed out (polling until we get at least one message within a timeout period) (per partition). |
-|   | buffered-message-count-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Current number of messages in queue (per partition). |
-|   | buffered-message-size-SystemStreamPartition [\<system\>, \<stream\>, \<partition\>] | Current size of messages in queue (if systems.system.samza.fetch.threshold.bytes is defined) (per partition). |
-|   | \<system\>-\<topic\>-\<partition\>-offset-change | The next offset to be read for a topic partition. |
-|   | \<system\>-\<topic\>-\<partition\>-bytes-read | Total size of all messages read for a topic partition (payload + key size). |
-
-
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **SystemProducersMetrics (Counters and Gauges)** <br/>These metrics are aggregated across Producers. | sends | Number of send method calls. Representing total number of sent messages. |
-|   | flushes | Number of flush method calls for all registered producers. |
-|   | <source\>-sends | Number of sent messages for a particular source (task instance). |
-|   | <source\>-flushes | Number of flushes for particular source (task instance). |
-|   | serialization error | Number of errors occurred while serializing envelopes before sending. |
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **KafkaSystemProducersMetrics (Counters)** | \<system\>-producer-sends | Number of send invocations to the KafkaSystemProducer. |
-|   | \<system\>-producer-send-success | Number of send requests that were successfully completed by the KafkaSystemProducer. |
-|   | \<system\>-producer-send-failed | Number of send requests to Kafka (e.g., output topics) that failed due to unrecoverable errors |
-|   | \<system\>-flushes | Number of calls made to flush in the KafkaSystemProducer. |
-|   | \<system\>-flush-failed | Number of times flush operation failed. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **KafkaSystemProducersMetrics (Timers)** | \<system\>-flush-ns | Represents average time the flush call takes to complete (in nanoseconds). |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **KeyValueStorageEngineMetrics (Counters)** <br/> These metrics provide insight into the type and number of KV Store operations taking place | <store-name\>-puts | Total number of put operations on the given KV store. |
-|   | <store-name\>-put-alls | Total number putAll operations on the given KV store. |
-|   | <store-name\>-gets | Total number get operations on the given KV store. |
-|   | <store-name\>-get-alls | Total number getAll operations on the given KV store. |
-|   | <store-name\>-alls | Total number of accesses to the iterator on the given KV store. |
-|   | <store-name\>-ranges | Total number of accesses to a sorted-range iterator on the given KV store. |
-|   | <store-name\>-deletes | Total number delete operations on the given KV store. |
-|   | <store-name\>-delete-alls | Total number deleteAll operations on the given KV store. |
-|   | <store-name\>-flushes | Total number flush operations on the given KV store. |
-|   | <store-name\>-restored-messages | Number of entries in the KV store restored from the changelog for that store. |
-|   | <store-name\>-restored-bytes | Size in bytes of entries in the KV store restored from the changelog for that store. |
-|   | <store-name\>-snapshots | Total number of snapshot operations on the given KV store. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **KeyValueStorageEngineMetrics (Timers)** <br/> These metrics provide insight into the latencies of  of KV Store operations | <store-name\>-get-ns | Average duration of the get operation on the given KV Store. |
-|   | <store-name\>-get-all-ns | Average duration of the getAll operation on the given KV Store. |
-|   | <store-name\>-put-ns | Average duration of the put operation on the given KV Store. |
-|   | <store-name\>-put-all-ns | Average duration of the putAll operation on the given KV Store. |
-|   | <store-name\>-delete-ns | Average duration of the delete operation on the given KV Store. |
-|   | <store-name\>-delete-all-ns | Average duration of the deleteAll operation on the given KV Store. |
-|   | <store-name\>-flush-ns | Average duration of the flush operation on the given KV Store. |
-|   | <store-name\>-all-ns | Average duration of obtaining an iterator (using the all operation) on the given KV Store. |
-|   | <store-name\>-range-ns | Average duration of obtaining a sorted-range iterator (using the all operation) on the given KV Store. |
-|   | <store-name\>-snapshot-ns | Average duration of the snapshot operation on the given KV Store. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **KeyValueStoreMetrics (Counters)** <br/> These metrics are measured at the App-facing layer for different KV Stores, e.g., RocksDBStore, InMemoryKVStore. | <store-name\>-gets, <store-name\>-getAlls, <store-name\>-puts, <store-name\>-putAlls, <store-name\>-deletes, <store-name\>-deleteAlls, <store-name\>-alls, <store-name\>-ranges, <store-name\>-flushes | Total number of the specified operation on the given KV Store.(These metrics have are equivalent to the respective ones under KeyValueStorageEngineMetrics). |
-|   | bytes-read | Total number of bytes read (when serving reads -- gets, getAlls, and iterations). |
-|   | bytes-written | Total number of bytes written (when serving writes -- puts, putAlls). |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **SerializedKeyValueStoreMetrics (Counters)** <br/> These metrics are measured at the serialization layer. | <store-name\>-gets, <store-name\>-getAlls, <store-name\>-puts, <store-name\>-putAlls, <store-name\>-deletes, <store-name\>-deleteAlls, <store-name\>-alls, <store-name\>-ranges, <store-name\>-flushes | Total number of the specified operation on the given KV Store. (These metrics have are equivalent to the respective ones under KeyValueStorageEngineMetrics) |
-|   | bytes-deserialized | Total number of bytes deserialized (when serving reads -- gets, getAlls, and iterations). |
-|   | bytes-serialized | Total number of bytes serialized (when serving reads and writes -- gets, getAlls, puts, putAlls). In addition to writes, serialization is also done during reads to serialize key to bytes for lookup in the underlying store. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **LoggedStoreMetrics (Counters)** <br/> These metrics are measured at the changeLog-backup layer for KV stores. | <store-name\>-gets, <store-name\>-puts, <store-name\>-alls, <store-name\>-deletes, <store-name\>-flushes, <store-name\>-ranges, | Total number of the specified operation on the given KV Store.
-|
-
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **CachedStoreMetrics (Counters and Gauges)** <br/> These metrics are measured at the caching layer for RocksDB-backed KV stores. | <store-name\>-gets, <store-name\>-puts, <store-name\>-alls, <store-name\>-deletes, <store-name\>-flushes, <store-name\>-ranges, | Total number of the specified operation on the given KV Store.|
-|   | cache-hits | Total number of get and getAll operations that hit cached entries. |
-|   | put-all-dirty-entries-batch-size | Total number of dirty KV-entries written-back to the underlying store. |
-|   | dirty-count | Number of entries in the cache marked dirty at that instant. |
-|   | cache-count | Number of entries in the cache at that instant. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **RoundRobinChooserMetrics (Counters)** | buffered-messages | Size of the queue with potential messages to process. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **BatchingChooserMetrics (Counters and gauges)** | batch-resets | Number of batch resets because they  exceeded the max batch size limit. |
-|   | batched-envelopes | Number of envelopes in the batch at the current instant. |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **BootstrappingChooserMetrics (Gauges)** | lagging-batch-streams | Number of bootstrapping streams that are lagging. |
-|   | \<system\>-\<stream\>-lagging-partitions | Number of lagging partitions in the stream (for each stream marked as bootstrapping stream). |
-
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **HdfsSystemProducerMetrics (Counters)** | system-producer-sends | Total number of attempts to write to HDFS. |
-|   | system-send-success | Total number of successful writes to HDFS. |
-|   | system-send-failed | Total number of failures while sending envelopes to HDFS. |
-|   | system-flushes | Total number of attempts to flush data to HDFS. |
-|   | system-flush-success | Total number of successfully flushed all written data to HDFS. |
-|   | system-flush-failed | Total number of failures while flushing data to HDFS. |
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **HdfsSystemProducerMetrics (Timers)** | system-send-ms | Average time spent for writing messages to HDFS (in milliseconds). |
-|   | system-flush-ms | Average time spent for flushing messages to HDFS (in milliseconds). |
-
-
-| **Group** | **Metric name** | **Meaning** |
-| --- | --- | --- |
-| **ElasticsearchSystemProducerMetrics (Counters)** | system-bulk-send-success | Total number of successful bulk requests |
-|   | system-docs-inserted | Total number of documents created. |
-|   | system-docs-updated | Total number of documents updated. |
-|   | system-version-conflicts | Number of times the failed requests due to conflicts with the current state of the document. |

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/core-concepts/core-concepts.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/core-concepts/core-concepts.md b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
new file mode 100644
index 0000000..449b338
--- /dev/null
+++ b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
@@ -0,0 +1,23 @@
+---
+layout: page
+title: Core concepts
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+## Core concepts page
+

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/deployment/standalone.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/deployment/standalone.md b/docs/learn/documentation/versioned/deployment/standalone.md
new file mode 100644
index 0000000..c7425f6
--- /dev/null
+++ b/docs/learn/documentation/versioned/deployment/standalone.md
@@ -0,0 +1,217 @@
+---
+layout: page
+title: Run as embedded library.
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+- [Introduction](#introduction)
+- [User guide](#user-guide)
+     - [Setup dependencies](#setup-dependencies)
+     - [Configuration](#configuration)
+     - [Code sample](#code-sample)
+- [Quick start guide](#quick-start-guide)
+  - [Setup zookeeper](#setup-zookeeper)
+  - [Setup kafka](#setup-kafka)
+  - [Build binaries](#build-binaries)
+  - [Deploy binaries](#deploy-binaries)
+  - [Inspect results](#inspect-results)
+- [Coordinator internals](#coordinator-internals)
+
+#
+
+# Introduction
+
+With Samza 0.13.0, the deployment model of samza jobs has been simplified and decoupled from YARN. _Standalone_ model provides the stream processing capabilities of samza packaged in the form of a library with pluggable coordination. This library model offers an easier integration path  and promotes a flexible deployment model for an application. Using the standalone mode, you can leverage Samza processors directly in your application and deploy Samza applications to self-managed clusters.
+
+A standalone application typically is comprised of multiple _stream processors_. A _stream processor_ encapsulates a user defined processing function and is responsible for processing a subset of input topic partitions. A stream processor of a standalone application is uniquely identified by a _processorId_.
+
+Samza provides pluggable job _coordinator_ layer to perform leader election and assign work to the stream processors. Standalone supports Zookeeper coordination out of the box and uses it for distributed coordination between the stream processors of standalone application. A processor can become part of a standalone application by setting its app.name(Ex: app.name=group\_1) and joining the group.
+
+In samza standalone, the input topic partitions are distributed between the available processors dynamically at runtime. In each standalone application, one stream processor will be chosen as a leader initially to mediate the assignment of input topic partitions to the stream processors. If the number of available processors changes(for example, if a processors is shutdown or added), then the leader processor will regenerate the partition assignments and re-distribute it to all the processors.
+
+On processor group change, the act of re-assigning input topic partitions to the remaining live processors in the group is known as rebalancing the group. On failure of the leader processor of a standalone application, an another stream processor of the standalone application will be chosen as leader.
+
+## User guide
+
+Samza standalone is designed to help you to have more control over the deployment of the application. So it is your responsibility to configure and deploy the processors. In case of ZooKeeper coordination, you have to configure the URL for an instance of ZooKeeper.
+
+A stream processor is identified by a unique processorID which is generated by the pluggable ProcessorIdGenerator abstraction. ProcessorId of the stream processor is used with the coordination service. Samza supports UUID based ProcessorIdGenerator out of the box.
+
+The diagram below shows a input topic with three partitions and an standalone application with three processors consuming messages from it.
+
+<img src="/img/versioned/learn/documentation/standalone/standalone-application.jpg" alt="Standalone application" height="550px" width="700px" align="middle">
+
+When a group is first initialized, each stream processor typically starts processing messages from either the earliest or latest offset of the input topic partition. The messages in each partition are sequentially delivered to the user defined processing function. As the stream processor makes progress, it commits the offsets of the messages it has successfully processed. For example, in the figure above, the stream processor position is at offset 7 and its last committed offset is at offset 3.
+
+When a input partition is reassigned to another processor in the group, the initial position is set to the last committed offset. If the processor-1 in the example above suddenly crashed, then the live processor taking over the partition would begin consumption from offset 3. In that case, it would not have to reprocess the messages up to the crashed processor's position of 3.
+
+### Setup dependencies
+
+Add the following samza-standalone maven dependencies to your project.
+
+```xml
+<dependency>
+    <groupId>org.apache.samza</groupId>
+    <artifactId>samza-kafka_2.11</artifactId>
+    <version>1.0</version>
+</dependency>
+<dependency>
+    <groupId>org.apache.samza</groupId>
+    <artifactId>samza-core_2.11</artifactId>
+    <version>1.0</version>
+</dependency>
+<dependency>
+    <groupId>org.apache.samza</groupId>
+    <artifactId>samza-api</artifactId>
+    <version>1.0</version>
+</dependency>
+```
+
+### Configuration
+
+A samza standalone application requires you to define the following mandatory configurations:
+
+```bash
+job.coordinator.factory=org.apache.samza.zk.ZkJobCoordinatorFactory
+job.coordinator.zk.connect=your_zk_connection(for local zookeeper, use localhost:2181)
+task.name.grouper.factory=org.apache.samza.container.grouper.task.GroupByContainerIdsFactory 
+```
+
+You have to configure the stream processor with the kafka brokers as defined in the following sample(we have assumed that the broker is running on localhost):
+
+```bash 
+systems.kafka.samza.factory=org.apache.samza.system.kafka.KafkaSystemFactory
+systems.kafka.samza.msg.serde=json
+systems.kafka.consumer.zookeeper.connect=localhost:2181
+systems.kafka.producer.bootstrap.servers=localhost:9092 
+```
+
+### Code sample
+
+Here&#39;s a sample standalone application with app.name set to sample-test. Running this class would launch a stream processor.
+
+```java
+public class PageViewEventExample implements StreamApplication {
+
+  public static void main(String[] args) {
+    CommandLine cmdLine = new CommandLine();
+    OptionSet options = cmdLine.parser().parse(args);
+    Config config = cmdLine.loadConfig(options);
+
+    ApplicationRunner runner = ApplicationRunners.getApplicationRunner(ApplicationClassUtils.fromConfig(config), config);
+    runner.run();
+    runner.waitForFinish();
+  }
+
+  @Override
+  public void describe(StreamAppDescriptor appDesc) {
+     MessageStream<PageViewEvent> pageViewEvents = null;
+     pageViewEvents = appDesc.getInputStream("inputStream", new JsonSerdeV2<>(PageViewEvent.class));
+     OutputStream<KV<String, PageViewCount>> pageViewEventPerMemberStream =
+         appDesc.getOutputStream("outputStream",  new JsonSerdeV2<>(PageViewEvent.class));
+     pageViewEvents.sendTo(pageViewEventPerMemberStream);
+  }
+}
+```
+
+## Quick start guide
+
+The [Hello-samza](https://github.com/apache/samza-hello-samza/) project contains sample Samza standalone applications. Here are step by step instruction guide to install, build and run a standalone application binaries using the local zookeeper cluster for coordination. Check out the hello-samza project by running the following commands:
+
+```bash
+git clone https://git.apache.org/samza-hello-samza.git hello-samza
+cd hello-samza 
+```
+
+### Setup Zookeeper
+
+Run the following command to install and start a local zookeeper cluster.
+
+```bash
+./bin/grid install zookeeper
+./bin/grid start zookeeper
+```
+
+### Setup Kafka
+
+Run the following command to install and start a local kafka cluster.
+
+```bash
+./bin/grid install kafka
+./bin/grid start zookeeper
+```
+
+### Build binaries
+
+Before you can run the standalone job, you need to build a package for it using the following command.
+
+```bash
+mvn clean package
+mkdir -p deploy/samza
+tar -xvf ./target/hello-samza-0.15.0-SNAPSHOT-dist.tar.gz -C deploy/samza 
+```
+
+### Deploy binaries
+
+To run the sample standalone application [WikipediaZkLocalApplication](https://github.com/apache/samza-hello-samza/blob/master/src/main/java/samza/examples/wikipedia/application/WikipediaZkLocalApplication.java)
+
+```bash
+./bin/deploy.sh
+./deploy/samza/bin/run-class.sh samza.examples.wikipedia.application.WikipediaZkLocalApplication  --config-factory=org.apache.samza.config.factories.PropertiesConfigFactory --config-path=file://$PWD/deploy/samza/config/wikipedia-application-local-runner.properties
+```
+
+### Inspect results
+
+The standalone application reads messages from the wikipedia-edits topic, and calculates counts, every ten seconds, for all edits that were made during that window. It outputs these counts to the local wikipedia-stats kafka topic. To inspect events in output topic, run the following command.
+
+```bash
+./deploy/kafka/bin/kafka-console-consumer.sh  --zookeeper localhost:2181 --topic wikipedia-stats
+```
+
+Events produced to the output topic from the standalone application launched above will be of the following form:
+
+```
+{"is-talk":2,"bytes-added":5276,"edits":13,"unique-titles":13}
+{"is-bot-edit":1,"is-talk":3,"bytes-added":4211,"edits":30,"unique-titles":30,"is-unpatrolled":1,"is-new":2,"is-minor":7}
+```
+
+# Coordinator internals
+
+A samza application is comprised of multiple stream processors. A processor can become part of a standalone application by setting its app.name(Ex: app.name=group\_1) and joining the group. In samza standalone, the input topic partitions are distributed between the available processors dynamically at runtime. If the number of available processors changes(for example, if some processors are shutdown or added), then the partition assignments will be regenerated and re-distributed to all the processors. One processor will be elected as leader and it will generate the partition assignments and distribute it to the other processors in the group.
+
+To mediate the partition assignments between processors, samza standalone relies upon a coordination service. The main responsibilities of coordination service are the following:
+
+**Leader Election** - Elects a single processor to generate the partition assignments and distribute it to other processors in the group.
+
+**Distributed barrier** - Coordination primitive used by the processors to reach consensus(agree) on an partition assignment.
+
+By default, embedded samza uses Zookeeper for coordinating between processors of an application and store the partition assignment state. Coordination sequence for a standalone application is listed below:
+
+1. Each processor(participant) will register with the coordination service(e.g: Zookeeper) with its participant ID.
+
+2. One of the participants will be elected as the leader.
+
+3. The leader will monitor the list of all the active participants.
+
+4. Whenever the list of the participants changes in a group, the leader will generate a new partition assignments for the current participants and persist it to a common storage.
+
+5. Participants are notified that the new partition assignment is available. Notification is done through the coordination service(e.g. ZooKeeper).
+
+6. The participants will stop processing, pick up the new partition assignment, and then resume processing.
+
+In order to ensure that no two partitions are processed by different processors, processing is paused and all the processors will synchronize on a distributed barrier. Once all the processors are paused, the new partition assignments are applied, after which processing resumes.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/deployment/yarn.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/deployment/yarn.md b/docs/learn/documentation/versioned/deployment/yarn.md
new file mode 100644
index 0000000..06f0446
--- /dev/null
+++ b/docs/learn/documentation/versioned/deployment/yarn.md
@@ -0,0 +1,27 @@
+---
+layout: page
+title: Run on YARN
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+
+# YARN section 1
+# YARN section 2
+# YARN section 3
+# YARN section 4
+# YARN section 5

http://git-wip-us.apache.org/repos/asf/samza/blob/1bf8bf5a/docs/learn/documentation/versioned/index.html
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/index.html b/docs/learn/documentation/versioned/index.html
index 49592f6..80035bb 100644
--- a/docs/learn/documentation/versioned/index.html
+++ b/docs/learn/documentation/versioned/index.html
@@ -19,20 +19,18 @@ title: Documentation
    limitations under the License.
 -->
 
-<h4><a href="comparisons/introduction.html">Core concepts</a></h4>
-<hr/>
-
-<h4>Architecture</h4>
+<h4><a href="core-concepts/core-concepts.html">CORE CONCEPTS</a></h4>
+<h4><a href="architecture/architecture-overview.html">ARCHITECTURE</a></h4>
 
 
 <h4>API</h4>
 
 <ul class="documentation-list">
-  <li><a href="comparisons/introduction.html">Low-level API</a></li>
-  <li><a href="comparisons/mupd8.html">Streams DSL</a></li>
+  <li><a href="api/low-level-api.html">Low-level API</a></li>
+  <li><a href="api/high-level-api.html">Streams DSL</a></li>
   <li><a href="api/table-api.html">Table API</a></li>
-  <li><a href="comparisons/storm.html">Samza SQL</a></li>
-  <li><a href="comparisons/spark-streaming.html">Apache BEAM</a></li>
+  <li><a href="api/samza-sql.html">Samza SQL</a></li>
+  <li><a href="https://beam.apache.org/documentation/runners/samza/">Apache BEAM</a></li>
 <!-- TODO comparisons pages
   <li><a href="comparisons/aurora.html">Aurora</a></li>
   <li><a href="comparisons/jms.html">JMS</a></li>
@@ -43,28 +41,25 @@ title: Documentation
 <h4>Deployment</h4>
 
 <ul class="documentation-list">
-  <li><a href="api/overview.html">Deployment overview</a></li>
-  <li><a href="deployment/deployment-model.html">Deployment model</a></li>
-  <li><a href="api/overview.html">Run on YARN</a></li>
-  <li><a href="standalone/standalone.html">Run as an embedded library</a></li>
+  <li><a href="deployment/deployment-model.html">Deployment options</a></li>
+  <li><a href="deployment/yarn.html">Run on YARN</a></li>
+  <li><a href="deployment/standalone.html">Run as an embedded library</a></li>
 </ul>
 
 <h4>Connectors</h4>
 
 <ul class="documentation-list">
-  <li><a href="jobs/job-runner.html">Connectors overview</a></li>
-  <li><a href="jobs/configuration.html">Apache Kafka</a></li>
-  <li><a href="jobs/packaging.html">Apache Hadoop</a></li>
-  <li><a href="jobs/yarn-jobs.html">Azure EventHubs</a></li>
-  <li><a href="aws/kinesis.html">AWS Kinesis</a></li>
+  <li><a href="connectors/overview.html">Connectors overview</a></li>
+  <li><a href="connectors/kafka.html">Apache Kafka</a></li>
+  <li><a href="connectors/hdfs.html">Apache Hadoop</a></li>
+  <li><a href="connectors/eventhubs.html">Azure EventHubs</a></li>
+  <li><a href="connectors/kinesis.html">AWS Kinesis</a></li>
 </ul>
 
 <h4>Operations</h4>
 
 <ul class="documentation-list">
-  <li><a href="yarn/application-master.html">Debugging</a></li>
-  <li><a href="yarn/isolation.html">Monitoring & metrics</a></li>
-  <li><a href="yarn/isolation.html">Samza REST service</a></li>
+  <li><a href="operations/monitoring.html">Monitoring</a></li>
 </ul>
 
 </div>


[11/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/02fea74b
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/02fea74b
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/02fea74b

Branch: refs/heads/master
Commit: 02fea74bc398fef2c00d166da2530e35dedd68ee
Parents: 3b7ff0d 8e61a45
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 22:56:28 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 22:56:28 2018 -0700

----------------------------------------------------------------------

----------------------------------------------------------------------



[27/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/c4cbebeb
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/c4cbebeb
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/c4cbebeb

Branch: refs/heads/master
Commit: c4cbebebc6f3952514eca48cf3fe29640ddba4cf
Parents: c35e6c7 8926b3e
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 19:12:41 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 19:12:41 2018 -0700

----------------------------------------------------------------------
 docs/community/committers.html                  |  7 +++--
 .../descriptors/GenericInputDescriptor.java     |  5 ++++
 .../descriptors/GenericOutputDescriptor.java    |  5 ++++
 .../base/stream/InputDescriptor.java            | 28 +++++++++-----------
 .../base/stream/StreamDescriptor.java           |  2 +-
 .../TestExpandingInputDescriptor.java           |  6 ++---
 .../descriptors/TestGenericInputDescriptor.java | 16 +++++------
 .../descriptors/TestSimpleInputDescriptor.java  |  6 ++---
 .../TestTransformingInputDescriptor.java        |  6 ++---
 .../system/kafka/TestKafkaInputDescriptor.java  |  6 ++---
 10 files changed, 42 insertions(+), 45 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/c4cbebeb/docs/community/committers.html
----------------------------------------------------------------------
diff --cc docs/community/committers.html
index 3b65181,ab4e2ed..3144716
--- a/docs/community/committers.html
+++ b/docs/community/committers.html
@@@ -20,9 -20,8 +20,8 @@@ exclude_from_loop: tru
     limitations under the License.
  -->
  
- <h6>
--Samza is developed by a friendly community of contributors. The Samza Project Management Committee(PMC) is responsible for the management and oversight of the project.
- </h6>
++=======
+ 
  <hr class="committers-hr"/>
  
  <ul class="committers">
@@@ -95,4 -94,4 +94,4 @@@
  
    {% endfor %}
  
--</ul>
++</ul>


[16/32] samza git commit: Fix pixels for image layout on the home-page

Posted by ja...@apache.org.
Fix pixels for image layout on the home-page


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/15c6e87f
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/15c6e87f
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/15c6e87f

Branch: refs/heads/master
Commit: 15c6e87feecbc5bda4b123efead6f3386e44cc09
Parents: 652260a
Author: Jagadish <jv...@linkedin.com>
Authored: Mon Oct 8 11:57:04 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Mon Oct 8 11:57:04 2018 -0700

----------------------------------------------------------------------
 docs/_layouts/default.html | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/15c6e87f/docs/_layouts/default.html
----------------------------------------------------------------------
diff --git a/docs/_layouts/default.html b/docs/_layouts/default.html
index 1b555d8..c10403d 100644
--- a/docs/_layouts/default.html
+++ b/docs/_layouts/default.html
@@ -117,7 +117,7 @@
       </p>
 
       <!-- <img src="/img/latest/learn/documentation/api/samza-arch3.png"> -->
-      <img src="/img/latest/learn/documentation/api/samza-arch6.png" width="50%" height="50%" hspace="30px">
+      <img src="/img/latest/learn/documentation/api/samza-arch4.png" width="44.7%" height="50%" hspace="25px">
   </div>
   </div>
 


[07/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/8970dd28
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/8970dd28
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/8970dd28

Branch: refs/heads/master
Commit: 8970dd2890073af40b4adad3ca6868dc40edf87d
Parents: 73ec4e9 5545592
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 22:39:50 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 22:39:50 2018 -0700

----------------------------------------------------------------------
 .../documentation/yarn/am-container-info.png    | Bin 0 -> 206702 bytes
 .../learn/documentation/yarn/am-job-model.png   | Bin 0 -> 191856 bytes
 .../documentation/yarn/am-runtime-configs.png   | Bin 0 -> 304936 bytes
 .../documentation/yarn/am-runtime-metadata.png  | Bin 0 -> 566827 bytes
 .../yarn/coordinator-internals.png              | Bin 0 -> 30163 bytes
 .../learn/documentation/yarn/yarn-am-ui.png     | Bin 0 -> 119934 bytes
 .../documentation/versioned/deployment/yarn.md  | 314 ++++++++++++++++++-
 7 files changed, 308 insertions(+), 6 deletions(-)
----------------------------------------------------------------------



[17/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/d9431b70
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/d9431b70
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/d9431b70

Branch: refs/heads/master
Commit: d9431b7005e8cf7d5b40d7ce85c1a14b1f9ca34b
Parents: 15c6e87 ff3717d
Author: Jagadish <jv...@linkedin.com>
Authored: Mon Oct 8 13:45:13 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Mon Oct 8 13:45:13 2018 -0700

----------------------------------------------------------------------
 .../org/apache/samza/config/KafkaConfig.scala   | 25 +++++++++++++-------
 .../apache/samza/config/TestKafkaConfig.scala   | 18 ++++++++++++--
 2 files changed, 33 insertions(+), 10 deletions(-)
----------------------------------------------------------------------



[05/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza into HEAD

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza into HEAD


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/187ff892
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/187ff892
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/187ff892

Branch: refs/heads/master
Commit: 187ff8929ca08e50df1b2d76c679a1726b9757bc
Parents: 115041a 4ae30f1
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 22:07:06 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 22:07:06 2018 -0700

----------------------------------------------------------------------
 .../context/ApplicationContainerContext.java    |  46 ++++++++
 .../ApplicationContainerContextFactory.java     |  44 ++++++++
 .../samza/context/ApplicationTaskContext.java   |  47 ++++++++
 .../context/ApplicationTaskContextFactory.java  |  48 ++++++++
 .../apache/samza/context/ContainerContext.java  |  46 ++++++++
 .../java/org/apache/samza/context/Context.java  |  77 +++++++++++++
 .../org/apache/samza/context/JobContext.java    |  46 ++++++++
 .../org/apache/samza/context/TaskContext.java   |  88 +++++++++++++++
 .../apache/samza/job/model/ContainerModel.java  |  90 +++++++++++++++
 .../org/apache/samza/job/model/TaskModel.java   | 110 +++++++++++++++++++
 .../samza/scheduler/CallbackScheduler.java      |  43 ++++++++
 .../java/org/apache/samza/task/TaskContext.java |   1 +
 .../apache/samza/container/TaskContextImpl.java |   4 +
 .../samza/context/ContainerContextImpl.java     |  43 ++++++++
 .../org/apache/samza/context/ContextImpl.java   |  74 +++++++++++++
 .../apache/samza/context/JobContextImpl.java    |  49 +++++++++
 .../apache/samza/context/TaskContextImpl.java   |  87 +++++++++++++++
 .../apache/samza/job/model/ContainerModel.java  |  92 ----------------
 .../org/apache/samza/job/model/TaskModel.java   | 105 ------------------
 .../samza/scheduler/CallbackSchedulerImpl.java  |  44 ++++++++
 .../apache/samza/context/TestContextImpl.java   |  98 +++++++++++++++++
 .../samza/context/TestTaskContextImpl.java      |  98 +++++++++++++++++
 .../scheduler/TestCallbackSchedulerImpl.java    |  72 ++++++++++++
 .../apache/samza/test/framework/TestRunner.java |  64 +++++------
 .../AsyncStreamTaskIntegrationTest.java         |   2 +-
 .../StreamApplicationIntegrationTest.java       |  23 ----
 .../framework/StreamTaskIntegrationTest.java    |   4 +-
 .../table/TestLocalTableWithSideInputs.java     |  14 +--
 28 files changed, 1295 insertions(+), 264 deletions(-)
----------------------------------------------------------------------



[03/32] samza git commit: Change styling on home-page

Posted by ja...@apache.org.
Change styling on home-page


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/c480b7ed
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/c480b7ed
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/c480b7ed

Branch: refs/heads/master
Commit: c480b7ed634f8218fd7b8099bb8d83a7b3489a1a
Parents: 1bf8bf5
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 15:54:19 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 15:54:19 2018 -0700

----------------------------------------------------------------------
 docs/css/main.new.css | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/c480b7ed/docs/css/main.new.css
----------------------------------------------------------------------
diff --git a/docs/css/main.new.css b/docs/css/main.new.css
index 3235c87..2c77a70 100644
--- a/docs/css/main.new.css
+++ b/docs/css/main.new.css
@@ -455,7 +455,7 @@ footer .side-by-side > * {
 }
 
 .section--highlight {
-  color: #9a9a9a;
+  color: #fbfbfb;
   background: #111;
   background-color: #111;
   background-image:  linear-gradient(to bottom, #3f3f3f, transparent 50%);
@@ -632,11 +632,11 @@ footer .side-by-side > * {
 }
 
 .section--highlight .section__item {
-  color: #7d7d7d;
+  color: #fbfbfb;
 }
 
 .section--highlight .section__item-title {
-  color: #848484;
+  color: #ec1c23;
 }
 
 .section__item-title {


[26/32] samza git commit: Add Powered By pages for Samza users in the community

Posted by ja...@apache.org.
Add Powered By pages for Samza users in the community


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/c35e6c7c
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/c35e6c7c
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/c35e6c7c

Branch: refs/heads/master
Commit: c35e6c7c9a625a1974e862aa36b3de32f0f0d6e4
Parents: 2110c5f
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 19:10:03 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 19:10:03 2018 -0700

----------------------------------------------------------------------
 docs/_powered-by/TEMPLATE.md        |  1 +
 docs/_powered-by/doubledutch.md     | 24 ++++++++++++++++++++++++
 docs/_powered-by/fortscale.md       | 23 +++++++++++++++++++++++
 docs/_powered-by/happypancake.md    | 22 ++++++++++++++++++++++
 docs/_powered-by/improve-digital.md | 22 ++++++++++++++++++++++
 docs/_powered-by/intuit.md          |  1 +
 docs/_powered-by/jha.md             | 25 +++++++++++++++++++++++++
 docs/_powered-by/linkedin.md        |  1 +
 docs/_powered-by/metamarkets.md     | 25 +++++++++++++++++++++++++
 docs/_powered-by/mobileaware.md     |  2 +-
 docs/_powered-by/movio.md           | 22 ++++++++++++++++++++++
 docs/_powered-by/netflix.md         | 24 ++++++++++++++++++++++++
 docs/_powered-by/ntent.md           | 28 ++++++++++++++++++++++++++++
 docs/_powered-by/optimizely.md      | 28 ++++++++++++++++++++++++++++
 docs/_powered-by/redfin.md          | 30 ++++++++++++++++++++++++++++++
 docs/_powered-by/state.md           | 27 +++++++++++++++++++++++++++
 docs/_powered-by/tivo.md            | 25 +++++++++++++++++++++++++
 docs/_powered-by/tripadvisor.md     | 30 ++++++++++++++++++++++++++++++
 docs/_powered-by/vintank.md         | 23 +++++++++++++++++++++++
 docs/_powered-by/vmware.md          | 32 ++++++++++++++++++++++++++++++++
 docs/community/committers.html      |  3 ++-
 docs/powered-by/index.html          |  3 ++-
 22 files changed, 418 insertions(+), 3 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/TEMPLATE.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/TEMPLATE.md b/docs/_powered-by/TEMPLATE.md
index 8225f7e..81c59fe 100644
--- a/docs/_powered-by/TEMPLATE.md
+++ b/docs/_powered-by/TEMPLATE.md
@@ -2,6 +2,7 @@
 exclude_from_loop: true # wont be able to find this page, useful for draft
 name: Company # formatted name of company eg LinkedIn
 domain: company.com # just the domain, no protocol
+priority: 2
 ---
 <!--
    Licensed to the Apache Software Foundation (ASF) under one or more

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/doubledutch.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/doubledutch.md b/docs/_powered-by/doubledutch.md
new file mode 100644
index 0000000..95ae897
--- /dev/null
+++ b/docs/_powered-by/doubledutch.md
@@ -0,0 +1,24 @@
+---
+name: DoubleDutch
+domain: doubledutch.me
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="www.doubledutch.me" rel="nofollow">DoubleDutch</a> provides mobile applications and performance analytics for events, conferences, and trade shows for more than 1,000 customers including SAP, UBM, and Urban Land Institute. It uses Samza to power their analytics platform and stream data live into an event dashboard for real-time insights.
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/fortscale.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/fortscale.md b/docs/_powered-by/fortscale.md
new file mode 100644
index 0000000..acbfea7
--- /dev/null
+++ b/docs/_powered-by/fortscale.md
@@ -0,0 +1,23 @@
+---
+name: Fortscale
+domain: fortscale.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="https://www.fortscale.com/" rel="nofollow">Fortscale</a> is redefining behavioral analytics, with the industry’s first embeddable engine, making behavioral analytics available for everyone. It is using Samza to process security events as part of their data ingestion pipelines and for the creation of on-line machine learning models.
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/happypancake.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/happypancake.md b/docs/_powered-by/happypancake.md
new file mode 100644
index 0000000..d4a68da
--- /dev/null
+++ b/docs/_powered-by/happypancake.md
@@ -0,0 +1,22 @@
+---
+name: HappyPancake
+domain: happypancake.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="https://www.happypancake.com/" rel="nofollow">Happy Pancake</a>, Northern Europe's largest internet dating service, is using Samza for all event handlers and data replication.

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/improve-digital.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/improve-digital.md b/docs/_powered-by/improve-digital.md
new file mode 100644
index 0000000..d275960
--- /dev/null
+++ b/docs/_powered-by/improve-digital.md
@@ -0,0 +1,22 @@
+---
+name: ImproveDigital
+domain: improvedigital.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="https://www.improvedigital.com/" rel="nofollow">Improve Digital</a> is using Samza as the foundation of its realtime processing capabilities, data analytics needs and alerting systems.

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/intuit.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/intuit.md b/docs/_powered-by/intuit.md
index e77eddc..ac04c77 100644
--- a/docs/_powered-by/intuit.md
+++ b/docs/_powered-by/intuit.md
@@ -1,6 +1,7 @@
 ---
 name: Intuit
 domain: intuit.com
+priority: 05
 ---
 <!--
    Licensed to the Apache Software Foundation (ASF) under one or more

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/jha.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/jha.md b/docs/_powered-by/jha.md
new file mode 100644
index 0000000..3574d5d
--- /dev/null
+++ b/docs/_powered-by/jha.md
@@ -0,0 +1,25 @@
+---
+name: Banno
+domain: banno.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="www.banno.com" rel="nofollow">Jack Henry and Associates</a>  is an S&P 400 company that supports more than 11,300 financial institutions with core processing services. It leverages Samza to process user activity data across its Banno suite of products for financial institutions.
+
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/linkedin.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/linkedin.md b/docs/_powered-by/linkedin.md
index 2ba4c40..9b1adcf 100644
--- a/docs/_powered-by/linkedin.md
+++ b/docs/_powered-by/linkedin.md
@@ -1,6 +1,7 @@
 ---
 name: LinkedIn
 domain: linkedin.com
+priority: 04
 ---
 <!--
    Licensed to the Apache Software Foundation (ASF) under one or more

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/metamarkets.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/metamarkets.md b/docs/_powered-by/metamarkets.md
new file mode 100644
index 0000000..7ca3111
--- /dev/null
+++ b/docs/_powered-by/metamarkets.md
@@ -0,0 +1,25 @@
+---
+name: Metamarkets
+domain: metamarkets.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="www.metamarkets.com" rel="nofollow">Metamarkets</a> Metamarkets offers an interactive analytics platform for buyers and sellers of programmatic advertising. It uses Samza to transform and join real-time event streams, then forward them into a Druid cluster for interactive querying.
+
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/mobileaware.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/mobileaware.md b/docs/_powered-by/mobileaware.md
index ea07155..08824c5 100644
--- a/docs/_powered-by/mobileaware.md
+++ b/docs/_powered-by/mobileaware.md
@@ -19,4 +19,4 @@ domain: mobileaware.com
    limitations under the License.
 -->
 
-At <a class="external-link" href="https://www.mobileaware.com/" rel="nofollow">MobileAware</a>, we use Samza to enrich events with more contextual data from various sources (CMDB, Change Management, Incident Management, Problem Management). This gives us more meaningful events that an operations centre person can act on.
\ No newline at end of file
+At <a class="external-link" href="https://www.mobileaware.com/" rel="nofollow">MobileAware</a>, we use Samza to enrich events with more contextual data from various sources (CMDB, Change Management, Incident Management, Problem Management). This gives us more meaningful events that an operations centre person can act on.

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/movio.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/movio.md b/docs/_powered-by/movio.md
new file mode 100644
index 0000000..4a8b661
--- /dev/null
+++ b/docs/_powered-by/movio.md
@@ -0,0 +1,22 @@
+---
+name: Movio
+domain: movio.co
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="http://www.ntent.com" rel="nofollow">Movio</a> offers data-driven marketing solutions for the film industry. At Movio, they use Samza to process and enrich billions of change data capture events on all databases in real-time. 

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/netflix.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/netflix.md b/docs/_powered-by/netflix.md
new file mode 100644
index 0000000..98c74d9
--- /dev/null
+++ b/docs/_powered-by/netflix.md
@@ -0,0 +1,24 @@
+---
+name: Netflix
+domain: netflix.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="www.netflix.com" rel="nofollow">Netflix</a> uses single-stage Samza jobs to route over 700 billion events / 1 peta byte per day from fronting Kafka clusters to s3/hive. A portion of these events are routed to Kafka and ElasticSearch with support for custom index creation, basic filtering and projection. We run over 10,000 samza jobs in that many docker containers.
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/ntent.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/ntent.md b/docs/_powered-by/ntent.md
new file mode 100644
index 0000000..05660f5
--- /dev/null
+++ b/docs/_powered-by/ntent.md
@@ -0,0 +1,28 @@
+---
+name: Ntent
+domain: ntent.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="http://www.ntent.com" rel="nofollow">Ntent</a> blends semantic search with natural language processing technologies to predict and create relevant content experiences.  They use Samza to power their streaming content ingestion system. Ntent takes crawled web pages and news articles, and passes them through a multi-stage processing pipeline that cleanses, classifies, extracts features that power other learning models, stores, and indexes the content for search.
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/optimizely.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/optimizely.md b/docs/_powered-by/optimizely.md
new file mode 100644
index 0000000..4103329
--- /dev/null
+++ b/docs/_powered-by/optimizely.md
@@ -0,0 +1,28 @@
+---
+name: Optimizely
+domain: optimizely.com
+priority: 07
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="https://www.optimizely.com" rel="nofollow">Optimizely</a>, the world's leader in customer experience optimization uses Apache Samza to aggregate and enrich billions of events per day to power real-time analytics of Experiments and Personalization experiences.
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/redfin.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/redfin.md b/docs/_powered-by/redfin.md
new file mode 100644
index 0000000..3c9affe
--- /dev/null
+++ b/docs/_powered-by/redfin.md
@@ -0,0 +1,30 @@
+---
+name: Redfin
+domain: redfin.com
+priority: 06
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="https://redfin.com" rel="nofollow">Redfin</a> provides real estate search and brokerage services through a combination of real estate web platforms. It uses Samza and Kafka for sending millions of email and push notifications to our customers everyday. Redfin chose Samza for distributed processing because it integrates really well with Kafka. Samza also provides managed state and a resilient local storage which Redfin found to be very useful features.
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/state.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/state.md b/docs/_powered-by/state.md
new file mode 100644
index 0000000..28cb0c1
--- /dev/null
+++ b/docs/_powered-by/state.md
@@ -0,0 +1,27 @@
+---
+name: State
+domain: state.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="https://state.com" rel="nofollow">State</a> is a public global opinion network that focuses on empowering individuals, democracy, and social progress. It uses Samza to process and join streams of changes from MongoDB to update a wide range of realtime services that support the website and mobile apps. These include search, user recommendations, opinion metrics and lots more.
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/tivo.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/tivo.md b/docs/_powered-by/tivo.md
new file mode 100644
index 0000000..99c2d21
--- /dev/null
+++ b/docs/_powered-by/tivo.md
@@ -0,0 +1,25 @@
+---
+name: Tivo
+domain: tivo.com
+priority: 03
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="www.tivo.com" rel="nofollow">Tivo</a> TiVo is a digital video recorder that allows users to save TV programs for later viewing based on an electronic TV programming schedule. It leverages Samza leveraging Samza to do online processing of views and ratings to help power personalized content recommendations and analytics.
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/tripadvisor.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/tripadvisor.md b/docs/_powered-by/tripadvisor.md
new file mode 100644
index 0000000..6980a82
--- /dev/null
+++ b/docs/_powered-by/tripadvisor.md
@@ -0,0 +1,30 @@
+---
+name: Tripadvisor
+domain: tripadvisor.com
+priority: 02
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="https://tripadvisor.com" rel="nofollow">Tripadvisor</a> is the world's largest travel site, enabling travelers to plan and book the perfect trip. It uses Apache Samza to process billions of events daily for analytics, machine learning, and site improvement.
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/vintank.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/vintank.md b/docs/_powered-by/vintank.md
new file mode 100644
index 0000000..ecd6842
--- /dev/null
+++ b/docs/_powered-by/vintank.md
@@ -0,0 +1,23 @@
+---
+name: VinTank
+domain: vintank.com
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="https://www.crunchbase.com/organization/vintank" rel="nofollow">VinTank</a>, is the leading software solution for social media management for the wine and hospitality industry. It uses Samza to power their social media analysis and NLP pipeline. Measuring over one billion conversations about wine, profiling over 30 million social wine consumers and serving over 1000 wine brands, VinTank helps wineries, restaurants, and hotels connect and understand their customers.
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/_powered-by/vmware.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/vmware.md b/docs/_powered-by/vmware.md
new file mode 100644
index 0000000..5699bc1
--- /dev/null
+++ b/docs/_powered-by/vmware.md
@@ -0,0 +1,32 @@
+---
+name: Vmware
+domain: vmware.com
+priority: 01
+---
+<!--
+   Licensed to the Apache Software Foundation (ASF) under one or more
+   contributor license agreements.  See the NOTICE file distributed with
+   this work for additional information regarding copyright ownership.
+   The ASF licenses this file to You under the Apache License, Version 2.0
+   (the "License"); you may not use this file except in compliance with
+   the License.  You may obtain a copy of the License at
+
+       http://www.apache.org/licenses/LICENSE-2.0
+
+   Unless required by applicable law or agreed to in writing, software
+   distributed under the License is distributed on an "AS IS" BASIS,
+   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+   See the License for the specific language governing permissions and
+   limitations under the License.
+-->
+
+<a class="external-link" href="http://www.vmware.com/products/vrealize-network-insight.html" rel="nofollow">vRealize Network Insight (vRNI)</a> is VMware’s flagship product for delivering intelligent operations for software defined network environments (e.g. NSX).
+ 
+At the heart of the vRNI architecture are a set of distributed processing and analytics modules that crunch large amounts of streaming data on a cluster of multiple machines. It is critical that these operations are carried out in a way that is reliable, efficient and robust - even in the face of dynamic faults in the underlying infrastructure layers. Vmware has been successfully using Apache Samza as a distributed streaming data processing framework for executing these analytical modules reliably and efficiently at a very large scale, thus helping them focus on our core business problems.
+
+
+
+
+
+
+

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/community/committers.html
----------------------------------------------------------------------
diff --git a/docs/community/committers.html b/docs/community/committers.html
index ab4e2ed..3b65181 100644
--- a/docs/community/committers.html
+++ b/docs/community/committers.html
@@ -20,8 +20,9 @@ exclude_from_loop: true
    limitations under the License.
 -->
 
+<h6>
 Samza is developed by a friendly community of contributors. The Samza Project Management Committee(PMC) is responsible for the management and oversight of the project.
-
+</h6>
 <hr class="committers-hr"/>
 
 <ul class="committers">

http://git-wip-us.apache.org/repos/asf/samza/blob/c35e6c7c/docs/powered-by/index.html
----------------------------------------------------------------------
diff --git a/docs/powered-by/index.html b/docs/powered-by/index.html
index 3838165..2f7a971 100644
--- a/docs/powered-by/index.html
+++ b/docs/powered-by/index.html
@@ -23,8 +23,9 @@ exclude_from_loop: true
 A list of companies powered by Samza
 
 <ul class="powered-by">
+{% assign sorted = site.powered-by | sort: 'priority' %}
 
-  {% for company in site.powered-by %}
+  {% for company in sorted %} 
     {% if company.exclude_from_loop %}
         {% continue %}
     {% endif %}


[23/32] samza git commit: Add images for Samza's core concepts

Posted by ja...@apache.org.
Add images for Samza's core concepts


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/341c06b4
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/341c06b4
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/341c06b4

Branch: refs/heads/master
Commit: 341c06b47bc4333817d5e53214f8731aeb7e8004
Parents: 8505064
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 16:21:24 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 16:21:24 2018 -0700

----------------------------------------------------------------------
 .../core-concepts/stream-application.png           | Bin 0 -> 6383 bytes
 .../core-concepts/streams-partitions.png           | Bin 0 -> 13432 bytes
 2 files changed, 0 insertions(+), 0 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/341c06b4/docs/img/versioned/learn/documentation/core-concepts/stream-application.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/core-concepts/stream-application.png b/docs/img/versioned/learn/documentation/core-concepts/stream-application.png
new file mode 100644
index 0000000..060f4a6
Binary files /dev/null and b/docs/img/versioned/learn/documentation/core-concepts/stream-application.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/341c06b4/docs/img/versioned/learn/documentation/core-concepts/streams-partitions.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/core-concepts/streams-partitions.png b/docs/img/versioned/learn/documentation/core-concepts/streams-partitions.png
new file mode 100644
index 0000000..8201d13
Binary files /dev/null and b/docs/img/versioned/learn/documentation/core-concepts/streams-partitions.png differ


[08/32] samza git commit: Fix links corresponding to images in the YARN documentation page

Posted by ja...@apache.org.
Fix links corresponding to images in the YARN documentation page


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/8ce2a9eb
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/8ce2a9eb
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/8ce2a9eb

Branch: refs/heads/master
Commit: 8ce2a9ebe4b7f28ab8cb7a9e665909fdb1f92173
Parents: 8970dd2
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 22:48:04 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 22:48:04 2018 -0700

----------------------------------------------------------------------
 docs/_layouts/default.html                            |  2 +-
 docs/learn/documentation/versioned/deployment/yarn.md | 13 +++++--------
 2 files changed, 6 insertions(+), 9 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/8ce2a9eb/docs/_layouts/default.html
----------------------------------------------------------------------
diff --git a/docs/_layouts/default.html b/docs/_layouts/default.html
index ba10e8b..bcc4a50 100644
--- a/docs/_layouts/default.html
+++ b/docs/_layouts/default.html
@@ -112,7 +112,7 @@
       <p>
         Samza allows you to build stateful applications that process data in real-time from multiple sources including Apache Kafka.
         <br/> <br/> 
-        Battle-tested at scale, it supports flexible deployment options to run on <a target="_blank" href="https://kafka.apache.org">YARN</a> or as a 
+        Battle-tested at scale, it supports flexible deployment options to run on <a href="/learn/documentation/latest/deployment/yarn.html">YARN</a> or as a 
         <a href="/learn/documentation/latest/deployment/standalone.html">standalone library</a>.
       </p>
 

http://git-wip-us.apache.org/repos/asf/samza/blob/8ce2a9eb/docs/learn/documentation/versioned/deployment/yarn.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/deployment/yarn.md b/docs/learn/documentation/versioned/deployment/yarn.md
index d6a8f55..e9e7f03 100644
--- a/docs/learn/documentation/versioned/deployment/yarn.md
+++ b/docs/learn/documentation/versioned/deployment/yarn.md
@@ -116,26 +116,23 @@ yarn.package.path=https://url/to/artifact/artifact-version-dist.tar.gz
 The AM implementation in Samza exposes metadata about the job via both a JSON REST interface and a Web UI.
 This Web UI can be accessed by clicking the Tracking UI (*ApplicationMaster*) link on the YARN RM dashboard.
 
-<img src="/img/versioned/learn/documentation/yarn/yarn-am-ui.png" alt="yarn-ui" class="diagram-small">
-
+![diagram-medium](/img/{{site.version}}/learn/documentation/yarn/yarn-am-ui.png)
 
 The Application Master UI provides you the ability to view:
 
  - Job level runtime metadata
+![diagram-small](/img/{{site.version}}/learn/documentation/yarn/am-runtime-metadata.png)
 
-<img src="/img/versioned/learn/documentation/yarn/am-runtime-metadata.png" alt="yarn-runtime-metadata" class="diagram-small">
 
  - Container information
-
-<img src="/img/versioned/learn/documentation/yarn/am-container-info.png" alt="yam-container-info" class="diagram-small">
+![diagram-small](/img/{{site.version}}/learn/documentation/yarn/am-container-info.png)
 
  - Job model (SystemStreamPartition to Task and Container mapping)
+![diagram-small](/img/{{site.version}}/learn/documentation/yarn/am-job-model.png)
 
-<img src="/img/versioned/learn/documentation/yarn/am-job-model.png" alt="am-job-model" class="diagram-small">
 
 - Runtime configs
-
- <img src="/img/versioned/learn/documentation/yarn/am-runtime-configs.png" alt="am-runtime-configs" class="diagram-small">
+![diagram-small](/img/{{site.version}}/learn/documentation/yarn/am-runtime-configs.png)
 
 
 # Viewing logs


[25/32] samza git commit: Update PMCs and committers

Posted by ja...@apache.org.
Update PMCs and committers


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/2110c5f5
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/2110c5f5
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/2110c5f5

Branch: refs/heads/master
Commit: 2110c5f5d439fa1dc93131f2c0ec92f6a2eabea3
Parents: 3c88699
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 16:49:40 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 16:49:40 2018 -0700

----------------------------------------------------------------------
 docs/_committers/angela-murrell.md | 28 ----------------------------
 docs/community/committers.html     |  2 +-
 2 files changed, 1 insertion(+), 29 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/2110c5f5/docs/_committers/angela-murrell.md
----------------------------------------------------------------------
diff --git a/docs/_committers/angela-murrell.md b/docs/_committers/angela-murrell.md
deleted file mode 100644
index d1125f6..0000000
--- a/docs/_committers/angela-murrell.md
+++ /dev/null
@@ -1,28 +0,0 @@
----
-name: Angela Murrell
-website: 
-linkedin: https://www.linkedin.com/in/angela-murrell-92689088/
-twitter: angie_splice
-image: 
-github: amurrell
-pmc_member: false
-job_title:
-samza_title:
-order: 100
----
-<!--
-   Licensed to the Apache Software Foundation (ASF) under one or more
-   contributor license agreements.  See the NOTICE file distributed with
-   this work for additional information regarding copyright ownership.
-   The ASF licenses this file to You under the Apache License, Version 2.0
-   (the "License"); you may not use this file except in compliance with
-   the License.  You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
--->
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/2110c5f5/docs/community/committers.html
----------------------------------------------------------------------
diff --git a/docs/community/committers.html b/docs/community/committers.html
index d155fb9..ab4e2ed 100644
--- a/docs/community/committers.html
+++ b/docs/community/committers.html
@@ -20,7 +20,7 @@ exclude_from_loop: true
    limitations under the License.
 -->
 
-A list of people who have contributed to Samza
+Samza is developed by a friendly community of contributors. The Samza Project Management Committee(PMC) is responsible for the management and oversight of the project.
 
 <hr class="committers-hr"/>
 


[14/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/e384e5ce
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/e384e5ce
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/e384e5ce

Branch: refs/heads/master
Commit: e384e5ce0eb347c8cc29101cf5c592e63f44ff4f
Parents: 2e7a6cd 4baaddb
Author: Jagadish <jv...@linkedin.com>
Authored: Fri Oct 5 11:05:37 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Fri Oct 5 11:05:37 2018 -0700

----------------------------------------------------------------------
 .../documentation/versioned/api/table-api.md    |  2 +-
 .../versioned/connectors/overview.md            | 32 ++++++++++++++++++--
 2 files changed, 31 insertions(+), 3 deletions(-)
----------------------------------------------------------------------



[21/32] samza git commit: Add an architecture page

Posted by ja...@apache.org.
Add an architecture page


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/cad265fa
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/cad265fa
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/cad265fa

Branch: refs/heads/master
Commit: cad265fa263e31c4d5f477d77084015268954b2f
Parents: ace5c65
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 16:18:52 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 16:18:52 2018 -0700

----------------------------------------------------------------------
 .../architecture/distributed-execution.png      | Bin 0 -> 16510 bytes
 .../architecture/fault-tolerance.png            | Bin 0 -> 20244 bytes
 .../architecture/incremental-checkpointing.png  | Bin 0 -> 15352 bytes
 .../documentation/architecture/state-store.png  | Bin 0 -> 16436 bytes
 .../architecture/task-assignment.png            | Bin 0 -> 11384 bytes
 .../architecture/architecture-overview.md       |  65 ++++++++++++++++++-
 .../versioned/core-concepts/core-concepts.md    |  65 ++++++++++++++++++-
 7 files changed, 128 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/cad265fa/docs/img/versioned/learn/documentation/architecture/distributed-execution.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/architecture/distributed-execution.png b/docs/img/versioned/learn/documentation/architecture/distributed-execution.png
new file mode 100644
index 0000000..b4f7714
Binary files /dev/null and b/docs/img/versioned/learn/documentation/architecture/distributed-execution.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/cad265fa/docs/img/versioned/learn/documentation/architecture/fault-tolerance.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/architecture/fault-tolerance.png b/docs/img/versioned/learn/documentation/architecture/fault-tolerance.png
new file mode 100644
index 0000000..5146f03
Binary files /dev/null and b/docs/img/versioned/learn/documentation/architecture/fault-tolerance.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/cad265fa/docs/img/versioned/learn/documentation/architecture/incremental-checkpointing.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/architecture/incremental-checkpointing.png b/docs/img/versioned/learn/documentation/architecture/incremental-checkpointing.png
new file mode 100644
index 0000000..4465ed5
Binary files /dev/null and b/docs/img/versioned/learn/documentation/architecture/incremental-checkpointing.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/cad265fa/docs/img/versioned/learn/documentation/architecture/state-store.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/architecture/state-store.png b/docs/img/versioned/learn/documentation/architecture/state-store.png
new file mode 100644
index 0000000..6cf8c24
Binary files /dev/null and b/docs/img/versioned/learn/documentation/architecture/state-store.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/cad265fa/docs/img/versioned/learn/documentation/architecture/task-assignment.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/architecture/task-assignment.png b/docs/img/versioned/learn/documentation/architecture/task-assignment.png
new file mode 100644
index 0000000..9cd4ada
Binary files /dev/null and b/docs/img/versioned/learn/documentation/architecture/task-assignment.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/cad265fa/docs/learn/documentation/versioned/architecture/architecture-overview.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/architecture/architecture-overview.md b/docs/learn/documentation/versioned/architecture/architecture-overview.md
index 6c1fbb1..282352c 100644
--- a/docs/learn/documentation/versioned/architecture/architecture-overview.md
+++ b/docs/learn/documentation/versioned/architecture/architecture-overview.md
@@ -19,5 +19,68 @@ title: Architecture page
    limitations under the License.
 -->
 
-## Samza architecture page
+- [Distributed execution](#distributed-execution)
+     - [Task](#task)
+     - [Container](#container)
+     - [Coordinator](#coordinator)
+- [Threading model and ordering](#threading-model)
+- [Incremental checkpointing](#incremental-checkpoints)
+- [State management](#state-management)
+- [Fault tolerance of state](#fault-tolerance-of-state)
+- [Host affinity](#host-affinity)
+
+
+
+## Distributed execution
+
+### Task 
+
+![diagram-large](/img/{{site.version}}/learn/documentation/architecture/task-assignment.png)
+
+Samza scales your application by logically breaking it down into multiple tasks. A task is the unit of parallelism for your application. Each task consumes data from one partition of your input streams. The assignment of partitions to tasks never changes: if a task is on a machine that fails, the task is restarted elsewhere, still consuming the same stream partitions. Since there is no ordering of messages across partitions, it allows tasks to execute entirely independent of each other without sharing any state. 
+
+
+### Container 
+![diagram-large](/img/{{site.version}}/learn/documentation/architecture/distributed-execution.png)
+
+Just like a task is the logical unit of parallelism for your application, a container is the physical unit. You can think of each worker as a JVM process, which runs one or more tasks. An application typically has multiple containers distributed across hosts. 
+
+### Coordinator
+Each application also has a coordinator which manages the assignment of tasks across the individual containers. The coordinator monitors the liveness of individual containers and redistributes the tasks among the remaining ones during a failure. <br/><br/>
+The coordinator itself is pluggable, enabling Samza to support multiple deployment options. You can use Samza as a light-weight embedded library that easily integrates with a larger application. Alternately, you can deploy and run it as a managed framework using a cluster-manager like YARN. It is worth noting that Samza is the only system that offers first-class support for both these deployment options. Some systems like Kafka-streams only support the embedded library model while others like Flink, Spark streaming etc., only offer the framework model for stream-processing.
+
+## Threading model and ordering
+
+Samza offers a flexible threading model to run each task. When running your applications, you can control the number of workers needed to process your data. You can also configure the number of threads each worker uses to run its assigned tasks. Each thread can run one or more tasks. Tasks don’t share any state - hence, you don’t have to worry about coordination across these threads. 
+
+Another common scenario in stream processing is to interact with remote services or databases. For example, a notifications system which processes each incoming message, generates an email and invokes a REST api to deliver it. Samza offers a fully asynchronous API for use-cases like this which require high-throughput remote I/O. 
+s
+By default, all messages delivered to a task are processed by the same thread. This guarantees in-order processing of messages within a partition. However, some applications don’t care about in-order processing of messages. For such use-cases, Samza also supports processing messages out-of-order within a single partition. This typically offers higher throughput by allowing for multiple concurrent messages in each partition.
+
+## Incremental checkpointing 
+![diagram-large](/img/{{site.version}}/learn/documentation/architecture/incremental-checkpointing.png)
+
+Samza guarantees that messages won’t be lost, even if your job crashes, if a machine dies, if there is a network fault, or something else goes wrong. To achieve this property, each task periodically persists the last processed offsets for its input stream partitions. If a task needs to be restarted on a different worker due to a failure, it resumes processing from its latest checkpoint. 
+
+Samza’s checkpointing mechanism ensures each task also stores the contents of its state-store consistently with its last processed offsets. Checkpoints are flushed incrementally ie., the state-store only flushes the delta since the previous checkpoint instead of flushing its entire state.
+
+## State management
+Samza offers scalable, high-performance storage to enable you to build stateful stream-processing applications. This is implemented by associating each Samza task with its own instance of a local database (aka. a state-store). The state-store associated with a particular task only stores data corresponding to the partitions processed by that task. This is important: when you scale out your job by giving it more computing resources, Samza transparently migrates the tasks from one machine to another. By giving each task its own state, tasks can be relocated without affecting your overall application. 
+![diagram-large](/img/{{site.version}}/learn/documentation/architecture/state-store.png)
+
+Here are some key advantages of this architecture. <br/>
+- The state is stored on disk, so the job can maintain more state than would fit in memory. <br/>
+- It is stored on the same machine as the task, to avoid the performance problems of making database queries over the network. <br/>
+- Each job has its own store, to avoid the isolation issues in a shared remote database (if you make an expensive query, it affects only the current task, nobody else). <br/>
+- Different storage engines can be plugged in - for example, a remote data-store that enables richer query capabilities <br/>
+
+## Fault tolerance of state
+Distributed stream processing systems need recover quickly from failures to resume their processing. While having a durable local store offers great performance, we should still guarantee fault-tolerance. For this purpose, Samza replicates every change to the local store into a separate stream (aka. called a changelog for the store). This allows you to later recover the data in the store by reading the contents of the changelog from the beginning. A log-compacted Kafka topic is typically used as a changelog since Kafka automatically retains the most recent value for each key.
+![diagram-large](/img/{{site.version}}/learn/documentation/architecture/fault-tolerance.png)
+
+## Host affinity
+If your application has several terabytes of state, then bootstrapping it every time by reading the changelog will stall progress. So, it’s critical to be able to recover state swiftly during failures. For this purpose, Samza takes data-locality into account when scheduling tasks on hosts. This is implemented by persisting metadata about the host each task is currently running on. 
+
+During a new deployment of the application, Samza tries to re-schedule the tasks on the same hosts they were previously on. This enables the task to re-use the snapshot of its local-state from its previous run on that host. We call this feature _host-affinity_ since it tries to preserve the assignment of tasks to hosts. This is a key differentiator that enables Samza applications to scale to several terabytes of local-state with effectively zero downtime.
+
 

http://git-wip-us.apache.org/repos/asf/samza/blob/cad265fa/docs/learn/documentation/versioned/core-concepts/core-concepts.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/core-concepts/core-concepts.md b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
index 449b338..52965a2 100644
--- a/docs/learn/documentation/versioned/core-concepts/core-concepts.md
+++ b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
@@ -18,6 +18,69 @@ title: Core concepts
    See the License for the specific language governing permissions and
    limitations under the License.
 -->
+- [Introduction](#introduction)
+- [Streams, Partitions](#streams-partitions)
+- [Stream Application](#stream-application)
+- [State](#state)
+- [Time](#time)
+- [Processing guarantee](#processing-guarantee)
 
-## Core concepts page
+## Introduction
+
+Apache Samza is a scalable data processing engine that allows you to process and analyze your data in real-time. Here is a summary of Samza’s features that simplify building your applications:
+
+_**Unified API:**_ Use a simple API to describe your application-logic in a manner independent of your data-source. The same API can process both batch and streaming data.
+
+*Pluggability at every level:* Process and transform data from any source. Samza offers built-in integrations with [Apache Kafka](/learn/documentation/{{site.version}}/connectors/kafka.html), [AWS Kinesis](/learn/documentation/{{site.version}}/connectors/kinesis.html), [Azure EventHubs](/learn/documentation/{{site.version}}/connectors/kinesis.html), ElasticSearch and [Apache Hadoop](/learn/documentation/{{site.version}}/connectors/hdfs.html). Also, it’s quite easy to integrate with your own sources.
+
+*Samza as an embedded library:* Integrate effortlessly with your existing applications eliminating the need to spin up and operate a separate cluster for stream processing. Samza can be used as a light-weight client-library [embedded](/learn/documentation/{{site.version}}/deployment/standalone.html) in your Java/Scala applications. 
+
+*Write once, Run anywhere:* [Flexible deployment options](/learn/documentation/{{site.version}}/deployment/deployment-model.html)  to run applications anywhere - from public clouds to containerized environments to bare-metal hardware.
+
+*Samza as a managed service:* Run stream-processing as a managed service by integrating with popular cluster-managers including [Apache YARN](https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html). 
+
+*Fault-tolerance:*  Transparently migrate tasks along with their associated state in the event of failures. Samza supports [host-affinity](/learn/documentation/{{site.version}}/architecture/architecture-overview.html#host-affinity) and [incremental checkpointing](/learn/documentation/{{site.version}}/architecture/architecture-overview.html#incremental-checkpoints) to enable fast recovery from failures.
+
+*Massive scale:* Battle-tested on applications that use several terabytes of state and run on thousands of cores. It [powers](/powered-by/) multiple large companies including LinkedIn, Uber, TripAdvisor, Slack etc. 
+
+Next, we will introduce Samza’s terminology. You will realize that it is extremely easy to get started with [building](/quickstart/{{site.version}}) your first stream-processing application. 
+
+
+## Streams, Partitions
+Samza processes your data in the form of streams. A _stream_ is a collection of immutable messages, usually of the same type or category. Each message in a stream is modelled as a key-value pair. 
+
+![diagram-medium](/img/{{site.version}}/learn/documentation/core-concepts/streams-partitions.png)
+<br/>
+A stream can have multiple producers that write data to it and multiple consumers that read data from it. Data in a stream can be unbounded (eg: a Kafka topic) or bounded (eg: a set of files on HDFS). 
+
+A stream is sharded into multiple partitions for scaling how its data is processed. Each _partition_ is an ordered, replayable sequence of records. When a message is written to a stream, it ends up in one its partitions. Each message in a partition is uniquely identified by an _offset_. 
+
+Samza supports for pluggable systems that can implement the stream abstraction. As an example, Kafka implements a stream as a topic while another database might implement a stream as a sequence of updates to its tables.
+
+## Stream Application
+A _stream application_ processes messages from input streams, transforms them and emits results to an output stream or a database. It is built by chaining multiple operators, each of which take in one or more streams and transform them.
+
+![diagram-medium](/img/{{site.version}}/learn/documentation/core-concepts/stream-application.png)
+
+Samza offers three top-level APIs to help you build your stream applications: <br/>
+1. The [Samza Streams DSL](/learn/documentation/{{site.version}}/api/high-level-api.html),  which offers several built-in operators like map, filter etc. This is the recommended API for most use-cases. <br/>
+2. The [low-level API](/learn/documentation/{{site.version}}/api/low-level-api.html), which allows greater flexibility to define your processing-logic and offers for greater control <br/>
+3. [Samza SQL](/learn/documentation/{{site.version}}/api/samza-sql.html), which offers a declarative SQL interface to create your applications <br/>
+
+## State
+Samza supports for both stateless and stateful stream processing. _Stateless processing_, as the name implies, does not retain any state associated with the current message after it has been processed. A good example of this is to filter an incoming stream of user-records by a field (eg:userId) and write the filtered messages to their own stream. 
+
+In contrast, _stateful processing_ requires you to record some state about a message even after processing it. Consider the example of counting the number of unique users to a website every five minutes. This requires you to record state about each user seen thus far, for deduping later. Samza offers a fault-tolerant, scalable state-store for this purpose.
+
+## Time
+Time is a fundamental concept in stream processing, especially how it is modeled and interpreted by the system. Samza supports two notions of dealing with time. By default, all built-in Samza operators use processing time. In processing time, the timestamp of a message is determined by when it is processed by the system. For example, an event generated by a sensor could be processed by Samza several milliseconds later. 
+
+On the other hand, in event time, the timestamp of an event is determined by when it actually occurred in the source. For example, a sensor which generates an event could embed the time of occurrence as a part of the event itself. Samza provides event-time based processing by its integration with [Apache BEAM](https://beam.apache.org/documentation/runners/samza/).
+
+## Processing guarantee
+Samza supports at-least once processing. As the name implies, this ensures that each message in the input stream is processed by the system at-least once. This guarantees no data-loss even when there are failures making Samza a practical choice for building fault-tolerant applications.
+
+
+Next Steps: We are now ready to have a closer look at Samza’s architecture.
+## [Architecture &raquo;](/learn/documentation/{{site.version}}/architecture/architecture-overview.html)
 


[31/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/da808fc3
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/da808fc3
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/da808fc3

Branch: refs/heads/master
Commit: da808fc3b55f7f7feb668e7550434ae66c8fb5d0
Parents: a05fee9 6dd0122
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 19:22:30 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 19:22:30 2018 -0700

----------------------------------------------------------------------

----------------------------------------------------------------------



[29/32] samza git commit: Add Powered By pages for Samza users in the community

Posted by ja...@apache.org.
Add Powered By pages for Samza users in the community


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/9cd7cdda
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/9cd7cdda
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/9cd7cdda

Branch: refs/heads/master
Commit: 9cd7cdda97b6b5e75998cab77e93d05ca04770df
Parents: a8d06cc
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 19:18:57 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 19:18:57 2018 -0700

----------------------------------------------------------------------
 docs/_powered-by/jha.md         | 5 +----
 docs/_powered-by/metamarkets.md | 5 +----
 docs/_powered-by/mobileaware.md | 2 +-
 docs/_powered-by/movio.md       | 2 +-
 docs/_powered-by/netflix.md     | 4 +---
 docs/_powered-by/ntent.md       | 8 +-------
 docs/_powered-by/optimizely.md  | 7 +------
 docs/_powered-by/redfin.md      | 9 +--------
 docs/_powered-by/state.md       | 5 -----
 docs/_powered-by/tivo.md        | 4 +---
 docs/_powered-by/tripadvisor.md | 9 +--------
 docs/_powered-by/vintank.md     | 3 +--
 docs/_powered-by/vmware.md      | 4 ----
 13 files changed, 11 insertions(+), 56 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/jha.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/jha.md b/docs/_powered-by/jha.md
index 3574d5d..84c098f 100644
--- a/docs/_powered-by/jha.md
+++ b/docs/_powered-by/jha.md
@@ -19,7 +19,4 @@ domain: banno.com
    limitations under the License.
 -->
 
-<a class="external-link" href="www.banno.com" rel="nofollow">Jack Henry and Associates</a>  is an S&P 400 company that supports more than 11,300 financial institutions with core processing services. It leverages Samza to process user activity data across its Banno suite of products for financial institutions.
-
-
-
+<a class="external-link" href="www.banno.com" rel="nofollow">Jack Henry and Associates</a>  is an S&P 400 company that supports more than 11,300 financial institutions with core processing services. It leverages Samza to process user activity data across its Banno suite of products for financial institutions.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/metamarkets.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/metamarkets.md b/docs/_powered-by/metamarkets.md
index 7ca3111..14b59e1 100644
--- a/docs/_powered-by/metamarkets.md
+++ b/docs/_powered-by/metamarkets.md
@@ -19,7 +19,4 @@ domain: metamarkets.com
    limitations under the License.
 -->
 
-<a class="external-link" href="www.metamarkets.com" rel="nofollow">Metamarkets</a> Metamarkets offers an interactive analytics platform for buyers and sellers of programmatic advertising. It uses Samza to transform and join real-time event streams, then forward them into a Druid cluster for interactive querying.
-
-
-
+<a class="external-link" href="www.metamarkets.com" rel="nofollow">Metamarkets</a> Metamarkets offers an interactive analytics platform for buyers and sellers of programmatic advertising. It uses Samza to transform and join real-time event streams, then forward them into a Druid cluster for interactive querying.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/mobileaware.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/mobileaware.md b/docs/_powered-by/mobileaware.md
index 08824c5..ea07155 100644
--- a/docs/_powered-by/mobileaware.md
+++ b/docs/_powered-by/mobileaware.md
@@ -19,4 +19,4 @@ domain: mobileaware.com
    limitations under the License.
 -->
 
-At <a class="external-link" href="https://www.mobileaware.com/" rel="nofollow">MobileAware</a>, we use Samza to enrich events with more contextual data from various sources (CMDB, Change Management, Incident Management, Problem Management). This gives us more meaningful events that an operations centre person can act on.
+At <a class="external-link" href="https://www.mobileaware.com/" rel="nofollow">MobileAware</a>, we use Samza to enrich events with more contextual data from various sources (CMDB, Change Management, Incident Management, Problem Management). This gives us more meaningful events that an operations centre person can act on.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/movio.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/movio.md b/docs/_powered-by/movio.md
index 4a8b661..5fb0c7b 100644
--- a/docs/_powered-by/movio.md
+++ b/docs/_powered-by/movio.md
@@ -19,4 +19,4 @@ domain: movio.co
    limitations under the License.
 -->
 
-<a class="external-link" href="http://www.ntent.com" rel="nofollow">Movio</a> offers data-driven marketing solutions for the film industry. At Movio, they use Samza to process and enrich billions of change data capture events on all databases in real-time. 
+<a class="external-link" href="http://www.ntent.com" rel="nofollow">Movio</a> offers data-driven marketing solutions for the film industry. At Movio, they use Samza to process and enrich billions of change data capture events on all databases in real-time. 
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/netflix.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/netflix.md b/docs/_powered-by/netflix.md
index 98c74d9..61fcb5f 100644
--- a/docs/_powered-by/netflix.md
+++ b/docs/_powered-by/netflix.md
@@ -19,6 +19,4 @@ domain: netflix.com
    limitations under the License.
 -->
 
-<a class="external-link" href="www.netflix.com" rel="nofollow">Netflix</a> uses single-stage Samza jobs to route over 700 billion events / 1 peta byte per day from fronting Kafka clusters to s3/hive. A portion of these events are routed to Kafka and ElasticSearch with support for custom index creation, basic filtering and projection. We run over 10,000 samza jobs in that many docker containers.
-
-
+<a class="external-link" href="www.netflix.com" rel="nofollow">Netflix</a> uses single-stage Samza jobs to route over 700 billion events / 1 peta byte per day from fronting Kafka clusters to s3/hive. A portion of these events are routed to Kafka and ElasticSearch with support for custom index creation, basic filtering and projection. We run over 10,000 samza jobs in that many docker containers.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/ntent.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/ntent.md b/docs/_powered-by/ntent.md
index 05660f5..ae9b0fa 100644
--- a/docs/_powered-by/ntent.md
+++ b/docs/_powered-by/ntent.md
@@ -19,10 +19,4 @@ domain: ntent.com
    limitations under the License.
 -->
 
-<a class="external-link" href="http://www.ntent.com" rel="nofollow">Ntent</a> blends semantic search with natural language processing technologies to predict and create relevant content experiences.  They use Samza to power their streaming content ingestion system. Ntent takes crawled web pages and news articles, and passes them through a multi-stage processing pipeline that cleanses, classifies, extracts features that power other learning models, stores, and indexes the content for search.
-
-
-
-
-
-
+<a class="external-link" href="http://www.ntent.com" rel="nofollow">Ntent</a> blends semantic search with natural language processing technologies to predict and create relevant content experiences.  They use Samza to power their streaming content ingestion system. Ntent takes crawled web pages and news articles, and passes them through a multi-stage processing pipeline that cleanses, classifies, extracts features that power other learning models, stores, and indexes the content for search.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/optimizely.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/optimizely.md b/docs/_powered-by/optimizely.md
index 4103329..458c4a0 100644
--- a/docs/_powered-by/optimizely.md
+++ b/docs/_powered-by/optimizely.md
@@ -20,9 +20,4 @@ priority: 07
    limitations under the License.
 -->
 
-<a class="external-link" href="https://www.optimizely.com" rel="nofollow">Optimizely</a>, the world's leader in customer experience optimization uses Apache Samza to aggregate and enrich billions of events per day to power real-time analytics of Experiments and Personalization experiences.
-
-
-
-
-
+<a class="external-link" href="https://www.optimizely.com" rel="nofollow">Optimizely</a>, the world's leader in customer experience optimization uses Apache Samza to aggregate and enrich billions of events per day to power real-time analytics of Experiments and Personalization experiences.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/redfin.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/redfin.md b/docs/_powered-by/redfin.md
index 3c9affe..865e997 100644
--- a/docs/_powered-by/redfin.md
+++ b/docs/_powered-by/redfin.md
@@ -20,11 +20,4 @@ priority: 06
    limitations under the License.
 -->
 
-<a class="external-link" href="https://redfin.com" rel="nofollow">Redfin</a> provides real estate search and brokerage services through a combination of real estate web platforms. It uses Samza and Kafka for sending millions of email and push notifications to our customers everyday. Redfin chose Samza for distributed processing because it integrates really well with Kafka. Samza also provides managed state and a resilient local storage which Redfin found to be very useful features.
-
-
-
-
-
-
-
+<a class="external-link" href="https://redfin.com" rel="nofollow">Redfin</a> provides real estate search and brokerage services through a combination of real estate web platforms. It uses Samza and Kafka for sending millions of email and push notifications to our customers everyday. Redfin chose Samza for distributed processing because it integrates really well with Kafka. Samza also provides managed state and a resilient local storage which Redfin found to be very useful features.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/state.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/state.md b/docs/_powered-by/state.md
index 28cb0c1..5d3399b 100644
--- a/docs/_powered-by/state.md
+++ b/docs/_powered-by/state.md
@@ -20,8 +20,3 @@ domain: state.com
 -->
 
 <a class="external-link" href="https://state.com" rel="nofollow">State</a> is a public global opinion network that focuses on empowering individuals, democracy, and social progress. It uses Samza to process and join streams of changes from MongoDB to update a wide range of realtime services that support the website and mobile apps. These include search, user recommendations, opinion metrics and lots more.
-
-
-
-
-

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/tivo.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/tivo.md b/docs/_powered-by/tivo.md
index 99c2d21..ee4b6a0 100644
--- a/docs/_powered-by/tivo.md
+++ b/docs/_powered-by/tivo.md
@@ -20,6 +20,4 @@ priority: 03
    limitations under the License.
 -->
 
-<a class="external-link" href="www.tivo.com" rel="nofollow">Tivo</a> TiVo is a digital video recorder that allows users to save TV programs for later viewing based on an electronic TV programming schedule. It leverages Samza leveraging Samza to do online processing of views and ratings to help power personalized content recommendations and analytics.
-
-
+<a class="external-link" href="www.tivo.com" rel="nofollow">Tivo</a> TiVo is a digital video recorder that allows users to save TV programs for later viewing based on an electronic TV programming schedule. It leverages Samza leveraging Samza to do online processing of views and ratings to help power personalized content recommendations and analytics.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/tripadvisor.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/tripadvisor.md b/docs/_powered-by/tripadvisor.md
index 6980a82..d3067fd 100644
--- a/docs/_powered-by/tripadvisor.md
+++ b/docs/_powered-by/tripadvisor.md
@@ -20,11 +20,4 @@ priority: 02
    limitations under the License.
 -->
 
-<a class="external-link" href="https://tripadvisor.com" rel="nofollow">Tripadvisor</a> is the world's largest travel site, enabling travelers to plan and book the perfect trip. It uses Apache Samza to process billions of events daily for analytics, machine learning, and site improvement.
-
-
-
-
-
-
-
+<a class="external-link" href="https://tripadvisor.com" rel="nofollow">Tripadvisor</a> is the world's largest travel site, enabling travelers to plan and book the perfect trip. It uses Apache Samza to process billions of events daily for analytics, machine learning, and site improvement.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/vintank.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/vintank.md b/docs/_powered-by/vintank.md
index ecd6842..94908b0 100644
--- a/docs/_powered-by/vintank.md
+++ b/docs/_powered-by/vintank.md
@@ -19,5 +19,4 @@ domain: vintank.com
    limitations under the License.
 -->
 
-<a class="external-link" href="https://www.crunchbase.com/organization/vintank" rel="nofollow">VinTank</a>, is the leading software solution for social media management for the wine and hospitality industry. It uses Samza to power their social media analysis and NLP pipeline. Measuring over one billion conversations about wine, profiling over 30 million social wine consumers and serving over 1000 wine brands, VinTank helps wineries, restaurants, and hotels connect and understand their customers.
-
+<a class="external-link" href="https://www.crunchbase.com/organization/vintank" rel="nofollow">VinTank</a>, is the leading software solution for social media management for the wine and hospitality industry. It uses Samza to power their social media analysis and NLP pipeline. Measuring over one billion conversations about wine, profiling over 30 million social wine consumers and serving over 1000 wine brands, VinTank helps wineries, restaurants, and hotels connect and understand their customers.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/samza/blob/9cd7cdda/docs/_powered-by/vmware.md
----------------------------------------------------------------------
diff --git a/docs/_powered-by/vmware.md b/docs/_powered-by/vmware.md
index 5699bc1..132f6f8 100644
--- a/docs/_powered-by/vmware.md
+++ b/docs/_powered-by/vmware.md
@@ -26,7 +26,3 @@ At the heart of the vRNI architecture are a set of distributed processing and an
 
 
 
-
-
-
-


[09/32] samza git commit: Fix images rendered in markdown

Posted by ja...@apache.org.
Fix images rendered in markdown


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/76e4a1c6
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/76e4a1c6
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/76e4a1c6

Branch: refs/heads/master
Commit: 76e4a1c633d6d3cf8a0a37d302971469cd9b3558
Parents: 8ce2a9e
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 22:52:56 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 22:52:56 2018 -0700

----------------------------------------------------------------------
 docs/learn/documentation/versioned/deployment/yarn.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/76e4a1c6/docs/learn/documentation/versioned/deployment/yarn.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/deployment/yarn.md b/docs/learn/documentation/versioned/deployment/yarn.md
index e9e7f03..c30346b 100644
--- a/docs/learn/documentation/versioned/deployment/yarn.md
+++ b/docs/learn/documentation/versioned/deployment/yarn.md
@@ -311,7 +311,8 @@ The `ClusterBasedJobCoordinator` is used as the control hub for a running Samza
 
 The `ClusterBasedJobCoordinator` contains a component called the `ContainerProcessManager` to handle metadata regarding container allocations. It uses the information (eg: host affinity) obtained from configs and the `CoordinatorStream` in order to make container allocation requests to the cluster manager (RM). In the case of YARN the config for `samza.cluster-manager.factory` which encapsulates the Application Master, is configured to `org.apache.samza.job.yarn.YarnResourceManagerFactory` and the `ContainerProcessManager` uses `YarnResourceManager` to interact with the RM.
 
-<img src="/img/versioned/learn/documentation/yarn/coordinator-internals.png" alt="yarn-coordinator-internals" class="diagram-small">
+![diagram-small](/img/{{site.version}}/learn/documentation/yarn/coordinator-internals.png)
+
 
 The following is a walkthrough of the different actions taken when the `run-job.sh` script is run:
 - When the job is submitted using `run-app.sh` the JobRunner invoked as part of this script first writes all the configs to the coordinator stream.


[28/32] samza git commit: Add Powered By pages for Samza users in the community

Posted by ja...@apache.org.
Add Powered By pages for Samza users in the community


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/a8d06cc4
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/a8d06cc4
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/a8d06cc4

Branch: refs/heads/master
Commit: a8d06cc4cc988b9160a9a5d2c2afa52f2145770b
Parents: c4cbebe
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 19:16:18 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 19:16:18 2018 -0700

----------------------------------------------------------------------
 docs/community/committers.html | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/a8d06cc4/docs/community/committers.html
----------------------------------------------------------------------
diff --git a/docs/community/committers.html b/docs/community/committers.html
index 3144716..ab4e2ed 100644
--- a/docs/community/committers.html
+++ b/docs/community/committers.html
@@ -20,7 +20,7 @@ exclude_from_loop: true
    limitations under the License.
 -->
 
-=======
+Samza is developed by a friendly community of contributors. The Samza Project Management Committee(PMC) is responsible for the management and oversight of the project.
 
 <hr class="committers-hr"/>
 
@@ -94,4 +94,4 @@ exclude_from_loop: true
 
   {% endfor %}
 
-</ul>
+</ul>
\ No newline at end of file


[13/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/2e7a6cdb
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/2e7a6cdb
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/2e7a6cdb

Branch: refs/heads/master
Commit: 2e7a6cdba16349e6bf8b3cdbe0f5128cd0463712
Parents: 07120f4 6b31894
Author: Jagadish <jv...@linkedin.com>
Authored: Fri Oct 5 11:04:23 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Fri Oct 5 11:04:23 2018 -0700

----------------------------------------------------------------------
 docs/_includes/footer.html                      |   4 +-
 docs/_includes/main-navigation.html             |   2 +-
 docs/_layouts/default.html                      |   6 +-
 docs/_menu/index.html                           |   4 +-
 docs/community/contact-us.md                    |  77 +++++
 docs/community/mailing-lists.md                 |  28 --
 docs/contribute/contributors-corner.md          | 123 ++++---
 docs/contribute/design-documents.md             |  58 ----
 docs/contribute/enhancement-proposal.md         |  96 ++++++
 docs/css/main.new.css                           |   6 +-
 .../versioned/jobs/basic-configurations.md      | 168 ----------
 .../versioned/jobs/configuration.md             |   4 +-
 .../versioned/jobs/samza-configurations.md      | 335 +++++++++++++++++++
 .../versioned/operations/monitoring.md          |   6 +-
 .../StreamApplicationDescriptor.java            |   6 -
 .../apache/samza/operators/MessageStream.java   |  25 +-
 .../StreamApplicationDescriptorImpl.java        |  17 +-
 .../TaskApplicationDescriptorImpl.java          |   2 +-
 .../apache/samza/container/LocalityManager.java |   8 -
 .../grouper/task/GroupByContainerCount.java     |  88 ++---
 .../task/GroupByContainerCountFactory.java      |   4 +-
 .../grouper/task/TaskAssignmentManager.java     |   2 +-
 .../samza/operators/MessageStreamImpl.java      |  11 -
 .../apache/samza/processor/StreamProcessor.java |  64 ++--
 .../apache/samza/container/SamzaContainer.scala |  17 +-
 .../TestStreamApplicationDescriptorImpl.java    |  13 +-
 .../grouper/task/TestGroupByContainerCount.java | 106 +++---
 .../grouper/task/TestGroupByContainerIds.java   |  12 +-
 .../grouper/task/TestTaskAssignmentManager.java |  12 +-
 .../samza/coordinator/TestJobModelManager.java  |  11 +-
 .../execution/ExecutionPlannerTestBase.java     |   8 +-
 .../samza/execution/TestExecutionPlanner.java   |  17 +-
 .../execution/TestJobGraphJsonGenerator.java    |   6 +-
 .../TestJobNodeConfigurationGenerator.java      |  22 --
 .../samza/operators/TestMessageStreamImpl.java  |  30 --
 .../operators/impl/TestOperatorImplGraph.java   |   4 +-
 .../spec/TestPartitionByOperatorSpec.java       |  12 +-
 .../samza/processor/TestStreamProcessor.java    |   7 +-
 .../samza/rest/proxy/task/SamzaTaskProxy.java   |   4 +-
 .../EndOfStreamIntegrationTest.java             |  11 +-
 .../WatermarkIntegrationTest.java               |   4 +-
 .../test/framework/FaultInjectionTest.java      | 126 +++++++
 .../StreamApplicationIntegrationTest.java       |  98 +++---
 ...StreamApplicationIntegrationTestHarness.java |   8 +-
 .../processor/TestZkLocalApplicationRunner.java |  10 +-
 .../apache/samza/test/table/TestLocalTable.java |   2 +-
 .../table/TestLocalTableWithSideInputs.java     |   2 +-
 .../webapp/TestApplicationMasterRestClient.java |   5 +-
 48 files changed, 988 insertions(+), 703 deletions(-)
----------------------------------------------------------------------



[24/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/3c886992
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/3c886992
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/3c886992

Branch: refs/heads/master
Commit: 3c8869920be301dd8afeb2cfe287dc58c85a8a4d
Parents: 341c06b 9903539
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 16:33:44 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 16:33:44 2018 -0700

----------------------------------------------------------------------
 .../documentation/versioned/api/samza-sql.md    | 174 +++++++++++++++++--
 1 file changed, 156 insertions(+), 18 deletions(-)
----------------------------------------------------------------------



[06/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/73ec4e91
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/73ec4e91
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/73ec4e91

Branch: refs/heads/master
Commit: 73ec4e919e744678b30bf251bc4600386820bf96
Parents: 187ff89 ad578e2
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 22:34:42 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 22:34:42 2018 -0700

----------------------------------------------------------------------
 .../versioned/connectors/eventhubs.md           |  95 ++++++++++++-
 .../documentation/versioned/connectors/hdfs.md  | 133 ++++++++++++++++++-
 docs/learn/documentation/versioned/index.html   |   2 +-
 3 files changed, 225 insertions(+), 5 deletions(-)
----------------------------------------------------------------------



[19/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/531b668e
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/531b668e
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/531b668e

Branch: refs/heads/master
Commit: 531b668ea52a701559dbc2b3515031d02cc22407
Parents: 44329cf 3c78e06
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 12:32:27 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 12:32:27 2018 -0700

----------------------------------------------------------------------
 .../documentation/versioned/azure/eventhubs.md  |  30 +--
 .../versioned/connectors/eventhubs.md           |  22 +-
 .../versioned/jobs/configuration-table.html     |  32 +--
 .../versioned/jobs/samza-configurations.md      |  32 +--
 .../descriptors/GenericSystemDescriptor.java    |   6 -
 .../org/apache/samza/storage/StorageEngine.java |   4 +-
 .../samza/system/eventhub/EventHubConfig.java   |  45 ++--
 .../eventhub/EventHubsInputDescriptor.java      | 121 +++++++++++
 .../eventhub/EventHubsOutputDescriptor.java     | 104 +++++++++
 .../eventhub/EventHubsSystemDescriptor.java     | 217 +++++++++++++++++++
 .../eventhub/producer/AsyncSystemProducer.java  |   3 +-
 .../eventhub/TestEventHubsInputDescriptor.java  |  91 ++++++++
 .../eventhub/TestEventHubsOutputDescriptor.java |  88 ++++++++
 .../eventhub/TestEventHubsSystemDescriptor.java | 112 ++++++++++
 .../org/apache/samza/execution/JobPlanner.java  |   7 +-
 .../descriptors/DelegatingSystemDescriptor.java |   6 -
 .../samza/storage/TaskStorageManager.scala      |   2 +-
 .../system/kafka/KafkaSystemDescriptor.java     |   6 -
 .../samza/storage/kv/RocksDbKeyValueStore.scala |   2 +-
 .../samza/storage/kv/AccessLoggedStore.scala    |   6 +-
 .../kv/BaseKeyValueStorageEngineFactory.scala   |   2 +-
 .../storage/kv/KeyValueStorageEngine.scala      |  10 +-
 .../storage/kv/TestKeyValueStorageEngine.scala  |   6 +-
 23 files changed, 842 insertions(+), 112 deletions(-)
----------------------------------------------------------------------



[12/32] samza git commit: Add new diagrams for the Samza homepage. Fix home-page rendering CSS

Posted by ja...@apache.org.
Add new diagrams for the Samza homepage. Fix home-page rendering CSS


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/07120f40
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/07120f40
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/07120f40

Branch: refs/heads/master
Commit: 07120f40e40fc3ae2039ab8525ba4df77e4c8401
Parents: 02fea74
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 4 18:17:14 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 4 18:17:14 2018 -0700

----------------------------------------------------------------------
 docs/_layouts/default.html                         |   3 ++-
 .../learn/documentation/api/samza-arch3.png        | Bin 0 -> 54723 bytes
 .../learn/documentation/api/samza-arch4.png        | Bin 0 -> 53799 bytes
 .../learn/documentation/api/samza-arch5.png        | Bin 0 -> 53535 bytes
 .../learn/documentation/api/samza-arch6.png        | Bin 0 -> 52928 bytes
 .../learn/documentation/api/samza-arch7.png        | Bin 0 -> 52383 bytes
 .../learn/documentation/api/samza-arch8.png        | Bin 0 -> 50516 bytes
 7 files changed, 2 insertions(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/07120f40/docs/_layouts/default.html
----------------------------------------------------------------------
diff --git a/docs/_layouts/default.html b/docs/_layouts/default.html
index bcc4a50..127fb17 100644
--- a/docs/_layouts/default.html
+++ b/docs/_layouts/default.html
@@ -116,7 +116,8 @@
         <a href="/learn/documentation/latest/deployment/standalone.html">standalone library</a>.
       </p>
 
-      <img src="/img/latest/learn/documentation/api/samza-arch1.png">
+      <!-- <img src="/img/latest/learn/documentation/api/samza-arch3.png"> -->
+      <img src="/img/latest/learn/documentation/api/samza-arch6.png" width="50%" height="50%" hspace="30px">
   </div>
   </div>
 

http://git-wip-us.apache.org/repos/asf/samza/blob/07120f40/docs/img/versioned/learn/documentation/api/samza-arch3.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch3.png b/docs/img/versioned/learn/documentation/api/samza-arch3.png
new file mode 100644
index 0000000..089eb1e
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch3.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/07120f40/docs/img/versioned/learn/documentation/api/samza-arch4.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch4.png b/docs/img/versioned/learn/documentation/api/samza-arch4.png
new file mode 100644
index 0000000..04547f3
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch4.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/07120f40/docs/img/versioned/learn/documentation/api/samza-arch5.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch5.png b/docs/img/versioned/learn/documentation/api/samza-arch5.png
new file mode 100644
index 0000000..a1b65af
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch5.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/07120f40/docs/img/versioned/learn/documentation/api/samza-arch6.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch6.png b/docs/img/versioned/learn/documentation/api/samza-arch6.png
new file mode 100644
index 0000000..651f096
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch6.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/07120f40/docs/img/versioned/learn/documentation/api/samza-arch7.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch7.png b/docs/img/versioned/learn/documentation/api/samza-arch7.png
new file mode 100644
index 0000000..5c166bf
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch7.png differ

http://git-wip-us.apache.org/repos/asf/samza/blob/07120f40/docs/img/versioned/learn/documentation/api/samza-arch8.png
----------------------------------------------------------------------
diff --git a/docs/img/versioned/learn/documentation/api/samza-arch8.png b/docs/img/versioned/learn/documentation/api/samza-arch8.png
new file mode 100644
index 0000000..a871d41
Binary files /dev/null and b/docs/img/versioned/learn/documentation/api/samza-arch8.png differ


[20/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/ace5c653
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/ace5c653
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/ace5c653

Branch: refs/heads/master
Commit: ace5c653ecd37d5cdd1c0206c21dd119cf733a19
Parents: 531b668 1eb4c26
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 15:04:22 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 15:04:22 2018 -0700

----------------------------------------------------------------------
 .../samza/table/TableDescriptorsProvider.java   |   2 +-
 .../samza/application/ApplicationUtil.java      |   1 -
 .../table/caching/CachingTableDescriptor.java   |  47 +-
 .../org/apache/samza/config/SystemConfig.scala  |   2 +-
 .../samza/table/caching/TestCachingTable.java   |  13 +-
 .../samza/config/KafkaConsumerConfig.java       | 194 ++++++
 .../samza/system/kafka/KafkaSystemAdmin.java    | 665 +++++++++++++++++++
 .../samza/system/kafka/KafkaSystemConsumer.java | 366 ++++++++++
 .../org/apache/samza/config/KafkaConfig.scala   |   5 +
 .../samza/config/KafkaConsumerConfig.java       | 201 ------
 .../samza/system/kafka/KafkaConsumerProxy.java  |   2 +
 .../samza/system/kafka/KafkaSystemAdmin.scala   | 608 -----------------
 .../kafka/KafkaSystemAdminUtilsScala.scala      | 192 ++++++
 .../samza/system/kafka/KafkaSystemConsumer.java | 371 -----------
 .../samza/system/kafka/KafkaSystemFactory.scala |  63 +-
 .../scala/org/apache/samza/util/KafkaUtil.scala |  11 -
 .../samza/config/TestKafkaConsumerConfig.java   |  60 +-
 .../system/kafka/TestKafkaSystemAdminJava.java  | 185 ++++--
 .../kafka/TestKafkaSystemAdminWithMock.java     | 317 +++++++++
 .../system/kafka/TestKafkaSystemConsumer.java   | 225 +++++++
 .../kafka/TestKafkaSystemConsumerMetrics.java   | 109 +++
 .../system/kafka/TestKafkaSystemAdmin.scala     | 109 ++-
 .../system/kafka/TestKafkaSystemConsumer.java   | 220 ------
 .../operator/TestRepartitionJoinWindowApp.java  |  18 +-
 .../samza/test/table/TestRemoteTable.java       |   6 +-
 .../AbstractIntegrationTestHarness.scala        |  60 +-
 26 files changed, 2399 insertions(+), 1653 deletions(-)
----------------------------------------------------------------------



[10/32] samza git commit: Merge branch 'master' of https://github.com/apache/samza

Posted by ja...@apache.org.
Merge branch 'master' of https://github.com/apache/samza


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/3b7ff0dd
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/3b7ff0dd
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/3b7ff0dd

Branch: refs/heads/master
Commit: 3b7ff0dd7953bc98b17b6a559db2dda267e62f54
Parents: 76e4a1c 7beb09e
Author: Jagadish <jv...@linkedin.com>
Authored: Tue Oct 2 22:54:50 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Tue Oct 2 22:54:50 2018 -0700

----------------------------------------------------------------------

----------------------------------------------------------------------



[32/32] samza git commit: Minor: Fix typo in Core Concepts section

Posted by ja...@apache.org.
Minor: Fix typo in Core Concepts section


Project: http://git-wip-us.apache.org/repos/asf/samza/repo
Commit: http://git-wip-us.apache.org/repos/asf/samza/commit/f7ebe591
Tree: http://git-wip-us.apache.org/repos/asf/samza/tree/f7ebe591
Diff: http://git-wip-us.apache.org/repos/asf/samza/diff/f7ebe591

Branch: refs/heads/master
Commit: f7ebe5918cb8a040a6b22007ebb1310c72694e89
Parents: da808fc
Author: Jagadish <jv...@linkedin.com>
Authored: Thu Oct 11 19:47:22 2018 -0700
Committer: Jagadish <jv...@linkedin.com>
Committed: Thu Oct 11 19:47:22 2018 -0700

----------------------------------------------------------------------
 docs/learn/documentation/versioned/core-concepts/core-concepts.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/samza/blob/f7ebe591/docs/learn/documentation/versioned/core-concepts/core-concepts.md
----------------------------------------------------------------------
diff --git a/docs/learn/documentation/versioned/core-concepts/core-concepts.md b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
index 52965a2..c4e5c21 100644
--- a/docs/learn/documentation/versioned/core-concepts/core-concepts.md
+++ b/docs/learn/documentation/versioned/core-concepts/core-concepts.md
@@ -78,7 +78,7 @@ Time is a fundamental concept in stream processing, especially how it is modeled
 On the other hand, in event time, the timestamp of an event is determined by when it actually occurred in the source. For example, a sensor which generates an event could embed the time of occurrence as a part of the event itself. Samza provides event-time based processing by its integration with [Apache BEAM](https://beam.apache.org/documentation/runners/samza/).
 
 ## Processing guarantee
-Samza supports at-least once processing. As the name implies, this ensures that each message in the input stream is processed by the system at-least once. This guarantees no data-loss even when there are failures making Samza a practical choice for building fault-tolerant applications.
+Samza supports at-least once processing. As the name implies, this ensures that each message in the input stream is processed by the system at-least once. This guarantees no data-loss even when there are failures, thereby making Samza a practical choice for building fault-tolerant applications.
 
 
 Next Steps: We are now ready to have a closer look at Samza’s architecture.