You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@apex.apache.org by th...@apache.org on 2016/03/02 02:40:33 UTC
[7/8] incubator-apex-core git commit: APEXCORE-293 Adding Apex Core documentation

APEXCORE-293 Adding Apex Core documentation


Project: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/commit/e1da746e
Tree: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/tree/e1da746e
Diff: http://git-wip-us.apache.org/repos/asf/incubator-apex-core/diff/e1da746e

Branch: refs/heads/APEXCORE-293
Commit: e1da746eaff9f5b0766637daf8e4dda3be965bf3
Parents: 44f220f
Author: Sasha Parfenov <sa...@apache.org>
Authored: Fri Feb 26 17:12:29 2016 -0800
Committer: Thomas Weise <th...@datatorrent.com>
Committed: Sun Feb 28 22:49:38 2016 -0800

----------------------------------------------------------------------
 README.md                              |   2 +
 docs/README.md                         |  34 +++++++
 docs/apex.md                           |  14 ---
 docs/apex_development_setup.md         | 117 +++++++++-------------
 docs/apex_malhar.md                    |  64 ++++++------
 docs/application_development.md        | 140 ++++++--------------------
 docs/application_packages.md           |   8 +-
 docs/autometrics.md                    | 150 ++--------------------------
 docs/configuration_packages.md         |   8 +-
 docs/dtcli.md                          |   9 +-
 docs/favicon.ico                       | Bin 0 -> 25597 bytes
 docs/images/MalharOperatorOverview.png | Bin 297948 -> 0 bytes
 docs/images/malhar-operators.png       | Bin 0 -> 109734 bytes
 docs/index.md                          |  20 ++++
 docs/operator_development.md           |   8 +-
 mkdocs.yml                             |  15 +++
 16 files changed, 201 insertions(+), 388 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index 4b22f71..97c2b03 100644
--- a/README.md
+++ b/README.md
@@ -11,6 +11,8 @@ Please visit the [documentation section](http://apex.incubator.apache.org/docs.h
 
 [Malhar](https://github.com/apache/incubator-apex-malhar) is a library of application building blocks and examples that will help you build out your first Apex application quickly.
 
+Documentation build and hosting process is explained in [docs/README.md].
+
 ##Contributing
 
 This project welcomes new contributors.  If you would like to help by adding new features, enhancements or fixing bugs, check out the [contributing guidelines](http://apex.incubator.apache.org/contributing.html).

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/README.md
----------------------------------------------------------------------
diff --git a/docs/README.md b/docs/README.md
new file mode 100644
index 0000000..c9fe8da
--- /dev/null
+++ b/docs/README.md
@@ -0,0 +1,34 @@
+# Apex Documentation
+
+Apex documentation repository for content available on http://apex.incubator.apache.org/docs/
+
+Documentation is written in [Markdown](https://guides.github.com/features/mastering-markdown/) format and statically generated into HTML using [MkDocs](http://www.mkdocs.org/).  All documentation is located in the [docs](docs) directory, and [mkdocs.yml](mkdocs.yml) file describes the navigation structure of the published documentation.
+
+## Authoring
+
+New pages can be added under [docs](docs) or related sub-category, and a reference to the new page must be added to the [mkdocs.yml](mkdocs.yml) file to make it availabe in the navigation.  Embedded images are typically added to images folder at the same level as the new page.
+
+When creating or editing pages, it can be useful to see the live results, and how the documents will appear when published.  Live preview feature is available by running the following command at the root of the repository:
+
+```bash
+mkdocs serve
+```
+
+For additional details see [writing your docs](http://www.mkdocs.org/user-guide/writing-your-docs/) guide.
+
+## Site Configuration
+
+Guides on applying site-wide [configuration](http://www.mkdocs.org/user-guide/configuration/) and [themeing](http://www.mkdocs.org/user-guide/styling-your-docs/) are available on the MkDocs site.
+
+## Deployment
+
+**Under Review**: Current deployment process is under review, and may change from the one outlined below.
+
+
+Deployment is done from master branch of the repository by executing the following command:
+
+```bash
+mkdocs gh-deploy --clean
+```
+
+This results in all the documentation under [docs](docs) being statically generatd into HTML files and deployed as top level in [gh-pages](https://github.com/apache/incubating-apex-core/tree/gh-pages) branch.  For more details on how this is done see [MkDocs - Deploying Github Pages](http://www.mkdocs.org/user-guide/deploying-your-docs/#github-pages).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/apex.md
----------------------------------------------------------------------
diff --git a/docs/apex.md b/docs/apex.md
deleted file mode 100644
index 215a957..0000000
--- a/docs/apex.md
+++ /dev/null
@@ -1,14 +0,0 @@
-Apache Apex
-================================================================================
-
-Apache Apex (incubating) is the industry’s only open source, enterprise-grade unified stream and batch processing engine.  Apache Apex includes key features requested by open source developer community that are not available in current open source technologies.
-
-* Event processing guarantees
-* In-memory performance & scalability
-* Fault tolerance and state management
-* Native rolling and tumbling window support
-* Hadoop-native YARN & HDFS implementation
-
-For additional information visit [Apache Apex](http://apex.incubator.apache.org/).
-
-[![](images/apex_logo.png)](http://apex.incubator.apache.org/)

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/apex_development_setup.md
----------------------------------------------------------------------
diff --git a/docs/apex_development_setup.md b/docs/apex_development_setup.md
index 777f2f9..0bbabc5 100644
--- a/docs/apex_development_setup.md
+++ b/docs/apex_development_setup.md
@@ -1,36 +1,29 @@
 Apache Apex Development Environment Setup
 =========================================
 
-This document discusses the steps needed for setting up a development environment for creating applications that run on the Apache Apex or the DataTorrent RTS streaming platform.
+This document discusses the steps needed for setting up a development environment for creating applications that run on the Apache Apex platform.
 
 
-Microsoft Windows
-------------------------------
+Development Tools
+-------------------------------------------------------------------------------
 
-There are a few tools that will be helpful when developing Apache Apex applications, some required and some optional:
+There are a few tools that will be helpful when developing Apache Apex applications, including:
 
-1.  *git* -- A revision control system (version 1.7.1 or later). There are multiple git clients available for Windows (<http://git-scm.com/download/win> for example), so download and install a client of your choice.
+1.  **git** - A revision control system (version 1.7.1 or later). There are multiple git clients available for Windows (<http://git-scm.com/download/win> for example), so download and install a client of your choice.
 
-2.  *java JDK* (not JRE). Includes the Java Runtime Environment as well as the Java compiler and a variety of tools (version 1.7.0\_79 or later). Can be downloaded from the Oracle website.
+2.  **java JDK** (not JRE) - Includes the Java Runtime Environment as well as the Java compiler and a variety of tools (version 1.7.0\_79 or later). Can be downloaded from the Oracle website.
 
-3.  *maven* -- Apache Maven is a build system for Java projects (version 3.0.5 or later). It can be downloaded from <https://maven.apache.org/download.cgi>.
+3.  **maven** - Apache Maven is a build system for Java projects (version 3.0.5 or later). It can be downloaded from <https://maven.apache.org/download.cgi>.
 
-4.  *VirtualBox* -- Oracle VirtualBox is a virtual machine manager (version 4.3 or later) and can be downloaded from <https://www.virtualbox.org/wiki/Downloads>. It is needed to run the Data Torrent Sandbox.
+4.  **IDE** (Optional) - If you prefer to use an IDE (Integrated Development Environment) such as *NetBeans*, *Eclipse* or *IntelliJ*, install that as well.
 
-5.  *DataTorrent Sandbox* -- The sandbox can be downloaded from <https://www.datatorrent.com/download>. It is useful for testing simple applications since it contains Apache Hadoop and Data Torrent RTS 3.1.1 pre-installed with a time-limited Enterprise License. If you already installed the RTS Enterprise Edition (evaluation or production license) on a cluster, you can use that setup for deployment and testing instead of the sandbox.
+After installing these tools, make sure that the directories containing the executable files are in your PATH environment variable.
 
-6.  (Optional) If you prefer to use an IDE (Integrated Development Environment) such as *NetBeans*, *Eclipse* or *IntelliJ*, install that as well.
+* **Windows** - Open a console window and enter the command `echo %PATH%` to see the value of the `PATH` variable and verify that the above directories for Java, git, and maven executables are present.  JDK executables like _java_ and _javac_, the directory might be something like `C:\\Program Files\\Java\\jdk1.7.0\_80\\bin`; for _git_ it might be `C:\\Program Files\\Git\\bin`; and for maven it might be `C:\\Users\\user\\Software\\apache-maven-3.3.3\\bin`.  If not, you can change its value clicking on the button at _Control Panel_ &#x21e8; _Advanced System Settings_ &#x21e8; _Advanced tab_ &#x21e8; _Environment Variables_.
+* **Linux and Mac** - Open a console/terminal window and enter the command `echo $PATH` to see the value of the `PATH` variable and verify that the above directories for Java, git, and maven executables are present.  If not, make sure software is downloaded and installed, and optionally PATH reference is added and exported  in a `~/.profile` or `~/.bash_profile`.  For example to add maven located in `/sfw/maven/apache-maven-3.3.3` to PATH add the line: `export PATH=$PATH:/sfw/maven/apache-maven-3.3.3/bin`
 
 
-After installing these tools, make sure that the directories containing the executable files are in your PATH environment; for example, for the JDK executables like _java_ and _javac_, the directory might be something like `C:\\Program Files\\Java\\jdk1.7.0\_80\\bin`; for _git_ it might be `C:\\Program Files\\Git\\bin`; and for maven it might be `C:\\Users\\user\\Software\\apache-maven-3.3.3\\bin`. Open a console window and enter the command:
-
-    echo %PATH%
-
-to see the value of the `PATH` variable and verify that the above directories are present. If not, you can change its value clicking on the button at _Control Panel_ &#x21e8; _Advanced System Settings_ &#x21e8; _Advanced tab_ &#x21e8; _Environment Variables_.
-
-
-Now run the following commands and ensure that the output is something similar to that shown in the table below:
-
+Confirm by running the following commands and comparing with output that show in the table below:
 
 <table>
 <colgroup>
@@ -59,65 +52,52 @@ Now run the following commands and ensure that the output is something similar t
 <tr class="odd">
 <td align="left"><p><tt>mvn --version</tt></p></td>
 <td align="left"><p>Apache Maven 3.3.3 (7994120775791599e205a5524ec3e0dfe41d4a06; 2015-04-22T06:57:37-05:00)</p>
-<p>Maven home: C:\Users\ram\Software\apache-maven-3.3.3\bin\..</p>
-<p>Java version: 1.7.0_80, vendor: Oracle Corporation</p>
-<p>Java home: C:\Program Files\Java\jdk1.7.0_80\jre</p>
-<p>Default locale: en_US, platform encoding: Cp1252</p>
-<p>OS name: &quot;windows 8&quot;, version: &quot;6.2&quot;, arch: &quot;amd64&quot;, family: &quot;windows&quot;</p></td>
+<p>...</p>
+</td>
 </tr>
 </tbody>
 </table>
 
 
-To install the sandbox, first download it from <https://www.datatorrent.com/download> and import the downloaded file into VirtualBox. Once the import completes, you can select it and click the  Start button to start the sandbox.
-
-
-The sandbox is configured with 6GB RAM; if your development machine has 16GB or more, you can increase the sandbox RAM to 8GB or more using the VirtualBox console. This will yield better performance and support larger applications. Additionally, you can change the network adapter from **NAT** to **Bridged Adapter**; this will allow you to login to the sandbox from your host machine using an _ssh_ tool like **PuTTY** and also to transfer files to and from the host using `pscp` on Windows. Of course all such configuration must be done when when the sandbox is not running.
-
-
-You can choose to develop either directly on the sandbox or on your development machine. The advantage of the former is that most of the tools (e.g. _jdk_, _git_, _maven_) are pre-installed and also the package files created by your project are directly available to the Data Torrent tools such as  **dtManage** and **dtcli**. The disadvantage is that the sandbox is a memory-limited environment so running a memory-hungry tool like a Java IDE on it may starve other applications of memory.
+Creating New Apex Project
+-------------------------------------------------------------------------------
 
+After development tools are configured, you can now use the maven archetype to create a basic Apache Apex project.  **Note:** When executing the commands below, replace `3.3.0-incubating` by [latest available version](http://apex.apache.org/downloads.html) of Apache Apex.
 
-You can now use the maven archetype to create a basic Apache Apex project as follows: Put these lines in a Windows command file called, for example, `newapp.cmd` and run it:
 
-    @echo off
-    @rem Script for creating a new application
-    setlocal
-    mvn archetype:generate ^
-    -DarchetypeRepository=https://www.datatorrent.com/maven/content/repositories/releases ^
-      -DarchetypeGroupId=com.datatorrent ^
-      -DarchetypeArtifactId=apex-app-archetype ^
-      -DarchetypeVersion=3.1.1 ^
-      -DgroupId=com.example ^
-      -Dpackage=com.example.myapexapp ^
-      -DartifactId=myapexapp ^
-      -Dversion=1.0-SNAPSHOT
-    endlocal
+* **Windows** - Create a new Windows command file called `newapp.cmd` by copying the lines below, and execute it.  When you run this file, the properties will be displayed and you will be prompted with `` Y: :``; just press **Enter** to complete the project generation.  The caret (^) at the end of some lines indicates that a continuation line follows. 
 
+        @echo off
+        @rem Script for creating a new application
+        setlocal
+        mvn archetype:generate ^
+         -DarchetypeGroupId=org.apache.apex ^
+         -DarchetypeArtifactId=apex-app-archetype -DarchetypeVersion=3.3.0-incubating ^
+         -DgroupId=com.example -Dpackage=com.example.myapexapp -DartifactId=myapexapp ^
+         -Dversion=1.0-SNAPSHOT
+        endlocal
 
 
-The caret (^) at the end of some lines indicates that a continuation line follows. When you run this file, the properties will be displayed and you will be prompted with `` Y: :``; just press **Enter** to complete the project generation.
+* **Linux** - Execute the lines below in a terminal window.  New project will be created in the curent working directory.  The backslash (\\) at the end of the lines indicates continuation.
 
+        mvn archetype:generate \
+         -DarchetypeGroupId=org.apache.apex \
+         -DarchetypeArtifactId=apex-app-archetype -DarchetypeVersion=3.2.0-incubating \
+         -DgroupId=com.example -Dpackage=com.example.myapexapp -DartifactId=myapexapp \
+         -Dversion=1.0-SNAPSHOT
 
-This command file also exists in the Data Torrent _examples_ repository which you can check out with:
 
-    git clone https://github.com/DataTorrent/examples
-
-You will find the script under `examples\tutorials\topnwords\scripts\newapp.cmd`.
-
-You can also, if you prefer, use an IDE to generate the project as described in Section 3 of [Application Packages](application_packages.md) but use the archetype version 3.1.1 instead of 3.0.0.
-
-
-When the run completes successfully, you should see a new directory named `myapexapp` containing a maven project for building a basic Apache Apex application. It includes 3 source files:**Application.java**,  **RandomNumberGenerator.java** and **ApplicationTest.java**. You can now build the application by stepping into the new directory and running the appropriate maven command:
+When the run completes successfully, you should see a new directory named `myapexapp` containing a maven project for building a basic Apache Apex application. It includes 3 source files:**Application.java**,  **RandomNumberGenerator.java** and **ApplicationTest.java**. You can now build the application by stepping into the new directory and running the maven package command:
 
     cd myapexapp
     mvn clean package -DskipTests
 
-The build should create the application package file `myapexapp\target\myapexapp-1.0-SNAPSHOT.apa`. This file can then be uploaded to the Data Torrent GUI tool on the sandbox (called **dtManage**) and launched  from there. It generates a stream of random numbers and prints them out, each prefixed by the string  `hello world: `.  If you built this package on the host, you can transfer it to the sandbox using the `pscp` tool bundled with **PuTTY** mentioned earlier.
+The build should create the application package file `myapexapp/target/myapexapp-1.0-SNAPSHOT.apa`. This application package can then be used to launch example application via **dtCli**, or other visual management tools.  When running, this application will generate a stream of random numbers and print them out, each prefixed by the string `hello world:`.
 
+Building Apex Demos
+-------------------------------------------------------------------------------
 
-If you want to checkout the Apache Apex source repositories and build them, you can do so by running the script `build-apex.cmd` located in the same place in the examples repository described above. The source repositories contain more substantial demo applications and the associated source code. Alternatively, if you do not want to use the script, you can follow these simple manual steps:
-
+If you want to see more substantial Apex demo applications and the associated source code, you can follow these simple steps to check out and build them.
 
 1.  Check out the source code repositories:
 
@@ -126,26 +106,23 @@ If you want to checkout the Apache Apex source repositories and build them, you
 
 2.  Switch to the appropriate release branch and build each repository:
 
-        pushd incubator-apex-core
-        git checkout release-3.1
+        cd incubator-apex-core
         mvn clean install -DskipTests
-        popd
-        pushd incubator-apex-malhar
-        git checkout release-3.1
+
+        cd incubator-apex-malhar
         mvn clean install -DskipTests
-        popd
+
 
 The `install` argument to the `mvn` command installs resources from each project to your local maven repository (typically `.m2/repository` under your home directory), and **not** to the system directories, so Administrator privileges are not required. The  `-DskipTests` argument skips running unit tests since they take a long time. If this is a first-time installation, it might take several minutes to complete because maven will download a number of associated plugins.
 
-After the build completes, you should see the demo application package files in the target directory under each demo subdirectory in `incubator-apex-malhar\demos\`.
+After the build completes, you should see the demo application package files in the target directory under each demo subdirectory in `incubator-apex-malhar/demos`.
+
 
-Linux
-------------------
 
-Most of the instructions for Linux (and other Unix-like systems) are similar to those for Windows described above, so we will just note the differences.
+Sandbox
+-------------------------------------------------------------------------------
 
+To jump start development with an Apache Hadoop single node cluster, [DataTorrent Sandbox](https://www.datatorrent.com/download) powered by VirtualBox is available on Windows, Linux, or Mac platforms.  The sandbox is configured by default to run with 6GB RAM; if your development machine has 16GB or more, you can increase the sandbox RAM to 8GB or more using the VirtualBox console.  This will yield better performance and support larger applications.  The advantage of developing in the sandbox is that most of the tools (e.g. _jdk_, _git_, _maven_), Hadoop YARN and HDFS, and a distribution of Apache Apex and DataTorrent RTS are pre-installed.  The disadvantage is that the sandbox is a memory-limited environment, and requires settings changes and restarts to adjust memory available for development and testing.
 
-The pre-requisites (such as _git_, _maven_, etc.) are the same as for Windows described above; please run the commands in the table and ensure that appropriate versions are present in your PATH environment variable (the command to display that variable is: `echo $PATH`).
 
 
-The maven archetype command is the same except that continuation lines use a backslash (``\``) instead of caret (``^``); the script for it is available in the same location and is named `newapp` (without the `.cmd` extension). The script to checkout and build the Apache Apex repositories is named `build-apex`.

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/apex_malhar.md
----------------------------------------------------------------------
diff --git a/docs/apex_malhar.md b/docs/apex_malhar.md
index ef2e371..45dee76 100644
--- a/docs/apex_malhar.md
+++ b/docs/apex_malhar.md
@@ -1,19 +1,19 @@
 Apache Apex Malhar
 ================================================================================
 
-Apache Apex Malhar is an open source operator and codec library that can be used with the Apache Apex platform to build real-time streaming applications.  As part of enabling enterprises extract value quickly, Malhar operators help get data in, analyze it in real-time and get data out of Hadoop in real-time with no paradigm limitations.  In addition to the operators, the library contains a number of demos applications, demonstrating operator features and capabilities.
+Apache Apex Malhar is an open source operator and codec library that can be used with the [Apache Apex](http://apex.apache.org/) platform to build real-time streaming applications.  Enabling users to extract value quickly, Malhar operators help get data in, analyze it in real-time, and get data out of Hadoop.  In addition to the operators, the library contains a number of demos applications, demonstrating operator features and capabilities.  To see the full list of available operators and related documentation, visit [Apex Malhar on Github](https://github.com/apache/incubator-apex-malhar)
 
-![MalharDiagram](images/MalharOperatorOverview.png)
+![MalharDiagram](images/malhar-operators.png)
 
 # Capabilities common across Malhar operators
 
-For most streaming platforms, connectors are afterthoughts and often end up being simple ‘bolt-ons’ to the platform. As a result they often cause performance issues or data loss when put through failure scenarios and scalability requirements. Malhar operators do not face these issues as they were designed to be integral parts of apex*.md RTS. Hence, they have following core streaming runtime capabilities
+For most streaming platforms, connectors are afterthoughts and often end up being simple ‘bolt-ons’ to the platform. As a result they often cause performance issues or data loss when put through failure scenarios and scalability requirements. Malhar operators do not face these issues as they were designed to be integral parts of Apex. Hence, they have following core streaming runtime capabilities
 
-1.  **Fault tolerance** – Apache Apex Malhar operators where applicable have fault tolerance built in. They use the checkpoint capability provided by the framework to ensure that there is no data loss under ANY failure scenario.
-2.  **Processing guarantees** – Malhar operators where applicable provide out of the box support for ALL three processing guarantees – exactly once, at-least once & at-most once WITHOUT requiring the user to write any additional code.  Some operators like MQTT operator deal with source systems that cant track processed data and hence need the operators to keep track of the data. Malhar has support for a generic operator that uses alternate storage like HDFS to facilitate this. Finally for databases that support transactions or support any sort of atomic batch operations Malhar operators can do exactly once down to the tuple level.
+1.  **Fault tolerance** – Malhar operators where applicable have fault tolerance built in. They use the checkpoint capability provided by the framework to ensure that there is no data loss under ANY failure scenario.
+2.  **Processing guarantees** – Malhar operators where applicable provide out of the box support for ALL three processing guarantees – exactly once, at-least once, and at-most once WITHOUT requiring the user to write any additional code.  Some operators, like MQTT operator, deal with source systems that can not track processed data and hence need the operators to keep track of the data.  Malhar has support for a generic operator that uses alternate storage like HDFS to facilitate this.  Finally for databases that support transactions or support any sort of atomic batch operations Malhar operators can do exactly once down to the tuple level.
 3.  **Dynamic updates** – Based on changing business conditions you often have to tweak several parameters used by the operators in your streaming application without incurring any application downtime. You can also change properties of a Malhar operator at runtime without having to bring down the application.
 4.  **Ease of extensibility** – Malhar operators are based on templates that are easy to extend.
-5.  **Partitioning support** – In streaming applications the input data stream often needs to be partitioned based on the contents of the stream. Also for operators that ingest data from external systems partitioning needs to be done based on the capabilities of the external system. E.g. With the Kafka or Flume operator, the operator can automatically scale up or down based on the changes in the number of Kafka partitions or Flume channels
+5.  **Partitioning support** – In streaming applications the input data stream often needs to be partitioned based on the contents of the stream. Also for operators that ingest data from external systems partitioning needs to be done based on the capabilities of the external system.  For example with Kafka, the operator can automatically scale up or down based on the changes in the number of Kafka partitions.
 
 # Operator Library Overview
 
@@ -21,45 +21,39 @@ For most streaming platforms, connectors are afterthoughts and often end up bein
 
 Below is a summary of the various sub categories of input and output operators. Input operators also have a corresponding output operator
 
-*   **File Systems** – Most streaming analytics use cases we have seen require the data to be stored in HDFS or perhaps S3 if the application is running in AWS. Also, customers often need to re-run their streaming analytical applications against historical data or consume data from upstream processes that are perhaps writing to some NFS share. Hence, it’s not just enough to be able to save data to various file systems. You also have to be able to read data from them. RTS supports input & output operators for HDFS, S3, NFS & Local Files
-*   **Flume** – NOTE: Flume operator is not yet part of Malhar
+*   **File Systems** – Most streaming analytics use cases require the data to be stored in HDFS or perhaps S3 if the application is running in AWS.  Users often need to re-run their streaming analytical applications against historical data or consume data from upstream processes that are perhaps writing to some NFS share.  Apex supports input & output operators for HDFS, S3, NFS & Local Files.  There are also File Splitter and Block Reader operators, which can accelecate processing of large files by splitting and paralellizing the work across non-overlapping sets of file blocks.
+*   **Relational Databases** – Most stream processing use cases require some reference data lookups to enrich, tag or filter streaming data. There is also a need to save results of the streaming analytical computation to a database so an operational dashboard can see them. Apex supports a JDBC operator so you can read/write data from any JDBC compliant RDBMS like Oracle, MySQL, Sqlite, etc.
+*   **NoSQL Databases** – NoSQL key-value pair databases like Cassandra & HBase are a common part of streaming analytics application architectures to lookup reference data or store results.  Malhar has operators for HBase, Cassandra, Accumulo, Aerospike, MongoDB, and CouchDB.
+*   **Messaging Systems** – Kafka, JMS, and similar systems are the workhorses of messaging infrastructure in most enterprises.  Malhar has a robust, industry-tested set of operators to read and write Kafka, JMS, ZeroMQ, and RabbitMQ messages.
+*   **Notification Systems** – Malhar includes an operator for sending notifications via SMTP.
+*   **In-memory Databases & Caching platforms** - Some streaming use cases need instantaneous access to shared state across the application. Caching platforms and in-memory databases serve this purpose really well. To support these use cases, Malhar has operators for memcached and Redis.
+*   **Social Media** - Malhar includes an operator to connect to the popular Twitter stream fire hose.
+*   **Protocols** - Malhar provides connectors that can communicate in HTTP, RSS, Socket, WebSocket, FTP, and MQTT.
 
-Many customers have existing Flume deployments that are being used to aggregate log data from variety of sources. However Flume does not allow analytics on the log data on the fly. The Flume input/output operator enables RTS to consume data from flume and analyze it in real-time before being persisted.
+## Parsers
 
-*   **Relational databases** – Most stream processing use cases require some reference data lookups to enrich, tag or filter streaming data. There is also a need to save results of the streaming analytical computation to a database so an operational dashboard can see them. RTS supports a JDBC operator so you can read/write data from any JDBC compliant RDBMS like Oracle, MySQL etc.
-*   **NoSQL databases** –NoSQL key-value pair databases like Cassandra & HBase are becoming a common part of streaming analytics application architectures to lookup reference data or store results. Malhar has operators for HBase, Cassandra, Accumulo (common with govt. & healthcare companies) MongoDB & CouchDB.
-*   **Messaging systems** – JMS brokers have been the workhorses of messaging infrastructure in most enterprises. Also Kafka is fast coming up in almost every customer we talk to. Malhar has operators to read/write to Kafka, any JMS implementation, ZeroMQ & RabbitMQ.
-*   **Notification systems** – Almost every streaming analytics application has some notification requirements that are tied to a business condition being triggered. Malhar supports sending notifications via SMTP & SNMP. It also has an alert escalation mechanism built in so users don’t get spammed by notifications (a common drawback in most streaming platforms)
-*   **In-memory Databases & Caching platforms** - Some streaming use cases need instantaneous access to shared state across the application. Caching platforms and in-memory databases serve this purpose really well. To support these use cases, Malhar has operators for memcached & Redis
-*   **Protocols** - Streaming use cases driven by machine-to-machine communication have one thing in common – there is no standard dominant protocol being used for communication. Malhar currently has support for MQTT. It is one of the more commonly, adopted protocols we see in the IoT space. Malhar also provides connectors that can directly talk to HTTP, RSS, Socket, WebSocket & FTP sources
+There are many industry vertical specific data formats that a streaming application developer might need to parse. Often there are existing parsers available for these that can be directly plugged into an Apache Apex application. For example in the Telco space, a Java based CDR parser can be directly plugged into Apache Apex operator. To further simplify development experience, Malhar also provides some operators for parsing common formats like XML (DOM & SAX), JSON (flat map converter), Apache log files, syslog, etc.
 
+## Stream manipulation
 
+Streaming data inevitably needs processing to clean, filter, tag, summarize, etc. The goal of Malhar is to enable the application developer to focus on WHAT needs to be done to the stream to get it in the right format and not worry about the HOW.  Malhar has several operators to perform the common stream manipulation actions like – GroupBy, Join, Distinct/Unique, Limit, OrderBy, Split, Sample, Inner join, Outer join, Select, Update etc.
 
 ## Compute
 
-One of the most important promises of a streaming analytics platform like Apache Apex is the ability to do analytics in real-time. However delivering on the promise becomes really difficult when the platform does not provide out of the box operators to support variety of common compute functions as the user then has to worry about making these scalable, fault tolerant etc. Malhar takes this responsibility away from the application developer by providing a huge variety of out of the box computational operators. The application developer can thus focus on the analysis.
+One of the most important promises of a streaming analytics platform like Apache Apex is the ability to do analytics in real-time. However delivering on the promise becomes really difficult when the platform does not provide out of the box operators to support variety of common compute functions as the user then has to worry about making these scalable, fault tolerant, stateful, etc.  Malhar takes this responsibility away from the application developer by providing a variety of out of the box computational operators.
 
 Below is just a snapshot of the compute operators available in Malhar
 
-*   Statistics & Math - Provide various mathematical and statistical computations over application defined time windows.
-*   Filtering & pattern matching
-*   Machine learning & Algorithms
-*   Real-time model scoring is a very common use case for stream processing platforms. &nbsp;Malhar allows users to invoke their R models from streaming applications
-*   Sorting, Maps, Frequency, TopN, BottomN, Random Generator etc.
-
-
-## Query & Script invocation
-
-Many streaming use cases are legacy implementations that need to be ported over. This often requires re-use some of the existing investments and code that perhaps would be really hard to re-write. With this in mind, Malhar supports invoking external scripts and queries as part of the streaming application using operators for invoking SQL query, Shell script, Ruby, Jython, and JavaScript etc.
-
-## Parsers
-
-There are many industry vertical specific data formats that a streaming application developer might need to parse. Often there are existing parsers available for these that can be directly plugged into an Apache Apex application. For example in the Telco space, a Java based CDR parser can be directly plugged into Apache Apex operator. To further simplify development experience, Malhar also provides some operators for parsing common formats like XML (DOM & SAX), JSON (flat map converter), Apache log files, syslog, etc.
-
-## Stream manipulation
+*   Statistics and math - Various mathematical and statistical computations over application defined time windows.
+*   Filtering and pattern matching
+*   Sorting, maps, frequency, TopN, BottomN
+*   Random data generators
 
-Streaming data aka ‘stream’ is raw data that inevitably needs processing to clean, filter, tag, summarize etc. The goal of Malhar is to enable the application developer to focus on ‘WHAT’ needs to be done to the stream to get it in the right format and not worry about the ‘HOW’. Hence, Malhar has several operators to perform the common stream manipulation actions like – DeDupe, GroupBy, Join, Distinct/Unique, Limit, OrderBy, Split, Sample, Inner join, Outer join, Select, Update etc.
+## Languages Support
 
-## Social Media
+Migrating to a new platform often requires re-use of the existing code that would be difficult or time-consuming to re-write.  With this in mind, Malhar supports invocation of code written in other languages by wrapping them in one of the library operators, and allows execution of software written in:
 
-Malhar includes an operator to connect to the popular Twitter stream fire hose.
+* JavaScript
+* Python
+* R
+* Ruby
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/application_development.md
----------------------------------------------------------------------
diff --git a/docs/application_development.md b/docs/application_development.md
index a76ad8a..c5c9075 100644
--- a/docs/application_development.md
+++ b/docs/application_development.md
@@ -1,50 +1,16 @@
 Application Developer Guide
 ===========================
 
-Real-time big data processing is not only important but has become
-critical for businesses which depend on accurate and timely analysis of
-their business data. A few businesses have yielded to very expensive
-solutions like building an in-house, real-time analytics infrastructure
-supported by an internal development team, or buying expensive
-proprietary software. A large number of businesses are dealing with the
-requirement just by trying to make Hadoop do their batch jobs in smaller
-iterations. Over the last few years, Hadoop has become ubiquitous in the
-big data processing space, replacing expensive proprietary hardware and
-software solutions for massive data processing with very cost-effective,
-fault-tolerant, open-sourced, and commodity-hardware-based solutions.
-While Hadoop has been a game changer for companies, it is primarily a
-batch-oriented system, and does not yet have a viable option for
-real-time data processing.  Most companies with real-time data
-processing end up having to build customized solutions in addition to
-their Hadoop infrastructure.
-
- 
-
-The DataTorrent platform is designed to process massive amounts of
-real-time events natively in Hadoop. This can be event ingestion,
-processing, and aggregation for real-time data analytics, or can be
-real-time business logic decisioning such as cell tower load balancing,
-real-time ads bidding, or fraud detection.  The platform has the ability
-to repair itself in real-time (without data loss) if hardware fails, and
-adapt to changes in load by adding and removing computing resources
-automatically.
-
-
-
-DataTorrent is a native Hadoop application. It runs as a YARN
-(Hadoop 2.x) application and leverages Hadoop as a distributed operating
-system. All the basic distributed operating system capabilities of
-Hadoop like resource allocation (Resource Manager, distributed file system (HDFS),
-multi-tenancy, security, fault-tolerance, scalability, etc.
-are supported natively in all streaming applications.  Just as Hadoop
-for map-reduce handles all the details of the application allowing you
-to only focus on writing the application (the mapper and reducer
-functions), the platform handles all the details of streaming execution,
-allowing you to only focus on your business logic. Using the platform
-removes the need to maintain separate clusters for real-time
-applications.
-
-
+The Apex platform is designed to process massive amounts of
+real-time events natively in Hadoop.  It runs as a YARN (Hadoop 2.x) 
+application and leverages Hadoop as a distributed operating
+system.  All the basic distributed operating system capabilities of
+Hadoop like resource management (YARN), distributed file system (HDFS),
+multi-tenancy, security, fault-tolerance, and scalability are supported natively 
+in all the Apex applications.  The platform handles all the details of the application 
+execution, including dynamic scaling, state checkpointing and recovery, event 
+processing guarantees, etc. allowing you to focus on writing your application logic without
+mixing operational and functional concerns.
 
 In the platform, building a streaming application can be extremely
 easy and intuitive.  The application is represented as a Directed
@@ -56,25 +22,24 @@ processing is not available in the Operator Library, one can easily
 write a custom operator. We refer those interested in creating their own
 operators to the [Operator Development Guide](operator_development.md).
 
+
 Running A Test Application
 =======================================
 
-This chapter will help you with a quick start on running an
-application. If you are starting with the platform for the first time,
-it would be informative to open an existing application and see it run.
-Do the following steps to run the PI demo, which computes the value of
-PI  in a simple
-manner:
+If you are starting with the Apex platform for the first time,
+it can be informative to launch an existing application and see it run.
+One of the simplest examples provided in [Apex Malhar](apex_malhar.md) is a Pi demo application,
+which computes the value of PI using random numbers.  After [setting up development environment](apex_development_setup.md)
+Pi demo can be launched as follows:
 
-1.  Open up platform files in your IDE (for example NetBeans, or Eclipse)
-2.  Open Demos project
-3.  Open Test Packages and run ApplicationTest.java in pi package
-4.  See the results in your system console
+1.  Open up Apex Malhar files in your IDE (for example Eclipse, IntelliJ, NetBeans, etc)
+2.  Navigate to `demos/pi/src/test/java/com/datatorrent/demos/ApplicationTest.java`
+3.  Run the test for ApplicationTest.java
+4.  View the output in system console
 
 
-
-Congratulations, you just ran your first real-time streaming demo
-:) This demo is very simple and has four operators. The first operator
+Congratulations, you just ran your first real-time streaming demo :) 
+This demo is very simple and has four operators. The first operator
 emits random integers between 0 to 30, 000. The second operator receives
 these coefficients and emits a hashmap with x and y values each time it
 receives two values. The third operator takes these values and computes
@@ -119,6 +84,7 @@ platform. In the remaining part of this document we will go through
 details needed for you to develop and run streaming applications in
 Malhar.
 
+
 Test Application: Yahoo! Finance Quotes
 ----------------------------------------------------
 
@@ -622,7 +588,7 @@ attribute APPLICATION_WINDOW_COUNT.
 In the rest of this chapter we will run through the process of
 running this application. We assume that  you are familiar with details
 of your Hadoop infrastructure. For installation
-details please refer to the [Installation Guide](installation.md).
+details please refer to the [Installation Guide](http://docs.datatorrent.com/installation/).
 
 
 Running a Test Application
@@ -1449,7 +1415,7 @@ not impact functionality of the operator. Users can change certain
 attributes in runtime. Users cannot add attributes to operators; they
 are pre-defined by the platform. They are interpreted by the platform
 and thus cannot be defined in user created code (like properties).
-Details of attributes are covered in  [Configuration Guide](configuration.md).
+Details of attributes are covered in  [Configuration Guide](http://docs.datatorrent.com/configuration/).
 
 ### Operator State
 
@@ -1857,10 +1823,7 @@ Hadoop is a multi-tenant distributed operating system. Security is
 an intrinsic element of multi-tenancy as without it a cluster cannot be
 reasonably be shared among enterprise applications. Streaming
 applications follow all multi-tenancy security models used in Hadoop as
-they are native Hadoop applications. For details refer to the
-[Operation and Installation
-Guide](https://www.datatorrent.com/docs/guides/OperationandInstallationGuide.html)
-.
+they are native Hadoop applications.
 
 Security
 ---------------------
@@ -2824,7 +2787,7 @@ is not yet available.
 
 
 
-9: Dynamic Application Modifications
+Dynamic Application Modifications
 =================================================
 
 Dynamic application modifications are being worked on and most of
@@ -2872,7 +2835,7 @@ Dynamic modifications to applications are foundational part of the
 platform. They enable users to build layers over the applications. Users
 can also save all the changes done since the application launch, and
 therefore predictably get the application to its current state. For
-details refer to  [Configuration Guide](configuration.md)
+details refer to  [Configuration Guide](http://docs.datatorrent.com/configuration/)
 .
 
 
@@ -2881,54 +2844,11 @@ details refer to  [Configuration Guide](configuration.md)
 
 
 
-
-User Interface
-===========================
-
-The platform provides a rich user interface. This includes tools
-to monitor the application system metrics (throughput, latency, resource
-utilization, etc.); dashboards for application data, replay, errors; and
-a Developer studio for application creation, launch etc. For details
-refer to  [UI Console Guide](dtmanage.md).
-
-
-
 Demos
 ==================
 
-In this section we list some of the demos that come packaged with
-installer. The source code for the demos is available in the open-source
+The source code for the demos is available in the open-source
 [Apache Apex-Malhar repository](https://github.com/apache/incubator-apex-malhar).
 All of these do computations in real-time. Developers are encouraged to
 review them as they use various features of the platform and provide an
-opportunity for quick learning.
-
-1.  Computation of PI:
-    Computes PI by generating a random location on X-Y plane and
-    measuring how often it lies within the unit circle centered
-    at (0,0).
-2.  Yahoo! Finance quote computation:
-    Computes ticker quote, 1-day chart (per min), and simple moving
-    averages (per 5 min).
-3.  Echoserver Reads messages from a
-    network connection and echoes them back out.
-4.  Twitter top N tweeted urls: Computes
-    top N tweeted urls over last 5 minutes
-5.  Twitter trending hashtags: Computes
-    the top Twitter Hashtags over the last 5 minutes
-6.  Twitter top N frequent words:
-    Computes top N frequent words in a sliding window
-7.  Word count: Computes word count for
-    all words within a large file
-8.  Mobile location tracker: Tracks
-    100,000 cell phones within an area code moving at car speed (jumping
-    cell phone towers every 1-5 seconds).
-9.  Frauddetect: Analyzes a stream of
-    credit card merchant transactions.
-10. Mroperator:Contains several
-    map-reduce applications.
-11. R: Analyzes a synthetic stream of
-    eruption event data for the Old Faithful
-    geyser (https://en.wikipedia.org/wiki/Old_Faithful).
-12. Machinedata: Analyzes a synthetic
-    stream of events to determine health of a machine.  
+opportunity for quick learning.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/application_packages.md
----------------------------------------------------------------------
diff --git a/docs/application_packages.md b/docs/application_packages.md
index 521779a..06c2980 100644
--- a/docs/application_packages.md
+++ b/docs/application_packages.md
@@ -107,7 +107,7 @@ other IDEs, like Eclipse or IntelliJ, is similar.
 # Writing Your Own App Package
 
 
-Please refer to the [Creating Apps](create.md) on the basics on how to write an Apache Apex application.  In your AppPackage project, you can add custom operators (refer to [Operator Development Guide](https://www.datatorrent.com/docs/guides/OperatorDeveloperGuide.html)), project dependencies, default and required configuration properties, pre-set configurations and other metadata.
+Please refer to the [Creating Apps](http://docs.datatorrent.com/create/) on the basics on how to write an Apache Apex application.  In your AppPackage project, you can add custom operators (refer to [Operator Development Guide](operator_development.md), project dependencies, default and required configuration properties, pre-set configurations and other metadata.
 
 ## Adding (and removing) project dependencies
 
@@ -398,8 +398,6 @@ property:
 
         dt.attr.APPLICATION_NAME
 
-There are also other properties that can be set.  For details on
-properties, refer to the [Operation and Installation Guide](https://www.datatorrent.com/docs/guides/OperationandInstallationGuide.html).
 
 In this example, property some_name_1 is a required property which
 must be set at launch time, or it must be set by a pre-set configuration
@@ -623,12 +621,12 @@ Here is an example of launching an application through curl:
  lications/MyFirstApplication/launch
 ```
 
-Please refer to the [Gateway API reference](https://www.google.com/url?q=https://www.datatorrent.com/docs/guides/DTGatewayAPISpecification.html&sa=D&usg=AFQjCNEWfN7-e7fd6MoWZjmJUE3GW7UwdQ) for the complete specification of the REST API.
+Please refer to the [Gateway API](http://docs.datatorrent.com/dtgateway_api/) for the complete specification of the REST API.
 
 # Examining and Launching Application Packages Through Apex CLI
 
 If you are working with Application Packages in the local filesystem and
-do not want to deal with dtGateway, you can use the Apex Command Line Interface (dtcli).  Please refer to the [Gateway API](dtgateway_api.md)
+do not want to deal with dtGateway, you can use the Apex Command Line Interface (dtcli).  Please refer to the [Gateway API](http://docs.datatorrent.com/dtgateway_api/)
 to see samples for these commands.
 
 ## Getting Application Package Meta Information

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/autometrics.md
----------------------------------------------------------------------
diff --git a/docs/autometrics.md b/docs/autometrics.md
index f6000e8..c534fb2 100644
--- a/docs/autometrics.md
+++ b/docs/autometrics.md
@@ -123,6 +123,13 @@ An instance of above aggregator can be specified as the `METRIC_AGGREGATOR` for
 ```
 
 # Retrieving AutoMetrics
+
+There are two options for retrieving the AutoMetrics:
+
+* Throught DataTorrent Gateway REST API
+* Through REST service on the port of the running STRAM
+
+
 The Gateway REST API provides a way to retrieve the latest AutoMetrics for each logical operator.  For example:
 
 ```
@@ -167,145 +174,4 @@ GET /ws/v2/applications/{appid}/logicalPlan/operators/{opName}
 }
 ```
 
-However, just like AutoMetrics, the Gateway only provides the latest metrics.  For historical metrics, we will need the help of App Data Tracker.
-
-# App Data Tracker
-As discussed above, STRAM aggregates the AutoMetrics from physical operators (partitions) to something that makes sense in one logical operator.  It pushes the aggregated AutoMetrics values using Websocket to the Gateway at every second along with system metrics for each operator.  Gateway relays the information to an application called App Data Tracker.  It is another Apex application that runs in the background and further aggregates the incoming values by time bucket and stores the values in HDHT.  It also allows the outside to retrieve the aggregated AutoMetrics and system metrics through websocket interface.
-
-![AppDataTracker](images/autometrics/adt.png)
-
-App Data Tracker is enabled by having these properties in dt-site.xml:
-
-```xml
-<property>
-  <name>dt.appDataTracker.enable</name>
-  <value>true</value>
-</property>
-<property>
-  <name>dt.appDataTracker.transport</name>
-  <value>builtin:AppDataTrackerFeed</value>
-</property>
-<property>
-  <name>dt.attr.METRICS_TRANSPORT</name>
-  <value>builtin:AppDataTrackerFeed</value>
-</property>
-```
-
-All the applications launched after the App Data Tracker is enabled will have metrics sent to it.
-
-**Note**: The App Data Tracker will be shown running in dtManage as a “system app”.  It will show up if the “show system apps” button is pressed.
-
-By default, the time buckets App Data Tracker aggregates upon are one minute, one hour and one day.  It can be overridden by changing the operator attribute `METRICS_DIMENSIONS_SCHEME`.
-
-Also by default, the app data tracker performs all these aggregations: SUM, MIN, MAX, AVG, COUNT, FIRST, LAST on all number metrics.  You can also override by changing the same operator attribute `METRICS_DIMENSIONS_SCHEME`, provided the custom aggregator is known to the App Data Tracker.  (See next section)
-
-# Custom Aggregator in App Data Tracker
-Custom aggregators allow you to do your own custom computation on statistics generated by any of your applications. In order to implement a Custom aggregator you have to do two things:
-
-1. Combining new inputs with the current aggregation
-2. Combining two aggregations together into one aggregation
-
-Let’s consider the case where we want to perform the following rolling average:
-
-Y_n = ½ * X_n + ½ * X_n-1 + ¼ * X_n-2 + ⅛ * X_n-3 +...
-
-This aggregation could be performed by the following Custom Aggregator:
-
-```java
-@Name("IIRAVG")
-public class AggregatorIIRAVG extends AbstractIncrementalAggregator
-{
-  ...
-
-  private void aggregateHelper(DimensionsEvent dest, DimensionsEvent src)
-  {
-    double[] destVals = dest.getAggregates().getFieldsDouble();
-    double[] srcVals = src.getAggregates().getFieldsDouble();
-
-    for (int index = 0; index < destLongs.length; index++) {
-      destVals[index] = .5 * destVals[index] + .5 * srcVals[index];
-    }
-  }
-
-  @Override
-  public void aggregate(Aggregate dest, InputEvent src)
-  {
-    //Aggregate a current aggregation with a new input
-    aggregateHelper(dest, src);
-  }
-
-  @Override
-  public void aggregate(Aggregate destAgg, Aggregate srcAgg)
-  {
-    //Combine two existing aggregations together
-    aggregateHelper(destAgg, srcAgg);
-  }
-}
-```
-
-## Discovery of Custom Aggregators
-AppDataTracker searches for custom aggregator jars under the following directories statically before launching:
-
-1. {dt\_installation\_dir}/plugin/aggregators
-2. {user\_home\_dir}/.dt/plugin/aggregators
-
-It uses reflection to find all the classes that extend from `IncrementalAggregator` and `OTFAggregator` in these jars and registers them with the name provided by `@Name` annotation (or class name when `@Name` is absent).
-
-# Using `METRICS_DIMENSIONS_SCHEME`
-
-Here is a sample code snippet on how you can make use of `METRICS_DIMENSIONS_SCHEME` to set your own time buckets and your own set of aggregators for certain `AutoMetric`s performed by the App Data Tracker in your application.
-
-```java
-  @Override
-  public void populateDAG(DAG dag, Configuration configuration)
-  {
-    ...
-    LineReceiver lineReceiver = dag.addOperator("LineReceiver", new LineReceiver());
-    ...
-    AutoMetric.DimensionsScheme dimensionsScheme = new AutoMetric.DimensionsScheme()
-    {
-      String[] timeBuckets = new String[] { "1s", "1m", "1h" };
-      String[] lengthAggregators = new String[] { "IIRAVG", "SUM" };
-      String[] countAggregators = new String[] { "SUM" };
-
-      /* Setting the aggregation time bucket to be one second, one minute and one hour */
-      @Override
-      public String[] getTimeBuckets()
-      {
-        return timeBuckets;
-      }
-
-      @Override
-      public String[] getDimensionAggregationsFor(String logicalMetricName)
-      {
-        if ("length".equals(logicalMetricName)) {
-          return lengthAggregators;
-        } else if ("count".equals(logicalMetricName)) {
-          return countAggregators;
-        } else {
-          return null; // use default
-        }
-      }
-    };
-
-    dag.setAttribute(lineReceiver, OperatorContext.METRICS_DIMENSIONS_SCHEME, dimensionsScheme);
-    ...
-  }
-```
-
-
-# Dashboards
-With App Data Tracker enabled, you can visualize the AutoMetrics and system metrics in the Dashboards within dtManage.   Refer back to the diagram in the App Data Tracker section, dtGateway relays queries and query results to and from the App Data Tracker.  In this way, dtManage sends queries and receives results from the App Data Tracker via dtGateway and uses the results to let the user visualize the data.
-
-Click on the visualize button in dtManage's application page.
-
-![AppDataTracker](images/autometrics/visualize.png)
-
-You will see the dashboard for the AutoMetrics and the system metrics.
-
-![AppDataTracker](images/autometrics/dashboard.png)
-
-The left widget shows the AutoMetrics of `line` and `count` for the LineReceiver operator.  The right widget shows the system metrics.
-
-The Dashboards have some simple builtin widgets to visualize the data.  Line charts and bar charts are some examples.
-Users will be able to implement their own widgets to visualize their data.
+However, just like AutoMetrics, the Gateway only provides the latest metrics.  For historical metrics, we will need the help of [App Data Tracker](http://docs.datatorrent.com/autometrics/#app-data-tracker).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/configuration_packages.md
----------------------------------------------------------------------
diff --git a/docs/configuration_packages.md b/docs/configuration_packages.md
index 30f1717..a0d5f90 100644
--- a/docs/configuration_packages.md
+++ b/docs/configuration_packages.md
@@ -91,13 +91,13 @@ Example:
 ```
   <groupId>com.example</groupId>
   <version>1.0.0</version>
-  <artifactId>mydtconf</artifactId>
+  <artifactId>mycustomconf</artifactId>
   <packaging>jar</packaging>
   <!-- change these to the appropriate values -->
-  <name>My DataTorrent Application Configuration</name>
-  <description>My DataTorrent Application Configuration Description</description>
+  <name>My Custom Application Configuration</name>
+  <description>My Custom Application Configuration Description</description>
   <properties>
-    <datatorrent.apppackage.name>mydtapp</datatorrent.apppackage.name>
+    <datatorrent.apppackage.name>mycustomapp</datatorrent.apppackage.name>
     <datatorrent.apppackage.minversion>1.0.0</datatorrent.apppackage.minversion>
    <datatorrent.apppackage.maxversion>1.9999.9999</datatorrent.apppackage.maxversion>
     <datatorrent.appconf.classpath>classpath/*</datatorrent.appconf.classpath>

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/dtcli.md
----------------------------------------------------------------------
diff --git a/docs/dtcli.md b/docs/dtcli.md
index 813a27f..8cf11e6 100644
--- a/docs/dtcli.md
+++ b/docs/dtcli.md
@@ -1,12 +1,7 @@
 Apache Apex Command Line Interface
 ================================================================================
 
-dtCli, the Apache Apex command line interface, can be used to launch, monitor, and manage
-Apache Apex applications.  dtCli is a wrapper around the [REST API](dtgateway_api.md) provided by dtGatway, and
-provides a developer friendly way of interacting with Apache Apex platform. The CLI enables a much higher level of feature set by
-hiding deep details of REST API.  Another advantage of dtCli is to provide scope, by connecting and executing commands in a context
-of specific application.  dtCli enables easy integration with existing enterprise toolset for automated application monitoring
-and management.  Currently the following high level tasks are supported.
+dtCli, the Apache Apex command line interface, can be used to launch, monitor, and manage Apache Apex applications.  It provides a developer friendly way of interacting with Apache Apex platform.  Another advantage of dtCli is to provide scope, by connecting and executing commands in a context of specific application.  dtCli enables easy integration with existing enterprise toolset for automated application monitoring and management.  Currently the following high level tasks are supported.
 
 -   Launch or kill applications
 -   View system metrics including load, throughput, latency, etc.
@@ -19,7 +14,7 @@ and management.  Currently the following high level tasks are supported.
 
 ## dtcli Commands
 
-dtCli can be launched by running following command on the same machine where dtGatway was installed
+dtCli can be launched by running following command
 
     dtcli
 

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/favicon.ico
----------------------------------------------------------------------
diff --git a/docs/favicon.ico b/docs/favicon.ico
new file mode 100644
index 0000000..c0b3dae
Binary files /dev/null and b/docs/favicon.ico differ

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/images/MalharOperatorOverview.png
----------------------------------------------------------------------
diff --git a/docs/images/MalharOperatorOverview.png b/docs/images/MalharOperatorOverview.png
deleted file mode 100644
index 40bee4a..0000000
Binary files a/docs/images/MalharOperatorOverview.png and /dev/null differ

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/images/malhar-operators.png
----------------------------------------------------------------------
diff --git a/docs/images/malhar-operators.png b/docs/images/malhar-operators.png
new file mode 100644
index 0000000..ac09622
Binary files /dev/null and b/docs/images/malhar-operators.png differ

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/index.md
----------------------------------------------------------------------
diff --git a/docs/index.md b/docs/index.md
new file mode 100644
index 0000000..6a78abf
--- /dev/null
+++ b/docs/index.md
@@ -0,0 +1,20 @@
+Apache Apex (Incubating)
+================================================================================
+
+Apex is a Hadoop YARN native big data processing platform, enabling real time stream as well as batch processing for your big data.  Apex provides the following benefits:
+
+* High scalability and performance
+* Fault tolerance and state management
+* Hadoop-native YARN & HDFS implementation
+* Event processing guarantees
+* Separation of functional and operational concerns
+* Simple API supports generic Java code
+
+Platform has been demonstated to scale linearly across Hadoop clusters under extreme loads of billions of events per second.  Hardware and process failures are quickly recovered with HDFS-backed checkpointing and automatic operator recovery, preserving application state and resuming execution in seconds.  Functional and operational specifications are separated.  Apex provides a simple API, which enables users to write generic, reusable code.  The code is dropped in as-is and platform automatically handles the various operational concerns, such as state management, fault tolerance, scalability, security, metrics, etc.  This frees users to focus on functional development, and lets platform provide operability support.
+
+The core Apex platform is supplemented by Malhar, a library of connector and logic functions, enabling rapid application development.  These operators and modules provide access to HDFS, S3, NFS, FTP, and other file systems; Kafka, ActiveMQ, RabbitMQ, JMS, and other message systems; MySql, Cassandra, MongoDB, Redis, HBase, CouchDB, generic JDBC, and other database connectors. The Malhar library also includes a host of other common business logic patterns that help users to significantly reduce the time it takes to go into production.  Ease of integration with all other big data technologies is one of the primary missions of Malhar.
+
+
+For additional information visit [Apache Apex (incubating)](http://apex.incubator.apache.org/).
+
+[![](favicon.ico)](http://apex.incubator.apache.org/)

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/docs/operator_development.md
----------------------------------------------------------------------
diff --git a/docs/operator_development.md b/docs/operator_development.md
index f502725..85ebab5 100644
--- a/docs/operator_development.md
+++ b/docs/operator_development.md
@@ -287,7 +287,7 @@ Code
 
 The source code for the tutorial can be found here:
 
-[https://github.com/DataTorrent/examples/tree/master/tutorials/operatorTutorial](https://www.google.com/url?q=https://github.com/DataTorrent/examples/tree/master/tutorials/operatorTutorial&sa=D&usg=AFQjCNHAAgSpNprHJVvy9GSjdlD1uwU7jw)
+[https://github.com/DataTorrent/examples/tree/master/tutorials/operatorTutorial](https://github.com/DataTorrent/examples/tree/master/tutorials/operatorTutorial)
 
 
 Operator Reference <a name="operator_reference"></a>
@@ -447,3 +447,9 @@ ports.
 1. Invoke constructor; non-transients initialized.
 2. Copy state from checkpoint -- initialized values from step 1 are
 replaced.
+
+
+Malhar Operator Library
+==========================
+
+To see the full list of Apex Malhar operators along with related documentation, visit [Apex Malhar on Github](https://github.com/apache/incubator-apex-malhar)

http://git-wip-us.apache.org/repos/asf/incubator-apex-core/blob/e1da746e/mkdocs.yml
----------------------------------------------------------------------
diff --git a/mkdocs.yml b/mkdocs.yml
new file mode 100644
index 0000000..c6a26d7
--- /dev/null
+++ b/mkdocs.yml
@@ -0,0 +1,15 @@
+site_name: Apache Apex Documentation
+site_favicon: favicon.ico
+theme: readthedocs
+pages:
+- Apache Apex: index.md
+- Apache Apex-Malhar: apex_malhar.md
+- Development:
+    - Development Setup: apex_development_setup.md
+    - Applications: application_development.md
+    - Application Packages: application_packages.md
+    - Configuration Packages: configuration_packages.md
+    - Operators: operator_development.md
+    - AutoMetric API: autometrics.md
+- Operations:
+    - dtCli: dtcli.md