You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@drill.apache.org by br...@apache.org on 2015/02/26 01:31:10 UTC

[06/13] drill git commit: DRILL-2315: Confluence conversion plus fixes

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/nested/001-flatten.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/sql-ref/nested/001-flatten.md b/_docs/drill-docs/sql-ref/nested/001-flatten.md
deleted file mode 100644
index 124db91..0000000
--- a/_docs/drill-docs/sql-ref/nested/001-flatten.md
+++ /dev/null
@@ -1,89 +0,0 @@
----
-title: "FLATTEN Function"
-parent: "Nested Data Functions"
----
-The FLATTEN function is useful for flexible exploration of repeated data.
-FLATTEN separates the elements in a repeated field into individual records. To
-maintain the association between each flattened value and the other fields in
-the record, all of the other columns are copied into each new record. A very
-simple example would turn this data (one record):
-
-    {
-      "x" : 5,
-      "y" : "a string",
-      "z" : [ 1,2,3]
-    }
-
-into three distinct records:
-
-    select flatten(z) from table;
-    | x           | y              | z         |
-    +-------------+----------------+-----------+
-    | 5           | "a string"     | 1         |
-    | 5           | "a string"     | 2         |
-    | 5           | "a string"     | 3         |
-
-The function takes a single argument, which must be an array (the `z` column
-in this example).
-
-  
-
-For a more interesting example, consider the JSON data in the publicly
-available [Yelp](https://www.yelp.com/dataset_challenge/dataset) data set. The
-first query below returns three columns from the
-`yelp_academic_dataset_business.json` file: `name`, `hours`, and `categories`.
-The query is restricted to distinct rows where the name is `z``pizza`. The
-query returns only one row that meets those criteria; however, note that this
-row contains an array of four categories:
-
-    0: jdbc:drill:zk=local> select distinct name, hours, categories 
-    from dfs.yelp.`yelp_academic_dataset_business.json` 
-    where name ='zpizza';
-    +------------+------------+------------+
-    |    name    |   hours    | categories |
-    +------------+------------+------------+
-    | zpizza     | {"Tuesday":{"close":"22:00","open":"10:00"},"Friday":{"close":"23:00","open":"10:00"},"Monday":{"close":"22:00","open":"10:00"},"Wednesday":{"close":"22:00","open":"10:00"},"Thursday":{"close":"22:00","open":"10:00"},"Sunday":{"close":"22:00","open":"10:00"},"Saturday":{"close":"23:00","open":"10:00"}} | ["Gluten-Free","Pizza","Vegan","Restaurants"] |
-
-The FLATTEN function can operate on this single row and return multiple rows,
-one for each category:
-
-    0: jdbc:drill:zk=local> select distinct name, flatten(categories) as categories 
-    from dfs.yelp.`yelp_academic_dataset_business.json` 
-    where name ='zpizza' order by 2;
-    +------------+-------------+
-    |    name    | categories  |
-    +------------+-------------+
-    | zpizza     | Gluten-Free |
-    | zpizza     | Pizza       |
-    | zpizza     | Restaurants |
-    | zpizza     | Vegan       |
-    +------------+-------------+
-    4 rows selected (2.797 seconds)
-
-Having used the FLATTEN function to break down arrays into distinct rows, you
-can run queries that do deeper analysis on the flattened result set. For
-example, you can use FLATTEN in a subquery, then apply WHERE clause
-constraints or aggregate functions to the results in the outer query.
-
-The following query uses the same data file as the previous query to flatten
-the categories array, then run a COUNT function on the flattened result:
-
-    select celltbl.catl, count(celltbl.catl) catcount 
-    from (select flatten(categories) catl 
-    from dfs.yelp.`yelp_academic_dataset_business.json`) celltbl 
-    group by celltbl.catl 
-    order by count(celltbl.catl) desc limit 5;
- 
-    +---------------+------------+
-    |    catl       |  catcount  |
-    +---------------+------------+
-    | Restaurants   | 14303      |
-    | Shopping      | 6428       |
-    | Food          | 5209       |
-    | Beauty & Spas | 3421       |
-    | Nightlife     | 2870       |
-    +---------------|------------+
-
-A common use case for FLATTEN is its use in conjunction with the
-[KVGEN](/confluence/display/DRILL/KVGEN+Function) function.
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/nested/002-kvgen.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/sql-ref/nested/002-kvgen.md b/_docs/drill-docs/sql-ref/nested/002-kvgen.md
deleted file mode 100644
index a27a781..0000000
--- a/_docs/drill-docs/sql-ref/nested/002-kvgen.md
+++ /dev/null
@@ -1,150 +0,0 @@
----
-title: "KVGEN Function"
-parent: "Nested Data Functions"
----
-KVGEN stands for _key-value generation_. This function is useful when complex
-data files contain arbitrary maps that consist of relatively "unknown" column
-names. Instead of having to specify columns in the map to access the data, you
-can use KVGEN to return a list of the keys that exist in the map. KVGEN turns
-a map with a wide set of columns into an array of key-value pairs.
-
-In turn, you can write analytic queries that return a subset of the generated
-keys or constrain the keys in some way. For example, you can use the
-[FLATTEN](/confluence/display/DRILL/FLATTEN+Function) function to break the
-array down into multiple distinct rows and further query those rows.
-
-  
-
-For example, assume that a JSON file contains this data:  
-
-    {"a": "valA", "b": "valB"}
-    {"c": "valC", "d": "valD"}
-  
-
-KVGEN would operate on this data to generate:
-
-    [{"key": "a", "value": "valA"}, {"key": "b", "value": "valB"}]
-    [{"key": "c", "value": "valC"}, {"key": "d", "value": "valD"}]
-
-Applying the [FLATTEN](/confluence/display/DRILL/FLATTEN+Function) function to
-this data would return:
-
-    {"key": "a", "value": "valA"}
-    {"key": "b", "value": "valB"}
-    {"key": "c", "value": "valC"}
-    {"key": "d", "value": "valD"}
-
-Assume that a JSON file called `kvgendata.json` includes multiple records that
-look like this one:
-
-    {
-	    "rownum": 1,
-	    "bigintegercol": {
-	        "int_1": 1,
-	        "int_2": 2,
-	        "int_3": 3
-	    },
-	    "varcharcol": {
-	        "varchar_1": "abc",
-	        "varchar_2": "def",
-	        "varchar_3": "xyz"
-	    },
-	    "boolcol": {
-	        "boolean_1": true,
-	        "boolean_2": false,
-	        "boolean_3": true
-	    },
-	    "float8col": {
-	        "f8_1": 1.1,
-	        "f8_2": 2.2
-	    },
-	    "complex": [
-	        {
-	            "col1": 3
-	        },
-	        {
-	            "col2": 2,
-	            "col3": 1
-	        },
-	        {
-	            "col1": 7
-	        }
-	    ]
-    }
- 
-	{
-	    "rownum": 3,
-	    "bigintegercol": {
-	        "int_1": 1,
-	        "int_3": 3
-	    },
-	    "varcharcol": {
-	        "varchar_1": "abcde",
-	        "varchar_2": null,
-	        "varchar_3": "xyz",
-	        "varchar_4": "xyz2"
-	    },
-	    "boolcol": {
-	        "boolean_1": true,
-	        "boolean_2": false
-	    },
-	    "float8col": {
-	        "f8_1": 1.1,
-	        "f8_3": 6.6
-	    },
-	    "complex": [
-	        {
-	            "col1": 2,
-	            "col3": 1
-	        }
-	    ]
-	}
-	...
-
-
-A SELECT * query against this specific record returns the following row:
-
-    0: jdbc:drill:zk=local> select * from dfs.yelp.`kvgendata.json` where rownum=1;
- 
-	+------------+---------------+------------+------------+------------+------------+
-	|   rownum   | bigintegercol | varcharcol |  boolcol   | float8col  |  complex   |
-	+------------+---------------+------------+------------+------------+------------+
-	| 1          | {"int_1":1,"int_2":2,"int_3":3} | {"varchar_1":"abc","varchar_2":"def","varchar_3":"xyz"} | {"boolean_1":true,"boolean_2":false,"boolean_3":true} | {"f8_1":1.1,"f8_2":2.2} | [{"col1":3},{"col2":2,"col3":1},{"col1":7}] |
-	+------------+---------------+------------+------------+------------+------------+
-	1 row selected (0.122 seconds)
-
-You can use the KVGEN function to turn the maps in this data into key-value
-pairs. For example:
-
-	0: jdbc:drill:zk=local> select kvgen(varcharcol) from dfs.yelp.`kvgendata.json`;
-	+------------+
-	|   EXPR$0   |
-	+------------+
-	| [{"key":"varchar_1","value":"abc"},{"key":"varchar_2","value":"def"},{"key":"varchar_3","value":"xyz"}] |
-	| [{"key":"varchar_1","value":"abcd"}] |
-	| [{"key":"varchar_1","value":"abcde"},{"key":"varchar_3","value":"xyz"},{"key":"varchar_4","value":"xyz2"}] |
-	| [{"key":"varchar_1","value":"abc"},{"key":"varchar_2","value":"def"}] |
-	+------------+
-	4 rows selected (0.091 seconds)
-
-Now you can apply the FLATTEN function to break out the key-value pairs into
-distinct rows:
-
-	0: jdbc:drill:zk=local> select flatten(kvgen(varcharcol)) from dfs.yelp.`kvgendata.json`;
-	+------------+
-	|   EXPR$0   |
-	+------------+
-	| {"key":"varchar_1","value":"abc"} |
-	| {"key":"varchar_2","value":"def"} |
-	| {"key":"varchar_3","value":"xyz"} |
-	| {"key":"varchar_1","value":"abcd"} |
-	| {"key":"varchar_1","value":"abcde"} |
-	| {"key":"varchar_3","value":"xyz"} |
-	| {"key":"varchar_4","value":"xyz2"} |
-	| {"key":"varchar_1","value":"abc"} |
-	| {"key":"varchar_2","value":"def"} |
-	+------------+
-	9 rows selected (0.151 seconds)
-
-See the description of [FLATTEN](/confluence/display/DRILL/FLATTEN+Function)
-for an example of a query against the flattened data.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/sql-ref/nested/003-repeated-cnt.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/sql-ref/nested/003-repeated-cnt.md b/_docs/drill-docs/sql-ref/nested/003-repeated-cnt.md
deleted file mode 100644
index a66075c..0000000
--- a/_docs/drill-docs/sql-ref/nested/003-repeated-cnt.md
+++ /dev/null
@@ -1,34 +0,0 @@
----
-title: "REPEATED_COUNT Function"
-parent: "Nested Data Functions"
----
-This function counts the values in an array. The following example returns the
-counts for the `categories` array in the `yelp_academic_dataset_business.json`
-file. The counts are restricted to rows that contain the string `pizza`.
-
-	SELECT name, REPEATED_COUNT(categories) 
-	FROM   dfs.yelp.`yelp_academic_dataset_business.json` 
-	WHERE  name LIKE '%pizza%';
-	 
-	+---------------+------------+
-	|    name       |   EXPR$1   |
-	+---------------+------------+
-	| Villapizza    | 2          |
-	| zpizza        | 4          |
-	| zpizza        | 4          |
-	| Luckys pizza  | 2          |
-	| Zpizza        | 2          |
-	| S2pizzabar    | 4          |
-	| Dominos pizza | 5          |
-	+---------------+------------+
-	 
-	7 rows selected (2.03 seconds)
-
-The function requires a single argument, which must be an array. Note that
-this function is not a standard SQL aggregate function and does not require
-the count to be grouped by other columns in the select list (such as `name` in
-this example).
-
-For another example of this function, see the following lesson in the Apache
-Drill Tutorial for Hadoop: [Lesson 3: Run Queries on Complex Data Types](/conf
-luence/display/DRILL/Lesson+3%3A+Run+Queries+on+Complex+Data+Types).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/tutorial/001-install-sandbox.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/tutorial/001-install-sandbox.md b/_docs/drill-docs/tutorial/001-install-sandbox.md
deleted file mode 100644
index e63ddd4..0000000
--- a/_docs/drill-docs/tutorial/001-install-sandbox.md
+++ /dev/null
@@ -1,56 +0,0 @@
----
-title: "Installing the Apache Drill Sandbox"
-parent: "Apache Drill Tutorial"
----
-This tutorial uses the MapR Sandbox, which is a Hadoop environment pre-configured with Apache Drill.
-
-To complete the tutorial on the MapR Sandbox with Apache Drill, work through
-the following pages in order:
-
-  * [Installing the Apache Drill Sandbox](/confluence/display/DRILL/Installing+the+Apache+Drill+Sandbox)
-  * [Getting to Know the Drill Setup](/confluence/display/DRILL/Getting+to+Know+the+Drill+Setup)
-  * [Lesson 1: Learn About the Data Set](/confluence/display/DRILL/Lesson+1%3A+Learn+About+the+Data+Set)
-  * [Lesson 2: Run Queries with ANSI SQL](/confluence/display/DRILL/Lesson+2%3A+Run+Queries+with+ANSI+SQL)
-  * [Lesson 3: Run Queries on Complex Data Types](/confluence/display/DRILL/Lesson+3%3A+Run+Queries+on+Complex+Data+Types)
-  * [Summary](/confluence/display/DRILL/Summary)
-
-# About Apache Drill
-
-Drill is an Apache open-source SQL query engine for Big Data exploration.
-Drill is designed from the ground up to support high-performance analysis on
-the semi-structured and rapidly evolving data coming from modern Big Data
-applications, while still providing the familiarity and ecosystem of ANSI SQL,
-the industry-standard query language. Drill provides plug-and-play integration
-with existing Apache Hive and Apache HBase deployments.Apache Drill 0.5 offers
-the following key features:
-
-  * Low-latency SQL queries
-
-  * Dynamic queries on self-describing data in files (such as JSON, Parquet, text) and MapR-DB/HBase tables, without requiring metadata definitions in the Hive metastore.
-
-  * ANSI SQL
-
-  * Nested data support
-
-  * Integration with Apache Hive (queries on Hive tables and views, support for all Hive file formats and Hive UDFs)
-
-  * BI/SQL tool integration using standard JDBC/ODBC drivers
-
-# MapR Sandbox with Apache Drill
-
-MapR includes Apache Drill as part of the Hadoop distribution. The MapR
-Sandbox with Apache Drill is a fully functional single-node cluster that can
-be used to get an overview on Apache Drill in a Hadoop environment. Business
-and technical analysts, product managers, and developers can use the sandbox
-environment to get a feel for the power and capabilities of Apache Drill by
-performing various types of queries. Once you get a flavor for the technology,
-refer to the [Apache Drill web site](http://incubator.apache.org/drill/) and
-[Apache Drill documentation
-](https://cwiki.apache.org/confluence/display/DRILL/Apache+Drill+Wiki)for more
-details.
-
-Note that Hadoop is not a prerequisite for Drill and users can start ramping
-up with Drill by running SQL queries directly on the local file system. Refer
-to [Apache Drill in 10 minutes](https://cwiki.apache.org/confluence/display/DR
-ILL/Apache+Drill+in+10+Minutes) for an introduction to using Drill in local
-(embedded) mode.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/tutorial/002-get2kno-sb.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/tutorial/002-get2kno-sb.md b/_docs/drill-docs/tutorial/002-get2kno-sb.md
deleted file mode 100644
index e7b24a8..0000000
--- a/_docs/drill-docs/tutorial/002-get2kno-sb.md
+++ /dev/null
@@ -1,235 +0,0 @@
----
-title: "Getting to Know the Drill Sandbox"
-parent: "Apache Drill Tutorial"
----
-This section describes the configuration of the Apache Drill system that you
-have installed and introduces the overall use case for the tutorial.
-
-# Storage Plugins Overview
-
-The Hadoop cluster within the sandbox is set up with MapR-FS, MapR-DB, and
-Hive, which all serve as data sources for Drill in this tutorial. Before you
-can run queries against these data sources, Drill requires each one to be
-configured as a storage plugin. A storage plugin defines the abstraction on
-the data sources for Drill to talk to and provides interfaces to read/write
-and get metadata from the data source. Each storage plugin also exposes
-optimization rules for Drill to leverage for efficient query execution.
-
-Take a look at the pre-configured storage plugins by opening the Drill Web UI.
-
-Feel free to skip this section and jump directly to the queries: [Lesson 1:
-Learn About the Data
-Set](/confluence/display/DRILL/Lesson+1%3A+Learn+About+the+Data+Set)
-
-  * Launch a web browser and go to: `http://<IP address of the sandbox>:8047`
-  * Go to the Storage tab
-  * Open the configured storage plugins one at a time by clicking Update
-  * You will see the following plugins configured.
-
-## maprdb
-
-A storage plugin configuration for MapR-DB in the sandbox. Drill uses a single
-storage plugin for connecting to HBase as well as MapR-DB, which is an
-enterprise grade in-Hadoop NoSQL database. See the [Apache Drill
-Wiki](https://cwiki.apache.org/confluence/display/DRILL/Registering+HBase) for
-information on how to configure Drill to query HBase.
-
-    {
-      "type" : "hbase",
-      "enabled" : true,
-      "config" : {
-        "hbase.table.namespace.mappings" : "*:/tables"
-      }
-     }
-
-## dfs
-
-This is a storage plugin configuration for the MapR file system (MapR-FS) in
-the sandbox. The connection attribute indicates the type of distributed file
-system: in this case, MapR-FS. Drill can work with any distributed system,
-including HDFS, S3, and so on.
-
-The configuration also includes a set of workspaces; each one represents a
-location in MapR-FS:
-
-  * root: access to the root file system location
-  * clicks: access to nested JSON log data
-  * logs: access to flat (non-nested) JSON log data in the logs directory and its subdirectories
-  * views: a workspace for creating views
-
-A workspace in Drill is a location where users can easily access a specific
-set of data and collaborate with each other by sharing artifacts. Users can
-create as many workspaces as they need within Drill.
-
-Each workspace can also be configured as “writable” or not, which indicates
-whether users can write data to this location and defines the storage format
-in which the data will be written (parquet, csv, json). These attributes
-become relevant when you explore Drill SQL commands, especially CREATE TABLE
-AS (CTAS) and CREATE VIEW.
-
-Drill can query files and directories directly and can detect the file formats
-based on the file extension or the first few bits of data within the file.
-However, additional information around formats is required for Drill, such as
-delimiters for text files, which are specified in the “formats” section below.
-
-    {
-      "type": "file",
-      "enabled": true,
-      "connection": "maprfs:///",
-      "workspaces": {
-        "root": {
-          "location": "/mapr/demo.mapr.com/data",
-          "writable": false,
-          "storageformat": null
-        },
-        "clicks": {
-          "location": "/mapr/demo.mapr.com/data/nested",
-          "writable": true,
-          "storageformat": "parquet"
-        },
-        "logs": {
-          "location": "/mapr/demo.mapr.com/data/flat",
-          "writable": true,
-          "storageformat": "parquet"
-        },
-        "views": {
-          "location": "/mapr/demo.mapr.com/data/views",
-          "writable": true,
-          "storageformat": "parquet"
-     },
-     "formats": {
-       "psv": {
-         "type": "text",
-         "extensions": [
-           "tbl"
-         ],
-         "delimiter": "|"
-     },
-     "csv": {
-       "type": "text",
-       "extensions": [
-         "csv"
-       ],
-       "delimiter": ","
-     },
-     "tsv": {
-       "type": "text",
-       "extensions": [
-         "tsv"
-       ],
-       "delimiter": "\t"
-     },
-     "parquet": {
-       "type": "parquet"
-     },
-     "json": {
-       "type": "json"
-     }
-    }}
-
-## hive
-
-A storage plugin configuration for a Hive data warehouse within the sandbox.
-Drill connects to the Hive metastore by using the configured metastore thrift
-URI. Metadata for Hive tables is automatically available for users to query.
-
-     {
-      "type": "hive",
-      "enabled": true,
-      "configProps": {
-        "hive.metastore.uris": "thrift://localhost:9083",
-        "hive.metastore.sasl.enabled": "false"
-      }
-    }
-
-# Client Application Interfaces
-
-Drill also provides additional application interfaces for the client tools to
-connect and access from Drill. The interfaces include the following.
-
-### ODBC/JDBC drivers
-
-Drill provides ODBC/JDBC drivers to connect from BI tools such as Tableau,
-MicroStrategy, SQUirrel, and Jaspersoft; refer to [Using ODBC to Access Apache
-Drill from BI Tools](http://doc.mapr.com/display/MapR/Using+ODBC+to+Access+Apa
-che+Drill+from+BI+Tools) and [Using JDBC to Access Apache Drill](http://doc.ma
-pr.com/display/MapR/Using+JDBC+to+Access+Apache+Drill+from+SQuirreL) to learn
-more.
-
-### SQLLine
-
-SQLLine is a JDBC application that comes packaged with Drill. In order to
-start working with it, you can use the command line on the demo cluster to log
-in as root, then enter `sqlline`. Use `mapr` as the login password. For
-example:
-
-    $ ssh root@localhost -p 2222
-    Password:
-    Last login: Mon Sep 15 13:46:08 2014 from 10.250.0.28
-    Welcome to your Mapr Demo virtual machine.
-    [root@maprdemo ~]# sqlline
-    sqlline version 1.1.6
-    0: jdbc:drill:>
-
-### Drill Web UI
-
-The Drill Web UI is a simple user interface for configuring and manage Apache
-Drill. This UI can be launched from any of the nodes in the Drill cluster. The
-configuration for Drill includes setting up storage plugins that represent the
-data sources on which Drill performs queries. The sandbox comes with storage
-plugins configured for the Hive, HBase, MapR file system, and local file
-system.
-
-Users and developers can get the necessary information for tuning and
-performing diagnostics on queries, such as the list of queries executed in a
-session and detailed query plan profiles for each.
-
-Detailed configuration and management of Drill is out of scope for this
-tutorial.
-
-The Web interface for Apache Drill also provides a query UI where users can
-submit queries to Drill and observe results. Here is a screen shot of the Web
-UI for Apache Drill:
-
-![](../../img/DrillWebUI.png)  
-
-### REST API
-
-Drill provides a simple REST API for the users to query data as well as manage
-the system. The Web UI leverages the REST API to talk to Drill.
-
-This tutorial introduces sample queries that you can run by using SQLLine.
-Note that you can run the queries just as easily by launching the Drill Web
-UI. No additional installation or configuration is required.
-
-# Use Case Overview
-
-As you run through the queries in this tutorial, put yourself in the shoes of
-an analyst with basic SQL skills. Let us imagine that the analyst works for an
-emerging online retail business that accepts purchases from its customers
-through both an established web-based interface and a new mobile application.
-
-The analyst is data-driven and operates mostly on the business side with
-little or no interaction with the IT department. Recently the central IT team
-has implemented a Hadoop-based infrastructure to reduce the cost of the legacy
-database system, and most of the DWH/ETL workload is now handled by
-Hadoop/Hive. The master customer profile information and product catalog are
-managed in MapR-DB, which is a NoSQL database. The IT team has also started
-acquiring clickstream data that comes from web and mobile applications. This
-data is stored in Hadoop as JSON files.
-
-The analyst has a number of data sources that he could explore, but exploring
-them in isolation is not the way to go. There are some potentially very
-interesting analytical connections between these data sources. For example, it
-would be good to be able to analyze customer records in the clickstream data
-and tie them to the master customer data in MapR DB.
-
-The analyst decides to explore various data sources and he chooses to do that
-by using Apache Drill. Think about the flexibility and analytic capability of
-Apache Drill as you work through the tutorial.
-
-# What's Next
-
-Start running queries by going to [Lesson 1: Learn About the Data
-Set](/confluence/display/DRILL/Lesson+1%3A+Learn+About+the+Data+Set).
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/tutorial/003-lesson1.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/tutorial/003-lesson1.md b/_docs/drill-docs/tutorial/003-lesson1.md
deleted file mode 100644
index 8f3465f..0000000
--- a/_docs/drill-docs/tutorial/003-lesson1.md
+++ /dev/null
@@ -1,423 +0,0 @@
----
-title: "Lession 1: Learn about the Data Set"
-parent: "Apache Drill Tutorial"
----
-## Goal
-
-This lesson is simply about discovering what data is available, in what
-format, using simple SQL SELECT statements. Drill is capable of analyzing data
-without prior knowledge or definition of its schema. This means that you can
-start querying data immediately (and even as it changes), regardless of its
-format.
-
-The data set for the tutorial consists of:
-
-  * Transactional data: stored as a Hive table
-
-  * Product catalog and master customer data: stored as MapR-DB tables
-
-  * Clickstream and logs data: stored in the MapR file system as JSON files
-
-## Queries in This Lesson
-
-This lesson consists of select * queries on each data source.
-
-## Before You Begin
-
-### Start sqlline
-
-If sqlline is not already started, use a Terminal or Command window to log
-into the demo VM as root, then enter `sqlline`:
-
-    $ ssh root@10.250.0.6
-    Password:
-    Last login: Mon Sep 15 13:46:08 2014 from 10.250.0.28
-    Welcome to your Mapr Demo virtual machine.
-    [root@maprdemo ~]# sqlline
-    sqlline version 1.1.6
-    0: jdbc:drill:>
-
-You can run queries from this prompt to complete the tutorial. To exit from
-`sqlline`, type:
-
-    0: jdbc:drill:> !quit
-
-Note that though this tutorial demonstrates the queries using SQLLine, you can
-also execute queries using the Drill Web UI.
-
-### List the available workspaces and databases:
-
-    0: jdbc:drill:> show databases;
-    +-------------+
-    | SCHEMA_NAME |
-    +-------------+
-    | hive.default |
-    | dfs.default |
-    | dfs.logs    |
-    | dfs.root    |
-    | dfs.views   |
-    | dfs.clicks  |
-    | dfs.data    |
-    | dfs.tmp     |
-    | sys         |
-    | maprdb      |
-    | cp.default  |
-    | INFORMATION_SCHEMA |
-    +-------------+
-    12 rows selected
-
-Note that this command exposes all the metadata available from the storage
-plugins configured with Drill as a set of schemas. This includes the Hive and
-MapR-DB databases as well as the workspaces configured in the file system. As
-you run queries in the tutorial, you will switch among these schemas by
-submitting the USE command. This behavior resembles the ability to use
-different database schemas (namespaces) in a relational database system.
-
-## Query Hive Tables
-
-The orders table is a six-column Hive table defined in the Hive metastore.
-This is a Hive external table pointing to the data stored in flat files on the
-MapR file system. The orders table contains 122,000 rows.
-
-### Set the schema to hive:
-
-    0: jdbc:drill:> use hive;
-    +------------+------------+
-    | ok | summary |
-    +------------+------------+
-    | true | Default schema changed to 'hive' |
-    +------------+------------+
-
-You will run the USE command throughout this tutorial. The USE command sets
-the schema for the current session.
-
-### Describe the table:
-
-You can use the DESCRIBE command to show the columns and data types for a Hive
-table:
-
-    0: jdbc:drill:> describe orders;
-    +-------------+------------+-------------+
-    | COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
-    +-------------+------------+-------------+
-    | order_id    | BIGINT     | YES         |
-    | month       | VARCHAR    | YES         |
-    | cust_id     | BIGINT     | YES         |
-    | state       | VARCHAR    | YES         |
-    | prod_id     | BIGINT     | YES         |
-    | order_total | INTEGER    | YES         |
-    +-------------+------------+-------------+
-
-The DESCRIBE command returns complete schema information for Hive tables based
-on the metadata available in the Hive metastore.
-
-### Select 5 rows from the orders table:
-
-    0: jdbc:drill:> select * from orders limit 5;
-    +------------+------------+------------+------------+------------+-------------+
-    | order_id | month | cust_id | state | prod_id | order_total |
-    +------------+------------+------------+------------+------------+-------------+
-    | 67212 | June | 10001 | ca | 909 | 13 |
-    | 70302 | June | 10004 | ga | 420 | 11 |
-    | 69090 | June | 10011 | fl | 44 | 76 |
-    | 68834 | June | 10012 | ar | 0 | 81 |
-    | 71220 | June | 10018 | az | 411 | 24 |
-    +------------+------------+------------+------------+------------+-------------+
-
-Because orders is a Hive table, you can query the data in the same way that
-you would query the columns in a relational database table. Note the use of
-the standard LIMIT clause, which limits the result set to the specified number
-of rows. You can use LIMIT with or without an ORDER BY clause.
-
-Drill provides seamless integration with Hive by allowing queries on Hive
-tables defined in the metastore with no extra configuration. Note that Hive is
-not a prerequisite for Drill, but simply serves as a storage plugin or data
-source for Drill. Drill also lets users query all Hive file formats (including
-custom serdes). Additionally, any UDFs defined in Hive can be leveraged as
-part of Drill queries.
-
-Because Drill has its own low-latency SQL query execution engine, you can
-query Hive tables with high performance and support for interactive and ad-hoc
-data exploration.
-
-## Query MapR-DB and HBase Tables
-
-The customers and products tables are MapR-DB tables. MapR-DB is an enterprise
-in-Hadoop NoSQL database. It exposes the HBase API to support application
-development. Every MapR-DB table has a row_key, in addition to one or more
-column families. Each column family contains one or more specific columns. The
-row_key value is a primary key that uniquely identifies each row.
-
-Drill allows direct queries on MapR-DB and HBase tables. Unlike other SQL on
-Hadoop options, Drill requires no overlay schema definitions in Hive to work
-with this data. Think about a MapR-DB or HBase table with thousands of
-columns, such as a time-series database, and the pain of having to manage
-duplicate schemas for it in Hive!
-
-### Products Table
-
-The products table has two column families.
-
-Column Family|Columns  
-  
----|---  
-  
-details
-
-|
-
-name
-
-category  
-  
-pricing
-
-|
-
-price  
-  
-The products table contains 965 rows.
-
-### Customers Table
-
-The Customers table has three column families.
-
-Column Family|Columns  
--------------|-------  
-  address    | state  
-  loyalty    | agg_rev
-             | membership  
-  personal   | age
-             | gender  
-  
-The customers table contains 993 rows.
-
-### Set the workspace to maprdb:
-
-    0: jdbc:drill:> use maprdb;
-    +------------+------------+
-    | ok | summary |
-    +------------+------------+
-    | true | Default schema changed to 'maprdb' |
-    +------------+------------+
-
-### Describe the tables:
-
-    0: jdbc:drill:> describe customers;
-    +-------------+------------+-------------+
-    | COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
-    +-------------+------------+-------------+
-    | row_key     | ANY        | NO          |
-    | address     | (VARCHAR(1), ANY) MAP | NO          |
-    | loyalty     | (VARCHAR(1), ANY) MAP | NO          |
-    | personal    | (VARCHAR(1), ANY) MAP | NO          |
-    +-------------+------------+-------------+
- 
-    0: jdbc:drill:> describe products;
-    +-------------+------------+-------------+
-    | COLUMN_NAME | DATA_TYPE  | IS_NULLABLE |
-    +-------------+------------+-------------+
-    | row_key     | ANY        | NO          |
-    | details     | (VARCHAR(1), ANY) MAP | NO          |
-    | pricing     | (VARCHAR(1), ANY) MAP | NO          |
-    +-------------+------------+-------------+
-
-Unlike the Hive example, the DESCRIBE command does not return the full schema
-up to the column level. Wide-column NoSQL databases such as MapR-DB and HBase
-can be schema-less by design; every row has its own set of column name-value
-pairs in a given column family, and the column value can be of any data type,
-as determined by the application inserting the data.
-
-A “MAP” complex type in Drill represents this variable column name-value
-structure, and “ANY” represents the fact that the column value can be of any
-data type. Observe the row_key, which is also simply bytes and has the type
-ANY.
-
-### Select 5 rows from the products table:
-
-    0: jdbc:drill:> select * from products limit 5;
-    +------------+------------+------------+
-    | row_key | details | pricing |
-    +------------+------------+------------+
-    | [B@a1a3e25 | {"category":"bGFwdG9w","name":"IlNvbnkgbm90ZWJvb2si"} | {"price":"OTU5"} |
-    | [B@103a43af | {"category":"RW52ZWxvcGVz","name":"IzEwLTQgMS84IHggOSAxLzIgUHJlbWl1bSBEaWFnb25hbCBTZWFtIEVudmVsb3Blcw=="} | {"price":"MT |
-    | [B@61319e7b | {"category":"U3RvcmFnZSAmIE9yZ2FuaXphdGlvbg==","name":"MjQgQ2FwYWNpdHkgTWF4aSBEYXRhIEJpbmRlciBSYWNrc1BlYXJs"} | {"price" |
-    | [B@9bcf17 | {"category":"TGFiZWxz","name":"QXZlcnkgNDk4"} | {"price":"Mw=="} |
-    | [B@7538ef50 | {"category":"TGFiZWxz","name":"QXZlcnkgNDk="} | {"price":"Mw=="} |
-
-Given that Drill requires no up front schema definitions indicating data
-types, the query returns the raw byte arrays for column values, just as they
-are stored in MapR-DB (or HBase). Observe that the column families (details
-and pricing) have the map data type and appear as JSON strings.
-
-In Lesson 2, you will use CAST functions to return typed data for each column.
-
-### Select 5 rows from the customers table:
-
-
-    +0: jdbc:drill:> select * from customers limit 5;
-    +------------+------------+------------+------------+
-    | row_key | address | loyalty | personal |
-    +------------+------------+------------+------------+
-    | [B@284bae62 | {"state":"Imt5Ig=="} | {"agg_rev":"IjEwMDEtMzAwMCI=","membership":"ImJhc2ljIg=="} | {"age":"IjI2LTM1Ig==","gender":"Ik1B |
-    | [B@7ffa4523 | {"state":"ImNhIg=="} | {"agg_rev":"IjAtMTAwIg==","membership":"ImdvbGQi"} | {"age":"IjI2LTM1Ig==","gender":"IkZFTUFMRSI= |
-    | [B@7d13e79 | {"state":"Im9rIg=="} | {"agg_rev":"IjUwMS0xMDAwIg==","membership":"InNpbHZlciI="} | {"age":"IjI2LTM1Ig==","gender":"IkZFT |
-    | [B@3a5c7df1 | {"state":"ImtzIg=="} | {"agg_rev":"IjMwMDEtMTAwMDAwIg==","membership":"ImdvbGQi"} | {"age":"IjUxLTEwMCI=","gender":"IkZF |
-    | [B@e507726 | {"state":"Im5qIg=="} | {"agg_rev":"IjAtMTAwIg==","membership":"ImJhc2ljIg=="} | {"age":"IjIxLTI1Ig==","gender":"Ik1BTEUi" |
-    +------------+------------+------------+------------+
-
-Again the table returns byte data that needs to be cast to readable data
-types.
-
-## Query the File System
-
-Along with querying a data source with full schemas (such as Hive) and partial
-schemas (such as MapR-DB and HBase), Drill offers the unique capability to
-perform SQL queries directly on file system. The file system could be a local
-file system, or a distributed file system such as MapR-FS, HDFS, or S3.
-
-In the context of Drill, a file or a directory is considered as synonymous to
-a relational database “table.” Therefore, you can perform SQL operations
-directly on files and directories without the need for up-front schema
-definitions or schema management for any model changes. The schema is
-discovered on the fly based on the query. Drill supports queries on a variety
-of file formats including text, CSV, Parquet, and JSON in the 0.5 release.
-
-In this example, the clickstream data coming from the mobile/web applications
-is in JSON format. The JSON files have the following structure:
-
-    {"trans_id":31920,"date":"2014-04-26","time":"12:17:12","user_info":{"cust_id":22526,"device":"IOS5","state":"il"},"trans_info":{"prod_id":[174,2],"purch_flag":"false"}}
-    {"trans_id":31026,"date":"2014-04-20","time":"13:50:29","user_info":{"cust_id":16368,"device":"AOS4.2","state":"nc"},"trans_info":{"prod_id":[],"purch_flag":"false"}}
-    {"trans_id":33848,"date":"2014-04-10","time":"04:44:42","user_info":{"cust_id":21449,"device":"IOS6","state":"oh"},"trans_info":{"prod_id":[582],"purch_flag":"false"}}
-
-
-The clicks.json and clicks.campaign.json files contain metadata as part of the
-data itself (referred to as “self-describing” data). Also note that the data
-elements are complex, or nested. The initial queries below do not show how to
-unpack the nested data, but they show that easy access to the data requires no
-setup beyond the definition of a workspace.
-
-### Query nested clickstream data
-
-#### Set the workspace to dfs.clicks:
-
-     0: jdbc:drill:> use dfs.clicks;
-    +------------+------------+
-    | ok | summary |
-    +------------+------------+
-    | true | Default schema changed to 'dfs.clicks' |
-    +------------+------------+
-
-In this case, setting the workspace is a mechanism for making queries easier
-to write. When you specify a file system workspace, you can shorten references
-to files in the FROM clause of your queries. Instead of having to provide the
-complete path to a file, you can provide the path relative to a directory
-location specified in the workspace. For example:
-
-    "location": "/mapr/demo.mapr.com/data/nested"
-
-Any file or directory that you want to query in this path can be referenced
-relative to this path. The clicks directory referred to in the following query
-is directly below the nested directory.
-
-#### Select 2 rows from the clicks.json file:
-
-    0: jdbc:drill:> select * from `clicks/clicks.json` limit 2;
-    +------------+------------+------------+------------+------------+
-    |  trans_id  |    date    |    time    | user_info  | trans_info |
-    +------------+------------+------------+------------+------------+
-    | 31920      | 2014-04-26 | 12:17:12   | {"cust_id":22526,"device":"IOS5","state":"il"} | {"prod_id":[174,2],"purch_flag":"false"} |
-    | 31026      | 2014-04-20 | 13:50:29   | {"cust_id":16368,"device":"AOS4.2","state":"nc"} | {"prod_id":[],"purch_flag":"false"} |
-    +------------+------------+------------+------------+------------+
-    2 rows selected
-
-Note that the FROM clause reference points to a specific file. Drill expands
-the traditional concept of a “table reference” in a standard SQL FROM clause
-to refer to a file in a local or distributed file system.
-
-The only special requirement is the use of back ticks to enclose the file
-path. This is necessary whenever the file path contains Drill reserved words
-or characters.
-
-#### Select 2 rows from the campaign.json file:
-
-    0: jdbc:drill:> select * from `clicks/clicks.campaign.json` limit 2;
-    +------------+------------+------------+------------+------------+------------+
-    |  trans_id  |    date    |    time    | user_info  |  ad_info   | trans_info |
-    +------------+------------+------------+------------+------------+------------+
-    | 35232      | 2014-05-10 | 00:13:03   | {"cust_id":18520,"device":"AOS4.3","state":"tx"} | {"camp_id":"null"} | {"prod_id":[7,7],"purch_flag":"true"} |
-    | 31995      | 2014-05-22 | 16:06:38   | {"cust_id":17182,"device":"IOS6","state":"fl"} | {"camp_id":"null"} | {"prod_id":[],"purch_flag":"false"} |
-    +------------+------------+------------+------------+------------+------------+
-    2 rows selected
-
-Notice that with a select * query, any complex data types such as maps and
-arrays return as JSON strings. You will see how to unpack this data using
-various SQL functions and operators in the next lesson.
-
-## Query Logs Data
-
-Unlike the previous example where we performed queries against clicks data in
-one file, logs data is stored as partitioned directories on the file system.
-The logs directory has three subdirectories:
-
-  * 2012
-
-  * 2013
-
-  * 2014
-
-Each of these year directories fans out to a set of numbered month
-directories, and each month directory contains a JSON file with log records
-for that month. The total number of records in all log files is 48000.
-
-The files in the logs directory and its subdirectories are JSON files. There
-are many of these files, but you can use Drill to query them all as a single
-data source, or to query a subset of the files.
-
-#### Set the workspace to dfs.logs:
-
-     0: jdbc:drill:> use dfs.logs;
-    +------------+------------+
-    | ok | summary |
-    +------------+------------+
-    | true | Default schema changed to 'dfs.logs' |
-    +------------+------------+
-
-#### Select 2 rows from the logs directory:
-
-    0: jdbc:drill:> select * from logs limit 2;
-    +------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+----------+
-    | dir0 | dir1 | trans_id | date | time | cust_id | device | state | camp_id | keywords | prod_id | purch_fl |
-    +------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+----------+
-    | 2014 | 8 | 24181 | 08/02/2014 | 09:23:52 | 0 | IOS5 | il | 2 | wait | 128 | false |
-    | 2014 | 8 | 24195 | 08/02/2014 | 07:58:19 | 243 | IOS5 | mo | 6 | hmm | 107 | false |
-    +------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+----------+
-
-Note that this is flat JSON data. The dfs.clicks workspace location property
-points to a directory that contains the logs directory, making the FROM clause
-reference for this query very simple. You do not have to refer to the complete
-directory path on the file system.
-
-The column names dir0 and dir1 are special Drill variables that identify
-subdirectories below the logs directory. In Lesson 3, you will do more complex
-queries that leverage these dynamic variables.
-
-#### Find the total number of rows in the logs directory (all files):
-
-    0: jdbc:drill:> select count(*) from logs;
-    +------------+
-    | EXPR$0 |
-    +------------+
-    | 48000 |
-    +------------+
-
-This query traverses all of the files in the logs directory and its
-subdirectories to return the total number of rows in those files.
-
-# What's Next
-
-Go to [Lesson 2: Run Queries with ANSI
-SQL](/confluence/display/DRILL/Lesson+2%3A+Run+Queries+with+ANSI+SQL).
-
-
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/tutorial/004-lesson2.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/tutorial/004-lesson2.md b/_docs/drill-docs/tutorial/004-lesson2.md
deleted file mode 100644
index d9c68d5..0000000
--- a/_docs/drill-docs/tutorial/004-lesson2.md
+++ /dev/null
@@ -1,392 +0,0 @@
----
-title: "Lession 2: Run Queries with ANSI SQL"
-parent: "Apache Drill Tutorial"
----
-## Goal
-
-This lesson shows how to do some standard SQL analysis in Apache Drill: for
-example, summarizing data by using simple aggregate functions and connecting
-data sources by using joins. Note that Apache Drill provides ANSI SQL support,
-not a “SQL-like” interface.
-
-## Queries in This Lesson
-
-Now that you know what the data sources look like in their raw form, using
-select * queries, try running some simple but more useful queries on each data
-source. These queries demonstrate how Drill supports ANSI SQL constructs and
-also how you can combine data from different data sources in a single SELECT
-statement.
-
-  * Show an aggregate query on a single file or table. Use GROUP BY, WHERE, HAVING, and ORDER BY clauses.
-
-  * Perform joins between Hive, MapR-DB, and file system data sources.
-
-  * Use table and column aliases.
-
-  * Create a Drill view.
-
-## Aggregation
-
-
-### Set the schema to hive:
-
-    0: jdbc:drill:> use hive;
-    +------------+------------+
-    |     ok     |  summary   |
-    +------------+------------+
-    | true       | Default schema changed to 'hive' |
-    +------------+------------+
-    1 row selected
-
-### Return sales totals by month:
-
-    0: jdbc:drill:> select `month`, sum(order_total)
-    from orders group by `month` order by 2 desc;
-    +------------+------------+
-    | month | EXPR$1 |
-    +------------+------------+
-    | June | 950481 |
-    | May | 947796 |
-    | March | 836809 |
-    | April | 807291 |
-    | July | 757395 |
-    | October | 676236 |
-    | August | 572269 |
-    | February | 532901 |
-    | September | 373100 |
-    | January | 346536 |
-    +------------+------------+
-
-Drill supports SQL aggregate functions such as SUM, MAX, AVG, and MIN.
-Standard SQL clauses work in the same way in Drill queries as in relational
-database queries.
-
-Note that back ticks are required for the “month” column only because “month”
-is a reserved word in SQL.
-
-### Return the top 20 sales totals by month and state:
-
-    0: jdbc:drill:> select `month`, state, sum(order_total) as sales from orders group by `month`, state
-    order by 3 desc limit 20;
-    +------------+------------+------------+
-    |   month    |   state    |   sales    |
-    +------------+------------+------------+
-    | May        | ca         | 119586     |
-    | June       | ca         | 116322     |
-    | April      | ca         | 101363     |
-    | March      | ca         | 99540      |
-    | July       | ca         | 90285      |
-    | October    | ca         | 80090      |
-    | June       | tx         | 78363      |
-    | May        | tx         | 77247      |
-    | March      | tx         | 73815      |
-    | August     | ca         | 71255      |
-    | April      | tx         | 68385      |
-    | July       | tx         | 63858      |
-    | February   | ca         | 63527      |
-    | June       | fl         | 62199      |
-    | June       | ny         | 62052      |
-    | May        | fl         | 61651      |
-    | May        | ny         | 59369      |
-    | October    | tx         | 55076      |
-    | March      | fl         | 54867      |
-    | March      | ny         | 52101      |
-    +------------+------------+------------+
-    20 rows selected
-
-Note the alias for the result of the SUM function. Drill supports column
-aliases and table aliases.
-
-## HAVING Clause
-
-This query uses the HAVING clause to constrain an aggregate result.
-
-### Set the workspace to dfs.clicks
-
-    0: jdbc:drill:> use dfs.clicks;
-    +------------+------------+
-    |     ok     |  summary   |
-    +------------+------------+
-    | true       | Default schema changed to 'dfs.clicks' |
-    +------------+------------+
-    1 row selected
-
-### Return total number of clicks for devices that indicate high click-throughs:
-
-    0: jdbc:drill:> select t.user_info.device, count(*) from `clicks/clicks.json` t 
-    group by t.user_info.device
-    having count(*) > 1000;
-    +------------+------------+
-    |   EXPR$0   |   EXPR$1   |
-    +------------+------------+
-    | IOS5       | 11814      |
-    | AOS4.2     | 5986       |
-    | IOS6       | 4464       |
-    | IOS7       | 3135       |
-    | AOS4.4     | 1562       |
-    | AOS4.3     | 3039       |
-    +------------+------------+
-
-The aggregate is a count of the records for each different mobile device in
-the clickstream data. Only the activity for the devices that registered more
-than 1000 transactions qualify for the result set.
-
-## UNION Operator
-
-Use the same workspace as before (dfs.clicks).
-
-### Combine clicks activity from before and after the marketing campaign
-
-    0: jdbc:drill:> select t.trans_id transaction, t.user_info.cust_id customer from `clicks/clicks.campaign.json` t 
-    union all 
-    select u.trans_id, u.user_info.cust_id  from `clicks/clicks.json` u limit 5;
-    +-------------+------------+
-    | transaction |  customer  |
-    +-------------+------------+
-    | 35232       | 18520      |
-    | 31995       | 17182      |
-    | 35760       | 18228      |
-    | 37090       | 17015      |
-    | 37838       | 18737      |
-    +-------------+------------+
-
-This UNION ALL query returns rows that exist in two files (and includes any
-duplicate rows from those files): `clicks.campaign.json` and `clicks.json`.
-
-## Subqueries
-
-### Set the workspace to hive:
-
-    0: jdbc:drill:> use hive;
-    +------------+------------+
-    |     ok     |  summary   |
-    +------------+------------+
-    | true       | Default schema changed to 'hive' |
-    +------------+------------+
-    
-### Compare order totals across states:
-
-    0: jdbc:drill:> select o1.cust_id, sum(o1.order_total) as ny_sales,
-    (select sum(o2.order_total) from hive.orders o2
-    where o1.cust_id=o2.cust_id and state='ca') as ca_sales
-    from hive.orders o1 where o1.state='ny' group by o1.cust_id
-    order by cust_id limit 20;
-    +------------+------------+------------+
-    |  cust_id   |  ny_sales  |  ca_sales  |
-    +------------+------------+------------+
-    | 1001       | 72         | 47         |
-    | 1002       | 108        | 198        |
-    | 1003       | 83         | null       |
-    | 1004       | 86         | 210        |
-    | 1005       | 168        | 153        |
-    | 1006       | 29         | 326        |
-    | 1008       | 105        | 168        |
-    | 1009       | 443        | 127        |
-    | 1010       | 75         | 18         |
-    | 1012       | 110        | null       |
-    | 1013       | 19         | null       |
-    | 1014       | 106        | 162        |
-    | 1015       | 220        | 153        |
-    | 1016       | 85         | 159        |
-    | 1017       | 82         | 56         |
-    | 1019       | 37         | 196        |
-    | 1020       | 193        | 165        |
-    | 1022       | 124        | null       |
-    | 1023       | 166        | 149        |
-    | 1024       | 233        | null       |
-    +------------+------------+------------+
-
-This example demonstrates Drill support for correlated subqueries. This query
-uses a subquery in the select list and correlates the result of the subquery
-with the outer query, using the cust_id column reference. The subquery returns
-the sum of order totals for California, and the outer query returns the
-equivalent sum, for the same cust_id, for New York.
-
-The result set is sorted by the cust_id and presents the sales totals side by
-side for easy comparison. Null values indicate customer IDs that did not
-register any sales in that state.
-
-## CAST Function
-
-### Use the maprdb workspace:
-
-    0: jdbc:drill:> use maprdb;
-    +------------+------------+
-    |     ok     |  summary   |
-    +------------+------------+
-    | true       | Default schema changed to 'maprdb' |
-    +------------+------------+
-    1 row selected
-
-### Return customer data with appropriate data types
-
-    0: jdbc:drill:> select cast(row_key as int) as cust_id, cast(t.personal.name as varchar(20)) as name, 
-    cast(t.personal.gender as varchar(10)) as gender, cast(t.personal.age as varchar(10)) as age,
-    cast(t.address.state as varchar(4)) as state, cast(t.loyalty.agg_rev as dec(7,2)) as agg_rev, 
-    cast(t.loyalty.membership as varchar(20)) as membership
-    from customers t limit 5;
-    +------------+------------+------------+------------+------------    +------------+------------+
-    |  cust_id   |    name    |   gender   |    age     |   state    |  agg_rev   | membership |
-    +------------+------------+------------+------------+------------+------------+------------+
-    | 10001      | "Corrine Mecham" | "FEMALE"   | "15-20"    | "va"       | 197.00     | "silver"   |
-    | 10005      | "Brittany Park" | "MALE"     | "26-35"    | "in"       | 230.00     | "silver"   |
-    | 10006      | "Rose Lokey" | "MALE"     | "26-35"    | "ca"       | 250.00     | "silver"   |
-    | 10007      | "James Fowler" | "FEMALE"   | "51-100"   | "me"       | 263.00     | "silver"   |
-    | 10010      | "Guillermo Koehler" | "OTHER"    | "51-100"   | "mn"       | 202.00     | "silver"   |
-    +------------+------------+------------+------------+------------+------------+------------+
-    5 rows selected
-
-Note the following features of this query:
-
-  * The CAST function is required for every column in the table. This function returns the MapR-DB/HBase binary data as readable integers and strings. Alternatively, you can use CONVERT_TO/CONVERT_FROM functions to decode the columns. CONVERT_TO and CONVERT_FROM are more efficient than CAST in most cases.
-  * The row_key column functions as the primary key of the table (a customer ID in this case).
-  * The table alias t is required; otherwise the column family names would be parsed as table names and the query would return an error.
-
-### Remove the quotes from the strings:
-
-You can use the regexp_replace function to remove the quotes around the
-strings in the query results. For example, to return a state name va instead
-of “va”:
-
-    0: jdbc:drill:> select cast(row_key as int), regexp_replace(cast(t.address.state as varchar(10)),'"','')
-    from customers t limit 1;
-    +------------+------------+
-    |   EXPR$0   |   EXPR$1   |
-    +------------+------------+
-    | 10001      | va         |
-    +------------+------------+
-    1 row selected
-
-## CREATE VIEW Command
-
-    0: jdbc:drill:> use dfs.views;
-    +------------+------------+
-    | ok | summary |
-    +------------+------------+
-    | true | Default schema changed to 'dfs.views' |
-    +------------+------------+
-
-### Use a mutable workspace:
-
-A mutable (or writable) workspace is a workspace that is enabled for “write”
-operations. This attribute is part of the storage plugin configuration. You
-can create Drill views and tables in mutable workspaces.
-
-### Create a view on a MapR-DB table
-
-    0: jdbc:drill:> create or replace view custview as select cast(row_key as int) as cust_id,
-    cast(t.personal.name as varchar(20)) as name, 
-    cast(t.personal.gender as varchar(10)) as gender, 
-    cast(t.personal.age as varchar(10)) as age, 
-    cast(t.address.state as varchar(4)) as state,
-    cast(t.loyalty.agg_rev as dec(7,2)) as agg_rev,
-    cast(t.loyalty.membership as varchar(20)) as membership
-    from maprdb.customers t;
-    +------------+------------+
-    |     ok     |  summary   |
-    +------------+------------+
-    | true       | View 'custview' replaced successfully in 'dfs.views' schema |
-    +------------+------------+
-    1 row selected
-
-Drill provides CREATE OR REPLACE VIEW syntax similar to relational databases
-to create views. Use the OR REPLACE option to make it easier to update the
-view later without having to remove it first. Note that the FROM clause in
-this example must refer to maprdb.customers. The MapR-DB tables are not
-directly visible to the dfs.views workspace.
-
-Unlike a traditional database where views typically are DBA/developer-driven
-operations, file system-based views in Drill are very lightweight. A view is
-simply a special file with a specific extension (.drill). You can store views
-even in your local file system or point to a specific workspace. You can
-specify any query against any Drill data source in the body of the CREATE VIEW
-statement.
-
-Drill provides a decentralized metadata model. Drill is able to query metadata
-defined in data sources such as Hive, HBase, and the file system. Drill also
-supports the creation of metadata in the file system.
-
-### Query data from the view:
-
-    0: jdbc:drill:> select * from custview limit 1;
-    +------------+------------+------------+------------+------------+------------+------------+
-    |  cust_id   |    name    |   gender   |    age     |   state    |  agg_rev   | membership |
-    +------------+------------+------------+------------+------------+------------+------------+
-    | 10001      | "Corrine Mecham" | "FEMALE"   | "15-20"    | "va"       | 197.00     | "silver"   |
-    +------------+------------+------------+------------+------------+------------+------------+
-
-Once the users get an idea on what data is available by exploring it directly
-from file system , views can be used as a way to take the data in downstream
-tools like Tableau, Microstrategy etc for downstream analysis and
-visualization. For these tools, a view appears simply as a “table” with
-selectable “columns” in it.
-
-## Query Across Data Sources
-
-Continue using dfs.views for this query.
-
-### Join the customers view and the orders table:
-
-    0: jdbc:drill:> select membership, sum(order_total) as sales from hive.orders, custview
-    where orders.cust_id=custview.cust_id
-    group by membership order by 2;
-    +------------+------------+
-    | membership |   sales    |
-    +------------+------------+
-    | "basic"    | 380665     |
-    | "silver"   | 708438     |
-    | "gold"     | 2787682    |
-    +------------+------------+
-    3 rows selected
-
-In this query, we are reading data from a MapR-DB table (represented by
-custview) and combining it with the order information in Hive. When doing
-cross data source queries such as this, you need to use fully qualified
-table/view names. For example, the orders table is prefixed by “hive,” which
-is the storage plugin name registered with Drill. We are not using any prefix
-for “custview” because we explicitly switched the dfs.views workspace where
-custview is stored.
-
-Note: If the results of any of your queries appear to be truncated because the
-rows are wide, set the maximum width of the display to 10000:
-
-Do not use a semicolon for this SET command.
-
-### Join the customers, orders, and clickstream data:
-
-    0: jdbc:drill:> select custview.membership, sum(orders.order_total) as sales from hive.orders, custview,
-    dfs.`/mapr/demo.mapr.com/data/nested/clicks/clicks.json` c 
-    where orders.cust_id=custview.cust_id and orders.cust_id=c.user_info.cust_id 
-    group by custview.membership order by 2;
-    +------------+------------+
-    | membership |   sales    |
-    +------------+------------+
-    | "basic"    | 372866     |
-    | "silver"   | 728424     |
-    | "gold"     | 7050198    |
-    +------------+------------+
-    3 rows selected
-
-This three-way join selects from three different data sources in one query:
-
-  * hive.orders table
-  * custview (a view of the HBase customers table)
-  * clicks.json file
-
-The join column for both sets of join conditions is the cust_id column. The
-views workspace is used for this query so that custview can be accessed. The
-hive.orders table is also visible to the query.
-
-However, note that the JSON file is not directly visible from the views
-workspace, so the query specifies the full path to the file:
-
-    dfs.`/mapr/demo.mapr.com/data/nested/clicks/clicks.json`
-
-
-# What's Next
-
-Go to [Lesson 3: Run Queries on Complex Data Types](/confluence/display/DRILL/
-Lesson+3%3A+Run+Queries+on+Complex+Data+Types). 
-
-
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/tutorial/005-lesson3.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/tutorial/005-lesson3.md b/_docs/drill-docs/tutorial/005-lesson3.md
deleted file mode 100644
index d9b362a..0000000
--- a/_docs/drill-docs/tutorial/005-lesson3.md
+++ /dev/null
@@ -1,379 +0,0 @@
----
-title: "Lession 3: Run Queries on Complex Data Types"
-parent: "Apache Drill Tutorial"
----
-## Goal
-
-This lesson focuses on queries that exercise functions and operators on self-
-describing data and complex data types. Drill offers intuitive SQL extensions
-to work with such data and offers high query performance with an architecture
-built from the ground up for complex data.
-
-## Queries in This Lesson
-
-Now that you have run ANSI SQL queries against different tables and files with
-relational data, you can try some examples including complex types.
-
-  * Access directories and subdirectories of files in a single SELECT statement.
-  * Demonstrate simple ways to access complex data in JSON files.
-  * Demonstrate the repeated_count function to aggregate values in an array.
-
-## Query Partitioned Directories
-
-You can use special variables in Drill to refer to subdirectories in your
-workspace path:
-
-  * dir0
-  * dir1
-  * …
-
-Note that these variables are dynamically determined based on the partitioning
-of the file system. No up-front definitions are required on what partitions
-exist. Here is a visual example of how this works:
-
-![example_query.png](../../img/example_query.png)
-
-### Set workspace to dfs.logs:
-
-    0: jdbc:drill:> use dfs.logs;
-    +------------+------------+
-    | ok | summary |
-    +------------+------------+
-    | true | Default schema changed to 'dfs.logs' |
-    +------------+------------+
-
-### Query logs data for a specific year:
-
-    0: jdbc:drill:> select * from logs where dir0='2013' limit 10;
-    +------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+-----------+------------+
-    | dir0 | dir1 | trans_id | date | time | cust_id | device | state | camp_id | keywords | prod_id | purch_flag |
-    +------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+-----------+------------+
-    | 2013 | 11 | 12119 | 11/09/2013 | 02:24:51 | 262 | IOS5 | ny | 0 | chamber | 198 | false |
-    | 2013 | 11 | 12120 | 11/19/2013 | 09:37:43 | 0 | AOS4.4 | il | 2 | outside | 511 | false |
-    | 2013 | 11 | 12134 | 11/10/2013 | 23:42:47 | 60343 | IOS5 | ma | 4 | and | 421 | false |
-    | 2013 | 11 | 12135 | 11/16/2013 | 01:42:13 | 46762 | AOS4.3 | ca | 4 | here's | 349 | false |
-    | 2013 | 11 | 12165 | 11/26/2013 | 21:58:09 | 41987 | AOS4.2 | mn | 4 | he | 271 | false |
-    | 2013 | 11 | 12168 | 11/09/2013 | 23:41:48 | 8600 | IOS5 | in | 6 | i | 459 | false |
-    | 2013 | 11 | 12196 | 11/20/2013 | 02:23:06 | 15603 | IOS5 | tn | 1 | like | 324 | false |
-    | 2013 | 11 | 12203 | 11/25/2013 | 23:50:29 | 221 | IOS6 | tx | 10 | if | 323 | false |
-    | 2013 | 11 | 12206 | 11/09/2013 | 23:53:01 | 2488 | AOS4.2 | tx | 14 | unlike | 296 | false |
-    | 2013 | 11 | 12217 | 11/06/2013 | 23:51:56 | 0 | AOS4.2 | tx | 9 | can't | 54 | false |
-    +------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+------------+
-
-
-This query constrains files inside the subdirectory named 2013. The variable
-dir0 refers to the first level down from logs, dir1 to the next level, and so
-on. So this query returned 10 of the rows for February 2013.
-
-### Further constrain the results using multiple predicates in the query:
-
-This query returns a list of customer IDs for people who made a purchase via
-an IOS5 device in August 2013.
-
-    0: jdbc:drill:> select dir0 as yr, dir1 as mth, cust_id from logs
-    where dir0='2013' and dir1='8' and device='IOS5' and purch_flag='true'
-    order by `date`;
-    +------------+------------+------------+
-    | yr | mth | cust_id |
-    +------------+------------+------------+
-    | 2013 | 8 | 4 |
-    | 2013 | 8 | 521 |
-    | 2013 | 8 | 1 |
-    | 2013 | 8 | 2 |
-    | 2013 | 8 | 4 |
-    | 2013 | 8 | 549 |
-    | 2013 | 8 | 72827 |
-    | 2013 | 8 | 38127 |
-    ...
-
-### Return monthly counts per customer for a given year:
-
-    0: jdbc:drill:> select cust_id, dir1 month_no, count(*) month_count from logs
-    where dir0=2014 group by cust_id, dir1 order by cust_id, month_no limit 10;
-    +------------+------------+-------------+
-    |  cust_id   |  month_no  | month_count |
-    +------------+------------+-------------+
-    | 0          | 1          | 143         |
-    | 0          | 2          | 118         |
-    | 0          | 3          | 117         |
-    | 0          | 4          | 115         |
-    | 0          | 5          | 137         |
-    | 0          | 6          | 117         |
-    | 0          | 7          | 142         |
-    | 0          | 8          | 19          |
-    | 1          | 1          | 66          |
-    | 1          | 2          | 59          |
-    +------------+------------+-------------+
-    10 rows selected
-
-This query groups the aggregate function by customer ID and month for one
-year: 2014.
-
-## Query Complex Data
-
-Drill provides some specialized operators and functions that you can use to
-analyze nested data natively without transformation. If you are familiar with
-JavaScript notation, you will already know how some of these extensions work.
-
-### Set the workspace to dfs.clicks:
-
-    0: jdbc:drill:> use dfs.clicks;
-    +------------+------------+
-    | ok | summary |
-    +------------+------------+
-    | true | Default schema changed to 'dfs.clicks' |
-    +------------+------------+
-
-### Explore clickstream data:
-
-Note that the user_info and trans_info columns contain nested data: arrays and
-arrays within arrays. The following queries show how to access this complex
-data.
-
-    0: jdbc:drill:> select * from `clicks/clicks.json` limit 5;
-    +------------+------------+------------+------------+------------+
-    | trans_id | date | time | user_info | trans_info |
-    +------------+------------+------------+------------+------------+
-    | 31920 | 2014-04-26 | 12:17:12 | {"cust_id":22526,"device":"IOS5","state":"il"} | {"prod_id":[174,2],"purch_flag":"false"} |
-    | 31026 | 2014-04-20 | 13:50:29 | {"cust_id":16368,"device":"AOS4.2","state":"nc"} | {"prod_id":[],"purch_flag":"false"} |
-    | 33848 | 2014-04-10 | 04:44:42 | {"cust_id":21449,"device":"IOS6","state":"oh"} | {"prod_id":[582],"purch_flag":"false"} |
-    | 32383 | 2014-04-18 | 06:27:47 | {"cust_id":20323,"device":"IOS5","state":"oh"} | {"prod_id":[710,47],"purch_flag":"false"} |
-    | 32359 | 2014-04-19 | 23:13:25 | {"cust_id":15360,"device":"IOS5","state":"ca"} | {"prod_id": [0,8,170,173,1,124,46,764,30,711,0,3,25],"purch_flag":"true"} |
-    +------------+------------+------------+------------+------------+
-
-
-### Unpack the user_info column:
-
-    0: jdbc:drill:> select t.user_info.cust_id as custid, t.user_info.device as device,
-    t.user_info.state as state
-    from `clicks/clicks.json` t limit 5;
-    +------------+------------+------------+
-    | custid | device | state |
-    +------------+------------+------------+
-    | 22526 | IOS5 | il |
-    | 16368 | AOS4.2 | nc |
-    | 21449 | IOS6 | oh |
-    | 20323 | IOS5 | oh |
-    | 15360 | IOS5 | ca |
-    +------------+------------+------------+
-
-This query uses a simple table.column.column notation to extract nested column
-data. For example:
-
-    t.user_info.cust_id
-
-where `t` is the table alias provided in the query, `user_info` is a top-level
-column name, and `cust_id` is a nested column name.
-
-The table alias is required; otherwise column names such as `user_info` are
-parsed as table names by the SQL parser.
-
-### Unpack the trans_info column:
-
-    0: jdbc:drill:> select t.trans_info.prod_id as prodid, t.trans_info.purch_flag as
-    purchased
-    from `clicks/clicks.json` t limit 5;
-    +------------+------------+
-    | prodid | purchased |
-    +------------+------------+
-    | [174,2] | false |
-    | [] | false |
-    | [582] | false |
-    | [710,47] | false |
-    | [0,8,170,173,1,124,46,764,30,711,0,3,25] | true |
-    +------------+------------+
-    5 rows selected
-
-Note that this result reveals that the prod_id column contains an array of IDs
-(one or more product ID values per row, separated by commas). The next step
-shows how you to access this kind of data.
-
-## Query Arrays
-
-Now use the [ n ] notation, where n is the position of the value in an array,
-starting from position 0 (not 1) for the first value. You can use this
-notation to write interesting queries against nested array data.
-
-For example:
-
-    trans_info.prod_id[0]
-
-refers to the first value in the nested prod_id column and
-
-    trans_info.prod_id[20]
-
-refers to the 21st value, assuming one exists.
-
-### Find the first product that is searched for in each transaction:
-
-    0: jdbc:drill:> select t.trans_id, t.trans_info.prod_id[0] from `clicks/clicks.json` t limit 5;
-    +------------+------------+
-    |  trans_id  |   EXPR$1   |
-    +------------+------------+
-    | 31920      | 174        |
-    | 31026      | null       |
-    | 33848      | 582        |
-    | 32383      | 710        |
-    | 32359      | 0          |
-    +------------+------------+
-    5 rows selected
-
-### For which transactions did customers search on at least 21 products?
-
-    0: jdbc:drill:> select t.trans_id, t.trans_info.prod_id[20]
-    from `clicks/clicks.json` t
-    where t.trans_info.prod_id[20] is not null
-    order by trans_id limit 5;
-    +------------+------------+
-    |  trans_id  |   EXPR$1   |
-    +------------+------------+
-    | 10328      | 0          |
-    | 10380      | 23         |
-    | 10701      | 1          |
-    | 11100      | 0          |
-    | 11219      | 46         |
-    +------------+------------+
-    5 rows selected
-
-This query returns transaction IDs and product IDs for records that contain a
-non-null product ID at the 21st position in the array.
-
-### Return clicks for a specific product range:
-
-    0: jdbc:drill:> select * from (select t.trans_id, t.trans_info.prod_id[0] as prodid,
-    t.trans_info.purch_flag as purchased
-    from `clicks/clicks.json` t) sq
-    where sq.prodid between 700 and 750 and sq.purchased='true'
-    order by sq.prodid;
-    +------------+------------+------------+
-    | trans_id | prodid | purchased |
-    +------------+------------+------------+
-    | 21886 | 704 | true |
-    | 20674 | 708 | true |
-    | 22158 | 709 | true |
-    | 34089 | 714 | true |
-    | 22545 | 714 | true |
-    | 37500 | 717 | true |
-    | 36595 | 718 | true |
-    ...
-
-This query assumes that there is some meaning to the array (that it is an
-ordered list of products purchased rather than a random list).
-
-## Perform Operations on Arrays
-
-### Rank successful click conversions and count product searches for each session:
-
-    0: jdbc:drill:> select t.trans_id, t.`date` as session_date, t.user_info.cust_id as
-    cust_id, t.user_info.device as device, repeated_count(t.trans_info.prod_id) as
-    prod_count, t.trans_info.purch_flag as purch_flag
-    from `clicks/clicks.json` t
-    where t.trans_info.purch_flag = 'true' order by prod_count desc;
-    +------------+--------------+------------+------------+------------+------------+
-    | trans_id | session_date | cust_id | device | prod_count | purch_flag |
-    +------------+--------------+------------+------------+------------+------------+
-    | 37426 | 2014-04-06 | 18709 | IOS5 | 34 | true |
-    | 31589 | 2014-04-16 | 18576 | IOS6 | 31 | true |
-    | 11600 | 2014-04-07 | 4260 | AOS4.2 | 28 | true |
-    | 35074 | 2014-04-03 | 16697 | AOS4.3 | 27 | true |
-    | 17192 | 2014-04-22 | 2501 | AOS4.2 | 26 | true |
-    ...
-
-This query uses a Drill SQL extension, the repeated_count function, to get an
-aggregated count of the array values. The query returns the number of products
-searched for each session that converted into a purchase and ranks the counts
-in descending order. Only clicks that have resulted in a purchase are counted.
-  
-## Store a Result Set in a Table for Reuse and Analysis
-
-Finally, run another correlated subquery that returns a fairly large result
-set. To facilitate additional analysis on this result set, you can easily and
-quickly create a Drill table from the results of the query.
-
-### Continue to use the dfs.clicks workspace
-
-    0: jdbc:drill:> use dfs.clicks;
-    +------------+------------+
-    | ok | summary |
-    +------------+------------+
-    | true | Default schema changed to 'dfs.clicks' |
-    +------------+------------+
-
-### Return product searches for high-value customers:
-
-    0: jdbc:drill:> select o.cust_id, o.order_total, t.trans_info.prod_id[0] as prod_id 
-    from hive.orders as o, `clicks/clicks.json` t 
-    where o.cust_id=t.user_info.cust_id 
-    and o.order_total > (select avg(inord.order_total) 
-    from hive.orders inord where inord.state = o.state);
-    +------------+-------------+------------+
-    |  cust_id   | order_total |   prod_id   |
-    +------------+-------------+------------+
-    ...
-    | 9650       | 69          | 16         |
-    | 9650       | 69          | 560        |
-    | 9650       | 69          | 959        |
-    | 9654       | 76          | 768        |
-    | 9656       | 76          | 32         |
-    | 9656       | 76          | 16         |
-    ...
-    +------------+-------------+------------+
-    106,281 rows selected
-
-This query returns a list of products that are being searched for by customers
-who have made transactions that are above the average in their states.
-
-### Materialize the result of the previous query:
-
-    0: jdbc:drill:> create table product_search as select o.cust_id, o.order_total, t.trans_info.prod_id[0] as prod_id
-    from hive.orders as o, `clicks/clicks.json` t 
-    where o.cust_id=t.user_info.cust_id and o.order_total > (select avg(inord.order_total) 
-    from hive.orders inord where inord.state = o.state);
-    +------------+---------------------------+
-    |  Fragment  | Number of records written |
-    +------------+---------------------------+
-    | 0_0        | 106281                    |
-    +------------+---------------------------+
-    1 row selected
-
-This example uses a CTAS statement to create a table based on a correlated
-subquery that you ran previously. This table contains all of the rows that the
-query returns (106,281) and stores them in the format specified by the storage
-plugin (Parquet format in this example). You can create tables that store data
-in csv, parquet, and json formats.
-
-### Query the new table to verify the row count:
-
-This example simply checks that the CTAS statement worked by verifying the
-number of rows in the table.
-
-    0: jdbc:drill:> select count(*) from product_search;
-    +------------+
-    |   EXPR$0   |
-    +------------+
-    | 106281     |
-    +------------+
-    1 row selected
-
-### Find the storage file for the table:
-
-    [root@maprdemo product_search]# cd /mapr/demo.mapr.com/data/nested/product_search
-    [root@maprdemo product_search]# ls -la
-    total 451
-    drwxr-xr-x. 2 mapr mapr      1 Sep 15 13:41 .
-    drwxr-xr-x. 4 root root      2 Sep 15 13:41 ..
-    -rwxr-xr-x. 1 mapr mapr 460715 Sep 15 13:41 0_0_0.parquet
-
-Note that the table is stored in a file called `0_0_0.parquet`. This file is
-stored in the location defined by the dfs.clicks workspace:
-
-    "location": "http://demo.mapr.com/data/nested)"
-
-with a subdirectory that has the same name as the table you created.
-
-## What's Next
-
-Complete the tutorial with the [Summary](/confluence/display/DRILL/Summary).
-
-
-

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/tutorial/006-summary.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/tutorial/006-summary.md b/_docs/drill-docs/tutorial/006-summary.md
deleted file mode 100644
index f210766..0000000
--- a/_docs/drill-docs/tutorial/006-summary.md
+++ /dev/null
@@ -1,14 +0,0 @@
----
-title: "Summary"
-parent: "Apache Drill Tutorial"
----
-This tutorial introduced Apache Drill and its ability to run ANSI SQL queries
-against various data sources, including Hive tables, MapR-DB/HBase tables, and
-file system directories. The tutorial also showed how to work with and
-manipulate complex and multi-structured data commonly found in Hadoop/NoSQL
-systems.
-
-Now that you are familiar with different ways to access the sample data with
-Drill, you can try writing your own queries against your own data sources.
-Refer to the [Apache Drill documentation](https://cwiki.apache.org/confluence/
-display/DRILL/Apache+Drill+Wiki) for more information.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/tutorial/install-sandbox/001-install-mapr-vm.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/tutorial/install-sandbox/001-install-mapr-vm.md b/_docs/drill-docs/tutorial/install-sandbox/001-install-mapr-vm.md
deleted file mode 100644
index f3d8953..0000000
--- a/_docs/drill-docs/tutorial/install-sandbox/001-install-mapr-vm.md
+++ /dev/null
@@ -1,55 +0,0 @@
----
-title: "Installing the MapR Sandbox with Apache Drill on VMware Player/VMware Fusion"
-parent: "Installing the Apache Drill Sandbox"
----
-Complete the following steps to install the MapR Sandbox with Apache Drill on
-VMware Player or VMware Fusion:
-
-  1. Download the MapR Sandbox with Drill file to a directory on your machine:  
-<https://www.mapr.com/products/mapr-sandbox-hadoop/download-sandbox-drill>
-
-  2. Open the virtual machine player, and select the **Open a Virtual Machine **option.
-
- Tip for VMware Fusion
-
-If you are running VMware Fusion, select** Import**.
-
-![](../../../img/vmWelcome.png)
-
-  3. Navigate to the directory where you downloaded the MapR Sandbox with Apache Drill file, and select `MapR-Sandbox-For-Apache-Drill-4.0.1_VM.ova`.
-
-![](../../../img/vmShare.png)
-
-The Import Virtual Machine dialog appears.
-
-  4. Click **Import**. The virtual machine player imports the sandbox.
-
-![](../../../img/vmLibrary.png)
-
-  5. Select `MapR-Sandbox-For-Apache-Drill-4.0.1_VM`, and click **Play virtual machine**. It takes a few minutes for the MapR services to start.   
-After the MapR services start and installation completes, the following screen
-appears:
-
-![](../../../img/loginSandbox.png)
-
-Note the URL provided in the screen, which corresponds to the Web UI in Apache
-Drill.
-
-  6. Verify that a DNS entry was created on the host machine for the virtual machine. If not, create the entry.
-
-    * For Linux and Mac, create the entry in `/etc/hosts`.  
-
-    * For WIndows, create the entry in the `%WINDIR%\system32\drivers\etc\hosts` file.  
-Example: `127.0.1.1 <vm_hostname>`
-
-  7. You can navigate to the URL provided to experience Drill Web UI or you can login to the sandbox through the command line.
-
-    a. To navigate to the MapR Sandbox with Apache Drill, enter the provided URL in your browser's address bar.  
-
-    b. To login to the virtual machine and access the command line, press Alt+F2 on Windows or Option+F5 on Mac. When prompted, enter `mapr` as the login and password.
-
-# What's Next
-
-After downloading and installing the sandbox, continue with the tutorial by
-[Getting to Know the Drill
-Setup](/confluence/display/DRILL/Getting+to+Know+the+Drill+Setup).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/drill-docs/tutorial/install-sandbox/002-install-mapr-vb.md
----------------------------------------------------------------------
diff --git a/_docs/drill-docs/tutorial/install-sandbox/002-install-mapr-vb.md b/_docs/drill-docs/tutorial/install-sandbox/002-install-mapr-vb.md
deleted file mode 100644
index 9ff26d5..0000000
--- a/_docs/drill-docs/tutorial/install-sandbox/002-install-mapr-vb.md
+++ /dev/null
@@ -1,72 +0,0 @@
----
-title: "Installing the MapR Sandbox with Apache Drill on VirtualBox"
-parent: "Installing the Apache Drill Sandbox"
----
-The MapR Sandbox for Apache Drill on VirtualBox comes with NAT port forwarding
-enabled, which allows you to access the sandbox using localhost as hostname.
-
-Complete the following steps to install the MapR Sandbox with Apache Drill on
-VirtualBox:
-
-  1. Download the MapR Sandbox with Apache Drill file to a directory on your machine:   
-<https://www.mapr.com/products/mapr-sandbox-hadoop/download-sandbox-drill>
-
-  2. Open the virtual machine player.
-
-  3. Select **File > Import Appliance**. The Import Virtual Appliance dialog appears.
-  
-     ![](../../../img/vbImport.png)
-
-  4. Navigate to the directory where you downloaded the MapR Sandbox with Apache Drill and click **Next**. The Appliance Settings window appears.
-  
-     ![](../../../img/vbapplSettings.png)
-
-  5. Select the check box at the bottom of the screen: **Reinitialize the MAC address of all network cards**, then click **Import**. The Import Appliance imports the sandbox.
-
-  6. When the import completes, select **File > Preferences**. The VirtualBox - Settings dialog appears.
-    
-     ![](../../../img/vbNetwork.png)
-
- 7. Select **Network**. 
-
-     The correct setting depends on your network connectivity when you run the
-Sandbox. In general, if you are going to use a wired Ethernet connection,
-select **NAT Networks **and **vboxnet0**. If you are going to use a wireless
-network, select **Host-only Networks** and the **VirtualBox Host-Only Ethernet
-Adapter**. If no adapters appear, click the green** +** button to add the
-VirtualBox adapter.
-
-    ![](../../../img/vbMaprSetting.png)
-
- 8. Click **OK **to continue.
-
- 9. Click ![](https://lh5.googleusercontent.com/6TjVEW28MJhPud2Nc2ButYB_GDqKTnadaluSulg0Zb259MgN1IRCgIlo-kMAEJ7lGWHf2aqc-nIjUsUFlaXP-LceAIKE5owNqXUWxXS0WXcBLWzUqg5X1VIXXswajb6oWA). The MapR-Sandbox-For-Apache-Drill-0.6.0-r2-4.0.1 - Settings dialog appears.
-  
-     ![](../../../img/vbGenSettings.png)    
-
- 10. Click **OK** to continue.
-
- 11. Click **Start**. It takes a few minutes for the MapR services to start. After the MapR services start and installation completes, the following screen appears:
-
-      ![](../../../img/vbloginSandbox.png)
-
- 12. The client must be able to resolve the actual hostname of the Drill node(s) with the IP(s). Verify that a DNS entry was created on the client machine for the Drill node(s).  
-If a DNS entry does not exist, create the entry for the Drill node(s).
-
-    * For Windows, create the entry in the %WINDIR%\system32\drivers\etc\hosts file.
-
-    * For Linux and Mac, create the entry in /etc/hosts.  
-<drill-machine-IP> <drill-machine-hostname>  
-Example: `127.0.1.1 maprdemo`
-
- 13. You can navigate to the URL provided or to [localhost:8047](http://localhost:8047) to experience the Drill Web UI, or you can log into the sandbox through the command line.
-
-    a. To navigate to the MapR Sandbox with Apache Drill, enter the provided URL in your browser's address bar.
-
-    b. To log into the virtual machine and access the command line, enter Alt+F2 on Windows or Option+F5 on Mac. When prompted, enter `mapr` as the login and password.
-
-# What's Next
-
-After downloading and installing the sandbox, continue with the tutorial by
-[Getting to Know the Drill
-Setup](/confluence/display/DRILL/Getting+to+Know+the+Drill+Setup).
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/58.png
----------------------------------------------------------------------
diff --git a/_docs/img/58.png b/_docs/img/58.png
new file mode 100644
index 0000000..b957927
Binary files /dev/null and b/_docs/img/58.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/BI_to_Drill_2.png
----------------------------------------------------------------------
diff --git a/_docs/img/BI_to_Drill_2.png b/_docs/img/BI_to_Drill_2.png
new file mode 100644
index 0000000..a7f32cd
Binary files /dev/null and b/_docs/img/BI_to_Drill_2.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/HbaseViewCreation0.png
----------------------------------------------------------------------
diff --git a/_docs/img/HbaseViewCreation0.png b/_docs/img/HbaseViewCreation0.png
new file mode 100644
index 0000000..0ae4465
Binary files /dev/null and b/_docs/img/HbaseViewCreation0.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/HbaseViewDSN.png
----------------------------------------------------------------------
diff --git a/_docs/img/HbaseViewDSN.png b/_docs/img/HbaseViewDSN.png
new file mode 100644
index 0000000..988e48b
Binary files /dev/null and b/_docs/img/HbaseViewDSN.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/Hbase_Browse.png
----------------------------------------------------------------------
diff --git a/_docs/img/Hbase_Browse.png b/_docs/img/Hbase_Browse.png
new file mode 100644
index 0000000..729e0f8
Binary files /dev/null and b/_docs/img/Hbase_Browse.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/Hive_DSN.png
----------------------------------------------------------------------
diff --git a/_docs/img/Hive_DSN.png b/_docs/img/Hive_DSN.png
new file mode 100644
index 0000000..be49d00
Binary files /dev/null and b/_docs/img/Hive_DSN.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/ODBC_CustomSQL.png
----------------------------------------------------------------------
diff --git a/_docs/img/ODBC_CustomSQL.png b/_docs/img/ODBC_CustomSQL.png
new file mode 100644
index 0000000..d2c7fb2
Binary files /dev/null and b/_docs/img/ODBC_CustomSQL.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/ODBC_HbasePreview2.png
----------------------------------------------------------------------
diff --git a/_docs/img/ODBC_HbasePreview2.png b/_docs/img/ODBC_HbasePreview2.png
new file mode 100644
index 0000000..948f268
Binary files /dev/null and b/_docs/img/ODBC_HbasePreview2.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/ODBC_HbaseView.png
----------------------------------------------------------------------
diff --git a/_docs/img/ODBC_HbaseView.png b/_docs/img/ODBC_HbaseView.png
new file mode 100644
index 0000000..be3bf4f
Binary files /dev/null and b/_docs/img/ODBC_HbaseView.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/ODBC_HiveConnection.png
----------------------------------------------------------------------
diff --git a/_docs/img/ODBC_HiveConnection.png b/_docs/img/ODBC_HiveConnection.png
new file mode 100644
index 0000000..a86d960
Binary files /dev/null and b/_docs/img/ODBC_HiveConnection.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/ODBC_to_Drillbit.png
----------------------------------------------------------------------
diff --git a/_docs/img/ODBC_to_Drillbit.png b/_docs/img/ODBC_to_Drillbit.png
new file mode 100644
index 0000000..7197d09
Binary files /dev/null and b/_docs/img/ODBC_to_Drillbit.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/ODBC_to_Quorum.png
----------------------------------------------------------------------
diff --git a/_docs/img/ODBC_to_Quorum.png b/_docs/img/ODBC_to_Quorum.png
new file mode 100644
index 0000000..bd77a28
Binary files /dev/null and b/_docs/img/ODBC_to_Quorum.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/Parquet_DSN.png
----------------------------------------------------------------------
diff --git a/_docs/img/Parquet_DSN.png b/_docs/img/Parquet_DSN.png
new file mode 100644
index 0000000..a76eb4e
Binary files /dev/null and b/_docs/img/Parquet_DSN.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/Parquet_Preview.png
----------------------------------------------------------------------
diff --git a/_docs/img/Parquet_Preview.png b/_docs/img/Parquet_Preview.png
new file mode 100644
index 0000000..121dff5
Binary files /dev/null and b/_docs/img/Parquet_Preview.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/RegionParquet_table.png
----------------------------------------------------------------------
diff --git a/_docs/img/RegionParquet_table.png b/_docs/img/RegionParquet_table.png
new file mode 100644
index 0000000..db914bb
Binary files /dev/null and b/_docs/img/RegionParquet_table.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/SelectHbaseView.png
----------------------------------------------------------------------
diff --git a/_docs/img/SelectHbaseView.png b/_docs/img/SelectHbaseView.png
new file mode 100644
index 0000000..a37b30e
Binary files /dev/null and b/_docs/img/SelectHbaseView.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/Untitled.png
----------------------------------------------------------------------
diff --git a/_docs/img/Untitled.png b/_docs/img/Untitled.png
new file mode 100644
index 0000000..7fea1e8
Binary files /dev/null and b/_docs/img/Untitled.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/VoterContributions_hbaseview.png
----------------------------------------------------------------------
diff --git a/_docs/img/VoterContributions_hbaseview.png b/_docs/img/VoterContributions_hbaseview.png
new file mode 100644
index 0000000..2c37df9
Binary files /dev/null and b/_docs/img/VoterContributions_hbaseview.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/img/ngram_plugin.png
----------------------------------------------------------------------
diff --git a/_docs/img/ngram_plugin.png b/_docs/img/ngram_plugin.png
new file mode 100644
index 0000000..c47148c
Binary files /dev/null and b/_docs/img/ngram_plugin.png differ