You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@drill.apache.org by br...@apache.org on 2015/02/26 01:31:08 UTC

[04/13] drill git commit: DRILL-2315: Confluence conversion plus fixes

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/interfaces/odbc-win/002-conf-odbc-win.md
----------------------------------------------------------------------
diff --git a/_docs/interfaces/odbc-win/002-conf-odbc-win.md b/_docs/interfaces/odbc-win/002-conf-odbc-win.md
new file mode 100644
index 0000000..636bd9f
--- /dev/null
+++ b/_docs/interfaces/odbc-win/002-conf-odbc-win.md
@@ -0,0 +1,143 @@
+---
+title: "Step 2. Configure ODBC Connections to Drill Data Sources"
+parent: "Using the MapR ODBC Driver on Windows"
+---
+Complete one of the following steps to create an ODBC connection to Drill data
+sources:
+
+  * Create a Data Source Name
+  * Create an ODBC Connection String
+
+**Prerequisite:** An Apache Drill installation must be available that is configured to access the data sources that you want to connect to.  For information about how to install Apache Drill, see [Install Drill](/drill/docs/install-drill). For information about configuring data sources, see the [Apache Drill documentation](/drill/docs).
+
+## Create a Data Source Name (DSN)
+
+Create a DSN that an application can use to connect to Drill data sources. If
+you want to create a DSN for a 32-bit application, you must use the 32-bit
+version of the ODBC Administrator to create the DSN.
+
+  1. To launch the ODBC Administrator, click **Start > All Programs > MapR Drill ODBC Driver 1.0 (32|64-bit) > (32|64-bit) ODBC Administrator**.  
+The ODBC Data Source Administrator window appears.
+
+     To launch the 32-bit version of the ODBC driver on a 64-bit machine, run:
+`C:\WINDOWS\SysWOW64\odbcad32.exe`.
+  2. Click the **System DSN** tab to create a system DSN or click the **User DSN** tab to create a user DSN. A system DSN is available for all users who log in to the machine. A user DSN is available to the user who creates the DSN.
+  3. Click **Add**.
+  4. Select **MapR Drill ODBC Driver** and click **Finish**.  
+     The _MapR Drill ODBC Driver DSN Setup_ window appears.
+  5. In the **Data Source Name** field, enter a name for the DSN,
+  6. Optionally, enter a description of the DSN in the Description field.
+  7. In the Connection Type section, select a connection type and enter the associated connection details:
+
+     <table style='table-layout:fixed;width:100%'><tbody><tr><th>Connection Type</th><th >Properties</th><th >Descriptions</th></tr><tr><td rowspan="2" valign="top" width="10%">Zookeeper Quorum</td><td valign="top" style='width: 100px;'>Quorum</td><td valign="top" style='width: 400px;'>A comma-separated list of servers in a Zookeeper cluster.For example, &lt;ip_zookeepernode1&gt;:5181,&lt;ip_zookeepernode21&gt;:5181,…</td></tr><tr><td valign="top">ClusterID</td><td valign="top">Name of the drillbit cluster. The default is drillbits1. You may need to specify a different value if the cluster ID was changed in the drill-override.conf file.</td></tr><tr><td colspan="1" valign="top">Direct to Drillbit</td><td colspan="1" valign="top"> </td><td colspan="1" valign="top">Provide the IP address or host name of the Drill server and the port number that that the Drill server is listening on.  The port number defaults to 31010. You may need to specify a different value if the port number was 
 changed in the drill-override.conf file.</td></tr></tbody></table>
+     For information on selecting the appropriate connection type, see [Connection
+Types](/drill/docs/step-2-configure-odbc-connections-to-drill-data-sources#connection-type).
+  8. In the **Default Schema** field, select the default schema that you want to connect to.
+     For more information about the schemas that appear in this list, see Schemas.
+  9. Optionally, perform one of the following operations:
+
+     <table ><tbody><tr><th >Option</th><th >Action</th></tr><tr><td valign="top">Update the configuration of the advanced properties.</td><td valign="top">Edit the default values in the <strong>Advanced Properties</strong> section. <br />For more information, see <a href="/drill/docs/advanced-properties/">Advanced Properties</a>.</td></tr><tr><td valign="top">Configure the types of events that you want the driver to log.</td><td valign="top">Click <strong>Logging Options</strong>. <br />For more information, see <a href="/drill/docs/step-2-configure-odbc-connections-to-drill-data-sources#logging-options">Logging Options</a>.</td></tr><tr><td valign="top">Create views or explore Drill sources.</td><td valign="top">Click <strong>Drill Explorer</strong>. <br />For more information, see <a href="/drill/docs/using-drill-explorer-to-browse-data-and-create-views">Using Drill Explorer to Browse Data and Create Views</a>.</td></tr></tbody></table>
+  10. Click **OK** to save the DSN.
+
+## Configuration Options
+
+### Connection Type
+
+ODBC can connect directly to a Drillbit or to a ZooKeeper Quorum. Select your
+connection type based on your environment and Drillbit configuration.
+
+The following table lists the appropriate connection type for each scenario:
+
+<table ><tbody><tr><th >Scenario</th><th >Connection Type</th></tr><tr><td valign="top">Drillbit is running in embedded mode.</td><td valign="top">Direct to Drillbit</td></tr><tr><td valign="top">Drillbit is registered with the ZooKeeper in a testing environment.</td><td valign="top">ZooKeeper Quorum or Direct to Drillbit</td></tr><tr><td valign="top">Drillbit is registered with the ZooKeeper in a production environment.</td><td valign="top">ZooKeeper Quorum</td></tr></tbody></table> 
+
+#### Connection to Zookeeper Quorum
+
+When you choose to connect to a ZooKeeper Quorum, the ODBC driver connects to
+the ZooKeeper Quorum to get a list of available Drillbits in the specified
+cluster. Then, the ODBC driver submits a query after selecting a Drillbit. All
+Drillbits in the cluster process the query and the Drillbit that received the
+query returns the query results.
+
+![ODBC to Quorum]({{ site.baseurl }}/docs/img/ODBC_to_Quorum.png)
+
+In a production environment, you should connect to a ZooKeeper Quorum for a
+more reliable connection. If one Drillbit is not available, another Drillbit
+that is registered with the ZooKeeper quorum can accept the query.
+
+#### Direct Connection to Drillbit
+
+When you choose to connect directly to a Drillbit, the ODBC driver connects to
+the Drillbit and submits a query. If you connect directly to Drillbit that is
+not part of a cluster, the Drillbit that you connect to processes the query.
+If you connect directly to a Drillbit that is part of a cluster, all Drillbits
+in the cluster process the query. In either case, the Drillbit that the ODBC
+driver connected to returns the query results.
+
+![]({{ site.baseurl }}/docs/img/ODBC_to_Drillbit.png)
+
+### Catalog
+
+This value defaults to DRILL and cannot be changed.
+
+### Schema
+
+The Default Schema list contains the data sources that you have configured to
+use with Drill via the Drill Storage Plugin.
+
+Views that you create using the Drill Explorer do not appear under the schema
+associated with the data source type. Instead, the views can be accessed from
+the file-based schema that you selected when saving the view.
+
+### Advanced Properties
+
+The Advanced Properties field allows you to customize the DSN.  
+You can configure the values of the following advanced properties:
+
+<table ><tbody><tr><th >Property Name</th><th >Default Value</th><th >Description</th></tr><tr><td valign="top">HandshakeTimeout</td><td valign="top">5</td><td valign="top">An integer value representing the number of seconds that the driver waits to establish a connection before aborting. When set to 0, the driver does not abort connection attempts.</td></tr><tr><td valign="top">QueryTimeout</td><td valign="top">180</td><td valign="top">An integer value representing the number of seconds for the driver to wait before automatically stopping a query. When set to 0, the driver does not stop queries automatically.</td></tr><tr><td valign="top">TimestampTZDisplayTimezone</td><td valign="top">local</td><td valign="top">A string value that defines how the timestamp with timezone is displayed:<ul><li class="Body"><strong>local</strong>—Timestamps appear in the time zone of the user.</li><li class="Body"><strong>utc</strong>—Timestamps appear in Coordinated Universal Time (UTC).</li></ul
 ></td></tr><tr><td valign="top">ExcludedSchemas</td><td valign="top">sys, INFORMATION_SCHEMA</td><td valign="top">A list of schemas that should not appear in client applications such as Drill Explorer, Tableau, and Excel. Separate schemas in the list using a comma (,).</td></tr></tbody></table> 
+Separate each advanced property using a semicolon.
+
+For example, the following Advanced Properties string excludes the schemas
+named test and abc; sets the timeout to 30 seconds; and, sets the time zone to
+Coordinated Universal Time:  
+`HandshakeTimeout=30;QueryTimeout=30;TimestampTZDisplayTimezone=utc;ExcludedSchemas=test,abc`
+
+### Logging Options
+
+Configure logging to troubleshoot issues. To configure logging, click the
+Logging Options button on the ODBC DSN Setup dialog and then set a log level
+and a log path.
+
+If logging is enabled, the MapR Drill ODBC driver logs events in following log
+files in the log path that you configure:
+
+<table ><tbody><tr><th >Log File</th><th >Description</th></tr><tr><td valign="top">driver.log</td><td valign="top">A log of driver events.</td></tr><tr><td valign="top">drillclient.log</td><td valign="top">A log of the Drill client events.</td></tr></tbody></table> 
+
+#### Logging Levels
+
+Each logging level provides a different level of detail in the log files. The
+following log levels are available:
+
+<table ><tbody><tr><th >Logging Level</th><th >Description</th></tr><tr><td valign="top">OFF</td><td valign="top">Disables all logging.</td></tr><tr><td valign="top">FATAL</td><td valign="top">Logs severe error events that may cause the driver to stop running.</td></tr><tr><td valign="top">ERROR</td><td valign="top">Logs error events that may allow the driver to continue running.</td></tr><tr><td valign="top">WARNING</td><td valign="top">Logs events about potentially harmful situations.</td></tr><tr><td valign="top">INFO</td><td valign="top">Logs high-level events about driver processes.</td></tr><tr><td valign="top">DEBUG</td><td valign="top">Logs detailed events that may help to debug issues.</td></tr><tr><td colspan="1" valign="top">TRACE</td><td colspan="1" valign="top">Logs finer-grained events than the DEBUG level.</td></tr></tbody></table>
+
+## Create an ODBC Connection String
+
+If you want to connect to a Drill data source from an application that does
+not require a DSN, you can use an ODBC connection string.  
+The following table describes the properties that you can use in the
+connection string:
+
+<table ><tbody><tr><th >Property</th><th >Description</th></tr><tr><td valign="top">AdvancedProperties</td><td valign="top">Separate advanced properties using a semicolon (;), and then surround all advanced properties in a connection string using braces { and }.For more information, see <a href="#Step2.ConfigureODBCConnectionstoDrillDataSources-AdvancedProperties">Advanced Properties</a>.</td></tr><tr><td valign="top">Catalog</td><td valign="top">The name of the catalog, under which all of the schemas are organized. The catalog name is DRILL.</td></tr><tr><td valign="top">ConnectionType</td><td valign="top">One of the following values:<br />• Direct—Connect to a Drill server using Host and Port properties in the connection string.<br />• ZooKeeper—Connect to a ZooKeeper cluster using ZKQuorum and ZKClusterID properties in the connection string.For more information, see<a href="#Step2.ConfigureODBCConnectionstoDrillDataSources-ConnectionType"> Connection Type</a>.</td></tr><t
 r><td valign="top">DRIVER</td><td valign="top">The name of the installed driver: MapR Drill ODBC Driver<br />(Required)</td></tr><tr><td valign="top">Host</td><td valign="top">If the ConnectionType property is set to Direct, indicate the IP address or hostname of the Drillbit server.</td></tr><tr><td valign="top">Port</td><td valign="top">If the ConnectionType property is set to Direct, indicate the port on which the Drillbit server is listening.</td></tr><tr><td valign="top">Schema</td><td valign="top">The name of the database schema to use when a schema is not explicitly specified in a query.<br />Note: Queries on other schemas can still be issued by explicitly specifying the schema in the query.</td></tr><tr><td valign="top">ZKClusterID</td><td valign="top">If the ConnectionType property is set to ZooKeeper, then use ZKClusterID to indicate the name of the Drillbit cluster to use.</td></tr><tr><td valign="top">ZKQuorum</td><td valign="top">If the ConnectionType property is set to
  ZooKeeper, then use ZKQuorum to indicate the server(s) in your ZooKeeper cluster. Separate multiple servers using a comma (,).</td></tr></tbody></table>
+
+#### Connection String Examples
+
+The following is an example connection string for the Direct connection type:  
+
+        DRIVER=MapR Drill ODBC Driver;AdvancedProperties={HandshakeTimeout=0;QueryTimeout=0;TimestampTZDisplayTimezone=utc;ExcludedSchemas=sys,INFORMATION_SCHEMA;};Catalog=DRILL;Schema=hivestg;ConnectionType=Direct;Host=192.168.202.147;Port=31010
+
+The following is an example connection string for the Zookeeper connection
+type:  
+
+        DRIVER=MapR Drill ODBC Driver;AdvancedProperties={HandshakeTimeout=0;QueryTimeout=0;TimestampTZDisplayTimezone=utc;ExcludedSchemas=sys, INFORMATION_SCHEMA;};Catalog=DRILL;Schema=;ConnectionType=ZooKeeper;ZKQuorum=192.168.39.43:5181;ZKClusterID=drillbits1
+
+#### What's Next? Go to [Step 3. Connect to Drill Data Sources from a BI Tool](/drill/docs/step-3-connect-to-drill-data-sources-from-a-bi-tool).
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/interfaces/odbc-win/003-connect-odbc-win.md
----------------------------------------------------------------------
diff --git a/_docs/interfaces/odbc-win/003-connect-odbc-win.md b/_docs/interfaces/odbc-win/003-connect-odbc-win.md
new file mode 100644
index 0000000..0d4cb8a
--- /dev/null
+++ b/_docs/interfaces/odbc-win/003-connect-odbc-win.md
@@ -0,0 +1,23 @@
+---
+title: "Step 3. Connect to Drill Data Sources from a BI Tool"
+parent: "Using the MapR ODBC Driver on Windows"
+---
+After you create the ODBC DSN, you can use ODBC to directly connect to data
+that is defined by a schema, such as Hive, and data that is self-describing.
+Examples of self-describing data include HBase, Parquet, JSON, CSV,and TSV.
+
+In some cases, you may want to use Drill Explorer to explore that data or to
+create a view before you connect to the data from a BI tool. For more
+information about Drill Explorer, see [Using Drill Explorer to Browse Data and
+Create Views](/drill/docs/using-drill-explorer-to-browse-data-and-create-views).
+
+In an ODBC-compliant BI tool, use the ODBC DSN to create an ODBC connection
+with one of the methods applicable to the data source type:
+
+<table ><tbody><tr><th >Data Source Type</th><th>ODBC Connection Method</th></tr><tr><td valign="top">Hive</td><td valign="top">Connect to a table.<br />Connect to the table using custom SQL.<br />Use Drill Explorer to create a view. Then use ODBC to connect to the view as if it were a table.</td></tr><tr><td valign="top">HBase<br /><span style="line-height: 1.4285715;background-color: transparent;">Parquet<br /></span><span style="line-height: 1.4285715;background-color: transparent;">JSON<br /></span><span style="line-height: 1.4285715;background-color: transparent;">CSV<br /></span><span style="line-height: 1.4285715;background-color: transparent;">TSV</span></td><td valign="top">Use Drill Explorer to create a view. Then use ODBC to connect to the view as if it were a table.<br />Connect to the data using custom SQL.</td></tr></tbody></table>
+  
+For examples of how to connect to Drill data sources from a BI tool, see the
+[Step 3. Connect to Drill Data Sources from a BI Tool](/drill/docs/step-3-connect-to-drill-data-sources-from-a-bi-tool).
+
+**Note:** The default schema that you configure in the DSN may or may not carry over to an application’s data source connections. You may need to re-select the schema.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/interfaces/odbc-win/004-tableau-examples.md
----------------------------------------------------------------------
diff --git a/_docs/interfaces/odbc-win/004-tableau-examples.md b/_docs/interfaces/odbc-win/004-tableau-examples.md
new file mode 100644
index 0000000..d45f3f3
--- /dev/null
+++ b/_docs/interfaces/odbc-win/004-tableau-examples.md
@@ -0,0 +1,245 @@
+---
+title: "Tableau Examples"
+parent: "Using the MapR ODBC Driver on Windows"
+---
+You can generate reports in Tableau using ODBC connections to Drill data
+sources. Each example in this section takes you through the steps to create a
+DSN to a Drill data source and then access the data in Tableau 8.1.
+
+This section includes the following examples:
+  * Connecting to a Hive table
+  * Using a view to connect to Hbase table data
+  * Using custom SQL to connect to data in a Parquet file
+The steps and results of these examples assume pre-configured schemas and
+source data. You configure schemas as storage plugin instances on the Storage
+tab of the [Drill Web UI](/drill/docs/getting-to-know-the-drill-sandbox#storage-plugins-overview).
+
+## Example: Connect to a Hive Table in Tableau
+
+To access Hive tables in Tableau 8.1, connect to the Hive schema using a DSN
+and then visualize the data in Tableau.  
+**Note:** This example assumes that there is a schema named hive.default which contains a table named student_hive. 
+
+## Step 1: Create a DSN to a Hive Table
+
+In this step, we will create a DSN that accesses a Hive table.
+
+  1. To launch the ODBC Administrator, click **Start > All Programs > MapR Drill ODBC Driver 1.0 (32|64-bit) > (32|64-bit) ODBC Administrator.**
+     The _ODBC Data Source Administrator _window appears.
+  2. On the **System DSN** tab, click **Add**.
+  3. Select **MapR Drill ODBC Driver** and click **Finish**.  
+     The _MapR Drill ODBC Driver DSN Setup_ window appears.
+  4. Enter a name for the data source.
+  5. Specify the connection type based on your requirements. The connection type provides the DSN access to Drill Data Sources. .  
+In this example, we are connecting to a Zookeeper Quorum.
+  6. In the **Schema** field, select the Hive schema.
+     In this example, the Hive schema is named hive.default.
+     ![]({{ site.baseurl }}/docs/img/Hive_DSN.png)
+  7. Click **OK** to create the DSN and return to the ODBC Data Source Administrator window.
+  8. Click **OK** to close the ODBC Data Source Administrator.
+
+## Step 2: Connect to Hive Tables in Tableau
+
+Now, we can connect to Hive tables.
+
+  1. In Tableau, click **Data > Connect to Data**.
+  2. In the _On a server_ section, click **Other Databases (ODBC**).  
+     The _Generic ODBC Connection_ dialog appears.
+  3. In the _Connect Using_ section, select the DSN that connects to the Hive table.   
+-or-  
+To create a connection without an existing DSN, select the Driver option,
+select the MapR Drill ODBC driver from the list and click **Connect.** Then,
+configure the connection to the Hive table and click **OK**.** **
+  4. In the **Schema** field, select the Hive schema.  
+     In this example, the Hive schema is named hive.default.
+  5. In the _Table_ section, verify that **Single Table** is selected and then click the Search icon.  
+     A list of tables appears.
+  6. Select the table from the list and click **Select**.   
+     In this example, the table name is student_hive.
+  7. Click **OK** to complete the connection.  
+     ![]({{ site.baseurl }}/docs/img/ODBC_HiveConnection.png)
+  8. In the _Data Connection_ dialog, click **Connect Live**.
+
+## Step 3. Visualize the Data in Tableau
+
+Once you connect to the data, the columns appear in the Data window. To
+visualize the data, drag fields from the Data window to the workspace view.
+
+For example, you can visualize the data in this way:
+
+![]({{ site.baseurl }}/docs/img/student_hive.png)
+
+# Example: Connect to Self-Describing Data in Tableau
+
+You can connect to self-describing data in Tableau in the following ways:
+
+  1. Use Drill Explorer to explore the self-describing data sources, create a Drill view, and then use ODBC to access the view in Tableau as if it were a table. 
+  2. Use Tableau’s Custom SQL to query the self-describing data directly. 
+
+## Option 1. Using a View to Connect to Self-Describing Data
+
+The following example describes how to create a view of an HBase table and
+connect to that view in Tableau 8.1. You can also use these steps to access
+data for other sources such as Hive, Parquet, JSON, TSV, and CSV.
+
+**Note:** This example assumes that there is a schema named hbase that contains a table named s_voters. It also assumes that there is a schema named dfs.default that points to a writable location.
+
+### Step 1. Create a View and a DSN
+
+In this step, we will use the ODBC Administrator to access the Drill Explorer
+where we can create a view of an HBase table. Then, we will use the ODBC
+Administrator to create a DSN that connects to the view.
+
+  1. To launch the ODBC Administrator, click **Start > All Programs > MapR Drill ODBC Driver 1.0 (32|64-bit) > (32|64-bit) ODBC Administrator**.  
+     The _ODBC Data Source Administrator_ window appears.
+  2. On the System DSN tab, click **Add**.
+  3. Select **MapR Drill ODBC Driver** and click **Finish**.
+     The _MapR Drill ODBC Driver DSN Setup_ window appears.
+  4. Specify the Connection Type based on your requirements.
+     The connection type provides the DSN access to a Drillbit. For more
+information, see [Connection Type](http://doc.mapr.com/display/MapR/Step+2.+Co
+nfigure+ODBC+Connections+to+Drill+Data+Sources#ConnectionType).
+  5. Click **Drill Explorer** to start exploring the data.
+     The Drill Explorer dialog appears. You can use the Browse tab to visually
+explore the metadata and data available from Drill data sources. Advanced
+users can use SQL tab to type in SQL manually to explore the data and save the
+SQL query as a view.
+  6. Select the schema that you want to create a view for.
+      ![]({{ site.baseurl }}/docs/img/Hbase_Browse.png)        
+     Drill Explorer displays the metadata and column families for the selected
+HBase table.
+  7. To create a view of the HBase table, click the **SQL** tab.  
+     By default, the View Definition SQL field contains: `SELECT * FROM
+<schema>.<table>`
+  8. To create the view, enter SQL in the _View Definition SQL_ section and then click **Preview** to verify that the results are as expected.   
+      ![]({{ site.baseurl }}/docs/img/ODBC_HbasePreview2.png)
+     In this example, the following SQL was entered:
+       
+        SELECT cast(row_key as integer) voter_id, convert_from(voter.onecf.name,
+        'UTF8') name, cast(voter.twocf.age as integer) age,
+        cast(voter.twocf.registration as varchar(20)) registration,
+        cast(voter.threecf.contributions as decimal(6,2)) contributions,
+        cast(voter.threecf.voterzone as integer)
+        voterzone,cast(voter.fourcf.create_date as timestamp) create_time FROM
+        hbase.voter
+
+     HBase does not contain type information, so you need to cast the data in Drill
+Explorer. For information about SQL query support, see the [SQL
+Reference] (/drill/docs/sql-reference).
+  9. To save the view, click **Create As**.
+  10. Specify the schema where you want to save the view, enter a name for the view, and click **Save**.  
+
+       ![]({{ site.baseurl }}/docs/img/HbaseViewCreation0.png)
+
+  11. Close the Drill Explorer to return to the _MapR Drill ODBC Driver DSN Setup _window.  
+      Now that we have created the view, we can create a DSN that can access the
+view.
+  12. Enter a data source name and select the schema where you saved the view.  
+      In this example, we saved the view to dfs.default.        
+       ![]({{ site.baseurl }}/docs/img/HbaseViewDSN.png)
+  13. Click **OK** to create the DSN and return to the _ODBC Data Source Administrator_ window.
+  14. Click **OK** to close the ODBC Data Source Administrator.
+
+### Step 2. Connect to the View from Tableau
+
+Now, we can connect to the view in Tableau.
+
+  1. In Tableau, click **Data > Connect to Data**.
+  2. In the _On a server_ section, click **Other Databases (ODBC).  
+**The _Generic ODBC Connection_ dialog appears.
+  3. In the _Connect Using_ section, select the DSN that connects to the schema that contains the view that you created.   
+     -or-  
+     To create a connection without an existing DSN, select the **Driver** option, select the **MapR Drill ODBC Driver** from the list and click **Connect**. Then, configure the connection using the steps in step 1 and click **OK**.In this example, we created SQLView-DrillDataSource to access the view.
+  4. In the **Schema **field, select the schema that contains the views that you created in Drill Explorer.  
+     In this example, we saved the view to the dfs_default schema.
+  5. In the _Table _section, verify that **Single Table** is selected and then click the Search icon.  
+     A list of views appears.
+  6. Select the view from the list and click **Select**.   
+     In this example, we need to select hbase_s_voter.  
+      ![]({{ site.baseurl }}/docs/img/SelectHbaseView.png)
+  7. Click **OK** to complete the connection.   
+      ![]({{ site.baseurl }}/docs/img/ODBC_HbaseView.png)
+  8. In the _Data Connection dialog_, click **Connect Live**.
+
+### Step 3. Visualize the Data in Tableau
+
+Once you connect to the data in Tableau, the columns appear in the Data
+window. To visualize the data, drag fields from the Data window to the
+workspace view.
+
+For example, you can visualize the data in this way:
+
+![]({{ site.baseurl }}/docs/img/VoterContributions_hbaseview.png)
+
+## Option 2. Using Custom SQL to Access Self-Describing Data
+
+The following example describes how to use custom SQL to connect to a Parquet
+file and then visualize the data in Tableau 8.1. You can use the same steps to
+access data from other sources such as Hive, HBase, JSON, TSV, and CSV.
+
+**Note:** This example assumes that there is a schema named dfs.default which contains a parquet file named region.parquet. 
+
+### Step 1. Create a DSN to the Parquet File and Preview the Data
+
+In this step, we will create a DSN that accesses files on the DFS. We will
+also use Drill Explorer to preview the SQL that we want to use to connect to
+the data in Tableau.
+
+  1. To launch the ODBC Administrator, click **Start > All Programs > MapR Drill ODBC Driver 1.0 (32|64-bit) > (32|64-bit) ODBC Administrator.  
+     **The _ODBC Data Source Administrator _window appears.
+  2. On the **System DSN** tab, click **Add**.
+  3. Select **MapR Drill ODBC Driver** and click **Finish**.  
+     The _MapR Drill ODBC Driver DSN Setup_ window appears.
+  4. Enter a data source name.
+  5. Specify the connection type based on your requirements. See Connection Type for more information.  
+     The connection type provides the DSN access to a Drillbit.  
+     In this example, we will connect to a Zookeeper Quorum.
+  6. In the _Schema_ section, select the schema associated with the data source that contains the Parquet file that you want to access. Then, click **OK**.  
+     In this example, the Parquet file is available in the dfs.default schema.  
+      ![]({{ site.baseurl }}/docs/img/Parquet_DSN.png)  
+     You can use this DSN to access multiple files from the same schema.  
+     In this example, we plan to use the Custom SQL option to connect to data in Tableau. You can use Drill Explorer to preview the results of custom SQL before you enter the SQL in Tableau.
+  7. If you want to preview the results of a query, click **Drill Explorer**.
+    1. On the **Browse** tab, navigate to the file that you want. 
+    2. Click the **SQL** tab.  
+       The SQL tab will include a default query to the file you selected on the Browse tab. You can use the SQL tab to preview the results of various queries until you achieve the expected result.
+    3. Enter the query that you want to preview and then click **Preview**.  
+       ![]({{ site.baseurl }}/docs/img/Parquet_Preview.png)  
+       You can copy this query to file so that you can use it in Tableau.
+    4. Close the Drill Explorer window. 
+  8. Click **OK** to create the DSN and return to the _ODBC Data Source Administrato_r window.
+  9. Click **OK** to close the ODBC Data Source Administrator.
+
+### Step 2. Connect to a Parquet File in Tableau using Custom SQL
+
+Now, we can create a connection to the Parquet file using the custom SQL.
+
+  1. In Tableau, click **Data > Connect to Data**.
+  2. In the _On a server_ section, click **Other Databases (ODBC).  
+     **The _Generic ODBC Connection_ dialog appears.
+  3. In the _Connect Using_ section, select the DSN that connects to the data source.  
+     In this example, Files-DrillDataSources was selected.
+  4. In the _Schema_ section, select the schema associated with the data source.  
+     In this example, dfs.default was selected.
+  5. In the _Table _section, select **Custom SQL**.
+  6. Enter the SQL query.  
+     In this example, the following SQL query was entered: 
+     
+         SELECT CAST(R_NAME as varchar(20))Country,
+         CAST(R_COMMENT as varchar(200))Comments, R_RegionKey 
+         FROM `dfs`.`default`.`./opt/mapr/drill/drill-1.0.0.BETA1/sample-data/region.parquet`  
+
+     Note: The path to the file depends on its location in your file system.` `
+
+  7. Click **OK** to complete the connection.  
+     ![]({{ site.baseurl }}/docs/img/ODBC_CustomSQL.png)
+  8. In the _Data Connection dialog_, click **Connect Live**.
+
+### Step 3. Visualize the Data in Tableau
+
+Once you connect to the data, the fields appear in the Data window. To
+visualize the data, drag fields from the Data window to the workspace view.
+
+For example, you can visualize the data in this way:
+![]({{ site.baseurl }}/docs/img/RegionParquet_table.png)
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/interfaces/odbc-win/005-browse-view.md
----------------------------------------------------------------------
diff --git a/_docs/interfaces/odbc-win/005-browse-view.md b/_docs/interfaces/odbc-win/005-browse-view.md
new file mode 100644
index 0000000..98bb511
--- /dev/null
+++ b/_docs/interfaces/odbc-win/005-browse-view.md
@@ -0,0 +1,49 @@
+---
+title: "Using Drill Explorer to Browse Data and Create Views"
+parent: "Using the MapR ODBC Driver on Windows"
+---
+Drill Explorer is a simple user interface that is embedded within the ODBC
+DSN. Drill Explorer enables users to understand the metadata and data before
+visualizing the data in a BI tool. Use Drill Explorer to browse Drill data
+sources, preview the results of a SQL query, and create a view that you can
+query.
+
+The Browse tab of Drill Explorer allows you to view metadata for each schema
+that you can access with Drill. The SQL tab allows you to preview the results
+of custom queries and save the results as a view.
+
+**To Browse Data:**
+
+  1. To launch the ODBC Administrator, click **Start > All Programs > MapR Drill ODBC Driver 1.0 (32|64-bit) > (32|64-bit) ODBC Administrator.**
+  2. Click the **User DSN** tab or the **System DSN** tab and then select the DSN that corresponds to the Drill data source that you want to explore.
+  3. Click **Configure**.  
+     The _MapR Drill ODBC Driver DSN Setup_ dialog appears.
+  4. Click **Drill Explorer**.
+  5. In the **Schemas** section on the **Browse** tab, navigate to the the data source that you want to explore.
+
+**To Create a View**:
+
+  1. To launch the ODBC Administrator, click** Start > All Programs > MapR Drill ODBC Driver 1.0 (32|64-bit) > (32|64-bit) ODBC Administrator.**
+  2. Click the **User DSN** tab or the **System DSN** tab and then select the DSN that corresponds to the Drill data source that you want to explore.
+  3. Click **Configure**.  
+     The _MapR Drill ODBC Driver DSN Setup_ dialog appears.
+  4. Click **Drill Explorer**.
+  5. In the **Schemas** section on the **Browse** tab, navigate to the the data source that you want to create a view for.  
+     After you select a data souce, the metadata and data displays on the Browse tab and the SQL that is used to access the data displays on the SQL tab.
+  6. Click the **SQL** tab.
+  7. In the **View Definition SQL** field, enter the SQL query that you want to create a view for.
+  8. Click **Preview**.   
+      If the results are not as expected, you can edit the SQL query and click
+Preview again.
+  9. Click **Create As**.  
+     The _Create As_ dialog displays.
+  10. In the **Schema** field, select the schema where you want to save the view.
+      As of 0.4.0, you can only save views to file-based schemas.
+  11. In the **View Name** field, enter a descriptive name for the view.
+      As of 0.4.0, do not include spaces in the view name.
+  12. Click **Save**.   
+      The status and any error message associated with the view creation displays in
+the Create As dialog. When a view saves successfully, the Save button changes
+to a Close button.
+  13. Click **Close**.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/001-conf.md
----------------------------------------------------------------------
diff --git a/_docs/manage/001-conf.md b/_docs/manage/001-conf.md
new file mode 100644
index 0000000..b67a340
--- /dev/null
+++ b/_docs/manage/001-conf.md
@@ -0,0 +1,14 @@
+---
+title: "Configuration Options"
+parent: "Manage Drill"
+---
+Drill provides several configuration options described in subsequent subsections that you can enable, disable, or
+modify. Modifying certain configuration options can impact Drill’s
+performance. Many of Drill's configuration options reside in the `drill-
+env.sh` and `drill-override.conf` files. Drill stores these files in the
+`/conf` directory. Drill sources` /etc/drill/conf` if it exists. Otherwise,
+Drill sources the local `<drill_installation_directory>/conf` directory.
+
+
+  
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/002-start-stop.md
----------------------------------------------------------------------
diff --git a/_docs/manage/002-start-stop.md b/_docs/manage/002-start-stop.md
new file mode 100644
index 0000000..76a76f4
--- /dev/null
+++ b/_docs/manage/002-start-stop.md
@@ -0,0 +1,45 @@
+---
+title: "Starting/Stopping Drill"
+parent: "Manage Drill"
+---
+How you start Drill depends on the installation method you followed. If you
+installed Drill in embedded mode, invoking SQLLine automatically starts a
+Drillbit locally. If you installed Drill in distributed mode on one or
+multiple nodes in a cluster, you must start the Drillbit service and then
+invoke SQLLine. Once SQLLine starts, you can issue queries to Drill.
+
+## Starting a Drillbit
+
+If you installed Drill in embedded mode, you do not need to start the
+Drillbit.
+
+To start a Drillbit, navigate to the Drill installation directory, and issue
+the following command:
+
+`bin/drillbit.sh restart`
+
+## Invoking SQLLine/Connecting to a Schema
+
+SQLLine is used as the Drill shell. SQLLine connects to relational databases
+and executes SQL commands. You invoke SQLLine for Drill in embedded or
+distributed mode. If you want to connect directly to a particular schema, you
+can indicate the schema name when you invoke SQLLine.
+
+To start SQLLine, issue the appropriate command for your Drill installation
+type:
+
+<table ><tbody><tr><td valign="top"><strong>Drill Install Type</strong></td><td valign="top"><strong>Example</strong></td><td valign="top"><strong>Command</strong></td></tr><tr><td valign="top">Embedded</td><td valign="top">Drill installed locally (embedded mode);Hive with embedded metastore</td><td valign="top">To connect without specifying a schema, navigate to the Drill installation directory and issue the following command:<code>$ bin/sqlline -u jdbc:drill:zk=local -n admin -p admin </code><span> </span>Once you are in the prompt, you can issue<code> USE &lt;schema&gt; </code>or you can use absolute notation: <code>schema.table.column.</code>To connect to a schema directly, issue the command with the schema name:<code>$ bin/sqlline -u jdbc:drill:schema=&lt;database&gt;;zk=local -n admin -p admin</code></td></tr><tr><td valign="top">Distributed</td><td valign="top">Drill installed in distributed mode;Hive with remote metastore;HBase</td><td valign="top">To connect without specify
 ing a schema, navigate to the Drill installation directory and issue the following command:<code>$ bin/sqlline -u jdbc:drill:zk=&lt;zk1host&gt;:&lt;port&gt;,&lt;zk2host&gt;:&lt;port&gt;,&lt;zk3host&gt;:&lt;port&gt; -n admin -p admin</code>Once you are in the prompt, you can issue<code> USE &lt;schema&gt; </code>or you can use absolute notation: <code>schema.table.column.</code>To connect to a schema directly, issue the command with the schema name:<code>$ bin/sqlline -u jdbc:drill:schema=&lt;database&gt;;zk=&lt;zk1host&gt;:&lt;port&gt;,&lt;zk2host&gt;:&lt;port&gt;,&lt;zk3host&gt;:&lt;port&gt; -n admin -p admin</code></td></tr></tbody></table>
+  
+When SQLLine starts, the system displays the following prompt:
+
+`0: jdbc:drill]:schema=<database>;zk=<zkhost>:<port>`
+
+At this point, you can use Drill to query your data source or you can discover
+metadata.
+
+## Exiting SQLLine
+
+To exit SQLLine, issue the following command:
+
+`!quit`  
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/003-ports.md
----------------------------------------------------------------------
diff --git a/_docs/manage/003-ports.md b/_docs/manage/003-ports.md
new file mode 100644
index 0000000..df1d362
--- /dev/null
+++ b/_docs/manage/003-ports.md
@@ -0,0 +1,9 @@
+---
+title: "Ports Used by Drill"
+parent: "Manage Drill"
+---
+The following table provides a list of the ports that Drill uses, the port
+type, and a description of how Drill uses the port:
+
+<table ><tbody><tr><th >Port</th><th colspan="1" >Type</th><th >Description</th></tr><tr><td valign="top" >8047</td><td valign="top" colspan="1" >TCP</td><td valign="top" >Needed for <span style="color: rgb(34,34,34);">the Drill Web UI.</span><span style="color: rgb(34,34,34);"> </span></td></tr><tr><td valign="top" >31010</td><td valign="top" colspan="1" >TCP</td><td valign="top" >User port address. Used between nodes in a Drill cluster. <br />Needed for an external client, such as Tableau, to connect into the<br />cluster nodes. Also needed for the Drill Web UI.</td></tr><tr><td valign="top" >31011</td><td valign="top" colspan="1" >TCP</td><td valign="top" >Control port address. Used between nodes in a Drill cluster. <br />Needed for multi-node installation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" >31012</td><td valign="top" colspan="1" >TCP</td><td valign="top" colspan="1" >Data port address. Used between nodes in a Drill cluster. <br />Needed for multi-node ins
 tallation of Apache Drill.</td></tr><tr><td valign="top" colspan="1" >46655</td><td valign="top" colspan="1" >UDP</td><td valign="top" colspan="1" >Used for JGroups and Infinispan. Needed for multi-node installation of Apache Drill.</td></tr></tbody></table>
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/004-partition-prune.md
----------------------------------------------------------------------
diff --git a/_docs/manage/004-partition-prune.md b/_docs/manage/004-partition-prune.md
new file mode 100644
index 0000000..75f2edd
--- /dev/null
+++ b/_docs/manage/004-partition-prune.md
@@ -0,0 +1,75 @@
+---
+title: "Partition Pruning"
+parent: "Manage Drill"
+---
+Partition pruning is a performance optimization that limits the number of
+files and partitions that Drill reads when querying file systems and Hive
+tables. Drill only reads a subset of the files that reside in a file system or
+a subset of the partitions in a Hive table when a query matches certain filter
+criteria.
+
+For Drill to apply partition pruning to Hive tables, you must have created the
+tables in Hive using the `PARTITION BY` clause:
+
+`CREATE TABLE <table_name> (<column_name>) PARTITION BY (<column_name>);`
+
+When you create Hive tables using the `PARTITION BY` clause, each partition of
+data is automatically split out into different directories as data is written
+to disk. For more information about Hive partitioning, refer to the [Apache
+Hive wiki](https://cwiki.apache.org/confluence/display/Hive/LanguageManual+DDL/#LanguageManualDDL-PartitionedTables).
+
+Typically, table data in a file system is organized by directories and
+subdirectories. Queries on table data may contain `WHERE` clause filters on
+specific directories.
+
+Drill’s query planner evaluates the filters as part of a Filter operator. If
+no partition filters are present, the underlying Scan operator reads all files
+in all directories and then sends the data to operators downstream, such as
+Filter.
+
+When partition filters are present, the query planner determines if it can
+push the filters down to the Scan such that the Scan only reads the
+directories that match the partition filters, thus reducing disk I/O.
+
+## Partition Pruning Example
+
+The /`Users/max/data/logs` directory in a file system contains subdirectories
+that span a few years.
+
+The following image shows the hierarchical structure of the `…/logs` directory
+and (sub) directories:
+
+![drill query flow]({{ site.baseurl }}/docs/img/54.png)
+
+The following query requests log file data for 2013 from the `…/logs`
+directory in the file system:
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 and dir0 = 2013 limit 2;
+
+If you run the `EXPLAIN PLAN` command for the query, you can see that the`
+…/logs` directory is filtered by the scan operator.
+
+    EXPLAIN PLAN FOR SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 and dir0 = 2013 limit 2;
+
+The following image shows a portion of the physical plan when partition
+pruning is applied:
+
+![drill query flow]({{ site.baseurl }}/docs/img/21.png)
+
+## Filter Examples
+
+The following queries include examples of the types of filters eligible for
+partition pruning optimization:
+
+**Example 1: Partition filters ANDed together**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE dir0 = '2014' AND dir1 = '1'
+
+**Example 2: Partition filter ANDed with regular column filter**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE cust_id < 10 AND dir0 = 2013 limit 2;
+
+**Example 3: Combination of AND, OR involving partition filters**
+
+    SELECT * FROM dfs.`/Users/max/data/logs` WHERE (dir0 = '2013' AND dir1 = '1') OR (dir0 = '2014' AND dir1 = '2')
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/005-monitor-cancel.md
----------------------------------------------------------------------
diff --git a/_docs/manage/005-monitor-cancel.md b/_docs/manage/005-monitor-cancel.md
new file mode 100644
index 0000000..0033838
--- /dev/null
+++ b/_docs/manage/005-monitor-cancel.md
@@ -0,0 +1,30 @@
+---
+title: "Monitoring and Canceling Queries in the Drill Web UI"
+parent: "Manage Drill"
+---
+You can monitor and cancel queries from the Drill Web UI. To access the Drill
+Web UI, the Drillbit process must be running on the Drill node that you use to
+access the Drill Web UI.
+
+To monitor or cancel a query from the Drill Web UI, complete the following
+steps:
+
+  1. Navigate to the Drill Web UI at `<drill_node_ip_address>:8047.`  
+When you access the Drill Web UI, you see some general information about Drill
+running in your cluster, such as the nodes running the Drillbit process, the
+various ports Drill is using, and the amount of direct memory assigned to
+Drill.  
+![drill query flow]({{ site.baseurl }}/docs/img/7.png)
+
+  2. Select **Profiles** in the toolbar. A list of running and completed queries appears. Drill assigns a query ID to each query and lists the Foreman node. The Foreman is the Drillbit node that receives the query from the client or application. The Foreman drives the entire query.
+![drill query flow]({{ site.baseurl }}/docs/img/51.png)  
+
+  3. Click the **Query ID** for the query that you want to monitor or cancel. The Query and Planning window appears.  
+![drill query flow]({{ site.baseurl }}/docs/img/4.png)
+
+  4. Select **Edit Query**.
+  5. Click **Cancel query **to cancel the** query. The following message appears:
+  ![drill query flow]({{ site.baseurl }}/docs/img/46.png)  
+
+  6. Optionally, you can re-run the query to see a query summary in this window.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/conf/001-mem-alloc.md
----------------------------------------------------------------------
diff --git a/_docs/manage/conf/001-mem-alloc.md b/_docs/manage/conf/001-mem-alloc.md
new file mode 100644
index 0000000..8f98cfc
--- /dev/null
+++ b/_docs/manage/conf/001-mem-alloc.md
@@ -0,0 +1,31 @@
+---
+title: "Memory Allocation"
+parent: "Configuration Options"
+---
+You can configure the amount of direct memory allocated to a Drillbit for
+query processing. The default limit is 8G, but Drill prefers 16G or more
+depending on the workload. The total amount of direct memory that a Drillbit
+allocates to query operations cannot exceed the limit set.
+
+Drill mainly uses Java direct memory and performs well when executing
+operations in memory instead of storing the operations on disk. Drill does not
+write to disk unless absolutely necessary, unlike MapReduce where everything
+is written to disk during each phase of a job.
+
+The JVM’s heap memory does not limit the amount of direct memory available in
+a Drillbit. The on-heap memory for Drill is only about 4-8G, which should
+suffice because Drill avoids having data sit in heap memory.
+
+## Modifying Drillbit Memory
+
+You can modify memory for each Drillbit node in your cluster. To modify the
+memory for a Drillbit, edit the `XX:MaxDirectMemorySize` parameter in the
+Drillbit startup script located in `<drill_installation_directory>/conf/drill-
+env.sh`.
+
+**Note:** If this parameter is not set, the limit depends on the amount of available system memory.
+
+After you edit `<drill_installation_directory>/conf/drill-env.sh`, [restart
+the Drillbit
+](/drill/docs/starting-stopping-drill#starting-a-drillbit)on
+the node.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/conf/002-startup-opt.md
----------------------------------------------------------------------
diff --git a/_docs/manage/conf/002-startup-opt.md b/_docs/manage/conf/002-startup-opt.md
new file mode 100644
index 0000000..3434401
--- /dev/null
+++ b/_docs/manage/conf/002-startup-opt.md
@@ -0,0 +1,50 @@
+---
+title: "Start-Up Options"
+parent: "Configuration Options"
+---
+Drill’s start-up options reside in a HOCON configuration file format, which is
+a hybrid between a properties file and a JSON file. Drill start-up options
+consist of a group of files with a nested relationship. At the core of the
+file hierarchy is `drill-default.conf`. This file is overridden by one or more
+`drill-module.conf` files, which are overridden by the `drill-override.conf`
+file that you define.
+
+You can see the following group of files throughout the source repository in
+Drill:
+
+	common/src/main/resources/drill-default.conf
+	common/src/main/resources/drill-module.conf
+	contrib/storage-hbase/src/main/resources/drill-module.conf
+	contrib/storage-hive/core/src/main/resources/drill-module.conf
+	contrib/storage-hive/hive-exec-shade/src/main/resources/drill-module.conf
+	exec/java-exec/src/main/resources/drill-module.conf
+	distribution/src/resources/drill-override.conf
+
+These files are listed inside the associated JAR files in the Drill
+distribution tarball.
+
+Each Drill module has a set of options that Drill incorporates. Drill’s
+modular design enables you to create new storage plugins, set new operators,
+or create UDFs. You can also include additional configuration options that you
+can override as necessary.
+
+When you add a JAR file to Drill, you must include a `drill-module.conf` file
+in the root directory of the JAR file that you add. The `drill-module.conf`
+file tells Drill to scan that JAR file or associated object and include it.
+
+## Viewing Startup Options
+
+You can run the following query to see a list of Drill’s startup options:
+
+    SELECT * FROM sys.options WHERE type='BOOT'
+
+## Configuring Start-Up Options
+
+You can configure start-up options for each Drillbit in the `drill-
+override.conf` file located in Drill’s` /conf` directory.
+
+You may want to configure the following start-up options that control certain
+behaviors in Drill:
+
+<table ><tbody><tr><th >Option</th><th >Default Value</th><th >Description</th></tr><tr><td valign="top" >drill.exec.sys.store.provider</td><td valign="top" >ZooKeeper</td><td valign="top" >Defines the persistent storage (PStore) provider. The PStore holds configuration and profile data. For more information about PStores, see <a href="/drill/docs/persistent-configuration-storage" rel="nofollow">Persistent Configuration Storage</a>.</td></tr><tr><td valign="top" >drill.exec.buffer.size</td><td valign="top" > </td><td valign="top" >Defines the amount of memory available, in terms of record batches, to hold data on the downstream side of an operation. Drill pushes data downstream as quickly as possible to make data immediately available. This requires Drill to use memory to hold the data pending operations. When data on a downstream operation is required, that data is immediately available so Drill does not have to go over the network to process it. Providing more memory to this optio
 n increases the speed at which Drill completes a query.</td></tr><tr><td valign="top" >drill.exec.sort.external.directoriesdrill.exec.sort.external.fs</td><td valign="top" > </td><td valign="top" >These options control spooling. The drill.exec.sort.external.directories option tells Drill which directory to use when spooling. The drill.exec.sort.external.fs option tells Drill which file system to use when spooling beyond memory files. <span style="line-height: 1.4285715;background-color: transparent;"> </span>Drill uses a spool and sort operation for beyond memory operations. The sorting operation is designed to spool to a Hadoop file system. The default Hadoop file system is a local file system in the /tmp directory. Spooling performance (both writing and reading back from it) is constrained by the file system. <span style="line-height: 1.4285715;background-color: transparent;"> </span>For MapR clusters, use MapReduce volumes or set up local volumes to use for spooling purposes. Vol
 umes improve performance and stripe data across as many disks as possible.</td></tr><tr><td valign="top" colspan="1" >drill.exec.debug.error_on_leak</td><td valign="top" colspan="1" >True</td><td valign="top" colspan="1" >Determines how Drill behaves when memory leaks occur during a query. By default, this option is enabled so that queries fail when memory leaks occur. If you disable the option, Drill issues a warning when a memory leak occurs and completes the query.</td></tr><tr><td valign="top" colspan="1" >drill.exec.zk.connect</td><td valign="top" colspan="1" >localhost:2181</td><td valign="top" colspan="1" >Provides Drill with the ZooKeeper quorum to use to connect to data sources. Change this setting to point to the ZooKeeper quorum that you want Drill to use. You must configure this option on each Drillbit node.</td></tr><tr><td valign="top" colspan="1" >drill.exec.cluster-id</td><td valign="top" colspan="1" >my_drillbit_cluster</td><td valign="top" colspan="1" >Identifies t
 he cluster that corresponds with the ZooKeeper quorum indicated. It also provides Drill with the name of the cluster used during UDP multicast. You must change the default cluster-id if there are multiple clusters on the same subnet. If you do not change the ID, the clusters will try to connect to each other to create one cluster.</td></tr></tbody></table>
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/conf/003-plan-exec.md
----------------------------------------------------------------------
diff --git a/_docs/manage/conf/003-plan-exec.md b/_docs/manage/conf/003-plan-exec.md
new file mode 100644
index 0000000..ea67e2d
--- /dev/null
+++ b/_docs/manage/conf/003-plan-exec.md
@@ -0,0 +1,37 @@
+---
+title: "Planning and Execution Options"
+parent: "Configuration Options"
+---
+You can set Drill query planning and execution options per cluster, at the
+system or session level. Options set at the session level only apply to
+queries that you run during the current Drill connection. Options set at the
+system level affect the entire system and persist between restarts. Session
+level settings override system level settings.
+
+#### Querying Planning and Execution Options
+
+You can run the following query to see a list of the system and session
+planning and execution options:
+
+    SELECT name FROM sys.options WHERE type in ('SYSTEM','SESSION');
+
+#### Configuring Planning and Execution Options
+
+Use the` ALTER SYSTEM` or `ALTER SESSION` commands to set options. Typically,
+you set the options at the session level unless you want the setting to
+persist across all sessions.
+
+The following table contains planning and execution options that you can set
+at the system or session level:
+
+<table ><tbody><tr><th >Option name</th><th >Default value</th><th >Description</th></tr><tr><td valign="top" colspan="1" >exec.errors.verbose</td><td valign="top" colspan="1" ><p>false</p></td><td valign="top" colspan="1" ><p>This option enables or disables the verbose message that Drill returns when a query fails. When enabled, Drill provides additional information about failed queries.</p></td></tr><tr><td valign="top" colspan="1" ><span>exec.max_hash_table_size</span></td><td valign="top" colspan="1" >1073741824</td><td valign="top" colspan="1" ><span>The default maximum size for hash tables.</span></td></tr><tr><td valign="top" colspan="1" >exec.min_hash_table_size</td><td valign="top" colspan="1" >65536</td><td valign="top" colspan="1" >The default starting size for hash tables. Increasing this size is useful for very large aggregations or joins when you have large amounts of memory for Drill to use. Drill can spend a lot of time resizing the hash table as it finds new data. I
 f you have large data sets, you can increase this hash table size to increase performance.</td></tr><tr><td valign="top" colspan="1" >planner.add_producer_consumer</td><td valign="top" colspan="1" ><p>false</p><p> </p></td><td valign="top" colspan="1" ><p>This option enables or disables a secondary reading thread that works out of band of the rest of the scanning fragment to prefetch data from disk. <span style="line-height: 1.4285715;background-color: transparent;">If you interact with a certain type of storage medium that is slow or does not prefetch much data, this option tells Drill to add a producer consumer reading thread to the operation. Drill can then assign one thread that focuses on a single reading fragment. </span></p><p>If Drill is using memory, you can disable this option to get better performance. If Drill is using disk space, you should enable this option and set a reasonable queue size for the planner.producer_consumer_queue_size option.</p></td></tr><tr><td valign
 ="top" colspan="1" >planner.broadcast_threshold</td><td valign="top" colspan="1" >1000000</td><td valign="top" colspan="1" ><span style="color: rgb(34,34,34);">Threshold, in terms of a number of rows, that determines whether a broadcast join is chosen for a query. Regardless of the setting of the broadcast_join option (enabled or disabled), a broadcast join is not chosen unless the right side of the join is estimated to contain fewer rows than this threshold. The intent of this option is to avoid broadcasting too many rows for join purposes. Broadcasting involves sending data across nodes and is a network-intensive operation. (The &quot;right side&quot; of the join, which may itself be a join or simply a table, is determined by cost-based optimizations and heuristics during physical planning.)</span></td></tr><tr><td valign="top" colspan="1" ><p>planner.enable_broadcast_join<br />planner.enable_hashagg<br />planner.enable_hashjoin<br />planner.enable_mergejoin<br />planner.enable_mu
 ltiphase_agg<br />planner.enable_streamagg</p></td><td valign="top" colspan="1" >true</td><td valign="top" colspan="1" ><p>These options enable or disable specific aggregation and join operators for queries. These operators are all enabled by default and in general should not be disabled.</p><p>Hash aggregation and hash join are hash-based operations. Streaming aggregation and merge join are sort-based operations. Both hash-based and sort-based operations consume memory; however, currently, hash-based operations do not spill to disk as needed, but the sort-based operations do. If large hash operations do not fit in memory on your system, you may need to disable these operations. Queries will continue to run, using alternative plans.</p></td></tr><tr><td valign="top" colspan="1" >planner.producer_consumer_queue_size</td><td valign="top" colspan="1" >10</td><td valign="top" colspan="1" >Determines how much data to prefetch from disk (in record batches) out of band of query execution. 
 The larger the queue size, the greater the amount of memory that the queue and overall query execution consumes.</td></tr><tr><td valign="top" colspan="1" >planner.slice_target</td><td valign="top" colspan="1" >100000</td><td valign="top" colspan="1" >The number of records manipulated within a fragment before Drill parallelizes them.</td></tr><tr><td valign="top" colspan="1" ><p>planner.width.max_per_node</p><p> </p></td><td valign="top" colspan="1" ><p>The default depends on the number of cores on each node.</p></td><td valign="top" colspan="1" ><p>In this context &quot;width&quot; refers to fanout or distribution potential: the ability to run a query in parallel across the cores on a node and the nodes on a cluster.</p><p><span>A physical plan consists of intermediate operations, known as query &quot;fragments,&quot; that run concurrently, yielding opportunities for parallelism above and below each exchange operator in the plan. An exchange operator represents a breakpoint in the 
 execution flow where processing can be distributed. For example, a single-process scan of a file may flow into an exchange operator, followed by a multi-process aggregation fragment.</span><span> </span></p><p>The maximum width per node defines the maximum degree of parallelism for any fragment of a query, but the setting applies at the level of a single node in the cluster.</p><p>The <em>default</em> maximum degree of parallelism per node is calculated as follows, with the theoretical maximum automatically scaled back (and rounded down) so that only 70% of the actual available capacity is taken into account:</p>
+<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[number of active drillbits (typically one per node) 
+* number of cores per node
+* 0.7]]></script>
+<p>For example, on a single-node test system with 2 cores and hyper-threading enabled:</p>
+<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[1 * 4 * 0.7 = 3]]></script>
+<p>When you modify the default setting, you can supply any meaningful number. The system does not automatically scale down your setting.</p></td></tr><tr><td valign="top" colspan="1" >planner.width.max_per_query</td><td valign="top" colspan="1" >1000</td><td valign="top" colspan="1" ><p>The max_per_query value also sets the maximum degree of parallelism for any given stage of a query, but the setting applies to the query as executed by the whole cluster (multiple nodes). In effect, the actual maximum width per query is the <em>minimum of two values</em>:</p>
+<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[min((number of nodes * width.max_per_node), width.max_per_query)]]></script>
+<p>For example, on a 4-node cluster where <span><code>width.max_per_node</code> is set to 6 and </span><span><code>width.max_per_query</code> is set to 30:</span></p>
+<script type="syntaxhighlighter" class="theme: Default; brush: java; gutter: false"><![CDATA[min((4 * 6), 30) = 24]]></script>
+<p>In this case, the effective maximum width per query is 24, not 30.</p></td></tr><tr><td valign="top" colspan="1" >store.format</td><td valign="top" colspan="1" > </td><td valign="top" colspan="1" >Output format for data that is written to tables with the CREATE TABLE AS (CTAS) command.</td></tr><tr><td valign="top" colspan="1" >store.json.all_text_mode</td><td valign="top" colspan="1" ><p>false</p></td><td valign="top" colspan="1" ><p>This option enables or disables text mode. When enabled, Drill reads everything in JSON as a text object instead of trying to interpret data types. This allows complicated JSON to be read using CASE and CAST.</p></td></tr><tr><td valign="top" >store.parquet.block-size</td><td valign="top" ><p>536870912</p></td><td valign="top" >T<span style="color: rgb(34,34,34);">arget size for a parquet row group, which should be equal to or less than the configured HDFS block size. </span></td></tr></tbody></table>
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/manage/conf/004-persist-conf.md
----------------------------------------------------------------------
diff --git a/_docs/manage/conf/004-persist-conf.md b/_docs/manage/conf/004-persist-conf.md
new file mode 100644
index 0000000..b1deefa
--- /dev/null
+++ b/_docs/manage/conf/004-persist-conf.md
@@ -0,0 +1,93 @@
+---
+title: "Persistent Configuration Storage"
+parent: "Configuration Options"
+---
+Drill stores persistent configuration data in a persistent configuration store
+(PStore). This data is encoded in JSON or Protobuf format. Drill can use the
+local file system, ZooKeeper, HBase, or MapR-DB to store this data. The data
+stored in a PStore includes state information for storage plugins, query
+profiles, and ALTER SYSTEM settings. The default type of PStore configured
+depends on the Drill installation mode.
+
+The following table provides the persistent storage mode for each of the Drill
+modes:
+
+<table ><tbody><tr><th >Mode</th><th >Description</th></tr><tr><td valign="top" >Embedded</td><td valign="top" >Drill stores persistent data in the local file system. <br />You cannot modify the PStore location for Drill in embedded mode.</td></tr><tr><td valign="top" >Distributed</td><td valign="top" >Drill stores persistent data in ZooKeeper, by default. <br />You can modify where ZooKeeper offloads data, <br />or you can change the persistent storage mode to HBase or MapR-DB.</td></tr></tbody></table>
+  
+**Note:** Switching between storage modes does not migrate configuration data.
+
+## ZooKeeper for Persistent Configuration Storage
+
+To make Drill installation and configuration simple, Drill uses ZooKeeper to
+store persistent configuration data. The ZooKeeper PStore provider stores all
+of the persistent configuration data in ZooKeeper except for query profile
+data.
+
+The ZooKeeper PStore provider offloads query profile data to the
+${DRILL_LOG_DIR:-/var/log/drill} directory on Drill nodes. If you want the
+query profile data stored in a specific location, you can configure where
+ZooKeeper offloads the data.
+
+To modify where the ZooKeeper PStore provider offloads query profile data,
+configure the `sys.store.provider.zk.blobroot` property in the `drill.exec`
+block in `<drill_installation_directory>/conf/drill-override.conf` on each
+Drill node and then restart the Drillbit service.
+
+**Example**
+
+	drill.exec: {
+	 cluster-id: "my_cluster_com-drillbits",
+	 zk.connect: "<zkhostname>:<port>",
+	 sys.store.provider.zk.blobroot: "maprfs://<directory to store pstore data>/"
+	}
+
+Issue the following command to restart the Drillbit on all Drill nodes:
+
+    maprcli node services -name drill-bits -action restart -nodes <node IP addresses separated by a space>
+
+## HBase for Persistent Configuration Storage
+
+To change the persistent storage mode for Drill, add or modify the
+`sys.store.provider` block in `<drill_installation_directory>/conf/drill-
+override.conf.`
+
+**Example**
+
+	sys.store.provider: {
+	    class: "org.apache.drill.exec.store.hbase.config.HBasePStoreProvider",
+	    hbase: {
+	      table : "drill_store",
+	      config: {
+	      "hbase.zookeeper.quorum": "<ip_address>,<ip_address>,<ip_address >,<ip_address>",
+	      "hbase.zookeeper.property.clientPort": "2181"
+	      }
+	    }
+	  },
+
+## MapR-DB for Persistent Configuration Storage
+
+The MapR-DB plugin will be released soon. You can [compile Drill from
+source](/drill/docs/compiling-drill-from-source) to try out this
+new feature.
+
+If you have MapR-DB in your cluster, you can use MapR-DB for persistent
+configuration storage. Using MapR-DB to store persistent configuration data
+can prevent memory strain on ZooKeeper in clusters running heavy workloads.
+
+To change the persistent storage mode to MapR-DB, add or modify the
+`sys.store.provider` block in `<drill_installation_directory>/conf/drill-
+override.conf` on each Drill node and then restart the Drillbit service.
+
+**Example**
+
+	sys.store.provider: {
+	class: "org.apache.drill.exec.store.hbase.config.HBasePStoreProvider",
+	hbase: {
+	  table : "/tables/drill_store",
+	    }
+	},
+
+Issue the following command to restart the Drillbit on all Drill nodes:
+
+    maprcli node services -name drill-bits -action restart -nodes <node IP addresses separated by a space>
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/progress/001-2014-q1.md
----------------------------------------------------------------------
diff --git a/_docs/progress/001-2014-q1.md b/_docs/progress/001-2014-q1.md
new file mode 100644
index 0000000..ebabaea
--- /dev/null
+++ b/_docs/progress/001-2014-q1.md
@@ -0,0 +1,174 @@
+---
+title: "2014 Q1 Drill Report"
+parent: "Progress Reports"
+---
+
+Apache: Project Drill
+
+Description:
+
+Apache Drill is a distributed system for interactive analysis of large-scale
+datasets that is based on Google's Dremel. Its goal is to efficiently process
+nested data, scale to 10,000 servers or more and to be able to process petabyes of data and trillions of records in seconds.
+
+Drill has been incubating since 2012-08-11.
+
+Three Issues to Address in Move to Graduation:
+
+1\. Continue to attract new developers and and early users with a variety of
+skills and viewpoints
+
+2\. Continue to develop deeper community skills and knowledge by building
+additional releases
+
+3\. Demonstrate community robustness by rotating project tasks among multiple
+project members
+
+Issues to Call to Attention of PMC or ASF Board:
+
+None
+
+How community has developed since last report:
+
+Community awareness and participation were strengthened through a meeting of
+the Bay Area Apache Drill User Group in San Jose
+
+sponsored by Yahoo! This event expanded participation to include many new to
+Drill and particularly those interested as potential users (analysts rather
+than developers).
+
+Speakers included Drill project mentor Ted Dunning from MapR, Data Scientist
+Will Ford from Alpine Data Labs, new Drill committer Julian Hyde from
+HortonWorks and Aman Sinha, MapR Drill engineer.
+
+Additional events include:
+
+• Two new Drill committers accepted appointment: Julian Hyde (HortonWorks) and Tim Chen (Microsoft).
+
+• Drill has a new project mentor, Sebastian Schelter.
+
+Mailing list discussions:
+
+Subscriptions to the Drill mailing lists have risen to 399 on dev list and 308
+on the user list and 508 uniques across both lists.
+
+There has been active and increasing participation in discussions on the
+developer mailing list, including new participants and developers. Participation on the user list is growing although still small; mainly activity takes place on developer mailing list.
+
+Activity summary for the user mailing list:
+
+<http://mail-archives.apache.org/mod_mbox/incubator-drill-user/>
+
+February to date 02/26/2014: 25
+
+January 2014, 12
+
+December 2013, 62
+
+Topics in discussion on the user mailing list included but not limited to:
+
+  * Feb 2014: Connecting Drill to HBase, Support for Distinct/Count
+  * Jan 2014: Loading Data into Drill, Data Locality
+  * December 2013: Loading Data into Drill, Setting Drill with HDFS and other Storage engines
+
+Activity summary for the dev mailing list:
+
+<http://mail-archives.apache.org/mod_mbox/incubator-drill-dev/>
+
+February to date 02/26/2014: 250 (jira; discussion; review requests)
+
+January 2014, 156(jira, focused discussions)
+
+December 2013, 51 (jira; focused discussions)
+
+Topics in discussion on the dev mailing list included but not
+
+limited to:
+
+• February to date 02/26/2014: How to contribute to Drill; 
+review requests for Drill 357, 346, 366, 364; status of Drill functions including Hash functions; support operators +,- for date and interval arithmetic
+
+• January: Sql Options discussions, Casting discussions, Multiplex Data
+Channel feedbacks
+
+• December: Guide for new comers contribution, Aggregate functions code gen
+feedback
+
+Code
+
+For details of code commits, see <http://bit.ly/14YPXN9>
+
+There has been continued activity in code commits
+
+19 contributors have participated in GitHUB code activity; there have been 116 forks.
+
+February code commits include but not limited to: Support for
+Information_schema, Hive storage and metastore integration, Optiq JDBC
+thinning and refactoring, Math functions rework to use codegen, Column pruning
+for Parquet/Json, Moving Sql parsing into Drillbit server side, TravisCI setup
+
+January code commits include but not limited to: Implicit and explicit casting
+support, Broadcast Sender exchange, add TPC-H test queries, Refactor memory
+allocation to use hierarchical memory allocation and freeing.
+
+Community Interactions
+
+Weekly Drill hangout continues, conducted remotely through Google hangouts
+Tuesday mornings 9am Pacific Time to keep core developers in contact in realtime despite geographical separation.
+
+Community stays in touch through @ApacheDrill Twitter ID, and by postings on
+various blogs including Apache Drill User <http://drill-user.org/> which has
+had several updates and through international presentations at conferences.
+
+Viability of community is also apparent through active participation in the
+Bay Area Apache Drill User group meeting in early November, which has grown to
+440 members.
+
+Sample presentations:
+
+• “How to Use Drill” by Ted Dunning and Will Ford, Bay Area Apache Drill Meet-
+up 24 February
+
+• “How Drill Addresses Dynamic Typing” by Julian Hyde, Bay Area Apache Drill
+Meet-up 24 February
+
+• “New Features and Infrastructure Improvements” by Aman Sinha, Bay Area
+Apache Drill Meet-up 24 February
+
+Articles
+
+Examples of articles or reports on Apache Drill since last report include:
+
+• Drill blog post by Ellen Friedman at Apache Drill User updating community on
+how people will use Drill and inviting comments/ questions from remote
+participants as part of the Drill User Group <http://bit.ly/1p1Qvgn>
+
+• Drill blog post by Ellen Friedman at Apache Drill User reports on
+appointment of new Drill committers and new mentor <http://bit.ly/JIcwQe>
+
+Social Networking
+
+@ApacheDrill Twitter entity is active and has grown substantially by 19%, to 744 followers.
+
+How project has developed since last report:
+
+1\. Significant progress is being made on execution engine and sql front end
+to support more functionality, also more integrations with storage engines.
+
+2\. Work on ODBC driver has begun with a new group led by George Chow in
+Vancouver.
+
+3\. Significant code drops have been checked in from a number of contributors
+and commiters
+
+4\. Work toward 2nd milestone is progressing substantially.
+
+Please check this [ ] when you have filled in the report for Drill.
+
+Signed-off-by:
+
+* Ted Dunning
+* Grant Ingersoll
+* Isabel Drost
+* Sebastian Schelter
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/query/001-query-fs.md
----------------------------------------------------------------------
diff --git a/_docs/query/001-query-fs.md b/_docs/query/001-query-fs.md
new file mode 100644
index 0000000..ca488fb
--- /dev/null
+++ b/_docs/query/001-query-fs.md
@@ -0,0 +1,35 @@
+---
+title: "Querying a File System"
+parent: "Query Data"
+---
+Files and directories are like standard SQL tables to Drill. You can specify a
+file system "database" as a prefix in queries when you refer to objects across
+databases. In Drill, a file system database consists of a storage plugin name
+followed by an optional workspace name, for example <storage
+plugin>.<workspace> or hdfs.logs.
+
+The following example shows a query on a file system database in a Hadoop
+distributed file system:
+
+       SELECT * FROM hdfs.logs.`AppServerLogs/20104/Jan/01/part0001.txt`;
+
+The default `dfs` storage plugin instance registered with Drill has a
+`default` workspace. If you query data in the `default` workspace, you do not
+need to include the workspace in the query. Refer to
+[Workspaces](/drill/docs/workspaces) for
+more information.
+
+Drill supports the following file types:
+
+  * Plain text files, including:
+    * Comma-separated values (CSV, type: text)
+    * Tab-separated values (TSV, type: text)
+    * Pipe-separated values (PSV, type: text)
+  * Structured data files:
+    * JSON (type: json)
+    * Parquet (type: parquet)
+
+The extensions for these file types must match the configuration settings for
+your registered storage plugins. For example, PSV files may be defined with a
+`.tbl` extension, while CSV files are defined with a `.csv` extension.
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/query/002-query-hbase.md
----------------------------------------------------------------------
diff --git a/_docs/query/002-query-hbase.md b/_docs/query/002-query-hbase.md
new file mode 100644
index 0000000..d2a33d5
--- /dev/null
+++ b/_docs/query/002-query-hbase.md
@@ -0,0 +1,151 @@
+---
+title: "Querying HBase"
+parent: "Query Data"
+---
+This is a simple exercise that provides steps for creating a “students” table
+and a “clicks” table in HBase that you can query with Drill.
+
+To create the HBase tables and query them with Drill, complete the following
+steps:
+
+  1. Issue the following command to start the HBase shell:
+  
+        hbase shell
+  2. Issue the following commands to create a ‘students’ table and a ‘clicks’ table with column families in HBase:
+    
+        echo "create 'students','account','address'" | hbase shell
+    
+        echo "create 'clicks','clickinfo','iteminfo'" | hbase shell
+  3. Issue the following command with the provided data to create a `testdata.txt` file:
+
+        cat > testdata.txt
+
+     **Sample Data**
+
+        put 'students','student1','account:name','Alice'
+        put 'students','student1','address:street','123 Ballmer Av'
+        put 'students','student1','address:zipcode','12345'
+        put 'students','student1','address:state','CA'
+        put 'students','student2','account:name','Bob'
+        put 'students','student2','address:street','1 Infinite Loop'
+        put 'students','student2','address:zipcode','12345'
+        put 'students','student2','address:state','CA'
+        put 'students','student3','account:name','Frank'
+        put 'students','student3','address:street','435 Walker Ct'
+        put 'students','student3','address:zipcode','12345'
+        put 'students','student3','address:state','CA'
+        put 'students','student4','account:name','Mary'
+        put 'students','student4','address:street','56 Southern Pkwy'
+        put 'students','student4','address:zipcode','12345'
+        put 'students','student4','address:state','CA'
+        put 'clicks','click1','clickinfo:studentid','student1'
+        put 'clicks','click1','clickinfo:url','http://www.google.com'
+        put 'clicks','click1','clickinfo:time','2014-01-01 12:01:01.0001'
+        put 'clicks','click1','iteminfo:itemtype','image'
+        put 'clicks','click1','iteminfo:quantity','1'
+        put 'clicks','click2','clickinfo:studentid','student1'
+        put 'clicks','click2','clickinfo:url','http://www.amazon.com'
+        put 'clicks','click2','clickinfo:time','2014-01-01 01:01:01.0001'
+        put 'clicks','click2','iteminfo:itemtype','image'
+        put 'clicks','click2','iteminfo:quantity','1'
+        put 'clicks','click3','clickinfo:studentid','student2'
+        put 'clicks','click3','clickinfo:url','http://www.google.com'
+        put 'clicks','click3','clickinfo:time','2014-01-01 01:02:01.0001'
+        put 'clicks','click3','iteminfo:itemtype','text'
+        put 'clicks','click3','iteminfo:quantity','2'
+        put 'clicks','click4','clickinfo:studentid','student2'
+        put 'clicks','click4','clickinfo:url','http://www.ask.com'
+        put 'clicks','click4','clickinfo:time','2013-02-01 12:01:01.0001'
+        put 'clicks','click4','iteminfo:itemtype','text'
+        put 'clicks','click4','iteminfo:quantity','5'
+        put 'clicks','click5','clickinfo:studentid','student2'
+        put 'clicks','click5','clickinfo:url','http://www.reuters.com'
+        put 'clicks','click5','clickinfo:time','2013-02-01 12:01:01.0001'
+        put 'clicks','click5','iteminfo:itemtype','text'
+        put 'clicks','click5','iteminfo:quantity','100'
+        put 'clicks','click6','clickinfo:studentid','student3'
+        put 'clicks','click6','clickinfo:url','http://www.google.com'
+        put 'clicks','click6','clickinfo:time','2013-02-01 12:01:01.0001'
+        put 'clicks','click6','iteminfo:itemtype','image'
+        put 'clicks','click6','iteminfo:quantity','1'
+        put 'clicks','click7','clickinfo:studentid','student3'
+        put 'clicks','click7','clickinfo:url','http://www.ask.com'
+        put 'clicks','click7','clickinfo:time','2013-02-01 12:45:01.0001'
+        put 'clicks','click7','iteminfo:itemtype','image'
+        put 'clicks','click7','iteminfo:quantity','10'
+        put 'clicks','click8','clickinfo:studentid','student4'
+        put 'clicks','click8','clickinfo:url','http://www.amazon.com'
+        put 'clicks','click8','clickinfo:time','2013-02-01 22:01:01.0001'
+        put 'clicks','click8','iteminfo:itemtype','image'
+        put 'clicks','click8','iteminfo:quantity','1'
+        put 'clicks','click9','clickinfo:studentid','student4'
+        put 'clicks','click9','clickinfo:url','http://www.amazon.com'
+        put 'clicks','click9','clickinfo:time','2013-02-01 22:01:01.0001'
+        put 'clicks','click9','iteminfo:itemtype','image'
+        put 'clicks','click9','iteminfo:quantity','10'
+
+  4. Issue the following command to verify that the data is in the `testdata.txt` file:  
+    
+         cat testdata.txt | hbase shell
+  5. Issue `exit` to leave the `hbase shell`.
+  6. Start Drill. Refer to [Starting/Stopping Drill](/drill/docs/starting-stopping-drill) for instructions.
+  7. Use Drill to issue the following SQL queries on the “students” and “clicks” tables:  
+  
+     1. Issue the following query to see the data in the “students” table:  
+
+            SELECT * FROM hbase.`students`;
+        The query returns binary results:
+        
+            Query finished, fetching results ...
+            +----------+----------+----------+-----------+----------+----------+----------+-----------+
+            |id    | name        | state       | street      | zipcode |`
+            +----------+----------+----------+-----------+----------+-----------+----------+-----------
+            | [B@1ee37126 | [B@661985a1 | [B@15944165 | [B@385158f4 |[B@3e08d131 |
+            | [B@64a7180e | [B@161c72c2 | [B@25b229e5 | [B@53dc8cb8 |[B@1d11c878 |
+            | [B@349aaf0b | [B@175a1628 | [B@1b64a812 | [B@6d5643ca |[B@147db06f |
+            | [B@3a7cbada | [B@52cf5c35 | [B@2baec60c | [B@5f4c543b |[B@2ec515d6 |
+
+        Since Drill does not require metadata, you must use the SQL `CAST` function in
+some queries to get readable query results.
+
+     2. Issue the following query, that includes the `CAST` function, to see the data in the “`students`” table:
+
+            SELECT CAST(students.clickinfo.studentid as VarChar(20)),
+            CAST(students.account.name as VarChar(20)), CAST (students.address.state as
+            VarChar(20)), CAST (students.address.street as VarChar(20)), CAST
+            (students.address.zipcode as VarChar(20)), FROM hbase.students;
+
+        **Note:** Use the following format when you query a column in an HBase table:
+          
+             tablename.columnfamilyname.columnname
+            
+        For more information about column families, refer to [5.6. Column
+Family](http://hbase.apache.org/book/columnfamily.html).
+
+        The query returns the data:
+
+            Query finished, fetching results ...
+            +----------+-------+-------+------------------+---------+`
+            | studentid | name  | state | street           | zipcode |`
+            +----------+-------+-------+------------------+---------+`
+            | student1 | Alice | CA    | 123 Ballmer Av   | 12345   |`
+            | student2 | Bob   | CA    | 1 Infinite Loop  | 12345   |`
+            | student3 | Frank | CA    | 435 Walker Ct    | 12345   |`
+            | student4 | Mary  | CA    | 56 Southern Pkwy | 12345   |`
+            +----------+-------+-------+------------------+---------+`
+
+     3. Issue the following query on the “clicks” table to find out which students clicked on google.com:
+        
+              SELECT CAST(clicks.clickinfo.studentid as VarChar(200)), CAST(clicks.clickinfo.url as VarChar(200)) FROM hbase.`clicks` WHERE URL LIKE '%google%';  
+
+        The query returns the data:
+        
+            Query finished, fetching results ...`
+        
+            +---------+-----------+-------------------------------+-----------------------+----------+----------+
+            | clickid | studentid | time                          | url                   | itemtype | quantity |
+            +---------+-----------+-------------------------------+-----------------------+----------+----------+
+            | click1  | student1  | 2014-01-01 12:01:01.000100000 | http://www.google.com | image    | 1        |
+            | click3  | student2  | 2014-01-01 01:02:01.000100000 | http://www.google.com | text     | 2        |
+            | click6  | student3  | 2013-02-01 12:01:01.000100000 | http://www.google.com | image    | 1        |
+            +---------+-----------+-------------------------------+-----------------------+----------+----------+
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/query/003-query-complex.md
----------------------------------------------------------------------
diff --git a/_docs/query/003-query-complex.md b/_docs/query/003-query-complex.md
new file mode 100644
index 0000000..537d7b4
--- /dev/null
+++ b/_docs/query/003-query-complex.md
@@ -0,0 +1,56 @@
+---
+title: "Querying Complex Data"
+parent: "Query Data"
+---
+Apache Drill queries do not require prior knowledge of the actual data you are
+trying to access, regardless of its source system or its schema and data
+types. The sweet spot for Apache Drill is a SQL query workload against
+"complex data": data made up of various types of records and fields, rather
+than data in a recognizable relational form (discrete rows and columns). Drill
+is capable of discovering the form of the data when you submit the query.
+Nested data formats such as JSON (JavaScript Object Notation) files and
+Parquet files are not only _accessible_: Drill provides special operators and
+functions that you can use to _drill down _into these files and ask
+interesting analytic questions.
+
+These operators and functions include:
+
+  * References to nested data values
+  * Access to repeating values in arrays and arrays within arrays (array indexes)
+
+The SQL query developer needs to know the data well enough to write queries
+that identify values of interest in the target file. For example, the writer
+needs to know what a record consists of, and its data types, in order to
+reliably request the right "columns" in the select list. Although these data
+values do not manifest themselves as columns in the source file, Drill will
+return them in the result set as if they had the predictable form of columns
+in a table. Drill also optimizes queries by treating the data as "columnar"
+rather than reading and analyzing complete records. (Drill uses similar
+parallel execution and optimization capabilities to commercial columnar MPP
+databases.)
+
+Given a basic knowledge of the input file, the developer needs to know how to
+use the SQL extensions that Drill provides and how to use them to "reach into"
+the nested data. The following examples show how to write both simple queries
+against JSON files and interesting queries that unpack the nested data. The
+examples show how to use the Drill extensions in the context of standard SQL
+SELECT statements. For the most part, the extensions use standard JavaScript
+notation for referencing data elements in a hierarchy.
+
+### Before You Begin
+
+The examples in this section operate on JSON data files. In order to write
+your own queries, you need to be aware of the basic data types in these files:
+
+  * string (all data inside double quotes), such as `"0001"` or `"Cake"`
+  * numeric types: integers, decimals, and floats, such as `0.55` or `10`
+  * null values
+  * boolean values: true, false
+
+Check that you have the following configuration setting for JSON files in the
+Drill Web UI (`dfs` storage plugin configuration):
+
+    "json" : {
+      "type" : "json"
+    }
+

http://git-wip-us.apache.org/repos/asf/drill/blob/d959a210/_docs/query/004-query-hive.md
----------------------------------------------------------------------
diff --git a/_docs/query/004-query-hive.md b/_docs/query/004-query-hive.md
new file mode 100644
index 0000000..903c7c6
--- /dev/null
+++ b/_docs/query/004-query-hive.md
@@ -0,0 +1,45 @@
+---
+title: "Querying Hive"
+parent: "Query Data"
+---
+This is a simple exercise that provides steps for creating a Hive table and
+inserting data that you can query using Drill. Before you perform the steps,
+download the [customers.csv](http://doc.mapr.com/download/attachments/22906623
+/customers.csv?api=v2) file.
+
+To create a Hive table and query it with Drill, complete the following steps:
+
+  1. Issue the following command to start the Hive shell:
+  
+        hive
+  2. Issue the following command from the Hive shell create a table schema:
+  
+        hive> create table customers(FirstName string, LastName string, Company string, Address string, City string, County string, State string, Zip string, Phone string, Fax string, Email string, Web string) row format delimited fields terminated by ',' stored as textfile;
+  3. Issue the following command to load the customer data into the customers table:  
+
+        hive> load data local inpath '/<directory path>/customers.csv' overwrite into table customers;`
+  4. Issue `quit` or `exit` to leave the Hive shell.
+  5. Start Drill. Refer to [Starting/Stopping Drill](/drill/docs/starting-stopping-drill) for instructions.
+  6. Issue the following query to Drill to get the first and last names of the first ten customers in the Hive table:  
+
+        0: jdbc:drill:schema=hiveremote> SELECT firstname,lastname FROM hiveremote.`customers` limit 10;`
+
+     The query returns the following results:
+     
+        +------------+------------+
+        | firstname  |  lastname  |
+        +------------+------------+
+        | Essie      | Vaill      |
+        | Cruz       | Roudabush  |
+        | Billie     | Tinnes     |
+        | Zackary    | Mockus     |
+        | Rosemarie  | Fifield    |
+        | Bernard    | Laboy      |
+        | Sue        | Haakinson  |
+        | Valerie    | Pou        |
+        | Lashawn    | Hasty      |
+        | Marianne   | Earman     |
+        +------------+------------+
+        10 rows selected (1.5 seconds)
+        0: jdbc:drill:schema=hiveremote>
+