You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@drill.apache.org by br...@apache.org on 2015/09/09 02:11:19 UTC

drill git commit: remove comment from Querying HBase

Repository: drill
Updated Branches:
  refs/heads/gh-pages be2f22234 -> 9bc38d563


remove comment from Querying HBase

remove extraneous punctuation

fix BE notation in table

another BE fix

minor edit

code punctuation

code spacing

Bridget's architecture edits

Daniel's changes

add new image


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/9bc38d56
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/9bc38d56
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/9bc38d56

Branch: refs/heads/gh-pages
Commit: 9bc38d563822aba4daf2d3c8282d233aa45cc7e2
Parents: be2f222
Author: Kristine Hahn <kh...@maprtech.com>
Authored: Sat Sep 5 11:29:45 2015 -0700
Committer: Kristine Hahn <kh...@maprtech.com>
Committed: Tue Sep 8 17:02:36 2015 -0700

----------------------------------------------------------------------
 _data/docs.json                                 |   4 +-
 .../010-architecture-introduction.md            |  24 ++++++------
 _docs/architecture/015-drill-query-execution.md |  12 +++---
 _docs/architecture/020-core-modules.md          |  16 ++++----
 _docs/architecture/030-performance.md           |  31 ++++++++--------
 _docs/getting-started/020-why-drill.md          |  12 +++---
 _docs/img/query-flow-client.png                 | Bin 13734 -> 11366 bytes
 _docs/query-data/030-querying-hbase.md          |  37 +++++++++----------
 _docs/tutorials/010-tutorials-introduction.md   |  12 +++---
 _docs/tutorials/020-drill-in-10-minutes.md      |  34 ++++++++---------
 10 files changed, 88 insertions(+), 94 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_data/docs.json
----------------------------------------------------------------------
diff --git a/_data/docs.json b/_data/docs.json
index 5a3ca09..c6af217 100644
--- a/_data/docs.json
+++ b/_data/docs.json
@@ -1,4 +1,4 @@
-{  
+{
     "by_title": {
         "2014 Q1 Drill Report": {
             "breadcrumbs": [
@@ -15955,4 +15955,4 @@
             "url": "/docs/project-bylaws/"
         }
     ]
-}
+}
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/architecture/010-architecture-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/architecture/010-architecture-introduction.md b/_docs/architecture/010-architecture-introduction.md
index 9fb5318..b802c72 100755
--- a/_docs/architecture/010-architecture-introduction.md
+++ b/_docs/architecture/010-architecture-introduction.md
@@ -15,7 +15,7 @@ metadata repository.
 ## High-Level Architecture
 
 Drill includes a distributed execution environment, purpose built for large-
-scale data processing. At the core of Apache Drill is the ‘Drillbit’ service,
+scale data processing. At the core of Apache Drill is the "Drillbit" service,
 which is responsible for accepting requests from the client, processing the
 queries, and returning results to the client.
 
@@ -28,7 +28,7 @@ uses ZooKeeper to maintain cluster membership and health-check information.
 Though Drill works in a Hadoop cluster environment, Drill is not tied to
 Hadoop and can run in any distributed cluster environment. The only pre-requisite for Drill is Zookeeper.
 
-See Drill Query Execution.
+See [Drill Query Execution]({{ site.baseurl }}/docs/drill-query-execution/).
 
 ## Drill Clients
 
@@ -45,32 +45,30 @@ Drill does not require schema or type specification for data in order to start
 the query execution process. Drill starts data processing in record-batches
 and discovers the schema during processing. Self-describing data formats such
 as Parquet, JSON, AVRO, and NoSQL databases have schema specified as part of
-the data itself, which Drill leverages dynamically at query time. Because
-schema can change over the course of a Drill query, all Drill operators are
+the data itself, which Drill leverages dynamically at query time. Because the
+schema can change over the course of a Drill query, many Drill operators are
 designed to reconfigure themselves when schemas change.
 
 ### **_Flexible data model_**
 
-Drill allows access to nested data attributes, just like SQL columns, and
+Drill allows access to nested data attributes, as if they were SQL columns, and
 provides intuitive extensions to easily operate on them. From an architectural
 point of view, Drill provides a flexible hierarchical columnar data model that
-can represent complex, highly dynamic and evolving data models. Drill allows
-for efficient processing of these models without the need to flatten or
-materialize them at design time or at execution time. Relational data in Drill
+can represent complex, highly dynamic and evolving data models. Relational data in Drill
 is treated as a special or simplified case of complex/multi-structured data.
 
-### **_De-centralized metadata_**
+### **_No centralized metadata_**
 
 Drill does not have a centralized metadata requirement. You do not need to
 create and manage tables and views in a metadata repository, or rely on a
 database administrator group for such a function. Drill metadata is derived
-from the storage plugins that correspond to data sources. Storage plugins
+through the storage plugins that correspond to data sources. Storage plugins
 provide a spectrum of metadata ranging from full metadata (Hive), partial
 metadata (HBase), or no central metadata (files). De-centralized metadata
 means that Drill is NOT tied to a single Hive repository. You can query
 multiple Hive repositories at once and then combine the data with information
 from HBase tables or with a file in a distributed file system. You can also
-use SQL DDL syntax to create metadata within Drill, which gets organized just
+use SQL DDL statements to create metadata within Drill, which gets organized just
 like a traditional database. Drill metadata is accessible through the ANSI
 standard INFORMATION_SCHEMA database.
 
@@ -79,6 +77,6 @@ standard INFORMATION_SCHEMA database.
 Drill provides an extensible architecture at all layers, including the storage
 plugin, query, query optimization/execution, and client API layers. You can
 customize any layer for the specific needs of an organization or you can
-extend the layer to a broader array of use cases. Drill provides a built in
-classpath scanning and plugin concept to add additional storage plugins,
+extend the layer to a broader array of use cases. Drill uses 
+classpath scanning to find and load plugins, and to add additional storage plugins,
 functions, and operators with minimal configuration.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/architecture/015-drill-query-execution.md
----------------------------------------------------------------------
diff --git a/_docs/architecture/015-drill-query-execution.md b/_docs/architecture/015-drill-query-execution.md
index 8e04163..c88c476 100755
--- a/_docs/architecture/015-drill-query-execution.md
+++ b/_docs/architecture/015-drill-query-execution.md
@@ -9,7 +9,7 @@ The following image represents the communication between clients, applications,
 
 ![]({{ site.baseurl }}/docs/img/query-flow-client.png)
 
-The Drillbit that receives the query from a client or application becomes the Foreman for the query and drives the entire query. A parser in the Foreman parses the SQL, applying custom rules to convert specific SQL operators into a specific logical operator syntax that Drill understands. This collection of logical operators forms a logical plan. The logical plan describes the work required to generate the query results and defines what data sources and operations to apply.
+The Drillbit that receives the query from a client or application becomes the Foreman for the query and drives the entire query. A parser in the Foreman parses the SQL, applying custom rules to convert specific SQL operators into a specific logical operator syntax that Drill understands. This collection of logical operators forms a logical plan. The logical plan describes the work required to generate the query results and defines which data sources and operations to apply.
 
 The Foreman sends the logical plan into a cost-based optimizer to optimize the order of SQL operators in a statement and read the logical plan. The optimizer applies various types of rules to rearrange operators and functions into an optimal plan. The optimizer converts the logical plan into a physical plan that describes how to execute the query.
 
@@ -21,24 +21,24 @@ A parallelizer in the Foreman transforms the physical plan into multiple phases,
 
 
 ## Major Fragments
-A major fragment is an abstract concept that represents a phase of the query execution. A phase can consist of one or multiple operations that Drill must perform to execute the query. Drill assigns each major fragment a MajorFragmentID.
+A major fragment is a concept that represents a phase of the query execution. A phase can consist of one or multiple operations that Drill must perform to execute the query. Drill assigns each major fragment a MajorFragmentID.
 
 For example, to perform a hash aggregation of two files, Drill may create a plan with two major phases (major fragments) where the first phase is dedicated to scanning the two files and the second phase is dedicated to the aggregation of the data.  
 
 ![]({{ site.baseurl }}/docs/img/ex-operator.png)
 
-Drill separates major fragments by an exchange operator. An exchange is a change in data location and/or parallelization of the physical plan. An exchange is composed of a sender and a receiver to allow data to move between nodes. 
+Drill uses an exchange operator to separate major fragments. An exchange is a change in data location and/or parallelization of the physical plan. An exchange is composed of a sender and a receiver to allow data to move between nodes. 
 
 Major fragments do not actually perform any query tasks. Each major fragment is divided into one or multiple minor fragments (discussed in the next section) that actually execute the operations required to complete the query and return results back to the client.
 
-You can interact with major fragments within the physical plan by capturing a JSON representation of the plan in a file, manually modifying it, and then submitting it back to Drill using the SUBMIT PLAN command. You can also view major fragments in the query profile, which is visible in the Drill Web UI. See [EXPLAIN ]({{ site.baseurl }}/docs/explain/)and [Query Profiles]({{ site.baseurl }}/docs/query-profiles/) for more information.
+You can work with major fragments within the physical plan by capturing a JSON representation of the plan in a file, manually modifying it, and then submitting it back to Drill using the SUBMIT PLAN command. You can also view major fragments in the query profile, which is visible in the Drill Web UI. See [EXPLAIN ]({{ site.baseurl }}/docs/explain/)and [Query Profiles]({{ site.baseurl }}/docs/query-profiles/) for more information.
 
 ## Minor Fragments
-Each major fragment is parallelized into minor fragments. A minor fragment is a logical unit of work that runs inside of a thread. A logical unit of work in Drill is also referred to as a slice. The execution plan that Drill creates is composed of minor fragments. Drill assigns each minor fragment a MinorFragmentID.  
+Each major fragment is parallelized into minor fragments. A minor fragment is a logical unit of work that runs inside a thread. A logical unit of work in Drill is also referred to as a slice. The execution plan that Drill creates is composed of minor fragments. Drill assigns each minor fragment a MinorFragmentID.  
 
 ![]({{ site.baseurl }}/docs/img/min-frag.png)
 
-The parallelizer in the Foreman creates one or more minor fragments from a major fragment at execution time, by breaking a major fragment into as many minor fragments as it can run simultaneously on the cluster.
+The parallelizer in the Foreman creates one or more minor fragments from a major fragment at execution time, by breaking a major fragment into as many minor fragments as it can usefully run at the same time on the cluster.
 
 Drill executes each minor fragment in its own thread as quickly as possible based on its upstream data requirements. Drill schedules the minor fragments on nodes with data locality. Otherwise, Drill schedules them in a round-robin fashion on the existing, available Drillbits.
 

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/architecture/020-core-modules.md
----------------------------------------------------------------------
diff --git a/_docs/architecture/020-core-modules.md b/_docs/architecture/020-core-modules.md
old mode 100644
new mode 100755
index 4c5347f..96d7dfa
--- a/_docs/architecture/020-core-modules.md
+++ b/_docs/architecture/020-core-modules.md
@@ -8,20 +8,18 @@ The following image represents components within each Drillbit:
 
 The following list describes the key components of a Drillbit:
 
-  * **RPC end point**: Drill exposes a low overhead protobuf-based RPC protocol to communicate with the clients. Additionally, a C++ and Java API layers are also available for the client applications to interact with Drill. Clients can communicate to a specific Drillbit directly or go through a ZooKeeper quorum to discover the available Drillbits before submitting queries. It is recommended that the clients always go through ZooKeeper to shield clients from the intricacies of cluster management, such as the addition or removal of nodes. 
+  * **RPC endpoint**: Drill exposes a low overhead protobuf-based RPC protocol to communicate with the clients. Additionally, C++ and Java API layers are also available for client applications to interact with Drill. Clients can communicate with a specific Drillbit directly or go through a ZooKeeper quorum to discover the available Drillbits before submitting queries. It is recommended that the clients always go through ZooKeeper to shield clients from the intricacies of cluster management, such as the addition or removal of nodes. 
 
-  * **SQL parser**: Drill uses [Calcite](https://calcite.incubator.apache.org/), the open source framework, to parse incoming queries. The output of the parser component is a language agnostic, computer-friendly logical plan that represents the query. 
-  * **Storage plugin interfaces**: Drill serves as a query layer on top of several data sources. Storage plugins in Drill represent the abstractions that Drill uses to interact with the data sources. Storage plugins provide Drill with the following information:
+  * **SQL parser**: Drill uses [Calcite](https://calcite.incubator.apache.org/), the open source  SQL parser framework, to parse incoming queries. The output of the parser component is a language agnostic, computer-friendly logical plan that represents the query. 
+  * **Storage plugin interface**: Drill serves as a query layer on top of several data sources. Storage plugins in Drill represent the abstractions that Drill uses to interact with the data sources. Storage plugins provide Drill with the following information:
     * Metadata available in the source
     * Interfaces for Drill to read from and write to data sources
-    * Location of data and a set of optimization rules to help with efficient and faster execution of Drill queries on a specific data source 
+    * Location of data and a set of optimization rules to help with efficient and fast execution of Drill queries on a specific data source 
 
-    In the context of Hadoop, Drill provides storage plugins for files and
-HBase. Drill also integrates with Hive as a storage plugin since Hive
-provides a metadata abstraction layer on top of files, HBase, and provides
-libraries to read data and operate on these sources (Serdes and UDFs).
+In the context of Hadoop, Drill provides storage plugins for distributed files and
+HBase. Drill also integrates with Hive using a storage plugin.
 
-    When users query files and HBase with Drill, they can do it directly or go
+When users query files and HBase with Drill, they can do it directly or go
 through Hive if they have metadata defined there. Drill integration with Hive
 is only for metadata. Drill does not invoke the Hive execution engine for any
 requests.

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/architecture/030-performance.md
----------------------------------------------------------------------
diff --git a/_docs/architecture/030-performance.md b/_docs/architecture/030-performance.md
old mode 100644
new mode 100755
index 4da897e..b771dba
--- a/_docs/architecture/030-performance.md
+++ b/_docs/architecture/030-performance.md
@@ -9,47 +9,46 @@ performance:
 **_Distributed engine_**
 
 Drill provides a powerful distributed execution engine for processing queries.
-Users can submit requests to any node in the cluster. You can simply add new
-nodes to the cluster to scale for larger volumes of data, support more users
-or to improve performance.
+Users can submit requests to any node in the cluster. You can add new
+nodes to the cluster to scale for larger volumes of data to support more users
+or improve performance.
 
 **_Columnar execution_**
 
 Drill optimizes for both columnar storage and execution by using an in-memory
 data model that is hierarchical and columnar. When working with data stored in
 columnar formats such as Parquet, Drill avoids disk access for columns that
-are not involved in an analytic query. Drill also provides an execution layer
-that performs SQL processing directly on columnar data without row
+are not involved in a query. Drill's execution layer also 
+performs SQL processing directly on columnar data without row
 materialization. The combination of optimizations for columnar storage and
 direct columnar execution significantly lowers memory footprints and provides
-faster execution of BI/Analytic type of workloads.
+faster execution of BI and analytic types of workloads.
 
 **_Vectorization_**
 
 Rather than operating on single values from a single table record at one time,
 vectorization in Drill allows the CPU to operate on vectors, referred to as a
-Record Batches. Record Batches are arrays of values from many different
+record batches. A record batch has arrays of values from many different
 records. The technical basis for efficiency of vectorized processing is modern
 chip technology with deep-pipelined CPU designs. Keeping all pipelines full to
-achieve efficiency near peak performance is something impossible to achieve in
+achieve efficiency near peak performance is impossible to achieve in
 traditional database engines, primarily due to code complexity.
 
 **_Runtime compilation_**
 
-Runtime compilation is faster compared to the interpreted execution. Drill
-generates highly efficient custom code for every single query for every single
-operator. Here is a quick overview of the Drill compilation/code generation
-process at a glance.
+Runtime compilation enables faster execution than interpreted execution. Drill
+generates highly efficient custom code for every single query. 
+The following image shows the Drill compilation/code generation
+process:
 
 ![drill compiler]({{ site.baseurl }}/docs/img/58.png)
 
 **_Optimistic and pipelined query execution_**
 
-Drill adopts an optimistic execution model to process queries. Drill assumes
-that failures are infrequent within the short span of a query and therefore
+Using an optimistic execution model to process queries, Drill assumes
+that failures are infrequent within the short span of a query. Drill 
 does not spend time creating boundaries or checkpoints to minimize recovery
-time. Failures at node level are handled gracefully. In the instance of a
-single query failure, the query is rerun. Drill execution uses a pipeline
+time. In the instance of a single query failure, the query is rerun. Drill execution uses a pipeline
 model where all tasks are scheduled at once. The query execution happens in-
 memory as much as possible to move data through task pipelines, persisting to
 disk only if there is memory overflow.
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/getting-started/020-why-drill.md
----------------------------------------------------------------------
diff --git a/_docs/getting-started/020-why-drill.md b/_docs/getting-started/020-why-drill.md
index b522844..f0dee66 100644
--- a/_docs/getting-started/020-why-drill.md
+++ b/_docs/getting-started/020-why-drill.md
@@ -5,9 +5,9 @@ parent: "Getting Started"
 
 ## Top 10 Reasons to Use Drill
 
-### 1. Get started in minutes
+## 1. Get started in minutes
 
-It takes just a few minutes to get started with Drill. Untar the Drill software on your Mac or Windows laptop and run a query on a local file. No need to set up any infrastructure or to define schemas. Just point to the data, such as data in a file, directory, HBase table, and drill.
+It takes just a few minutes to get started with Drill. Untar the Drill software on your Linux, Mac, or Windows laptop and run a query on a local file. No need to set up any infrastructure or to define schemas. Just point to the data, such as data in a file, directory, HBase table, and drill.
 
     $ tar -xvf apache-drill-<version>.tar.gz
     $ <install directory>/bin/drill-embedded
@@ -37,14 +37,14 @@ Using Drill's schema-free JSON model, you can query complex, semi-structured dat
 
 
 ## 4. Real SQL -- not "SQL-like"
-Drill supports the standard SQL:2003 syntax. No need to learn a new "SQL-like" language or struggle with a semi-functional BI tool. Drill supports many data types including DATE, INTERVALDAY/INTERVALYEAR, TIMESTAMP, and VARCHAR, as well as complex query constructs such as correlated sub-queries and joins in WHERE clauses. Here is an example of a TPC-H standard query that runs in Drill "as is":
+Drill supports the standard SQL:2003 syntax. No need to learn a new "SQL-like" language or struggle with a semi-functional BI tool. Drill supports many data types including DATE, INTERVAL, TIMESTAMP, and VARCHAR, as well as complex query constructs such as correlated sub-queries and joins in WHERE clauses. Here is an example of a TPC-H standard query that runs in Drill:
 
 ### TPC-H query 4
 
-    SELECT  o.o_orderpriority, count(*) AS order_count
+    SELECT  o.o_orderpriority, COUNT(*) AS order_count
     FROM orders o
-    WHERE o.o_orderdate >= date '1996-10-01'
-          AND o.o_orderdate < date '1996-10-01' + interval '3' month
+    WHERE o.o_orderdate >= DATE '1996-10-01'
+          AND o.o_orderdate < DATE '1996-10-01' + INTERVAL '3' month
           AND EXISTS(
                      SELECT * FROM lineitem l 
                      WHERE l.l_orderkey = o.o_orderkey

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/img/query-flow-client.png
----------------------------------------------------------------------
diff --git a/_docs/img/query-flow-client.png b/_docs/img/query-flow-client.png
index 10fe24f..0ae87fc 100755
Binary files a/_docs/img/query-flow-client.png and b/_docs/img/query-flow-client.png differ

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/query-data/030-querying-hbase.md
----------------------------------------------------------------------
diff --git a/_docs/query-data/030-querying-hbase.md b/_docs/query-data/030-querying-hbase.md
index 474d1fe..ecd2fe6 100644
--- a/_docs/query-data/030-querying-hbase.md
+++ b/_docs/query-data/030-querying-hbase.md
@@ -2,19 +2,19 @@
 title: "Querying HBase"
 parent: "Query Data"
 ---
-<!-- 
-To use Drill to query HBase data, you need to understand how to work with the HBase byte arrays. If you want Drill to interpret the underlying HBase row key as something other than a byte array, you need to know the encoding of the data in HBase. By default, HBase stores data in little endian and Drill assumes the data is little endian, which is unsorted. The following table shows the sorting of typical rowkey IDs in bytes, encoded in little endian and big endian, respectively:
+
+To use Drill to query HBase data, you need to understand how to work with the HBase byte arrays. If you want Drill to interpret the underlying HBase row key as something other than a byte array, you need to know the encoding of the data in HBase. By default, HBase stores data in little endian and Drill assumes the data is little endian, which is unsorted. The following table shows the sorting of typical row key IDs in bytes, encoded in little endian and big endian, respectively:
 
 | IDs in Byte Notation Little Endian Sorting | IDs in Decimal Notation | IDs in Byte Notation Big Endian Sorting | IDs in Decimal Notation |
 |--------------------------------------------|-------------------------|-----------------------------------------|-------------------------|
-| 0 x 010000 . . . 000                       | 1                       | 0 x 010000 . . . 000                    | 1                       |
-| 0 x 010100 . . . 000                       | 17                      | 0 x 020000 . . . 000                    | 2                       |
-| 0 x 020000 . . . 000                       | 2                       | 0 x 030000 . . . 000                    | 3                       |
-| . . .                                      |                         | 0 x 040000 . . . 000                    | 4                       |
-| 0x 050000 . . . 000                        | 5                       | 0 x 050000 . . . 000                    | 5                       |
+| 0 x 010000 . . . 000                       | 1                       | 0 x 00000001                            | 1                       |
+| 0 x 010100 . . . 000                       | 17                      | 0 x 00000002                            | 2                       |
+| 0 x 020000 . . . 000                       | 2                       | 0 x 00000003                            | 3                       |
+| . . .                                      |                         | 0 x 00000004                            | 4                       |
+| 0 x 050000 . . . 000                       | 5                       | 0 x 00000005                            | 5                       |
 | . . .                                      |                         | . . .                                   |                         |
-| 0 x 0A000000                               | 10                      | 0 x 0A0000 . . . 000                    | 10                      |
-|                                            |                         | 0 x 010100 . . . 000                    | 17                      |
+| 0 x 0A0000 . . . 000                       | 10                      | 0 x 0000000A                            | 10                      |
+|                                            |                         | 0 x 00000101                            | 17                      |
 
 ## Querying Big Endian-Encoded Data
 
@@ -30,16 +30,16 @@ For example, Drill returns results performantly when you use the following query
 
 ```
 SELECT
- CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') d
-, CONVERT_FROM(BYTE_SUBSTR(row_key, 9, 8), 'BIGINT_BE') id
-, CONVERT_FROM(tableName.f.c, 'UTF8') 
- FROM hbase.`TestTableCompositeDate` tableName
- WHERE
- CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') < DATE '2015-06-18' AND
- CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') > DATE '2015-06-13';
+  CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') d,
+  CONVERT_FROM(BYTE_SUBSTR(row_key, 9, 8), 'BIGINT_BE') id,
+  CONVERT_FROM(tableName.f.c, 'UTF8') 
+FROM hbase.`TestTableCompositeDate` tableName
+WHERE
+  CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') < DATE '2015-06-18' AND
+  CONVERT_FROM(BYTE_SUBSTR(row_key, 1, 8), 'DATE_EPOCH_BE') > DATE '2015-06-13';
 ```
 
-This query assumes that the row key of the table represents the DATE_EPOCH type encoded in big-endian format. The Drill HBase plugin will be able to prune the scan range since there is a condition on the big endian-encoded prefix of the row key. For more examples, see the [test code:](https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java).
+This query assumes that the row key of the table represents the DATE_EPOCH type encoded in big-endian format. The Drill HBase plugin will be able to prune the scan range since there is a condition on the big endian-encoded prefix of the row key. For more examples, see the [test code](https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java).
 
 To query HBase data:
 
@@ -71,12 +71,11 @@ WHERE
   CONVERT_FROM(row_key, 'INT_OB') < cast(59 as INT);
 ```
 
-For more examples, see the [test code:](https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java).
+For more examples, see the [test code](https://github.com/apache/drill/blob/95623912ebf348962fe8a8846c5f47c5fdcf2f78/contrib/storage-hbase/src/test/java/org/apache/drill/hbase/TestHBaseFilterPushDown.java).
 
 By taking advantage of ordered byte encoding, Drill 1.2 and later can performantly execute conditional queries without a secondary index on HBase big endian data. 
 
 ## Querying Little Endian-Encoded Data
- -->
 As mentioned earlier, HBase stores data in little endian by default and Drill assumes the data is encoded in little endian. This exercise involves working with data that is encoded in little endian. First, you create two tables in HBase, students and clicks, that you can query with Drill. You use the CONVERT_TO and CONVERT_FROM functions to convert binary text to/from typed data. You use the CAST function to convert the binary data to an INT in step 4 of [Query HBase Tables]({{site.baseurl}}/docs/querying-hbase/#query-hbase-tables). When converting an INT or BIGINT number, having a byte count in the destination/source that does not match the byte count of the number in the binary source/destination, use CAST.
 
 ### Create the HBase tables

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/tutorials/010-tutorials-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/tutorials/010-tutorials-introduction.md b/_docs/tutorials/010-tutorials-introduction.md
index a2f439b..cadc147 100644
--- a/_docs/tutorials/010-tutorials-introduction.md
+++ b/_docs/tutorials/010-tutorials-introduction.md
@@ -5,17 +5,17 @@ parent: "Tutorials"
 If you've never used Drill, use these tutorials to download, install, and start working with Drill. The tutorials include step-by-step procedures for the following tasks:
 
 * [Drill in 10 Minutes]({{site.baseurl}}/docs/drill-in-10-minutes)  
-  Download and install Drill in embedded mode, which means you use a single-node cluster.  
+  Download, install, and start Drill in embedded mode (single-node cluster mode).  
 * [Analyzing the Yelp Academic Dataset]({{site.baseurl}}/docs/analyzing-the-yelp-academic-dataset)  
   Download and install Drill in embedded mode and use SQL examples to analyze Yelp data.  
 * [Learn Drill with the MapR Sandbox]({{site.baseurl}}/docs/about-the-mapr-sandbox)  
   Explore data using a Hadoop environment pre-configured with Drill.  
 * [Analyzing Highly Dynamic Datasets]({{site.baseurl}}/docs/analyzing-highly-dynamic-datasets)  
-  Delve into changing data without creating a schema or going through an ETL phase.
+  Learn how to handle dynamic data without changing a schema or going through an ETL phase.
 * [Analyzing Social Media]({{site.baseurl}}/docs/analyzing-social-media)  
-  Analyze Twitter data in native JSON format using Apache Drill.  
+  Analyze Twitter data in its native JSON format using Drill.  
 * [Tableau Examples]({{site.baseurl}}/docs/tableau-examples)  
-  Access Hive tables in Tableau.  
+  Access Hive tables using Drill and Tableau.  
 * [Using MicroStrategy Analytics with Apache Drill]({{site.baseurl}}/docs/using-microstrategy-analytics-with-apache-drill/)  
   Use the Drill ODBC driver from MapR to analyze data and generate a report using Drill from the MicroStrategy UI.  
 * [Using Tibco Spotfire Desktop with Drill]({{site.baseurl}}/docs/using-tibco-spotfire-desktop-with-drill/)  
@@ -27,7 +27,7 @@ If you've never used Drill, use these tutorials to download, install, and start
 * [Using Apache Drill with Tableau 9 Server]({{site.baseurl}}/docs/using-apache-drill-with-tableau-9-server)  
   Connect Tableau 9 Server to Apache Drill, explore multiple data formats on Hadoop, access semi-structured data, and share Tableau visualizations with others.  
 * [Using Drill to Analyze Amazon Spot Prices](https://github.com/vicenteg/spot-price-history#drill-workshop---amazon-spot-prices)  
-  A Drill workshop on github that covers views of JSON and Parquet data.  
+  Use a Drill workshop on github to create views of JSON and Parquet data.  
 * [Running Drill Queries on S3 Data](http://drill.apache.org/blog/2014/12/09/running-sql-queries-on-amazon-s3/)  
-  Nick Amato's blog that steps through querying files using Drill and Amazon Simple Storage Service (S3).  
+  Step through querying files using Drill and Amazon Simple Storage Service (S3).  
 

http://git-wip-us.apache.org/repos/asf/drill/blob/9bc38d56/_docs/tutorials/020-drill-in-10-minutes.md
----------------------------------------------------------------------
diff --git a/_docs/tutorials/020-drill-in-10-minutes.md b/_docs/tutorials/020-drill-in-10-minutes.md
index 3cfe4e4..ed2ca25 100755
--- a/_docs/tutorials/020-drill-in-10-minutes.md
+++ b/_docs/tutorials/020-drill-in-10-minutes.md
@@ -11,11 +11,11 @@ without having to perform any setup tasks.
 
 ## Installation Overview
 
-You can install Drill in embedded mode on a machine running Linux, Mac OS X, or Windows. For information about installing Drill in distributed mode, see [Installing Drill in Distributed Mode]({{ site.baseurl }}/docs/installing-drill-in-distributed-mode).
+You can install Drill to run in embedded mode on a machine running Linux, Mac OS X, or Windows. For information about installing Drill to run in distributed mode, see [Installing Drill in Distributed Mode]({{ site.baseurl }}/docs/installing-drill-in-distributed-mode).
 
-This installation procedure includes how to download the Apache Drill archive and extract the contents to a directory on your machine. The Apache Drill archive contains sample JSON and Parquet files that you can query immediately.
+This installation procedure includes how to download the Apache Drill archive file and extract the contents to a directory on your machine. The Apache Drill archive contains sample JSON and Parquet files that you can query immediately.
 
-After installing Drill, you start the Drill shell. The Drill shell is a pure-Java console-based utility for connecting to relational databases and executing SQL commands. Drill follows the ANSI SQL: 2011 standard with [extensions]({{site.baseurl}}/docs/sql-extensions/) for nested data formats and other capabilities.
+After installing Drill, you start the Drill shell. The Drill shell is a pure-Java console-based utility for connecting to relational databases and executing SQL commands. Drill follows the SQL:2011 standard with [extensions]({{site.baseurl}}/docs/sql-extensions/) for nested data formats and other capabilities.
 
 ## Embedded Mode Installation Prerequisites
 
@@ -23,15 +23,15 @@ You need to meet the following prerequisites to run Drill:
 
 * Linux, Mac OS X, and Windows: [Oracle Java SE Development (JDK) Kit 7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) installation  
 * Windows only:  
-  * A JAVA_HOME environment variable set up that points to  to the JDK installation  
-  * A PATH environment variable that includes a pointer to the JDK installation  
-  * A third-party utility for unzipping a tar.gz file 
+  * A JAVA_HOME environment variable set up that points to the JDK installation  
+  * A PATH environment variable that includes a pointer to the bin directory of the JDK installation 
+  * A third-party utility for unzipping a .tar.gz file 
   
 ### Java Installation Prerequisite Check
 
 Run the following command in a terminal (Linux and Mac OS X) or Command Prompt (Windows) to verify that Java 7 is the version in effect:
 
-    java -version
+`java -version`
 
 The output looks something like this:
 
@@ -43,7 +43,7 @@ The output looks something like this:
 
 Complete the following steps to install Drill:  
 
-1. In a terminal windows, change to the directory where you want to install Drill.
+1. In a terminal window, change to the directory where you want to install Drill.
 
 2. To download the latest version of Apache Drill, download Drill from the [Drill web site](http://getdrill.org/drill/download/apache-drill-1.1.0.tar.gz) or run one of the following commands, depending on which you have installed on your system:
 
@@ -52,11 +52,11 @@ Complete the following steps to install Drill:
 
 3. Copy the downloaded file to the directory where you want to install Drill. 
 
-4. Extract the contents of the Drill tar.gz file. Use sudo only if necessary:  
+4. Extract the contents of the Drill tar.gz file. Use sudo if necessary:  
 
-        tar -xvzf apache-drill-1.1.0.tar.gz  
+    `tar -xvzf apache-drill-1.1.0.tar.gz`  
 
-The extraction process creates the installation directory named apache-drill-1.1.0 containing the Drill software.
+The extraction process creates an installation directory containing the Drill software.
 
 At this point, you can start Drill.
 
@@ -65,11 +65,11 @@ Start Drill in embedded mode using the `drill-embedded` command:
 
 1. Navigate to the Drill installation directory. For example:  
 
-        cd apache-drill-1.1.0  
+    `cd apache-drill-1.1.0`  
 
 2. Issue the following command to launch Drill in embedded mode:
 
-        bin/drill-embedded  
+    `bin/drill-embedded`  
 
    The message of the day followed by the `0: jdbc:drill:zk=local>`  prompt appears.  
 
@@ -106,7 +106,7 @@ At this point, you can [run queries]({{ site.baseurl }}/docs/drill-in-10-minutes
 
 Issue the following command when you want to exit the Drill shell:
 
-    !quit
+`!quit`
 
 ## Query Sample Data
 
@@ -124,7 +124,7 @@ A sample JSON file, `employee.json`, contains fictitious employee data.
 To view the data in the `employee.json` file, submit the following SQL query
 to Drill, using the [cp (classpath) storage plugin]({{site.baseurl}}/docs/storage-plugin-registration/) to point to the file.
     
-    0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json` LIMIT 3;
+`0: jdbc:drill:zk=local> SELECT * FROM cp.`employee.json` LIMIT 3;`
 
 The query output is:
 
@@ -153,7 +153,7 @@ systems.
 To view the data in the `region.parquet` file, issue the query appropriate for
 your operating system:
 
-        SELECT * FROM dfs.`<path-to-installation>/apache-drill-<version>/sample-data/region.parquet`;
+``SELECT * FROM dfs.`<path-to-installation>/apache-drill-<version>/sample-data/region.parquet`;``
 
 The query returns the following results:
 
@@ -179,7 +179,7 @@ systems.
 To view the data in the `nation.parquet` file, issue the query appropriate for
 your operating system:
 
-          SELECT * FROM dfs.`<path-to-installation>/apache-drill-<version>/sample-data/nation.parquet`;
+``SELECT * FROM dfs.`<path-to-installation>/apache-drill-<version>/sample-data/nation.parquet`;``
 
 The query returns the following results: