You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@drill.apache.org by br...@apache.org on 2015/05/22 01:24:20 UTC

drill git commit: add examples per Mehant

Repository: drill
Updated Branches:
  refs/heads/gh-pages 161108df3 -> 2070bbe78


add examples per Mehant


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/2070bbe7
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/2070bbe7
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/2070bbe7

Branch: refs/heads/gh-pages
Commit: 2070bbe7846e60a99115a6bccb70c78f974ffd5b
Parents: 161108d
Author: Kristine Hahn <kh...@maprtech.com>
Authored: Thu May 21 16:21:48 2015 -0700
Committer: Kristine Hahn <kh...@maprtech.com>
Committed: Thu May 21 16:21:48 2015 -0700

----------------------------------------------------------------------
 _docs/connect-a-data-source/040-workspaces.md   |  2 +-
 .../010-embedded-mode-prerequisites.md          |  2 +-
 .../data-types/010-supported-data-types.md      | 11 ++-
 .../data-types/020-date-time-and-timestamp.md   | 41 ++++++++--
 .../sql-functions/020-data-type-conversion.md   | 30 ++++++--
 .../030-date-time-functions-and-arithmetic.md   | 78 ++++++++++++++------
 _docs/tutorials/010-tutorials-introduction.md   |  2 +-
 7 files changed, 124 insertions(+), 42 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/2070bbe7/_docs/connect-a-data-source/040-workspaces.md
----------------------------------------------------------------------
diff --git a/_docs/connect-a-data-source/040-workspaces.md b/_docs/connect-a-data-source/040-workspaces.md
index 9885b1a..864c7ce 100644
--- a/_docs/connect-a-data-source/040-workspaces.md
+++ b/_docs/connect-a-data-source/040-workspaces.md
@@ -57,7 +57,7 @@ files and tables in the `file` or `hive default` workspaces, you can omit the
 workspace name from the query.
 
 For example, you can issue a query on a Hive table in the `default workspace`
-using either of the following formats and get the the same results:
+using either of the following formats and get the same results:
 
 **Example**
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2070bbe7/_docs/install/installing-drill-in-embedded-mode/010-embedded-mode-prerequisites.md
----------------------------------------------------------------------
diff --git a/_docs/install/installing-drill-in-embedded-mode/010-embedded-mode-prerequisites.md b/_docs/install/installing-drill-in-embedded-mode/010-embedded-mode-prerequisites.md
index f0cd33f..920dca3 100644
--- a/_docs/install/installing-drill-in-embedded-mode/010-embedded-mode-prerequisites.md
+++ b/_docs/install/installing-drill-in-embedded-mode/010-embedded-mode-prerequisites.md
@@ -14,7 +14,7 @@ You need to meet the following prerequisites to run Drill:
 
 * Linux, Mac OS X, and Windows: [Oracle Java SE Development (JDK) Kit 7](http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html) installation  
 * Windows only:  
-  * A JAVA_HOME environment variable set up that points to  to the JDK installation  
+  * A JAVA_HOME environment variable set up that points to the JDK installation  
   * A PATH environment variable that includes a pointer to the JDK installation  
   * A third-party utility for unzipping a tar.gz file 
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2070bbe7/_docs/sql-reference/data-types/010-supported-data-types.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/data-types/010-supported-data-types.md b/_docs/sql-reference/data-types/010-supported-data-types.md
index 8008f76..45a4366 100644
--- a/_docs/sql-reference/data-types/010-supported-data-types.md
+++ b/_docs/sql-reference/data-types/010-supported-data-types.md
@@ -43,14 +43,12 @@ To enable the DECIMAL type, set the `planner.enable_decimal_data_type` option to
 
 Drill supports the following composite types:
 
-* Array
 * Map
+* Array
 
 A map is a set of name/value pairs. A value in a map can be a scalar type, such as string or int, or a complex type, such as an array or another map. An array is a repeated list of values. A value in an array can be a scalar type, such as string or int, or an array can be a complex type, such as a map or another array.
 
-Drill uses map and array data types internally for reading complex and nested data structures from data sources. For more information, see examples of [handling JSON maps and arrays]({{ site.baseurl }}/docs/handling-different-data-types/#handling-json-and-parquet-data). 
-
-In this release of Drill, you cannot reference a composite type by name in a query, but Drill supports array values coming from data sources. For example, you can use the index syntax to query data and get the value of an array element:  
+Drill uses map and array data types internally for reading complex and nested data structures from data sources. In this release of Drill, you cannot reference a composite type by name in a query, but Drill supports array values coming from data sources. For example, you can use the index syntax to query data and get the value of an array element:  
 
 `a[1]`  
 
@@ -58,9 +56,10 @@ You can refer to the value for a key in a map using this syntax:
 
 `m['k']`
 
-The section [“Query Complex Data”]({{ site.baseurl }}/docs/querying-complex-data-introduction) show how to use [composite types]({{site.baseurl}}/docs/supported-data-types/#composite-types) to access nested arrays.
+The section [“Query Complex Data”]({{ site.baseurl }}/docs/querying-complex-data-introduction) shows how to use [composite types]({{site.baseurl}}/docs/supported-data-types/#composite-types) to access nested arrays. ["Handling Different Data Types"]({{ site.baseurl }}/docs/handling-different-data-types/#handling-json-and-parquet-data) includes examples of JSON maps and arrays. Drill provides functions for handling array and map types:
 
-For more information about using array and map types, see the sections, ["KVGEN"]({{site.baseurl}}/docs/kvgen/) and ["FLATTEN"]({{site.baseurl}}/docs/flatten/).
+* ["KVGEN"]({{site.baseurl}}/docs/kvgen/)
+* ["FLATTEN"]({{site.baseurl}}/docs/flatten/)
 
 ## Casting and Converting Data Types
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2070bbe7/_docs/sql-reference/data-types/020-date-time-and-timestamp.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/data-types/020-date-time-and-timestamp.md b/_docs/sql-reference/data-types/020-date-time-and-timestamp.md
index 428ec28..dc9d17c 100644
--- a/_docs/sql-reference/data-types/020-date-time-and-timestamp.md
+++ b/_docs/sql-reference/data-types/020-date-time-and-timestamp.md
@@ -2,13 +2,28 @@
 title: "Date, Time, and Timestamp"
 parent: "Data Types"
 ---
-Using familiar date and time formats, listed in the [SQL data types table]({{ site.baseurl }}/docs/data-types/data-types), you can construct query date and time data. You need to cast textual data to date and time data types. The format of date, time, and timestamp text in a textual data source needs to match the SQL query format for successful casting. 
+Using familiar date and time formats, listed in the [SQL data types table]({{ site.baseurl }}/docs/supported-data-types), you can construct query date and time data. You need to cast textual data to date and time data types. The format of date, time, and timestamp text in a textual data source needs to match the SQL query format for successful casting. Drill supports date, time, timestamp, and interval literals shown in the following example:
+
+    SELECT DATE '2008-2-23', 
+           TIME '12:23:34', 
+           TIMESTAMP '2008-2-23 12:23:34.456', 
+           INTERVAL '1' YEAR, INTERVAL '2' DAY, 
+           DATE_ADD(DATE '2008-2-23', INTERVAL '1 10:20:30' DAY TO SECOND) 
+           DATE_ADD(DATE '2010-2-23', 1)
+    FROM sys.version LIMIT 1;
+    +-------------+-----------+--------------------------+---------+---------+------------------------+-------------+
+    |   EXPR$0    |  EXPR$1   |          EXPR$2          | EXPR$3  | EXPR$4  |         EXPR$5         |   EXPR$6    |
+    +-------------+-----------+--------------------------+---------+---------+------------------------+-------------+
+    | 2008-02-23  | 12:23:34  | 2008-02-23 12:23:34.456  | P1Y     | P2D     | 2008-02-24 10:20:30.0  | 2010-02-24  |
+    +-------------+-----------+--------------------------+---------+---------+------------------------+-------------+
 
 ## INTERVALYEAR and INTERVALDAY
 
 The INTERVALYEAR AND INTERVALDAY types represent a period of time. The INTERVALYEAR type specifies values from a year to a month. The INTERVALDAY type specifies values from a day to seconds.
 
-If your interval data is in the data source, for example a JSON file, cast the JSON VARCHAR types to intervalyear and intervalday using the folliwng ISO 8601 syntax:
+### Interval in Data Source
+
+If your interval data is in the data source, for example a JSON file, cast the JSON VARCHAR types to INTERVALYEAR and INTERVALDAY using the following ISO 8601 syntax:
 
     P [qty] Y [qty] M [qty] D T [qty] H [qty] M [qty] S
 
@@ -26,8 +41,9 @@ where:
 * M follows a number of minutes.
 * S follows a number of seconds and optional milliseconds to the right of a decimal point
 
+### Interval Literal
 
-If your input is interval data, use the following SQL literals to restrict the set of stored interval fields:
+You can use INTERVAL as a keyword that introduces an interval literal that denotes a data type. With the input of interval data, use the following SQL literals to restrict the set of stored interval fields:
 
 * YEAR
 * MONTH
@@ -43,6 +59,22 @@ If your input is interval data, use the following SQL literals to restrict the s
 * HOUR TO SECOND
 * MINUTE TO SECOND
 
+### Interval in a Data Source Example
+
+To cast interval data to interval types you can query from a data source such as JSON, see the example in the section, ["Casting Intervals"]({{site.baseurl}}/docs/data-type-conversion/#casting-intervals).
+
+### Literal Interval Exampls
+
+In the following example, the INTERVAL keyword followed by 200 adds 200 years to the timestamp. The parentheticated 3 in `YEAR(3)` specifies the precision of the year interval, 3 digits in this case to support the hundreds interval.
+
+    SELECT CURRENT_TIMESTAMP + INTERVAL '200' YEAR(3) FROM sys.version;
+    +--------------------------+
+    |          EXPR$0          |
+    +--------------------------+
+    | 2215-05-20 14:04:25.129  |
+    +--------------------------+
+    1 row selected (0.148 seconds)
+
 The following examples show the input and output format of INTERVALYEAR (Year, Month) and INTERVALDAY (Day, Hours, Minutes, Seconds, Milliseconds). The following SELECT statements show how to format the query input. The output shows how to format the data in the data source.
 
     SELECT INTERVAL '1 10:20:30.123' day to second FROM sys.version;
@@ -77,11 +109,10 @@ The following examples show the input and output format of INTERVALYEAR (Year, M
     +------------+
     1 row selected (0.076 seconds)
 
-For information about casting interval data, see the ["CAST"]({{ site.baseurl }}/docs/data-type-conversion#cast) function.
 
 ## DATE, TIME, and TIMESTAMP
 
-DATE, TIME, and TIMESTAMP store values in Coordinated Universal Time (UTC). Drill supports time functions in the range 1971 to 2037.
+DATE, TIME, and TIMESTAMP literals. Drill stores values in Coordinated Universal Time (UTC). Drill supports time functions in the range 1971 to 2037.
 
 Drill does not support TIMESTAMP with time zone; however, if your data includes the time zone, use the [TO_TIMESTAMP function]({{ site.baseurl }}/docs/casting/converting-data-types#to_timestamp) and [Joda format specifiers]({{site.baseurl}}/docs/data-type-conversion/#format-specifiers-for-date/time-conversions) as shown the examples in section, ["Time Zone Limitation"]({{site.baseurl}}/docs/data-type-conversion/#time-zone-limitation).
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2070bbe7/_docs/sql-reference/sql-functions/020-data-type-conversion.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/sql-functions/020-data-type-conversion.md b/_docs/sql-reference/sql-functions/020-data-type-conversion.md
index fe478e1..0476902 100644
--- a/_docs/sql-reference/sql-functions/020-data-type-conversion.md
+++ b/_docs/sql-reference/sql-functions/020-data-type-conversion.md
@@ -86,17 +86,20 @@ Cast an integer to a decimal.
 
 ### Casting Intervals
 
-To cast interval data to the INTERVALDAY or INTERVALYEAR types use the following syntax:
+To cast interval data to interval types you can query from a data source such as JSON, for example, use the following syntax, respectively:
 
     CAST (column_name AS INTERVAL DAY)
     CAST (column_name AS INTERVAL YEAR)
+    CAST (column_name AS INTERVAL SECOND)
 
-For example, a JSON file named intervals.json contains the following objects:
+For example, a JSON file named `intervals.json` contains the following objects:
 
     { "INTERVALYEAR_col":"P1Y", "INTERVALDAY_col":"P1D", "INTERVAL_col":"P1Y1M1DT1H1M" }
     { "INTERVALYEAR_col":"P2Y", "INTERVALDAY_col":"P2D", "INTERVAL_col":"P2Y2M2DT2H2M" }
     { "INTERVALYEAR_col":"P3Y", "INTERVALDAY_col":"P3D", "INTERVAL_col":"P3Y3M3DT3H3M" }
 
+Create a table in Parquet from the interval data in the `intervals.json` file.
+
 1. Set the storage format to Parquet.
 
         ALTER SESSION SET `store.format` = 'parquet';
@@ -110,10 +113,25 @@ For example, a JSON file named intervals.json contains the following objects:
 
 2. Use a CTAS statement to cast text from a JSON file to year and day intervals and to write the data to a Parquet table:
 
-    CREATE TABLE dfs.tmp.parquet_intervals AS 
-    (SELECT CAST( INTERVALYEAR_col as interval year) INTERVALYEAR_col, 
-            CAST( INTERVALDAY_col as interval day) INTERVALDAY_col 
-    FROM dfs.`/Users/drill/intervals.json`);
+        CREATE TABLE dfs.tmp.parquet_intervals AS 
+        (SELECT CAST( INTERVALYEAR_col as INTERVAL YEAR) INTERVALYEAR_col, 
+                CAST( INTERVALDAY_col as INTERVAL DAY) INTERVALDAY_col,
+                CAST( INTERVAL_col as INTERVAL SECOND) INTERVAL_col 
+        FROM dfs.`/Users/drill/intervals.json`);
+
+3. Take a look at what Drill wrote to the Parquet file:
+
+        SELECT * FROM dfs.`tmp`.parquet_intervals;
+        +-------------------+------------------+---------------+
+        | INTERVALYEAR_col  | INTERVALDAY_col  | INTERVAL_col  |
+        +-------------------+------------------+---------------+
+        | P12M              | P1D              | P1DT3660S     |
+        | P24M              | P2D              | P2DT7320S     |
+        | P36M              | P3D              | P3DT10980S    |
+        +-------------------+------------------+---------------+
+        3 rows selected (0.082 seconds)
+
+Because you cast the INTERVAL_col to INTERVAL SECOND, Drill returns the interval data representing the year, month, day, hour, minute, and second. 
 
 ## CONVERT_TO and CONVERT_FROM
 

http://git-wip-us.apache.org/repos/asf/drill/blob/2070bbe7/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md
----------------------------------------------------------------------
diff --git a/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md b/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md
index a7956d8..23b0983 100644
--- a/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md
+++ b/_docs/sql-reference/sql-functions/030-date-time-functions-and-arithmetic.md
@@ -36,7 +36,7 @@ Cast string arguments to timestamp to include time data in the calculations of t
 
 ### AGE Examples
 
-Find the interval between midnight April 3, 2015 and June 13, 1957.
+Find the interval between midnight today, April 3, 2015, and June 13, 1957.
 
     SELECT AGE('1957-06-13') FROM sys.version;
     +------------+
@@ -46,6 +46,17 @@ Find the interval between midnight April 3, 2015 and June 13, 1957.
     +------------+
     1 row selected (0.064 seconds)
 
+Find the interval between midnight today, May 21, 2015, and hire dates of employees 578 and 761 in the employees.json file included with the Drill installation.
+
+    SELECT AGE(CAST(hire_date AS TIMESTAMP)) FROM cp.`employee.json` where employee_id IN( '578','761');
+    +------------------+
+    |      EXPR$0      |
+    +------------------+
+    | P236MT25200S     |
+    | P211M19DT25200S  |
+    +------------------+
+    2 rows selected (0.121 seconds)
+
 Find the interval between 11:10:10 PM on January 1, 2001 and 10:10:10 PM on January 1, 2001.
 
     SELECT AGE(CAST('2010-01-01 10:10:10' AS TIMESTAMP), CAST('2001-01-01 11:10:10' AS TIMESTAMP)) FROM sys.version;
@@ -82,10 +93,6 @@ Returns the sum of a date/time and a number of days/hours, or of a date/time and
 
 ### DATE_ADD Examples
 
-Add two days to the birthday column in the genealogy database.
-
-    SELECT DATE_ADD (CAST (birthdays AS date), 2) from genealogy.json;
-
 Add two days to today's date May 15, 2015.
 
     SELECT DATE_ADD(date '2015-05-15', 2) FROM sys.version;
@@ -96,35 +103,51 @@ Add two days to today's date May 15, 2015.
     +------------+
     1 row selected (0.07 seconds)
 
-Add two months to April 15, 2015.
+Using the example data from the ["Casting Intervals"]({{site.baseurl}}/docs/data-type-conversion/#casting-intervals) section, add intervals from the `intervals.json` file to a literal timestamp using an interval expression. Create an interval expression that casts the interval data in the intervals.json file to a timestamp.
 
-    SELECT DATE_ADD(date '2015-04-15', interval '2' month) FROM sys.version;
+    SELECT DATE_ADD(timestamp '2015-04-15 22:55:55', CAST(INTERVALDAY_col as interval second)) FROM dfs.`/Users/drilluser/apache-drill-1.0.0/intervals.json`;
     +------------------------+
     |         EXPR$0         |
     +------------------------+
-    | 2015-06-15 00:00:00.0  |
+    | 2015-04-16 22:55:55.0  |
+    | 2015-04-17 22:55:55.0  |
+    | 2015-04-18 22:55:55.0  |
     +------------------------+
-    1 row selected (0.107 seconds)
+    3 rows selected (0.105 seconds)
+
+The query returns the sum of the timestamp plus 1, 2, and 3 days becuase the INTERVALDAY_col contains P1D, P2D, and P3D, 
+
+The Drill installation includes the `employee.json` file that has records of employee hire dates:
 
-Add 10 hours to the timestamp 2015-04-15 22:55:55.
+    SELECT * FROM cp.`employee.json` LIMIT 1;
+    +--------------+---------------+-------------+------------+--------------+-----------------+-----------+----------------+-------------+------------------------+----------+----------------+------------------+-----------------+---------+--------------------+
+    | employee_id  |   full_name   | first_name  | last_name  | position_id  | position_title  | store_id  | department_id  | birth_date  |       hire_date        |  salary  | supervisor_id  | education_level  | marital_status  | gender  |  management_role   |
+    +--------------+---------------+-------------+------------+--------------+-----------------+-----------+----------------+-------------+------------------------+----------+----------------+------------------+-----------------+---------+--------------------+
+    | 1            | Sheri Nowmer  | Sheri       | Nowmer     | 1            | President       | 0         | 1              | 1961-08-26  | 1994-12-01 00:00:00.0  | 80000.0  | 0              | Graduate Degree  | S               | F       | Senior Management  |
+    +--------------+---------------+-------------+------------+--------------+-----------------+-----------+----------------+-------------+------------------------+----------+----------------+------------------+-----------------+---------+--------------------+
+    1 row selected (0.137 seconds)
 
-    SELECT DATE_ADD(timestamp '2015-04-15 22:55:55', interval '10' hour) FROM sys.version;
+Look at the hire_dates for the employee 578 and 761 in `employee.json`.
+
+    SELECT hire_date FROM cp.`employee.json` where employee_id IN( '578','761');
     +------------------------+
-    |         EXPR$0         |
+    |       hire_date        |
     +------------------------+
-    | 2015-04-16 08:55:55.0  |
+    | 1996-01-01 00:00:00.0  |
+    | 1998-01-01 00:00:00.0  |
     +------------------------+
-    1 row selected (0.199 seconds)
+    2 rows selected (0.135 seconds)
 
-Add 10 hours to the time 22 hours, 55 minutes, 55 seconds.
+Cast the hire_dates of the employees 578 and 761 to a timestamp, and add 10 hours to the hire_date timestamp. Because Drill reads data from JSON as VARCHAR, you need to cast the hire_date to the TIMESTAMP type. 
 
-    SELECT DATE_ADD(time '22:55:55', interval '10' hour) FROM sys.version;
-    +------------+
-    |   EXPR$0   |
-    +------------+
-    | 08:55:55   |
-    +------------+
-    1 row selected (0.085 seconds)
+    SELECT DATE_ADD(CAST(hire_date AS TIMESTAMP), interval '10' hour) FROM cp.`employee.json` where employee_id IN( '578','761');
+    +------------------------+
+    |         EXPR$0         |
+    +------------------------+
+    | 1996-01-01 10:00:00.0  |
+    | 1998-01-01 10:00:00.0  |
+    +------------------------+
+    2 rows selected (0.172 seconds)
 
 Add 1 year and 1 month to the timestamp 2015-04-15 22:55:55.
 
@@ -211,6 +234,17 @@ Returns the difference between a date/time and a number of days/hours, or betwee
 
 ### DATE_SUB Examples
 
+Cast the hire_dates of the employees 578 and 761 to a timestamp, and add 10 hours to the hire_date timestamp. Because Drill reads data from JSON as VARCHAR, you need to cast the hire_date to the TIMESTAMP type. 
+
+    SELECT DATE_SUB(CAST(hire_date AS TIMESTAMP), interval '10' hour) FROM cp.`employee.json` WHERE employee_id IN( '578','761');
+    +------------------------+
+    |         EXPR$0         |
+    +------------------------+
+    | 1995-12-31 14:00:00.0  |
+    | 1997-12-31 14:00:00.0  |
+    +------------------------+
+    2 rows selected (0.161 seconds)
+
 Subtract two days to today's date May 15, 2015.
 
     SELECT DATE_SUB(date '2015-05-15', 2) FROM sys.version;

http://git-wip-us.apache.org/repos/asf/drill/blob/2070bbe7/_docs/tutorials/010-tutorials-introduction.md
----------------------------------------------------------------------
diff --git a/_docs/tutorials/010-tutorials-introduction.md b/_docs/tutorials/010-tutorials-introduction.md
index 46c782c..87037b6 100644
--- a/_docs/tutorials/010-tutorials-introduction.md
+++ b/_docs/tutorials/010-tutorials-introduction.md
@@ -19,7 +19,7 @@ If you've never used Drill, use these tutorials to download, install, and start
 * [Using MicroStrategy Analytics with Apache Drill]({{site.baseurl}}/docs/using-microstrategy-analytics-with-apache-drill/)  
   Use the Drill ODBC driver from MapR to analyze data and generate a report using Drill from the MicroStrategy UI.  
 * [Using Tibco Spotfire Server with Drill]({{site.baseurl}}/docs/using-tibco-spotfire-with-drill/)  
-  Use the Apache Drill to query complex data structures from Tibco Spotfire Desktop.
+  Use Apache Drill to query complex data structures from Tibco Spotfire Desktop.
 * [Configuring Tibco Spotfire Server with Drill]({{site.baseurl}}/docs/configuring-tibco-spotfire-server-with-drill)  
   Integrate Tibco Spotfire Server with Apache Drill and explore multiple data formats on Hadoop.  
 * [Using Apache Drill with Tableau 9 Desktop]({{site.baseurl}}/docs/using-apache-drill-with-tableau-9-desktop)