You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@drill.apache.org by br...@apache.org on 2015/04/01 02:02:54 UTC

drill git commit: DRILL-2635

Repository: drill
Updated Branches:
  refs/heads/gh-pages d1dd57fe3 -> 81be4bc98


DRILL-2635


Project: http://git-wip-us.apache.org/repos/asf/drill/repo
Commit: http://git-wip-us.apache.org/repos/asf/drill/commit/81be4bc9
Tree: http://git-wip-us.apache.org/repos/asf/drill/tree/81be4bc9
Diff: http://git-wip-us.apache.org/repos/asf/drill/diff/81be4bc9

Branch: refs/heads/gh-pages
Commit: 81be4bc98458c4b5d7569af198ff97c26074098c
Parents: d1dd57f
Author: Kristine Hahn <kh...@maprtech.com>
Authored: Tue Mar 31 15:48:26 2015 -0700
Committer: Bridget Bevens <bb...@maprtech.com>
Committed: Tue Mar 31 17:01:59 2015 -0700

----------------------------------------------------------------------
 _docs/data-sources/004-json-ref.md              |  4 +-
 _docs/develop/contribute/002-ideas.md           | 21 ++++-
 _docs/install/001-drill-in-10.md                |  6 +-
 _docs/install/002-deploy.md                     |  2 +-
 _docs/install/004-install-distributed.md        |  2 +-
 .../install-embedded/001-install-linux.md       |  2 +-
 .../install/install-embedded/002-install-mac.md |  2 +-
 .../install/install-embedded/003-install-win.md |  2 +-
 _docs/sql-ref/functions/002-data-type-fmt.md    | 92 ++++++++++++++++++--
 9 files changed, 113 insertions(+), 20 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/data-sources/004-json-ref.md
----------------------------------------------------------------------
diff --git a/_docs/data-sources/004-json-ref.md b/_docs/data-sources/004-json-ref.md
index 358fc61..c8d525a 100644
--- a/_docs/data-sources/004-json-ref.md
+++ b/_docs/data-sources/004-json-ref.md
@@ -452,7 +452,9 @@ For example, you cannot query the [City Lots San Francisco in .json](https://git
 After removing the extraneous square brackets in the coordinates array, you can drill down to query all the data for the lots.
 
 ### Lengthy JSON objects
-TBD statement about limits.
+Currently, Drill cannot manage lengthy JSON objects, such as a gigabit JSON file. Finding the beginning and end of records can be time consuming and require scanning the whole file. 
+
+Workaround: Use a tool to split the JSON file into smaller chunks of 64-128MB or 64-256MB initially until you know the total data size and node configuration. Keep the JSON objects intact in each file. A distributed file system, such as MapR-FS, is recommended over trying to manage file partitions.
 
 ### Complex JSON objects
 Complex arrays and maps can be difficult or impossible to query.

http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/develop/contribute/002-ideas.md
----------------------------------------------------------------------
diff --git a/_docs/develop/contribute/002-ideas.md b/_docs/develop/contribute/002-ideas.md
index 2270112..052d3e7 100644
--- a/_docs/develop/contribute/002-ideas.md
+++ b/_docs/develop/contribute/002-ideas.md
@@ -58,7 +58,7 @@ s.java)** **
 
 Currently Drill supports text, JSON and Parquet file formats natively when
 interacting with file system. More readers/writers can be introduced by
-implementing custom storage plugins. Example formats include below.
+implementing custom storage plugins. Example formats are.
 
   * AVRO
   * Sequence
@@ -67,7 +67,24 @@ implementing custom storage plugins. Example formats include below.
   * Protobuf
   * XML
   * Thrift
-  * ....
+
+You can refer to the github commits to the mongo db and hbase storage plugin for implementation details: 
+
+* [mongodb_storage_plugin](https://github.com/apache/drill/commit/2ca9c907bff639e08a561eac32e0acab3a0b3304)
+* [hbase_storage_plugin](https://github.com/apache/drill/commit/3651182141b963e24ee48db0530ec3d3b8b6841a)
+
+Initially, concentrate on basics:
+
+* AbstractGroupScan (MongoGroupScan, HbaseGroupScan)  
+* SubScan (MongoSubScan, HbaseSubScan)  
+* RecordReader (MongoRecordReader, HbaseRecordReader)  
+* BatchCreator (MongoScanBatchCreator, HbaseScanBatchCreator)  
+* AbstractStoragePlugin (MongoStoragePlugin, HbaseStoragePlugin)  
+* StoragePluginConfig (MongoStoragePluginConfig, HbaseStoragePluginConfig)
+
+Focus on implementing/extending this list of classes and the corresponding implementations done by Mongo and Hbase. Ignore the mongo db plugin optimizer rules for pushing predicates into the scan.
+
+Writing a new file-based storage plugin, such as a JSON or text-based storage plugin, simply involves implementing a couple of interfaces. The JSON storage plugin is a good example. 
 
 ## Support for new data sources
 

http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/install/001-drill-in-10.md
----------------------------------------------------------------------
diff --git a/_docs/install/001-drill-in-10.md b/_docs/install/001-drill-in-10.md
old mode 100644
new mode 100755
index 37b8bd0..8bbc907
--- a/_docs/install/001-drill-in-10.md
+++ b/_docs/install/001-drill-in-10.md
@@ -108,7 +108,7 @@ Complete the following steps to install Drill:
 
   1. Issue the following command to download the latest, stable version of Apache Drill to a directory on your machine:
         
-        wget http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz
+        wget http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz
   2. Issue the following command to create a new directory to which you can extract the contents of the Drill `tar.gz` file:
   
         sudo mkdir -p /opt/drill
@@ -137,7 +137,7 @@ Complete the following steps to install Drill:
         $ pwd
         /Users/max/drill
   2. Click the following link to download the latest, stable version of Apache Drill:  
-      [http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz)
+      [http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz](http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz)
   3. Open the downloaded `TAR` file with the Mac Archive utility or a similar tool for unzipping files.
   4. Move the resulting `apache-drill-<version>` folder into the `drill` directory that you created.
   5. Issue the following command to navigate to the `apache-drill-<version>` directory:
@@ -180,7 +180,7 @@ Complete the following steps to install Drill:
      Do not include spaces in your directory path. If you include spaces in the
 directory path, Drill fails to run.
   2. Click the following link to download the latest, stable version of Apache Drill: 
-      [http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz)
+      [http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz](http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz)
   3. Move the `apache-drill-<version>.tar.gz` file to the `drill` directory that you created on your `C:\` drive.
   4. Unzip the `TAR.GZ` file and the resulting `TAR` file.
      1. Right-click `apache-drill-<version>.tar.gz,` and select `7-Zip>Extract Here`. The utility extracts the `apache-drill-<version>.tar` file.

http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/install/002-deploy.md
----------------------------------------------------------------------
diff --git a/_docs/install/002-deploy.md b/_docs/install/002-deploy.md
old mode 100644
new mode 100755
index 399414e..4e0a84a
--- a/_docs/install/002-deploy.md
+++ b/_docs/install/002-deploy.md
@@ -27,7 +27,7 @@ Complete the following steps to install Drill on designated nodes:
 
   1. Download the Drill tarball.
   
-        curl http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz
+        curl http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz
   2. Issue the following command to create a Drill installation directory and then explode the tarball to the directory:
   
         mkdir /opt/drill

http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/install/004-install-distributed.md
----------------------------------------------------------------------
diff --git a/_docs/install/004-install-distributed.md b/_docs/install/004-install-distributed.md
old mode 100644
new mode 100755
index a47176f..593b040
--- a/_docs/install/004-install-distributed.md
+++ b/_docs/install/004-install-distributed.md
@@ -28,7 +28,7 @@ Complete the following steps to install Drill on designated nodes:
 
   1. Download the Drill tarball.
   
-        curl http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz
+        curl http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz
   2. Issue the following command to create a Drill installation directory and then explode the tarball to the directory:
   
         mkdir /opt/drill

http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/install/install-embedded/001-install-linux.md
----------------------------------------------------------------------
diff --git a/_docs/install/install-embedded/001-install-linux.md b/_docs/install/install-embedded/001-install-linux.md
old mode 100644
new mode 100755
index 589fa0f..37ab3c7
--- a/_docs/install/install-embedded/001-install-linux.md
+++ b/_docs/install/install-embedded/001-install-linux.md
@@ -7,7 +7,7 @@ Linux:
 
   1. Issue the following command to download the latest, stable version of Apache Drill to a directory on your machine:
     
-        wget http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz
+        wget http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz
   2. Issue the following command to create a new directory to which you can extract the contents of the Drill `tar.gz` file:
   
         sudo mkdir -p /opt/drill

http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/install/install-embedded/002-install-mac.md
----------------------------------------------------------------------
diff --git a/_docs/install/install-embedded/002-install-mac.md b/_docs/install/install-embedded/002-install-mac.md
old mode 100644
new mode 100755
index 97ae775..b68a9e5
--- a/_docs/install/install-embedded/002-install-mac.md
+++ b/_docs/install/install-embedded/002-install-mac.md
@@ -16,7 +16,7 @@ OS X:
         $ pwd
         /Users/max/drill
   2. Click the following link to download the latest, stable version of Apache Drill:  
-     [http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz)
+     [http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz](http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz)
   3. Open the downloaded `TAR` file with the Mac Archive utility or a similar tool for unzipping files.
   4. Move the resulting `apache-drill-<version>` folder into the `drill` directory that you created.
   5. Issue the following command to navigate to the `apache-drill-<version>` directory:

http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/install/install-embedded/003-install-win.md
----------------------------------------------------------------------
diff --git a/_docs/install/install-embedded/003-install-win.md b/_docs/install/install-embedded/003-install-win.md
old mode 100644
new mode 100755
index 6c8272b..23264df
--- a/_docs/install/install-embedded/003-install-win.md
+++ b/_docs/install/install-embedded/003-install-win.md
@@ -35,7 +35,7 @@ Complete the following steps to install Drill:
 directory path, Drill fails to run.
   2. Click the following link to download the latest, stable version of Apache Drill:
   
-     [http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz](http://www.apache.org/dyn/closer.cgi/drill/drill-0.7.0/apache-drill-0.7.0.tar.gz)
+     [http://getdrill.org/drill/download/apache-drill-0.8.0.tar.gz)
   3. Move the `apache-drill-<version>.tar.gz` file to the `drill` directory that you created on your `C:\` drive.
   4. Unzip the `TAR.GZ` file and the resulting `TAR` file.  
      1. Right-click `apache-drill-<version>.tar.gz,` and select `7-Zip>Extract Here`. The utility extracts the `apache-drill-<version>.tar` file.  

http://git-wip-us.apache.org/repos/asf/drill/blob/81be4bc9/_docs/sql-ref/functions/002-data-type-fmt.md
----------------------------------------------------------------------
diff --git a/_docs/sql-ref/functions/002-data-type-fmt.md b/_docs/sql-ref/functions/002-data-type-fmt.md
index 59f58be..5ebfec2 100644
--- a/_docs/sql-ref/functions/002-data-type-fmt.md
+++ b/_docs/sql-ref/functions/002-data-type-fmt.md
@@ -152,9 +152,79 @@ In addition to the CAST, CONVERT_TO, and CONVERT_FROM functions, Drill supports
 * A character string to a timestamp with time zone
 * A decimal type to a timestamp with time zone
 
+## Usage Notes
+
+Use the following format specifiers for numerical conversions:
+<table >
+     <tr >
+          <th align=left>Symbol
+          <th align=left>Location
+          <th align=left>Meaning
+     <tr valign=top>
+          <td><code>0</code>
+          <td>Number
+          <td>Digit
+     <tr >
+          <td><code>#</code>
+          <td>Number
+          <td>Digit, zero shows as absent
+     <tr valign=top>
+          <td><code>.</code>
+          <td>Number
+          <td>Decimal separator or monetary decimal separator
+     <tr >
+          <td><code>-</code>
+          <td>Number
+          <td>Minus sign
+     <tr valign=top>
+          <td><code>,</code>
+          <td>Number
+          <td>Grouping separator
+     <tr >
+          <td><code>E</code>
+          <td>Number
+          <td>Separates mantissa and exponent in scientific notation.
+              <em>Need not be quoted in prefix or suffix.</em>
+     <tr valign=top>
+          <td><code>;</code>
+          <td>Subpattern boundary
+          <td>Separates positive and negative subpatterns
+     <tr >
+          <td><code>%</code>
+          <td>Prefix or suffix
+          <td>Multiply by 100 and show as percentage
+     <tr valign=top>
+          <td><code>&#92;u2030</code>
+          <td>Prefix or suffix
+          <td>Multiply by 1000 and show as per mille value
+     <tr >
+          <td><code>&#164;</code> (<code>&#92;u00A4</code>)
+          <td>Prefix or suffix
+          <td>Currency sign, replaced by currency symbol.  If
+              doubled, replaced by international currency symbol.
+              If present in a pattern, the monetary decimal separator
+              is used instead of the decimal separator.
+     <tr valign=top>
+          <td><code>'</code>
+          <td>Prefix or suffix
+          <td>Used to quote special characters in a prefix or suffix,
+              for example, <code>"'#'#"</code> formats 123 to
+              <code>"#123"</code>.  To create a single quote
+              itself, use two in a row: <code>"# o''clock"</code>.
+ </table>
+
+Use the following format specifiers for data type conversions:
+
+
+
+For more information about specifying a format, refer to one of the following format specifier documents:
+
+* [Java DecimalFormat class](http://docs.oracle.com/javase/7/docs/api/java/text/DecimalFormat.html) format specifiers 
+* [Java DateTimeFormat class](http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html) format specifiers
+
 # TO_CHAR
 
-TO_CHAR converts a date, time, timestamp, timestamp with timezone, or numerical expression to a character string.
+TO_CHAR converts a date, time, timestamp, or numerical expression to a character string.
 
 ## Syntax
 
@@ -164,12 +234,16 @@ TO_CHAR converts a date, time, timestamp, timestamp with timezone, or numerical
 
 * 'format'* is format specifier enclosed in single quotation marks that sets a pattern for the output formatting. 
 
-## Usage Notes
-For information about specifying a format, refer to one of the following format specifier documents:
-
-* [Java DecimalFormat class](http://docs.oracle.com/javase/7/docs/api/java/text/DecimalFormat.html) format specifiers 
-* [Java DateTimeFormat class](http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html)
+### Usage Notes
+Currently Drill does not support a timestamp with time zone data type. Drill stores the timestamp and date in [UTC](http://www.timeanddate.com/time/aboututc.html) and maintains no timezone information. Currently, you cannot convert dates/timestamp to a specific timezone. However if your input data contains timezone information, Drill can use it as if it were UTC time.
 
+SELECT to_char(cast('2008-2-23 12:00:00 America/Los_Angeles' as timestamp), 'yyyy MMM dd HH:mm:ss z') FROM dfs.`/Users/drill/dummy.json`;
+    +------------+
+    |   EXPR$0   |
+    +------------+
+    | 2008 Feb 23 12:00:00 UTC |
+    +------------+
+    1 row selected (0.108 seconds)
 
 ## Examples
 
@@ -227,11 +301,11 @@ Converts a character string or a UNIX epoch timestamp to a date.
 
 ## Syntax
 
-    TO_DATE (expression[, 'format']);
+    TO_DATE (expression [, 'format']);
 
-*expression* is a character string enclosed in single quotation marks or a UNIX epoch timestamp not enclosed in single quotation marks. 
+*expression* is a character string enclosed in single quotation marks or a UNIX epoch timestamp, not enclosed in single quotation marks. 
 
-* 'format'* is format specifier enclosed in single quotation marks that sets a pattern for the output formatting. Use this option only when the expression is a character string. 
+* 'format'* is format specifier enclosed in single quotation marks that sets a pattern for the output formatting. Use this option only when the expression is a character string, not a UNIX epoch timestamp. 
 
 ## Usage 
 Specify a format using patterns defined in [Java DateTimeFormat class](http://joda-time.sourceforge.net/apidocs/org/joda/time/format/DateTimeFormat.html).