You are viewing a plain text version of this content. The canonical link for it is here.

Posted to commits@carbondata.apache.org by ch...@apache.org on 2018/03/06 03:07:53 UTC

[1/2] carbondata-site git commit: update md document from github

Repository: carbondata-site
Updated Branches:
  refs/heads/asf-site 4d05a0dc7 -> b0888c1b2


http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/main/webapp/streaming-guide.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/streaming-guide.html b/src/main/webapp/streaming-guide.html
index 8d3effe..43992dd 100644
--- a/src/main/webapp/streaming-guide.html
+++ b/src/main/webapp/streaming-guide.html
@@ -351,6 +351,82 @@ streaming table using following DDL.</p>
 </tbody>
 </table>
 <h2>
+<a id="stream-data-parser" class="anchor" href="#stream-data-parser" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Stream data parser</h2>
+<p>Config the property "carbon.stream.parser" to define a stream parser to convert InternalRow to Object[] when write stream data.</p>
+<table>
+<thead>
+<tr>
+<th>property name</th>
+<th>default</th>
+<th>description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>carbon.stream.parser</td>
+<td>org.apache.carbondata.streaming.parser.CSVStreamParserImp</td>
+<td>the class of the stream parser</td>
+</tr>
+</tbody>
+</table>
+<p>Currently CarbonData support two parsers, as following:</p>
+<p><strong>1. org.apache.carbondata.streaming.parser.CSVStreamParserImp</strong>: This is the default stream parser, it gets a line data(String type) from the first index of InternalRow and converts this String to Object[].</p>
+<p><strong>2. org.apache.carbondata.streaming.parser.RowStreamParserImp</strong>: This stream parser will auto convert InternalRow to Object[] according to schema of this <code>DataSet</code>, for example:</p>
+<div class="highlight highlight-source-scala"><pre> <span class="pl-k">case</span> <span class="pl-k">class</span> <span class="pl-en">FileElement</span>(<span class="pl-v">school</span>: <span class="pl-en">Array</span>[<span class="pl-k">String</span>], <span class="pl-v">age</span>: <span class="pl-k">Int</span>)
+ <span class="pl-k">case</span> <span class="pl-k">class</span> <span class="pl-en">StreamData</span>(<span class="pl-v">id</span>: <span class="pl-k">Int</span>, <span class="pl-v">name</span>: <span class="pl-k">String</span>, <span class="pl-v">city</span>: <span class="pl-k">String</span>, <span class="pl-v">salary</span>: <span class="pl-k">Float</span>, <span class="pl-v">file</span>: <span class="pl-en">FileElement</span>)
+ ...
+
+ <span class="pl-k">var</span> <span class="pl-en">qry</span><span class="pl-k">:</span> <span class="pl-en">StreamingQuery</span> <span class="pl-k">=</span> <span class="pl-c1">null</span>
+ <span class="pl-k">val</span> <span class="pl-en">readSocketDF</span> <span class="pl-k">=</span> spark.readStream
+   .format(<span class="pl-s"><span class="pl-pds">"</span>socket<span class="pl-pds">"</span></span>)
+   .option(<span class="pl-s"><span class="pl-pds">"</span>host<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>localhost<span class="pl-pds">"</span></span>)
+   .option(<span class="pl-s"><span class="pl-pds">"</span>port<span class="pl-pds">"</span></span>, <span class="pl-c1">9099</span>)
+   .load()
+   .as[<span class="pl-k">String</span>]
+   .map(_.split(<span class="pl-s"><span class="pl-pds">"</span>,<span class="pl-pds">"</span></span>))
+   .map { fields <span class="pl-k">=&gt;</span> {
+     <span class="pl-k">val</span> <span class="pl-en">tmp</span> <span class="pl-k">=</span> fields(<span class="pl-c1">4</span>).split(<span class="pl-s"><span class="pl-pds">"</span><span class="pl-cce">\\</span>$<span class="pl-pds">"</span></span>)
+     <span class="pl-k">val</span> <span class="pl-en">file</span> <span class="pl-k">=</span> <span class="pl-en">FileElement</span>(tmp(<span class="pl-c1">0</span>).split(<span class="pl-s"><span class="pl-pds">"</span>:<span class="pl-pds">"</span></span>), tmp(<span class="pl-c1">1</span>).toInt)
+     <span class="pl-en">StreamData</span>(fields(<span class="pl-c1">0</span>).toInt, fields(<span class="pl-c1">1</span>), fields(<span class="pl-c1">2</span>), fields(<span class="pl-c1">3</span>).toFloat, file)
+   } }
+
+ <span class="pl-c"><span class="pl-c">//</span> Write data from socket stream to carbondata file</span>
+ qry <span class="pl-k">=</span> readSocketDF.writeStream
+   .format(<span class="pl-s"><span class="pl-pds">"</span>carbondata<span class="pl-pds">"</span></span>)
+   .trigger(<span class="pl-en">ProcessingTime</span>(<span class="pl-s"><span class="pl-pds">"</span>5 seconds<span class="pl-pds">"</span></span>))
+   .option(<span class="pl-s"><span class="pl-pds">"</span>checkpointLocation<span class="pl-pds">"</span></span>, tablePath.getStreamingCheckpointDir)
+   .option(<span class="pl-s"><span class="pl-pds">"</span>dbName<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>default<span class="pl-pds">"</span></span>)
+   .option(<span class="pl-s"><span class="pl-pds">"</span>tableName<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>carbon_table<span class="pl-pds">"</span></span>)
+   .option(<span class="pl-en">CarbonStreamParser</span>.<span class="pl-en">CARBON_STREAM_PARSER</span>,
+     <span class="pl-en">CarbonStreamParser</span>.<span class="pl-en">CARBON_STREAM_PARSER_ROW_PARSER</span>)
+   .start()
+
+ ...</pre></div>
+<h3>
+<a id="how-to-implement-a-customized-stream-parser" class="anchor" href="#how-to-implement-a-customized-stream-parser" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>How to implement a customized stream parser</h3>
+<p>If user needs to implement a customized stream parser to convert a specific InternalRow to Object[], it needs to implement <code>initialize</code> method and <code>parserRow</code> method of interface <code>CarbonStreamParser</code>, for example:</p>
+<div class="highlight highlight-source-scala"><pre> <span class="pl-k">package</span> <span class="pl-en">org.XXX.XXX.streaming.parser</span>
+ 
+ <span class="pl-k">import</span> <span class="pl-smi">org.apache.hadoop.conf.</span><span class="pl-smi">Configuration</span>
+ <span class="pl-k">import</span> <span class="pl-smi">org.apache.spark.sql.catalyst.</span><span class="pl-smi">InternalRow</span>
+ <span class="pl-k">import</span> <span class="pl-smi">org.apache.spark.sql.types.</span><span class="pl-smi">StructType</span>
+ 
+ <span class="pl-k">class</span> <span class="pl-en">XXXStreamParserImp</span> <span class="pl-k">extends</span> <span class="pl-e">CarbonStreamParser</span> {
+ 
+   <span class="pl-k">override</span> <span class="pl-k">def</span> <span class="pl-en">initialize</span>(<span class="pl-v">configuration</span>: <span class="pl-en">Configuration</span>, <span class="pl-v">structType</span>: <span class="pl-en">StructType</span>)<span class="pl-k">:</span> <span class="pl-k">Unit</span> <span class="pl-k">=</span> {
+     <span class="pl-c"><span class="pl-c">//</span> user can get the properties from "configuration"</span>
+   }
+   
+   <span class="pl-k">override</span> <span class="pl-k">def</span> <span class="pl-en">parserRow</span>(<span class="pl-v">value</span>: <span class="pl-en">InternalRow</span>)<span class="pl-k">:</span> <span class="pl-en">Array</span>[<span class="pl-en">Object</span>] <span class="pl-k">=</span> {
+     <span class="pl-c"><span class="pl-c">//</span> convert InternalRow to Object[](Array[Object] in Scala) </span>
+   }
+   
+   <span class="pl-k">override</span> <span class="pl-k">def</span> <span class="pl-en">close</span>()<span class="pl-k">:</span> <span class="pl-k">Unit</span> <span class="pl-k">=</span> {
+   }
+ }
+   </pre></div>
+<p>and then set the property "carbon.stream.parser" to "org.XXX.XXX.streaming.parser.XXXStreamParserImp".</p>
+<h2>
 <a id="close-streaming-table" class="anchor" href="#close-streaming-table" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Close streaming table</h2>
 <p>Use below command to handoff all streaming segments to columnar format segments and modify the streaming property to false, this table becomes a normal table.</p>
 <div class="highlight highlight-source-sql"><pre><span class="pl-k">ALTER</span> <span class="pl-k">TABLE</span> streaming_table COMPACT <span class="pl-s"><span class="pl-pds">'</span>close_streaming<span class="pl-pds">'</span></span>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/main/webapp/troubleshooting.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/troubleshooting.html b/src/main/webapp/troubleshooting.html
index 3a2e311..107fb23 100644
--- a/src/main/webapp/troubleshooting.html
+++ b/src/main/webapp/troubleshooting.html
@@ -288,7 +288,7 @@ For example, you can use scp to copy this file to all the nodes.</p>
 <a id="failed-to-load-data-on-the-cluster" class="anchor" href="#failed-to-load-data-on-the-cluster" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Failed to load data on the cluster</h2>
 <p><strong>Symptom</strong></p>
 <p>Data loading fails with the following exception :</p>
-<pre><code>Data Load failure exeception
+<pre><code>Data Load failure exception
 </code></pre>
 <p><strong>Possible Cause</strong></p>
 <p>The following issue can cause the failure :</p>
@@ -316,7 +316,7 @@ For example, you can use scp to copy this file to all the nodes.</p>
 <a id="failed-to-insert-data-on-the-cluster" class="anchor" href="#failed-to-insert-data-on-the-cluster" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Failed to insert data on the cluster</h2>
 <p><strong>Symptom</strong></p>
 <p>Insertion fails with the following exception :</p>
-<pre><code>Data Load failure exeception
+<pre><code>Data Load failure exception
 </code></pre>
 <p><strong>Possible Cause</strong></p>
 <p>The following issue can cause the failure :</p>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/main/webapp/useful-tips-on-carbondata.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/useful-tips-on-carbondata.html b/src/main/webapp/useful-tips-on-carbondata.html
index cb19036..6df49a7 100644
--- a/src/main/webapp/useful-tips-on-carbondata.html
+++ b/src/main/webapp/useful-tips-on-carbondata.html
@@ -353,7 +353,7 @@ You can configure CarbonData by tuning following properties in carbon.properties
 <tr>
 <td>carbon.number.of.cores.block.sort</td>
 <td>Default: 7</td>
-<td>If you have huge memory and cpus, increase it as you will</td>
+<td>If you have huge memory and CPUs, increase it as you will</td>
 </tr>
 <tr>
 <td>carbon.merge.sort.reader.thread</td>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/site/markdown/data-management-on-carbondata.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/data-management-on-carbondata.md b/src/site/markdown/data-management-on-carbondata.md
index c846ffc..2aa4a49 100644
--- a/src/site/markdown/data-management-on-carbondata.md
+++ b/src/site/markdown/data-management-on-carbondata.md
@@ -26,7 +26,6 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
 * [UPDATE AND DELETE](#update-and-delete)
 * [COMPACTION](#compaction)
 * [PARTITION](#partition)
-* [PRE-AGGREGATE TABLES](#pre-aggregate-tables)
 * [BUCKETING](#bucketing)
 * [SEGMENT MANAGEMENT](#segment-management)
 
@@ -39,7 +38,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   STORED BY 'carbondata'
   [TBLPROPERTIES (property_name=property_value, ...)]
   [LOCATION 'path']
-  ```  
+  ```
   
 ### Usage Guidelines
 
@@ -101,11 +100,11 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
      These properties are table level compaction configurations, if not specified, system level configurations in carbon.properties will be used.
      Following are 5 configurations:
      
-     * MAJOR_COMPACTION_SIZE: same meaning with carbon.major.compaction.size, size in MB.
-     * AUTO_LOAD_MERGE: same meaning with carbon.enable.auto.load.merge.
-     * COMPACTION_LEVEL_THRESHOLD: same meaning with carbon.compaction.level.threshold.
-     * COMPACTION_PRESERVE_SEGMENTS: same meaning with carbon.numberof.preserve.segments.
-     * ALLOWED_COMPACTION_DAYS: same meaning with carbon.allowed.compaction.days.     
+     * MAJOR_COMPACTION_SIZE: same meaning as carbon.major.compaction.size, size in MB.
+     * AUTO_LOAD_MERGE: same meaning as carbon.enable.auto.load.merge.
+     * COMPACTION_LEVEL_THRESHOLD: same meaning as carbon.compaction.level.threshold.
+     * COMPACTION_PRESERVE_SEGMENTS: same meaning as carbon.numberof.preserve.segments.
+     * ALLOWED_COMPACTION_DAYS: same meaning as carbon.allowed.compaction.days.     
 
      ```
      TBLPROPERTIES ('MAJOR_COMPACTION_SIZE'='2048',
@@ -127,26 +126,17 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
 
    ```
     CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                                   productNumber Int,
-                                   productName String,
-                                   storeCity String,
-                                   storeProvince String,
-                                   productCategory String,
-                                   productBatch String,
-                                   saleQuantity Int,
-                                   revenue Int)
+                                   productNumber INT,
+                                   productName STRING,
+                                   storeCity STRING,
+                                   storeProvince STRING,
+                                   productCategory STRING,
+                                   productBatch STRING,
+                                   saleQuantity INT,
+                                   revenue INT)
     STORED BY 'carbondata'
-    TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber',
-                   'NO_INVERTED_INDEX'='productBatch',
-                   'SORT_COLUMNS'='productName,storeCity',
-                   'SORT_SCOPE'='NO_SORT',
-                   'TABLE_BLOCKSIZE'='512',
-                   'MAJOR_COMPACTION_SIZE'='2048',
-                   'AUTO_LOAD_MERGE'='true',
-                   'COMPACTION_LEVEL_THRESHOLD'='5,6',
-                   'COMPACTION_PRESERVE_SEGMENTS'='10',
-				   'streaming'='true',
-                   'ALLOWED_COMPACTION_DAYS'='5')
+    TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity',
+                   'SORT_SCOPE'='NO_SORT')
    ```
 
 ## CREATE DATABASE 
@@ -187,7 +177,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   SHOW TABLES IN defaultdb
   ```
 
-### ALTER TALBE
+### ALTER TABLE
 
   The following section introduce the commands to modify the physical or logical state of the existing table(s).
 
@@ -200,9 +190,9 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
 
      Examples:
      ```
-     ALTER TABLE carbon RENAME TO carbondata
+     ALTER TABLE carbon RENAME TO carbonTable
      OR
-     ALTER TABLE test_db.carbon RENAME TO test_db.carbondata
+     ALTER TABLE test_db.carbon RENAME TO test_db.carbonTable
      ```
 
    - **ADD COLUMNS**
@@ -294,15 +284,48 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   * Before executing this command the old table schema and data should be copied into the new database location.
   * If the table is aggregate table, then all the aggregate tables should be copied to the new database location.
   * For old store, the time zone of the source and destination cluster should be same.
-  * If old cluster uses HIVE meta store, refresh will not work as schema file does not exist in file system.
+  * If old cluster used HIVE meta store to store schema, refresh will not work as schema file does not exist in file system.
+
+### Table and Column Comment
+
+  You can provide more information on table by using table comment. Similarly you can provide more information about a particular column using column comment. 
+  You can see the column comment of an existing table using describe formatted command.
+  
+  ```
+  CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type [COMMENT col_comment], ...)]
+    [COMMENT table_comment]
+  STORED BY 'carbondata'
+  [TBLPROPERTIES (property_name=property_value, ...)]
+  ```
   
+  Example:
+  ```
+  CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
+                                productNumber Int COMMENT 'unique serial number for product')
+  COMMENT “This is table comment”
+   STORED BY 'carbondata'
+   TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber')
+  ```
+  You can also SET and UNSET table comment using ALTER command.
+
+  Example to SET table comment:
+  
+  ```
+  ALTER TABLE carbon SET TBLPROPERTIES ('comment'='this table comment is modified');
+  ```
+  
+  Example to UNSET table comment:
+  
+  ```
+  ALTER TABLE carbon UNSET TBLPROPERTIES ('comment');
+  ```
 
 ## LOAD DATA
 
 ### LOAD FILES TO CARBONDATA TABLE
   
   This command is used to load csv files to carbondata, OPTIONS are not mandatory for data loading process. 
-  Inside OPTIONS user can provide either of any options like DELIMITER, QUOTECHAR, FILEHEADER, ESCAPECHAR, MULTILINE as per requirement.
+  Inside OPTIONS user can provide any options like DELIMITER, QUOTECHAR, FILEHEADER, ESCAPECHAR, MULTILINE as per requirement.
   
   ```
   LOAD DATA [LOCAL] INPATH 'folder_path' 
@@ -330,6 +353,16 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
     OPTIONS('COMMENTCHAR'='#')
     ```
 
+  - **HEADER:** When you load the CSV file without the file header and the file header is the same with the table schema, then add 'HEADER'='false' to load data SQL as user need not provide the file header. By default the value is 'true'.
+  false: CSV file is without file header.
+  true: CSV file is with file header.
+  
+    ```
+    OPTIONS('HEADER'='false') 
+    ```
+
+	NOTE: If the HEADER option exist and is set to 'true', then the FILEHEADER option is not required.
+	
   - **FILEHEADER:** Headers can be provided in the LOAD DATA command if headers are missing in the source files.
 
     ```
@@ -342,7 +375,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
     OPTIONS('MULTILINE'='true') 
     ```
 
-  - **ESCAPECHAR:** Escape char can be provided if user want strict validation of escape character on CSV.
+  - **ESCAPECHAR:** Escape char can be provided if user want strict validation of escape character in CSV files.
 
     ```
     OPTIONS('ESCAPECHAR'='\') 
@@ -402,6 +435,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
    ```
    LOAD DATA local inpath '/opt/rawdata/data.csv' INTO table carbontable
    options('DELIMITER'=',', 'QUOTECHAR'='"','COMMENTCHAR'='#',
+   'HEADER'='false',
    'FILEHEADER'='empno,empname,designation,doj,workgroupcategory,
    workgroupcategoryname,deptno,deptname,projectcode,
    projectjoindate,projectenddate,attendance,utilization,salary',
@@ -424,10 +458,10 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   * BAD_RECORDS_ACTION property can have four type of actions for bad records FORCE, REDIRECT, IGNORE and FAIL.
   * FAIL option is its Default value. If the FAIL option is used, then data loading fails if any bad records are found.
   * If the REDIRECT option is used, CarbonData will add all bad records in to a separate CSV file. However, this file must not be used for subsequent data loading because the content may not exactly match the source record. You are advised to cleanse the original source record for further data ingestion. This option is used to remind you which records are bad records.
-  * If the FORCE option is used, then it auto-corrects the data by storing the bad records as NULL before Loading data.
+  * If the FORCE option is used, then it auto-converts the data by storing the bad records as NULL before Loading data.
   * If the IGNORE option is used, then bad records are neither loaded nor written to the separate CSV file.
   * In loaded data, if all records are bad records, the BAD_RECORDS_ACTION is invalid and the load operation fails.
-  * The maximum number of characters per column is 100000. If there are more than 100000 characters in a column, data loading will fail.
+  * The maximum number of characters per column is 32000. If there are more than 32000 characters in a column, data loading will fail.
 
   Example:
 
@@ -492,7 +526,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   [ WHERE { <filter_condition> } ]
   ```
   
-  alternatively the following the command can also be used for updating the CarbonData Table :
+  alternatively the following command can also be used for updating the CarbonData Table :
   
   ```
   UPDATE <table_name>
@@ -552,7 +586,6 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
 ## COMPACTION
 
   Compaction improves the query performance significantly. 
-  During the load data, several CarbonData files are generated, this is because data is sorted only within each load (per load segment and one B+ tree index).
   
   There are two types of compaction, Minor and Major compaction.
   
@@ -576,6 +609,8 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   
   In Major compaction, multiple segments can be merged into one large segment. 
   User will specify the compaction size until which segments can be merged, Major compaction is usually done during the off-peak time.
+  Configure the property carbon.major.compaction.size with appropriate value in MB.
+  
   This command merges the specified number of segments into one segment: 
      
   ```
@@ -611,13 +646,13 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   Example:
   ```
    CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                                productNumber Int,
-                                productName String,
-                                storeCity String,
-                                storeProvince String,
-                                saleQuantity Int,
-                                revenue Int)
-  PARTITIONED BY (productCategory String, productBatch String)
+                                productNumber INT,
+                                productName STRING,
+                                storeCity STRING,
+                                storeProvince STRING,
+                                saleQuantity INT,
+                                revenue INT)
+  PARTITIONED BY (productCategory STRING, productBatch STRING)
   STORED BY 'carbondata'
   ```
 		
@@ -628,8 +663,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   ```
   LOAD DATA [LOCAL] INPATH 'folder_path' 
   INTO TABLE [db_name.]table_name PARTITION (partition_spec) 
-  OPTIONS(property_name=property_value, ...)
-    
+  OPTIONS(property_name=property_value, ...)    
   INSERT INTO INTO TABLE [db_name.]table_name PARTITION (partition_spec) <SELECT STATMENT>
   ```
   
@@ -637,8 +671,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   ```
   LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.csv'
   INTO TABLE locationTable
-  PARTITION (country = 'US', state = 'CA')
-    
+  PARTITION (country = 'US', state = 'CA')  
   INSERT INTO TABLE locationTable
   PARTITION (country = 'US', state = 'AL')
   SELECT <columns list excluding partition columns> FROM another_user
@@ -651,8 +684,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   Example:
   ```
   LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.csv'
-  INTO TABLE locationTable
-          
+  INTO TABLE locationTable          
   INSERT INTO TABLE locationTable
   SELECT <columns list excluding partition columns> FROM another_user
   ```
@@ -674,7 +706,7 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
 
 #### Insert OVERWRITE
   
-  This command allows you to insert or load overwrite on a spcific partition.
+  This command allows you to insert or load overwrite on a specific partition.
   
   ```
    INSERT OVERWRITE TABLE table_name
@@ -712,12 +744,12 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   Example:
   ```
   CREATE TABLE IF NOT EXISTS hash_partition_table(
-      col_A String,
-      col_B Int,
-      col_C Long,
-      col_D Decimal(10,2),
-      col_F Timestamp
-  ) PARTITIONED BY (col_E Long)
+      col_A STRING,
+      col_B INT,
+      col_C LONG,
+      col_D DECIMAL(10,2),
+      col_F TIMESTAMP
+  ) PARTITIONED BY (col_E LONG)
   STORED BY 'carbondata' TBLPROPERTIES('PARTITION_TYPE'='HASH','NUM_PARTITIONS'='9')
   ```
 
@@ -740,11 +772,11 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   Example:
   ```
   CREATE TABLE IF NOT EXISTS range_partition_table(
-      col_A String,
-      col_B Int,
-      col_C Long,
-      col_D Decimal(10,2),
-      col_E Long
+      col_A STRING,
+      col_B INT,
+      col_C LONG,
+      col_D DECIMAL(10,2),
+      col_E LONG
    ) partitioned by (col_F Timestamp)
    PARTITIONED BY 'carbondata'
    TBLPROPERTIES('PARTITION_TYPE'='RANGE',
@@ -767,12 +799,12 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   Example:
   ```
   CREATE TABLE IF NOT EXISTS list_partition_table(
-      col_B Int,
-      col_C Long,
-      col_D Decimal(10,2),
-      col_E Long,
-      col_F Timestamp
-   ) PARTITIONED BY (col_A String)
+      col_B INT,
+      col_C LONG,
+      col_D DECIMAL(10,2),
+      col_E LONG,
+      col_F TIMESTAMP
+   ) PARTITIONED BY (col_A STRING)
    STORED BY 'carbondata'
    TBLPROPERTIES('PARTITION_TYPE'='LIST',
    'LIST_INFO'='aaaa, bbbb, (cccc, dddd), eeee')
@@ -826,234 +858,6 @@ This tutorial is going to introduce all commands and data operations on CarbonDa
   * The partitioned column can be excluded from SORT_COLUMNS, this will let other columns to do the efficient sorting.
   * When writing SQL on a partition table, try to use filters on the partition column.
 
-
-## PRE-AGGREGATE TABLES
-  Carbondata supports pre aggregating of data so that OLAP kind of queries can fetch data 
-  much faster.Aggregate tables are created as datamaps so that the handling is as efficient as 
-  other indexing support.Users can create as many aggregate tables they require as datamaps to 
-  improve their query performance,provided the storage requirements and loading speeds are 
-  acceptable.
-  
-  For main table called **sales** which is defined as 
-  
-  ```
-  CREATE TABLE sales (
-  order_time timestamp,
-  user_id string,
-  sex string,
-  country string,
-  quantity int,
-  price bigint)
-  STORED BY 'carbondata'
-  ```
-  
-  user can create pre-aggregate tables using the DDL
-  
-  ```
-  CREATE DATAMAP agg_sales
-  ON TABLE sales
-  USING "preaggregate"
-  AS
-  SELECT country, sex, sum(quantity), avg(price)
-  FROM sales
-  GROUP BY country, sex
-  ```
-  
-<b><p align="left">Functions supported in pre-aggregate tables</p></b>
-
-| Function | Rollup supported |
-|-----------|----------------|
-| SUM | Yes |
-| AVG | Yes |
-| MAX | Yes |
-| MIN | Yes |
-| COUNT | Yes |
-
-
-##### How pre-aggregate tables are selected
-For the main table **sales** and pre-aggregate table **agg_sales** created above, queries of the 
-kind
-```
-SELECT country, sex, sum(quantity), avg(price) from sales GROUP BY country, sex
-
-SELECT sex, sum(quantity) from sales GROUP BY sex
-
-SELECT sum(price), country from sales GROUP BY country
-``` 
-
-will be transformed by Query Planner to fetch data from pre-aggregate table **agg_sales**
-
-But queries of kind
-```
-SELECT user_id, country, sex, sum(quantity), avg(price) from sales GROUP BY user_id, country, sex
-
-SELECT sex, avg(quantity) from sales GROUP BY sex
-
-SELECT country, max(price) from sales GROUP BY country
-```
-
-will fetch the data from the main table **sales**
-
-##### Loading data to pre-aggregate tables
-For existing table with loaded data, data load to pre-aggregate table will be triggered by the 
-CREATE DATAMAP statement when user creates the pre-aggregate table.
-For incremental loads after aggregates tables are created, loading data to main table triggers 
-the load to pre-aggregate tables once main table loading is complete.These loads are automic 
-meaning that data on main table and aggregate tables are only visible to the user after all tables 
-are loaded
-
-##### Querying data from pre-aggregate tables
-Pre-aggregate tables cannot be queries directly.Queries are to be made on main table.Internally 
-carbondata will check associated pre-aggregate tables with the main table and if the 
-pre-aggregate tables satisfy the query condition, the plan is transformed automatically to use 
-pre-aggregate table to fetch the data
-
-##### Compacting pre-aggregate tables
-Compaction command (ALTER TABLE COMPACT) need to be run separately on each pre-aggregate table.
-Running Compaction command on main table will **not automatically** compact the pre-aggregate 
-tables.Compaction is an optional operation for pre-aggregate table. If compaction is performed on
-main table but not performed on pre-aggregate table, all queries still can benefit from 
-pre-aggregate tables.To further improve performance on pre-aggregate tables, compaction can be 
-triggered on pre-aggregate tables directly, it will merge the segments inside pre-aggregate table. 
-
-##### Update/Delete Operations on pre-aggregate tables
-This functionality is not supported.
-
-  NOTE (<b>RESTRICTION</b>):
-  * Update/Delete operations are <b>not supported</b> on main table which has pre-aggregate tables 
-  created on it.All the pre-aggregate tables <b>will have to be dropped</b> before update/delete 
-  operations can be performed on the main table.Pre-aggregate tables can be rebuilt manually 
-  after update/delete operations are completed
- 
-##### Delete Segment Operations on pre-aggregate tables
-This functionality is not supported.
-
-  NOTE (<b>RESTRICTION</b>):
-  * Delete Segment operations are <b>not supported</b> on main table which has pre-aggregate tables 
-  created on it.All the pre-aggregate tables <b>will have to be dropped</b> before update/delete 
-  operations can be performed on the main table.Pre-aggregate tables can be rebuilt manually 
-  after delete segment operations are completed
-  
-##### Alter Table Operations on pre-aggregate tables
-This functionality is not supported.
-
-  NOTE (<b>RESTRICTION</b>):
-  * Adding new column in new table does not have any affect on pre-aggregate tables. However if 
-  dropping or renaming a column has impact in pre-aggregate table, such operations will be 
-  rejected and error will be thrown.All the pre-aggregate tables <b>will have to be dropped</b> 
-  before Alter Operations can be performed on the main table.Pre-aggregate tables can be rebuilt 
-  manually after Alter Table operations are completed
-  
-### Supporting timeseries data (Alpha feature in 1.3.0)
-Carbondata has built-in understanding of time hierarchy and levels: year, month, day, hour, minute.
-Multiple pre-aggregate tables can be created for the hierarchy and Carbondata can do automatic 
-roll-up for the queries on these hierarchies.
-
-  ```
-  CREATE DATAMAP agg_year
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time’=’order_time’,
-  'year_granualrity’=’1’,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-    
-  CREATE DATAMAP agg_month
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time’=’order_time’,
-  'month_granualrity’=’1’,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-    
-  CREATE DATAMAP agg_day
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time’=’order_time’,
-  'day_granualrity’=’1’,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-        
-  CREATE DATAMAP agg_sales_hour
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time’=’order_time’,
-  'hour_granualrity’=’1’,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-  
-  CREATE DATAMAP agg_minute
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time’=’order_time’,
-  'minute_granualrity’=’1’,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-  ```
-  
-  For Querying data and automatically roll-up to the desired aggregation level,Carbondata supports 
-  UDF as
-  ```
-  timeseries(timeseries column name, ‘aggregation level’)
-  ```
-  ```
-  Select timeseries(order_time, ‘hour’), sum(quantity) from sales group by timeseries(order_time,
-  ’hour’)
-  ```
-  
-  It is **not necessary** to create pre-aggregate tables for each granularity unless required for 
-  query.Carbondata can roll-up the data and fetch it.
-   
-  For Example: For main table **sales** , If pre-aggregate tables were created as  
-  
-  ```
-  CREATE DATAMAP agg_day
-    ON TABLE sales
-    USING "timeseries"
-    DMPROPERTIES (
-    'event_time’=’order_time’,
-    'day_granualrity’=’1’,
-    ) AS
-    SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-     avg(price) FROM sales GROUP BY order_time, country, sex
-          
-    CREATE DATAMAP agg_sales_hour
-    ON TABLE sales
-    USING "timeseries"
-    DMPROPERTIES (
-    'event_time’=’order_time’,
-    'hour_granualrity’=’1’,
-    ) AS
-    SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-     avg(price) FROM sales GROUP BY order_time, country, sex
-  ```
-  
-  Queries like below will be rolled-up and fetched from pre-aggregate tables
-  ```
-  Select timeseries(order_time, ‘month’), sum(quantity) from sales group by timeseries(order_time,
-    ’month’)
-    
-  Select timeseries(order_time, ‘year’), sum(quantity) from sales group by timeseries(order_time,
-    ’year’)
-  ```
-  
-  NOTE (<b>RESTRICTION</b>):
-  * Only value of 1 is supported for hierarchy levels. Other hierarchy levels are not supported. 
-  Other hierarchy levels are not supported
-  * pre-aggregate tables for the desired levels needs to be created one after the other
-  * pre-aggregate tables created for each level needs to be dropped separately 
-    
-
 ## BUCKETING
 
   Bucketing feature can be used to distribute/organize the table/partition data into multiple files such
@@ -1070,20 +874,20 @@ roll-up for the queries on these hierarchies.
   ```
 
   NOTE:
-  * Bucketing can not be performed for columns of Complex Data Types.
-  * Columns in the BUCKETCOLUMN parameter must be only dimension. The BUCKETCOLUMN parameter can not be a measure or a combination of measures and dimensions.
+  * Bucketing cannot be performed for columns of Complex Data Types.
+  * Columns in the BUCKETCOLUMN parameter must be dimensions. The BUCKETCOLUMN parameter cannot be a measure or a combination of measures and dimensions.
 
   Example:
   ```
   CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                                productNumber Int,
-                                saleQuantity Int,
-                                productName String,
-                                storeCity String,
-                                storeProvince String,
-                                productCategory String,
-                                productBatch String,
-                                revenue Int)
+                                productNumber INT,
+                                saleQuantity INT,
+                                productName STRING,
+                                storeCity STRING,
+                                storeProvince STRING,
+                                productCategory STRING,
+                                productBatch STRING,
+                                revenue INT)
   STORED BY 'carbondata'
   TBLPROPERTIES ('BUCKETNUMBER'='4', 'BUCKETCOLUMNS'='productName')
   ```
@@ -1092,7 +896,7 @@ roll-up for the queries on these hierarchies.
 
 ### SHOW SEGMENT
 
-  This command is used to get the segments of CarbonData table.
+  This command is used to list the segments of CarbonData table.
 
   ```
   SHOW SEGMENTS FOR TABLE [db_name.]table_name LIMIT number_of_segments
@@ -1158,7 +962,7 @@ roll-up for the queries on these hierarchies.
   NOTE:
   carbon.input.segments: Specifies the segment IDs to be queried. This property allows you to query specified segments of the specified table. The CarbonScan will read data from specified segments only.
   
-  If user wants to query with segments reading in multi threading mode, then CarbonSession.threadSet can be used instead of SET query.
+  If user wants to query with segments reading in multi threading mode, then CarbonSession. threadSet can be used instead of SET query.
   ```
   CarbonSession.threadSet ("carbon.input.segments.<database_name>.<table_name>","<list of segment IDs>");
   ```
@@ -1168,7 +972,7 @@ roll-up for the queries on these hierarchies.
   SET carbon.input.segments.<database_name>.<table_name> = *;
   ```
   
-  If user wants to query with segments reading in multi threading mode, then CarbonSession.threadSet can be used instead of SET query. 
+  If user wants to query with segments reading in multi threading mode, then CarbonSession. threadSet can be used instead of SET query. 
   ```
   CarbonSession.threadSet ("carbon.input.segments.<database_name>.<table_name>","*");
   ```

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/site/markdown/faq.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/faq.md b/src/site/markdown/faq.md
index baa46cc..8f04e4f 100644
--- a/src/site/markdown/faq.md
+++ b/src/site/markdown/faq.md
@@ -80,7 +80,7 @@ In order to build CarbonData project it is necessary to specify the spark profil
 
 ## How Carbon will behave when execute insert operation in abnormal scenarios?
 Carbon support insert operation, you can refer to the syntax mentioned in [DML Operations on CarbonData](dml-operation-on-carbondata.md).
-First, create a soucre table in spark-sql and load data into this created table.
+First, create a source table in spark-sql and load data into this created table.
 
 ```
 CREATE TABLE source_table(
@@ -124,7 +124,7 @@ id  city    name
 
 As result shows, the second column is city in carbon table, but what inside is name, such as jack. This phenomenon is same with insert data into hive table.
 
-If you want to insert data into corresponding column in carbon table, you have to specify the column order same in insert statment. 
+If you want to insert data into corresponding column in carbon table, you have to specify the column order same in insert statement. 
 
 ```
 INSERT INTO TABLE carbon_table SELECT id, city, name FROM source_table;

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/site/markdown/installation-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/installation-guide.md b/src/site/markdown/installation-guide.md
index 1ba5dd1..0c8790b 100644
--- a/src/site/markdown/installation-guide.md
+++ b/src/site/markdown/installation-guide.md
@@ -141,7 +141,6 @@ mv carbondata.tar.gz carbonlib/
 
 ```
 ./bin/spark-submit
---conf spark.sql.hive.thriftServer.singleSession=true
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
 $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
 ```
@@ -151,13 +150,23 @@ $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
 | CARBON_ASSEMBLY_JAR | CarbonData assembly jar name present in the `$SPARK_HOME/carbonlib/` folder. | carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar |
 | carbon_store_path | This is a parameter to the CarbonThriftServer class. This a HDFS path where CarbonData files will be kept. Strongly Recommended to put same as carbon.storelocation parameter of carbon.properties. | `hdfs://<host_name>:port/user/hive/warehouse/carbon.store` |
 
+**NOTE**: From Spark 1.6, by default the Thrift server runs in multi-session mode. Which means each JDBC/ODBC connection owns a copy of their own SQL configuration and temporary function registry. Cached tables are still shared though. If you prefer to run the Thrift server in single-session mode and share all SQL configuration and temporary function registry, please set option `spark.sql.hive.thriftServer.singleSession` to `true`. You may either add this option to `spark-defaults.conf`, or pass it to `spark-submit.sh` via `--conf`:
+
+```
+./bin/spark-submit
+--conf spark.sql.hive.thriftServer.singleSession=true
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
+```
+
+**But** in single-session mode, if one user changes the database from one connection, the database of the other connections will be changed too.
+
 **Examples**
    
    * Start with default memory and executors.
 
 ```
 ./bin/spark-submit
---conf spark.sql.hive.thriftServer.singleSession=true
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
 $SPARK_HOME/carbonlib
 /carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
@@ -167,7 +176,7 @@ hdfs://<host_name>:port/user/hive/warehouse/carbon.store
    * Start with Fixed executors and resources.
 
 ```
-./bin/spark-submit --conf spark.sql.hive.thriftServer.singleSession=true 
+./bin/spark-submit
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
 --num-executors 3 --driver-memory 20g --executor-memory 250g 
 --executor-cores 32 

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/site/markdown/streaming-guide.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/streaming-guide.md b/src/site/markdown/streaming-guide.md
index 201f8e0..aa9eaef 100644
--- a/src/site/markdown/streaming-guide.md
+++ b/src/site/markdown/streaming-guide.md
@@ -152,6 +152,80 @@ property name | default | description
 --- | --- | ---
 carbon.streaming.auto.handoff.enabled | true | whether to auto trigger handoff operation
 
+## Stream data parser
+Config the property "carbon.stream.parser" to define a stream parser to convert InternalRow to Object[] when write stream data.
+
+property name | default | description
+--- | --- | ---
+carbon.stream.parser | org.apache.carbondata.streaming.parser.CSVStreamParserImp | the class of the stream parser
+
+Currently CarbonData support two parsers, as following:
+
+**1. org.apache.carbondata.streaming.parser.CSVStreamParserImp**: This is the default stream parser, it gets a line data(String type) from the first index of InternalRow and converts this String to Object[].
+
+**2. org.apache.carbondata.streaming.parser.RowStreamParserImp**: This stream parser will auto convert InternalRow to Object[] according to schema of this `DataSet`, for example:
+
+```scala
+ case class FileElement(school: Array[String], age: Int)
+ case class StreamData(id: Int, name: String, city: String, salary: Float, file: FileElement)
+ ...
+
+ var qry: StreamingQuery = null
+ val readSocketDF = spark.readStream
+   .format("socket")
+   .option("host", "localhost")
+   .option("port", 9099)
+   .load()
+   .as[String]
+   .map(_.split(","))
+   .map { fields => {
+     val tmp = fields(4).split("\\$")
+     val file = FileElement(tmp(0).split(":"), tmp(1).toInt)
+     StreamData(fields(0).toInt, fields(1), fields(2), fields(3).toFloat, file)
+   } }
+
+ // Write data from socket stream to carbondata file
+ qry = readSocketDF.writeStream
+   .format("carbondata")
+   .trigger(ProcessingTime("5 seconds"))
+   .option("checkpointLocation", tablePath.getStreamingCheckpointDir)
+   .option("dbName", "default")
+   .option("tableName", "carbon_table")
+   .option(CarbonStreamParser.CARBON_STREAM_PARSER,
+     CarbonStreamParser.CARBON_STREAM_PARSER_ROW_PARSER)
+   .start()
+
+ ...
+```
+
+### How to implement a customized stream parser
+If user needs to implement a customized stream parser to convert a specific InternalRow to Object[], it needs to implement `initialize` method and `parserRow` method of interface `CarbonStreamParser`, for example:
+
+```scala
+ package org.XXX.XXX.streaming.parser
+ 
+ import org.apache.hadoop.conf.Configuration
+ import org.apache.spark.sql.catalyst.InternalRow
+ import org.apache.spark.sql.types.StructType
+ 
+ class XXXStreamParserImp extends CarbonStreamParser {
+ 
+   override def initialize(configuration: Configuration, structType: StructType): Unit = {
+     // user can get the properties from "configuration"
+   }
+   
+   override def parserRow(value: InternalRow): Array[Object] = {
+     // convert InternalRow to Object[](Array[Object] in Scala) 
+   }
+   
+   override def close(): Unit = {
+   }
+ }
+   
+```
+
+and then set the property "carbon.stream.parser" to "org.XXX.XXX.streaming.parser.XXXStreamParserImp".
+
 ## Close streaming table
 Use below command to handoff all streaming segments to columnar format segments and modify the streaming property to false, this table becomes a normal table.
 ```sql

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/site/markdown/troubleshooting.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/troubleshooting.md b/src/site/markdown/troubleshooting.md
index 68dd538..0156121 100644
--- a/src/site/markdown/troubleshooting.md
+++ b/src/site/markdown/troubleshooting.md
@@ -177,7 +177,7 @@ Note :  Refrain from using "mvn clean package" without specifying the profile.
   Data loading fails with the following exception :
 
    ```
-   Data Load failure exeception
+   Data Load failure exception
    ```
 
   **Possible Cause**
@@ -208,7 +208,7 @@ Note :  Refrain from using "mvn clean package" without specifying the profile.
   Insertion fails with the following exception :
 
    ```
-   Data Load failure exeception
+   Data Load failure exception
    ```
 
   **Possible Cause**

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/site/markdown/useful-tips-on-carbondata.md
----------------------------------------------------------------------
diff --git a/src/site/markdown/useful-tips-on-carbondata.md b/src/site/markdown/useful-tips-on-carbondata.md
index aaf6460..4d43003 100644
--- a/src/site/markdown/useful-tips-on-carbondata.md
+++ b/src/site/markdown/useful-tips-on-carbondata.md
@@ -138,7 +138,7 @@
   |carbon.number.of.cores.while.loading|Default: 2.This value should be >= 2|Specifies the number of cores used for data processing during data loading in CarbonData. |
   |carbon.sort.size|Default: 100000. The value should be >= 100.|Threshold to write local file in sort step when loading data|
   |carbon.sort.file.write.buffer.size|Default:  50000.|DataOutputStream buffer. |
-  |carbon.number.of.cores.block.sort|Default: 7 | If you have huge memory and cpus, increase it as you will|
+  |carbon.number.of.cores.block.sort|Default: 7 | If you have huge memory and CPUs, increase it as you will|
   |carbon.merge.sort.reader.thread|Default: 3 |Specifies the number of cores used for temp file merging during data loading in CarbonData.|
   |carbon.merge.sort.prefetch|Default: true | You may want set this value to false if you have not enough memory|

[2/2] carbondata-site git commit: update md document from github

Posted by ch...@apache.org.

update md document from github


Project: http://git-wip-us.apache.org/repos/asf/carbondata-site/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata-site/commit/b0888c1b
Tree: http://git-wip-us.apache.org/repos/asf/carbondata-site/tree/b0888c1b
Diff: http://git-wip-us.apache.org/repos/asf/carbondata-site/diff/b0888c1b

Branch: refs/heads/asf-site
Commit: b0888c1b2043d831fd1650d14242f47ff25817ac
Parents: 4d05a0d
Author: chenliang613 <ch...@huawei.com>
Authored: Tue Mar 6 11:07:34 2018 +0800
Committer: chenliang613 <ch...@huawei.com>
Committed: Tue Mar 6 11:07:34 2018 +0800

----------------------------------------------------------------------
 content/data-management-on-carbondata.html      | 409 +++++-------------
 content/faq.html                                |   4 +-
 content/installation-guide.html                 |  11 +-
 content/pdf/maven-pdf-plugin.pdf                | Bin 0 -> 216771 bytes
 content/streaming-guide.html                    |  76 ++++
 content/troubleshooting.html                    |   4 +-
 content/useful-tips-on-carbondata.html          |   2 +-
 .../webapp/data-management-on-carbondata.html   | 409 +++++-------------
 src/main/webapp/faq.html                        |   4 +-
 src/main/webapp/installation-guide.html         |  11 +-
 src/main/webapp/pdf/maven-pdf-plugin.pdf        | Bin 0 -> 155540 bytes
 src/main/webapp/streaming-guide.html            |  76 ++++
 src/main/webapp/troubleshooting.html            |   4 +-
 src/main/webapp/useful-tips-on-carbondata.html  |   2 +-
 .../markdown/data-management-on-carbondata.md   | 420 +++++--------------
 src/site/markdown/faq.md                        |   4 +-
 src/site/markdown/installation-guide.md         |  15 +-
 src/site/markdown/streaming-guide.md            |  74 ++++
 src/site/markdown/troubleshooting.md            |   4 +-
 src/site/markdown/useful-tips-on-carbondata.md  |   2 +-
 20 files changed, 581 insertions(+), 950 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/content/data-management-on-carbondata.html
----------------------------------------------------------------------
diff --git a/content/data-management-on-carbondata.html b/content/data-management-on-carbondata.html
index 846f11e..05c006b 100644
--- a/content/data-management-on-carbondata.html
+++ b/content/data-management-on-carbondata.html
@@ -182,7 +182,6 @@
 <li><a href="#update-and-delete">UPDATE AND DELETE</a></li>
 <li><a href="#compaction">COMPACTION</a></li>
 <li><a href="#partition">PARTITION</a></li>
-<li><a href="#pre-aggregate-tables">PRE-AGGREGATE TABLES</a></li>
 <li><a href="#bucketing">BUCKETING</a></li>
 <li><a href="#segment-management">SEGMENT MANAGEMENT</a></li>
 </ul>
@@ -249,11 +248,11 @@ And if you care about loading resources isolation strictly, because the system u
 <p>These properties are table level compaction configurations, if not specified, system level configurations in carbon.properties will be used.
 Following are 5 configurations:</p>
 <ul>
-<li>MAJOR_COMPACTION_SIZE: same meaning with carbon.major.compaction.size, size in MB.</li>
-<li>AUTO_LOAD_MERGE: same meaning with carbon.enable.auto.load.merge.</li>
-<li>COMPACTION_LEVEL_THRESHOLD: same meaning with carbon.compaction.level.threshold.</li>
-<li>COMPACTION_PRESERVE_SEGMENTS: same meaning with carbon.numberof.preserve.segments.</li>
-<li>ALLOWED_COMPACTION_DAYS: same meaning with carbon.allowed.compaction.days.</li>
+<li>MAJOR_COMPACTION_SIZE: same meaning as carbon.major.compaction.size, size in MB.</li>
+<li>AUTO_LOAD_MERGE: same meaning as carbon.enable.auto.load.merge.</li>
+<li>COMPACTION_LEVEL_THRESHOLD: same meaning as carbon.compaction.level.threshold.</li>
+<li>COMPACTION_PRESERVE_SEGMENTS: same meaning as carbon.numberof.preserve.segments.</li>
+<li>ALLOWED_COMPACTION_DAYS: same meaning as carbon.allowed.compaction.days.</li>
 </ul>
 <pre><code>TBLPROPERTIES ('MAJOR_COMPACTION_SIZE'='2048',
                'AUTO_LOAD_MERGE'='true',
@@ -272,26 +271,17 @@ Following are 5 configurations:</p>
 <h3>
 <a id="example" class="anchor" href="#example" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
 <pre><code> CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                                productNumber Int,
-                                productName String,
-                                storeCity String,
-                                storeProvince String,
-                                productCategory String,
-                                productBatch String,
-                                saleQuantity Int,
-                                revenue Int)
+                                productNumber INT,
+                                productName STRING,
+                                storeCity STRING,
+                                storeProvince STRING,
+                                productCategory STRING,
+                                productBatch STRING,
+                                saleQuantity INT,
+                                revenue INT)
  STORED BY 'carbondata'
- TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber',
-                'NO_INVERTED_INDEX'='productBatch',
-                'SORT_COLUMNS'='productName,storeCity',
-                'SORT_SCOPE'='NO_SORT',
-                'TABLE_BLOCKSIZE'='512',
-                'MAJOR_COMPACTION_SIZE'='2048',
-                'AUTO_LOAD_MERGE'='true',
-                'COMPACTION_LEVEL_THRESHOLD'='5,6',
-                'COMPACTION_PRESERVE_SEGMENTS'='10',
- 			   'streaming'='true',
-                'ALLOWED_COMPACTION_DAYS'='5')
+ TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity',
+                'SORT_SCOPE'='NO_SORT')
 </code></pre>
 <h2>
 <a id="create-database" class="anchor" href="#create-database" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>CREATE DATABASE</h2>
@@ -324,7 +314,7 @@ OR
 SHOW TABLES IN defaultdb
 </code></pre>
 <h3>
-<a id="alter-talbe" class="anchor" href="#alter-talbe" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>ALTER TALBE</h3>
+<a id="alter-table" class="anchor" href="#alter-table" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>ALTER TABLE</h3>
 <p>The following section introduce the commands to modify the physical or logical state of the existing table(s).</p>
 <ul>
 <li>
@@ -333,9 +323,9 @@ SHOW TABLES IN defaultdb
 <pre><code>ALTER TABLE [db_name.]table_name RENAME TO new_table_name
 </code></pre>
 <p>Examples:</p>
-<pre><code>ALTER TABLE carbon RENAME TO carbondata
+<pre><code>ALTER TABLE carbon RENAME TO carbonTable
 OR
-ALTER TABLE test_db.carbon RENAME TO test_db.carbondata
+ALTER TABLE test_db.carbon RENAME TO test_db.carbonTable
 </code></pre>
 </li>
 <li>
@@ -408,14 +398,37 @@ Change of decimal data type from lower precision to higher precision will only b
 <li>Before executing this command the old table schema and data should be copied into the new database location.</li>
 <li>If the table is aggregate table, then all the aggregate tables should be copied to the new database location.</li>
 <li>For old store, the time zone of the source and destination cluster should be same.</li>
-<li>If old cluster uses HIVE meta store, refresh will not work as schema file does not exist in file system.</li>
+<li>If old cluster used HIVE meta store to store schema, refresh will not work as schema file does not exist in file system.</li>
 </ul>
+<h3>
+<a id="table-and-column-comment" class="anchor" href="#table-and-column-comment" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Table and Column Comment</h3>
+<p>You can provide more information on table by using table comment. Similarly you can provide more information about a particular column using column comment.
+You can see the column comment of an existing table using describe formatted command.</p>
+<pre><code>CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type [COMMENT col_comment], ...)]
+  [COMMENT table_comment]
+STORED BY 'carbondata'
+[TBLPROPERTIES (property_name=property_value, ...)]
+</code></pre>
+<p>Example:</p>
+<pre><code>CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
+                              productNumber Int COMMENT 'unique serial number for product')
+COMMENT ?This is table comment?
+ STORED BY 'carbondata'
+ TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber')
+</code></pre>
+<p>You can also SET and UNSET table comment using ALTER command.</p>
+<p>Example to SET table comment:</p>
+<pre><code>ALTER TABLE carbon SET TBLPROPERTIES ('comment'='this table comment is modified');
+</code></pre>
+<p>Example to UNSET table comment:</p>
+<pre><code>ALTER TABLE carbon UNSET TBLPROPERTIES ('comment');
+</code></pre>
 <h2>
 <a id="load-data" class="anchor" href="#load-data" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>LOAD DATA</h2>
 <h3>
 <a id="load-files-to-carbondata-table" class="anchor" href="#load-files-to-carbondata-table" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>LOAD FILES TO CARBONDATA TABLE</h3>
 <p>This command is used to load csv files to carbondata, OPTIONS are not mandatory for data loading process.
-Inside OPTIONS user can provide either of any options like DELIMITER, QUOTECHAR, FILEHEADER, ESCAPECHAR, MULTILINE as per requirement.</p>
+Inside OPTIONS user can provide any options like DELIMITER, QUOTECHAR, FILEHEADER, ESCAPECHAR, MULTILINE as per requirement.</p>
 <pre><code>LOAD DATA [LOCAL] INPATH 'folder_path' 
 INTO TABLE [db_name.]table_name 
 OPTIONS(property_name=property_value, ...)
@@ -438,6 +451,14 @@ OPTIONS(property_name=property_value, ...)
 </code></pre>
 </li>
 <li>
+<p><strong>HEADER:</strong> When you load the CSV file without the file header and the file header is the same with the table schema, then add 'HEADER'='false' to load data SQL as user need not provide the file header. By default the value is 'true'.
+false: CSV file is without file header.
+true: CSV file is with file header.</p>
+<pre><code>OPTIONS('HEADER'='false') 
+</code></pre>
+<p>NOTE: If the HEADER option exist and is set to 'true', then the FILEHEADER option is not required.</p>
+</li>
+<li>
 <p><strong>FILEHEADER:</strong> Headers can be provided in the LOAD DATA command if headers are missing in the source files.</p>
 <pre><code>OPTIONS('FILEHEADER'='column1,column2') 
 </code></pre>
@@ -448,7 +469,7 @@ OPTIONS(property_name=property_value, ...)
 </code></pre>
 </li>
 <li>
-<p><strong>ESCAPECHAR:</strong> Escape char can be provided if user want strict validation of escape character on CSV.</p>
+<p><strong>ESCAPECHAR:</strong> Escape char can be provided if user want strict validation of escape character in CSV files.</p>
 <pre><code>OPTIONS('ESCAPECHAR'='\') 
 </code></pre>
 </li>
@@ -499,6 +520,7 @@ OPTIONS(property_name=property_value, ...)
 <p>Example:</p>
 <pre><code>LOAD DATA local inpath '/opt/rawdata/data.csv' INTO table carbontable
 options('DELIMITER'=',', 'QUOTECHAR'='"','COMMENTCHAR'='#',
+'HEADER'='false',
 'FILEHEADER'='empno,empname,designation,doj,workgroupcategory,
 workgroupcategoryname,deptno,deptname,projectcode,
 projectjoindate,projectenddate,attendance,utilization,salary',
@@ -523,10 +545,10 @@ projectjoindate,projectenddate,attendance,utilization,salary',
 <li>BAD_RECORDS_ACTION property can have four type of actions for bad records FORCE, REDIRECT, IGNORE and FAIL.</li>
 <li>FAIL option is its Default value. If the FAIL option is used, then data loading fails if any bad records are found.</li>
 <li>If the REDIRECT option is used, CarbonData will add all bad records in to a separate CSV file. However, this file must not be used for subsequent data loading because the content may not exactly match the source record. You are advised to cleanse the original source record for further data ingestion. This option is used to remind you which records are bad records.</li>
-<li>If the FORCE option is used, then it auto-corrects the data by storing the bad records as NULL before Loading data.</li>
+<li>If the FORCE option is used, then it auto-converts the data by storing the bad records as NULL before Loading data.</li>
 <li>If the IGNORE option is used, then bad records are neither loaded nor written to the separate CSV file.</li>
 <li>In loaded data, if all records are bad records, the BAD_RECORDS_ACTION is invalid and the load operation fails.</li>
-<li>The maximum number of characters per column is 100000. If there are more than 100000 characters in a column, data loading will fail.</li>
+<li>The maximum number of characters per column is 32000. If there are more than 32000 characters in a column, data loading will fail.</li>
 </ul>
 <p>Example:</p>
 <pre><code>LOAD DATA INPATH 'filepath.csv' INTO TABLE tablename
@@ -572,7 +594,7 @@ It comes with the functionality to aggregate the records of a table by performin
 SET (column_name1, column_name2, ... column_name n) = (column1_expression , column2_expression, ... column n_expression )
 [ WHERE { &lt;filter_condition&gt; } ]
 </code></pre>
-<p>alternatively the following the command can also be used for updating the CarbonData Table :</p>
+<p>alternatively the following command can also be used for updating the CarbonData Table :</p>
 <pre><code>UPDATE &lt;table_name&gt;
 SET (column_name1, column_name2) =(select sourceColumn1, sourceColumn2 from sourceTable [ WHERE { &lt;filter_condition&gt; } ] )
 [ WHERE { &lt;filter_condition&gt; } ]
@@ -605,8 +627,7 @@ SET (column_name1, column_name2) =(select sourceColumn1, sourceColumn2 from sour
 </code></pre>
 <h2>
 <a id="compaction" class="anchor" href="#compaction" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>COMPACTION</h2>
-<p>Compaction improves the query performance significantly.
-During the load data, several CarbonData files are generated, this is because data is sorted only within each load (per load segment and one B+ tree index).</p>
+<p>Compaction improves the query performance significantly.</p>
 <p>There are two types of compaction, Minor and Major compaction.</p>
 <pre><code>ALTER TABLE [db_name.]table_name COMPACT 'MINOR/MAJOR'
 </code></pre>
@@ -627,7 +648,8 @@ If any segments are available to be merged, then compaction will run parallel wi
 </ul>
 <p>In Major compaction, multiple segments can be merged into one large segment.
 User will specify the compaction size until which segments can be merged, Major compaction is usually done during the off-peak time.
-This command merges the specified number of segments into one segment:</p>
+Configure the property carbon.major.compaction.size with appropriate value in MB.</p>
+<p>This command merges the specified number of segments into one segment:</p>
 <pre><code>ALTER TABLE table_name COMPACT 'MAJOR'
 </code></pre>
 <ul>
@@ -653,13 +675,13 @@ This command merges the specified number of segments into one segment:</p>
 </code></pre>
 <p>Example:</p>
 <pre><code> CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                              productNumber Int,
-                              productName String,
-                              storeCity String,
-                              storeProvince String,
-                              saleQuantity Int,
-                              revenue Int)
-PARTITIONED BY (productCategory String, productBatch String)
+                              productNumber INT,
+                              productName STRING,
+                              storeCity STRING,
+                              storeProvince STRING,
+                              saleQuantity INT,
+                              revenue INT)
+PARTITIONED BY (productCategory STRING, productBatch STRING)
 STORED BY 'carbondata'
 </code></pre>
 <h4>
@@ -667,15 +689,13 @@ STORED BY 'carbondata'
 <p>This command allows you to load data using static partition.</p>
 <pre><code>LOAD DATA [LOCAL] INPATH 'folder_path' 
 INTO TABLE [db_name.]table_name PARTITION (partition_spec) 
-OPTIONS(property_name=property_value, ...)
-  
+OPTIONS(property_name=property_value, ...)    
 INSERT INTO INTO TABLE [db_name.]table_name PARTITION (partition_spec) &lt;SELECT STATMENT&gt;
 </code></pre>
 <p>Example:</p>
 <pre><code>LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.csv'
 INTO TABLE locationTable
-PARTITION (country = 'US', state = 'CA')
-  
+PARTITION (country = 'US', state = 'CA')  
 INSERT INTO TABLE locationTable
 PARTITION (country = 'US', state = 'AL')
 SELECT &lt;columns list excluding partition columns&gt; FROM another_user
@@ -685,8 +705,7 @@ SELECT &lt;columns list excluding partition columns&gt; FROM another_user
 <p>This command allows you to load data using dynamic partition. If partition spec is not specified, then the partition is considered as dynamic.</p>
 <p>Example:</p>
 <pre><code>LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.csv'
-INTO TABLE locationTable
-        
+INTO TABLE locationTable          
 INSERT INTO TABLE locationTable
 SELECT &lt;columns list excluding partition columns&gt; FROM another_user
 </code></pre>
@@ -702,7 +721,7 @@ SELECT &lt;columns list excluding partition columns&gt; FROM another_user
 </code></pre>
 <h4>
 <a id="insert-overwrite" class="anchor" href="#insert-overwrite" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Insert OVERWRITE</h4>
-<p>This command allows you to insert or load overwrite on a spcific partition.</p>
+<p>This command allows you to insert or load overwrite on a specific partition.</p>
 <pre><code> INSERT OVERWRITE TABLE table_name
  PARTITION (column = 'partition_name')
  select_statement
@@ -729,12 +748,12 @@ STORED BY 'carbondata'
 <p>NOTE: N is the number of hash partitions</p>
 <p>Example:</p>
 <pre><code>CREATE TABLE IF NOT EXISTS hash_partition_table(
-    col_A String,
-    col_B Int,
-    col_C Long,
-    col_D Decimal(10,2),
-    col_F Timestamp
-) PARTITIONED BY (col_E Long)
+    col_A STRING,
+    col_B INT,
+    col_C LONG,
+    col_D DECIMAL(10,2),
+    col_F TIMESTAMP
+) PARTITIONED BY (col_E LONG)
 STORED BY 'carbondata' TBLPROPERTIES('PARTITION_TYPE'='HASH','NUM_PARTITIONS'='9')
 </code></pre>
 <h3>
@@ -754,11 +773,11 @@ STORED BY 'carbondata'
 </ul>
 <p>Example:</p>
 <pre><code>CREATE TABLE IF NOT EXISTS range_partition_table(
-    col_A String,
-    col_B Int,
-    col_C Long,
-    col_D Decimal(10,2),
-    col_E Long
+    col_A STRING,
+    col_B INT,
+    col_C LONG,
+    col_D DECIMAL(10,2),
+    col_E LONG
  ) partitioned by (col_F Timestamp)
  PARTITIONED BY 'carbondata'
  TBLPROPERTIES('PARTITION_TYPE'='RANGE',
@@ -777,12 +796,12 @@ STORED BY 'carbondata'
 <p>NOTE: List partition supports list info in one level group.</p>
 <p>Example:</p>
 <pre><code>CREATE TABLE IF NOT EXISTS list_partition_table(
-    col_B Int,
-    col_C Long,
-    col_D Decimal(10,2),
-    col_E Long,
-    col_F Timestamp
- ) PARTITIONED BY (col_A String)
+    col_B INT,
+    col_C LONG,
+    col_D DECIMAL(10,2),
+    col_E LONG,
+    col_F TIMESTAMP
+ ) PARTITIONED BY (col_A STRING)
  STORED BY 'carbondata'
  TBLPROPERTIES('PARTITION_TYPE'='LIST',
  'LIST_INFO'='aaaa, bbbb, (cccc, dddd), eeee')
@@ -824,234 +843,6 @@ SegmentDir/part-0-0_batchno0-0-1502703086921.carbondata
 <li>When writing SQL on a partition table, try to use filters on the partition column.</li>
 </ul>
 <h2>
-<a id="pre-aggregate-tables" class="anchor" href="#pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>PRE-AGGREGATE TABLES</h2>
-<p>Carbondata supports pre aggregating of data so that OLAP kind of queries can fetch data
-much faster.Aggregate tables are created as datamaps so that the handling is as efficient as
-other indexing support.Users can create as many aggregate tables they require as datamaps to
-improve their query performance,provided the storage requirements and loading speeds are
-acceptable.</p>
-<p>For main table called <strong>sales</strong> which is defined as</p>
-<pre><code>CREATE TABLE sales (
-order_time timestamp,
-user_id string,
-sex string,
-country string,
-quantity int,
-price bigint)
-STORED BY 'carbondata'
-</code></pre>
-<p>user can create pre-aggregate tables using the DDL</p>
-<pre><code>CREATE DATAMAP agg_sales
-ON TABLE sales
-USING "preaggregate"
-AS
-SELECT country, sex, sum(quantity), avg(price)
-FROM sales
-GROUP BY country, sex
-</code></pre>
-<p><b></b></p><p align="left">Functions supported in pre-aggregate tables</p>
-<table>
-<thead>
-<tr>
-<th>Function</th>
-<th>Rollup supported</th>
-</tr>
-</thead>
-<tbody>
-<tr>
-<td>SUM</td>
-<td>Yes</td>
-</tr>
-<tr>
-<td>AVG</td>
-<td>Yes</td>
-</tr>
-<tr>
-<td>MAX</td>
-<td>Yes</td>
-</tr>
-<tr>
-<td>MIN</td>
-<td>Yes</td>
-</tr>
-<tr>
-<td>COUNT</td>
-<td>Yes</td>
-</tr>
-</tbody>
-</table>
-<h5>
-<a id="how-pre-aggregate-tables-are-selected" class="anchor" href="#how-pre-aggregate-tables-are-selected" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>How pre-aggregate tables are selected</h5>
-<p>For the main table <strong>sales</strong> and pre-aggregate table <strong>agg_sales</strong> created above, queries of the
-kind</p>
-<pre><code>SELECT country, sex, sum(quantity), avg(price) from sales GROUP BY country, sex
-
-SELECT sex, sum(quantity) from sales GROUP BY sex
-
-SELECT sum(price), country from sales GROUP BY country
-</code></pre>
-<p>will be transformed by Query Planner to fetch data from pre-aggregate table <strong>agg_sales</strong></p>
-<p>But queries of kind</p>
-<pre><code>SELECT user_id, country, sex, sum(quantity), avg(price) from sales GROUP BY user_id, country, sex
-
-SELECT sex, avg(quantity) from sales GROUP BY sex
-
-SELECT country, max(price) from sales GROUP BY country
-</code></pre>
-<p>will fetch the data from the main table <strong>sales</strong></p>
-<h5>
-<a id="loading-data-to-pre-aggregate-tables" class="anchor" href="#loading-data-to-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Loading data to pre-aggregate tables</h5>
-<p>For existing table with loaded data, data load to pre-aggregate table will be triggered by the
-CREATE DATAMAP statement when user creates the pre-aggregate table.
-For incremental loads after aggregates tables are created, loading data to main table triggers
-the load to pre-aggregate tables once main table loading is complete.These loads are automic
-meaning that data on main table and aggregate tables are only visible to the user after all tables
-are loaded</p>
-<h5>
-<a id="querying-data-from-pre-aggregate-tables" class="anchor" href="#querying-data-from-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Querying data from pre-aggregate tables</h5>
-<p>Pre-aggregate tables cannot be queries directly.Queries are to be made on main table.Internally
-carbondata will check associated pre-aggregate tables with the main table and if the
-pre-aggregate tables satisfy the query condition, the plan is transformed automatically to use
-pre-aggregate table to fetch the data</p>
-<h5>
-<a id="compacting-pre-aggregate-tables" class="anchor" href="#compacting-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Compacting pre-aggregate tables</h5>
-<p>Compaction command (ALTER TABLE COMPACT) need to be run separately on each pre-aggregate table.
-Running Compaction command on main table will <strong>not automatically</strong> compact the pre-aggregate
-tables.Compaction is an optional operation for pre-aggregate table. If compaction is performed on
-main table but not performed on pre-aggregate table, all queries still can benefit from
-pre-aggregate tables.To further improve performance on pre-aggregate tables, compaction can be
-triggered on pre-aggregate tables directly, it will merge the segments inside pre-aggregate table.</p>
-<h5>
-<a id="updatedelete-operations-on-pre-aggregate-tables" class="anchor" href="#updatedelete-operations-on-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Update/Delete Operations on pre-aggregate tables</h5>
-<p>This functionality is not supported.</p>
-<p>NOTE (<b>RESTRICTION</b>):</p>
-<ul>
-<li>Update/Delete operations are <b>not supported</b> on main table which has pre-aggregate tables
-created on it.All the pre-aggregate tables <b>will have to be dropped</b> before update/delete
-operations can be performed on the main table.Pre-aggregate tables can be rebuilt manually
-after update/delete operations are completed</li>
-</ul>
-<h5>
-<a id="delete-segment-operations-on-pre-aggregate-tables" class="anchor" href="#delete-segment-operations-on-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Delete Segment Operations on pre-aggregate tables</h5>
-<p>This functionality is not supported.</p>
-<p>NOTE (<b>RESTRICTION</b>):</p>
-<ul>
-<li>Delete Segment operations are <b>not supported</b> on main table which has pre-aggregate tables
-created on it.All the pre-aggregate tables <b>will have to be dropped</b> before update/delete
-operations can be performed on the main table.Pre-aggregate tables can be rebuilt manually
-after delete segment operations are completed</li>
-</ul>
-<h5>
-<a id="alter-table-operations-on-pre-aggregate-tables" class="anchor" href="#alter-table-operations-on-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Alter Table Operations on pre-aggregate tables</h5>
-<p>This functionality is not supported.</p>
-<p>NOTE (<b>RESTRICTION</b>):</p>
-<ul>
-<li>Adding new column in new table does not have any affect on pre-aggregate tables. However if
-dropping or renaming a column has impact in pre-aggregate table, such operations will be
-rejected and error will be thrown.All the pre-aggregate tables <b>will have to be dropped</b>
-before Alter Operations can be performed on the main table.Pre-aggregate tables can be rebuilt
-manually after Alter Table operations are completed</li>
-</ul>
-<h3>
-<a id="supporting-timeseries-data-alpha-feature-in-130" class="anchor" href="#supporting-timeseries-data-alpha-feature-in-130" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Supporting timeseries data (Alpha feature in 1.3.0)</h3>
-<p>Carbondata has built-in understanding of time hierarchy and levels: year, month, day, hour, minute.
-Multiple pre-aggregate tables can be created for the hierarchy and Carbondata can do automatic
-roll-up for the queries on these hierarchies.</p>
-<pre><code>CREATE DATAMAP agg_year
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'year_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-  
-CREATE DATAMAP agg_month
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'month_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-  
-CREATE DATAMAP agg_day
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'day_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-      
-CREATE DATAMAP agg_sales_hour
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'hour_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-
-CREATE DATAMAP agg_minute
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'minute_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-</code></pre>
-<p>For Querying data and automatically roll-up to the desired aggregation level,Carbondata supports
-UDF as</p>
-<pre><code>timeseries(timeseries column name, ?aggregation level?)
-</code></pre>
-<pre><code>Select timeseries(order_time, ?hour?), sum(quantity) from sales group by timeseries(order_time,
-?hour?)
-</code></pre>
-<p>It is <strong>not necessary</strong> to create pre-aggregate tables for each granularity unless required for
-query.Carbondata can roll-up the data and fetch it.</p>
-<p>For Example: For main table <strong>sales</strong> , If pre-aggregate tables were created as</p>
-<pre><code>CREATE DATAMAP agg_day
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time?=?order_time?,
-  'day_granualrity?=?1?,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-        
-  CREATE DATAMAP agg_sales_hour
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time?=?order_time?,
-  'hour_granualrity?=?1?,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-</code></pre>
-<p>Queries like below will be rolled-up and fetched from pre-aggregate tables</p>
-<pre><code>Select timeseries(order_time, ?month?), sum(quantity) from sales group by timeseries(order_time,
-  ?month?)
-  
-Select timeseries(order_time, ?year?), sum(quantity) from sales group by timeseries(order_time,
-  ?year?)
-</code></pre>
-<p>NOTE (<b>RESTRICTION</b>):</p>
-<ul>
-<li>Only value of 1 is supported for hierarchy levels. Other hierarchy levels are not supported.
-Other hierarchy levels are not supported</li>
-<li>pre-aggregate tables for the desired levels needs to be created one after the other</li>
-<li>pre-aggregate tables created for each level needs to be dropped separately</li>
-</ul>
-<h2>
 <a id="bucketing" class="anchor" href="#bucketing" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>BUCKETING</h2>
 <p>Bucketing feature can be used to distribute/organize the table/partition data into multiple files such
 that similar records are present in the same file. While creating a table, user needs to specify the
@@ -1065,19 +856,19 @@ TBLPROPERTIES('BUCKETNUMBER'='noOfBuckets',
 </code></pre>
 <p>NOTE:</p>
 <ul>
-<li>Bucketing can not be performed for columns of Complex Data Types.</li>
-<li>Columns in the BUCKETCOLUMN parameter must be only dimension. The BUCKETCOLUMN parameter can not be a measure or a combination of measures and dimensions.</li>
+<li>Bucketing cannot be performed for columns of Complex Data Types.</li>
+<li>Columns in the BUCKETCOLUMN parameter must be dimensions. The BUCKETCOLUMN parameter cannot be a measure or a combination of measures and dimensions.</li>
 </ul>
 <p>Example:</p>
 <pre><code>CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                              productNumber Int,
-                              saleQuantity Int,
-                              productName String,
-                              storeCity String,
-                              storeProvince String,
-                              productCategory String,
-                              productBatch String,
-                              revenue Int)
+                              productNumber INT,
+                              saleQuantity INT,
+                              productName STRING,
+                              storeCity STRING,
+                              storeProvince STRING,
+                              productCategory STRING,
+                              productBatch STRING,
+                              revenue INT)
 STORED BY 'carbondata'
 TBLPROPERTIES ('BUCKETNUMBER'='4', 'BUCKETCOLUMNS'='productName')
 </code></pre>
@@ -1085,7 +876,7 @@ TBLPROPERTIES ('BUCKETNUMBER'='4', 'BUCKETCOLUMNS'='productName')
 <a id="segment-management" class="anchor" href="#segment-management" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>SEGMENT MANAGEMENT</h2>
 <h3>
 <a id="show-segment" class="anchor" href="#show-segment" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>SHOW SEGMENT</h3>
-<p>This command is used to get the segments of CarbonData table.</p>
+<p>This command is used to list the segments of CarbonData table.</p>
 <pre><code>SHOW SEGMENTS FOR TABLE [db_name.]table_name LIMIT number_of_segments
 </code></pre>
 <p>Example:</p>
@@ -1125,13 +916,13 @@ The segment created before the particular date will be removed from the specific
 </code></pre>
 <p>NOTE:
 carbon.input.segments: Specifies the segment IDs to be queried. This property allows you to query specified segments of the specified table. The CarbonScan will read data from specified segments only.</p>
-<p>If user wants to query with segments reading in multi threading mode, then CarbonSession.threadSet can be used instead of SET query.</p>
+<p>If user wants to query with segments reading in multi threading mode, then CarbonSession. threadSet can be used instead of SET query.</p>
 <pre><code>CarbonSession.threadSet ("carbon.input.segments.&lt;database_name&gt;.&lt;table_name&gt;","&lt;list of segment IDs&gt;");
 </code></pre>
 <p>Reset the segment IDs</p>
 <pre><code>SET carbon.input.segments.&lt;database_name&gt;.&lt;table_name&gt; = *;
 </code></pre>
-<p>If user wants to query with segments reading in multi threading mode, then CarbonSession.threadSet can be used instead of SET query.</p>
+<p>If user wants to query with segments reading in multi threading mode, then CarbonSession. threadSet can be used instead of SET query.</p>
 <pre><code>CarbonSession.threadSet ("carbon.input.segments.&lt;database_name&gt;.&lt;table_name&gt;","*");
 </code></pre>
 <p><strong>Examples:</strong></p>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/content/faq.html
----------------------------------------------------------------------
diff --git a/content/faq.html b/content/faq.html
index b42f8bd..b51b071 100644
--- a/content/faq.html
+++ b/content/faq.html
@@ -236,7 +236,7 @@ The property carbon.lock.type configuration specifies the type of lock to be acq
 <h2>
 <a id="how-carbon-will-behave-when-execute-insert-operation-in-abnormal-scenarios" class="anchor" href="#how-carbon-will-behave-when-execute-insert-operation-in-abnormal-scenarios" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>How Carbon will behave when execute insert operation in abnormal scenarios?</h2>
 <p>Carbon support insert operation, you can refer to the syntax mentioned in <a href="dml-operation-on-carbondata.html">DML Operations on CarbonData</a>.
-First, create a soucre table in spark-sql and load data into this created table.</p>
+First, create a source table in spark-sql and load data into this created table.</p>
 <pre><code>CREATE TABLE source_table(
 id String,
 name String,
@@ -266,7 +266,7 @@ id  city    name
 3   davi    shenzhen
 </code></pre>
 <p>As result shows, the second column is city in carbon table, but what inside is name, such as jack. This phenomenon is same with insert data into hive table.</p>
-<p>If you want to insert data into corresponding column in carbon table, you have to specify the column order same in insert statment.</p>
+<p>If you want to insert data into corresponding column in carbon table, you have to specify the column order same in insert statement.</p>
 <pre><code>INSERT INTO TABLE carbon_table SELECT id, city, name FROM source_table;
 </code></pre>
 <p><strong>Scenario 2</strong> :</p>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/content/installation-guide.html
----------------------------------------------------------------------
diff --git a/content/installation-guide.html b/content/installation-guide.html
index 7da254e..5f1df57 100644
--- a/content/installation-guide.html
+++ b/content/installation-guide.html
@@ -388,7 +388,6 @@ mv carbondata.tar.gz carbonlib/
 <p>a. cd <code>$SPARK_HOME</code></p>
 <p>b. Run the following command to start the CarbonData thrift server.</p>
 <pre><code>./bin/spark-submit
---conf spark.sql.hive.thriftServer.singleSession=true
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
 $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR &lt;carbon_store_path&gt;
 </code></pre>
@@ -413,12 +412,18 @@ $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR &lt;carbon_store_path&gt;
 </tr>
 </tbody>
 </table>
+<p><strong>NOTE</strong>: From Spark 1.6, by default the Thrift server runs in multi-session mode. Which means each JDBC/ODBC connection owns a copy of their own SQL configuration and temporary function registry. Cached tables are still shared though. If you prefer to run the Thrift server in single-session mode and share all SQL configuration and temporary function registry, please set option <code>spark.sql.hive.thriftServer.singleSession</code> to <code>true</code>. You may either add this option to <code>spark-defaults.conf</code>, or pass it to <code>spark-submit.sh</code> via <code>--conf</code>:</p>
+<pre><code>./bin/spark-submit
+--conf spark.sql.hive.thriftServer.singleSession=true
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR &lt;carbon_store_path&gt;
+</code></pre>
+<p><strong>But</strong> in single-session mode, if one user changes the database from one connection, the database of the other connections will be changed too.</p>
 <p><strong>Examples</strong></p>
 <ul>
 <li>Start with default memory and executors.</li>
 </ul>
 <pre><code>./bin/spark-submit
---conf spark.sql.hive.thriftServer.singleSession=true
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
 $SPARK_HOME/carbonlib
 /carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
@@ -427,7 +432,7 @@ hdfs://&lt;host_name&gt;:port/user/hive/warehouse/carbon.store
 <ul>
 <li>Start with Fixed executors and resources.</li>
 </ul>
-<pre><code>./bin/spark-submit --conf spark.sql.hive.thriftServer.singleSession=true 
+<pre><code>./bin/spark-submit
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
 --num-executors 3 --driver-memory 20g --executor-memory 250g 
 --executor-cores 32 

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/content/pdf/maven-pdf-plugin.pdf
----------------------------------------------------------------------
diff --git a/content/pdf/maven-pdf-plugin.pdf b/content/pdf/maven-pdf-plugin.pdf
new file mode 100644
index 0000000..72c0425
Binary files /dev/null and b/content/pdf/maven-pdf-plugin.pdf differ

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/content/streaming-guide.html
----------------------------------------------------------------------
diff --git a/content/streaming-guide.html b/content/streaming-guide.html
index 8d3effe..43992dd 100644
--- a/content/streaming-guide.html
+++ b/content/streaming-guide.html
@@ -351,6 +351,82 @@ streaming table using following DDL.</p>
 </tbody>
 </table>
 <h2>
+<a id="stream-data-parser" class="anchor" href="#stream-data-parser" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Stream data parser</h2>
+<p>Config the property "carbon.stream.parser" to define a stream parser to convert InternalRow to Object[] when write stream data.</p>
+<table>
+<thead>
+<tr>
+<th>property name</th>
+<th>default</th>
+<th>description</th>
+</tr>
+</thead>
+<tbody>
+<tr>
+<td>carbon.stream.parser</td>
+<td>org.apache.carbondata.streaming.parser.CSVStreamParserImp</td>
+<td>the class of the stream parser</td>
+</tr>
+</tbody>
+</table>
+<p>Currently CarbonData support two parsers, as following:</p>
+<p><strong>1. org.apache.carbondata.streaming.parser.CSVStreamParserImp</strong>: This is the default stream parser, it gets a line data(String type) from the first index of InternalRow and converts this String to Object[].</p>
+<p><strong>2. org.apache.carbondata.streaming.parser.RowStreamParserImp</strong>: This stream parser will auto convert InternalRow to Object[] according to schema of this <code>DataSet</code>, for example:</p>
+<div class="highlight highlight-source-scala"><pre> <span class="pl-k">case</span> <span class="pl-k">class</span> <span class="pl-en">FileElement</span>(<span class="pl-v">school</span>: <span class="pl-en">Array</span>[<span class="pl-k">String</span>], <span class="pl-v">age</span>: <span class="pl-k">Int</span>)
+ <span class="pl-k">case</span> <span class="pl-k">class</span> <span class="pl-en">StreamData</span>(<span class="pl-v">id</span>: <span class="pl-k">Int</span>, <span class="pl-v">name</span>: <span class="pl-k">String</span>, <span class="pl-v">city</span>: <span class="pl-k">String</span>, <span class="pl-v">salary</span>: <span class="pl-k">Float</span>, <span class="pl-v">file</span>: <span class="pl-en">FileElement</span>)
+ ...
+
+ <span class="pl-k">var</span> <span class="pl-en">qry</span><span class="pl-k">:</span> <span class="pl-en">StreamingQuery</span> <span class="pl-k">=</span> <span class="pl-c1">null</span>
+ <span class="pl-k">val</span> <span class="pl-en">readSocketDF</span> <span class="pl-k">=</span> spark.readStream
+   .format(<span class="pl-s"><span class="pl-pds">"</span>socket<span class="pl-pds">"</span></span>)
+   .option(<span class="pl-s"><span class="pl-pds">"</span>host<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>localhost<span class="pl-pds">"</span></span>)
+   .option(<span class="pl-s"><span class="pl-pds">"</span>port<span class="pl-pds">"</span></span>, <span class="pl-c1">9099</span>)
+   .load()
+   .as[<span class="pl-k">String</span>]
+   .map(_.split(<span class="pl-s"><span class="pl-pds">"</span>,<span class="pl-pds">"</span></span>))
+   .map { fields <span class="pl-k">=&gt;</span> {
+     <span class="pl-k">val</span> <span class="pl-en">tmp</span> <span class="pl-k">=</span> fields(<span class="pl-c1">4</span>).split(<span class="pl-s"><span class="pl-pds">"</span><span class="pl-cce">\\</span>$<span class="pl-pds">"</span></span>)
+     <span class="pl-k">val</span> <span class="pl-en">file</span> <span class="pl-k">=</span> <span class="pl-en">FileElement</span>(tmp(<span class="pl-c1">0</span>).split(<span class="pl-s"><span class="pl-pds">"</span>:<span class="pl-pds">"</span></span>), tmp(<span class="pl-c1">1</span>).toInt)
+     <span class="pl-en">StreamData</span>(fields(<span class="pl-c1">0</span>).toInt, fields(<span class="pl-c1">1</span>), fields(<span class="pl-c1">2</span>), fields(<span class="pl-c1">3</span>).toFloat, file)
+   } }
+
+ <span class="pl-c"><span class="pl-c">//</span> Write data from socket stream to carbondata file</span>
+ qry <span class="pl-k">=</span> readSocketDF.writeStream
+   .format(<span class="pl-s"><span class="pl-pds">"</span>carbondata<span class="pl-pds">"</span></span>)
+   .trigger(<span class="pl-en">ProcessingTime</span>(<span class="pl-s"><span class="pl-pds">"</span>5 seconds<span class="pl-pds">"</span></span>))
+   .option(<span class="pl-s"><span class="pl-pds">"</span>checkpointLocation<span class="pl-pds">"</span></span>, tablePath.getStreamingCheckpointDir)
+   .option(<span class="pl-s"><span class="pl-pds">"</span>dbName<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>default<span class="pl-pds">"</span></span>)
+   .option(<span class="pl-s"><span class="pl-pds">"</span>tableName<span class="pl-pds">"</span></span>, <span class="pl-s"><span class="pl-pds">"</span>carbon_table<span class="pl-pds">"</span></span>)
+   .option(<span class="pl-en">CarbonStreamParser</span>.<span class="pl-en">CARBON_STREAM_PARSER</span>,
+     <span class="pl-en">CarbonStreamParser</span>.<span class="pl-en">CARBON_STREAM_PARSER_ROW_PARSER</span>)
+   .start()
+
+ ...</pre></div>
+<h3>
+<a id="how-to-implement-a-customized-stream-parser" class="anchor" href="#how-to-implement-a-customized-stream-parser" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>How to implement a customized stream parser</h3>
+<p>If user needs to implement a customized stream parser to convert a specific InternalRow to Object[], it needs to implement <code>initialize</code> method and <code>parserRow</code> method of interface <code>CarbonStreamParser</code>, for example:</p>
+<div class="highlight highlight-source-scala"><pre> <span class="pl-k">package</span> <span class="pl-en">org.XXX.XXX.streaming.parser</span>
+ 
+ <span class="pl-k">import</span> <span class="pl-smi">org.apache.hadoop.conf.</span><span class="pl-smi">Configuration</span>
+ <span class="pl-k">import</span> <span class="pl-smi">org.apache.spark.sql.catalyst.</span><span class="pl-smi">InternalRow</span>
+ <span class="pl-k">import</span> <span class="pl-smi">org.apache.spark.sql.types.</span><span class="pl-smi">StructType</span>
+ 
+ <span class="pl-k">class</span> <span class="pl-en">XXXStreamParserImp</span> <span class="pl-k">extends</span> <span class="pl-e">CarbonStreamParser</span> {
+ 
+   <span class="pl-k">override</span> <span class="pl-k">def</span> <span class="pl-en">initialize</span>(<span class="pl-v">configuration</span>: <span class="pl-en">Configuration</span>, <span class="pl-v">structType</span>: <span class="pl-en">StructType</span>)<span class="pl-k">:</span> <span class="pl-k">Unit</span> <span class="pl-k">=</span> {
+     <span class="pl-c"><span class="pl-c">//</span> user can get the properties from "configuration"</span>
+   }
+   
+   <span class="pl-k">override</span> <span class="pl-k">def</span> <span class="pl-en">parserRow</span>(<span class="pl-v">value</span>: <span class="pl-en">InternalRow</span>)<span class="pl-k">:</span> <span class="pl-en">Array</span>[<span class="pl-en">Object</span>] <span class="pl-k">=</span> {
+     <span class="pl-c"><span class="pl-c">//</span> convert InternalRow to Object[](Array[Object] in Scala) </span>
+   }
+   
+   <span class="pl-k">override</span> <span class="pl-k">def</span> <span class="pl-en">close</span>()<span class="pl-k">:</span> <span class="pl-k">Unit</span> <span class="pl-k">=</span> {
+   }
+ }
+   </pre></div>
+<p>and then set the property "carbon.stream.parser" to "org.XXX.XXX.streaming.parser.XXXStreamParserImp".</p>
+<h2>
 <a id="close-streaming-table" class="anchor" href="#close-streaming-table" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Close streaming table</h2>
 <p>Use below command to handoff all streaming segments to columnar format segments and modify the streaming property to false, this table becomes a normal table.</p>
 <div class="highlight highlight-source-sql"><pre><span class="pl-k">ALTER</span> <span class="pl-k">TABLE</span> streaming_table COMPACT <span class="pl-s"><span class="pl-pds">'</span>close_streaming<span class="pl-pds">'</span></span>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/content/troubleshooting.html
----------------------------------------------------------------------
diff --git a/content/troubleshooting.html b/content/troubleshooting.html
index 3a2e311..107fb23 100644
--- a/content/troubleshooting.html
+++ b/content/troubleshooting.html
@@ -288,7 +288,7 @@ For example, you can use scp to copy this file to all the nodes.</p>
 <a id="failed-to-load-data-on-the-cluster" class="anchor" href="#failed-to-load-data-on-the-cluster" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Failed to load data on the cluster</h2>
 <p><strong>Symptom</strong></p>
 <p>Data loading fails with the following exception :</p>
-<pre><code>Data Load failure exeception
+<pre><code>Data Load failure exception
 </code></pre>
 <p><strong>Possible Cause</strong></p>
 <p>The following issue can cause the failure :</p>
@@ -316,7 +316,7 @@ For example, you can use scp to copy this file to all the nodes.</p>
 <a id="failed-to-insert-data-on-the-cluster" class="anchor" href="#failed-to-insert-data-on-the-cluster" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Failed to insert data on the cluster</h2>
 <p><strong>Symptom</strong></p>
 <p>Insertion fails with the following exception :</p>
-<pre><code>Data Load failure exeception
+<pre><code>Data Load failure exception
 </code></pre>
 <p><strong>Possible Cause</strong></p>
 <p>The following issue can cause the failure :</p>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/content/useful-tips-on-carbondata.html
----------------------------------------------------------------------
diff --git a/content/useful-tips-on-carbondata.html b/content/useful-tips-on-carbondata.html
index cb19036..6df49a7 100644
--- a/content/useful-tips-on-carbondata.html
+++ b/content/useful-tips-on-carbondata.html
@@ -353,7 +353,7 @@ You can configure CarbonData by tuning following properties in carbon.properties
 <tr>
 <td>carbon.number.of.cores.block.sort</td>
 <td>Default: 7</td>
-<td>If you have huge memory and cpus, increase it as you will</td>
+<td>If you have huge memory and CPUs, increase it as you will</td>
 </tr>
 <tr>
 <td>carbon.merge.sort.reader.thread</td>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/main/webapp/data-management-on-carbondata.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/data-management-on-carbondata.html b/src/main/webapp/data-management-on-carbondata.html
index 846f11e..05c006b 100644
--- a/src/main/webapp/data-management-on-carbondata.html
+++ b/src/main/webapp/data-management-on-carbondata.html
@@ -182,7 +182,6 @@
 <li><a href="#update-and-delete">UPDATE AND DELETE</a></li>
 <li><a href="#compaction">COMPACTION</a></li>
 <li><a href="#partition">PARTITION</a></li>
-<li><a href="#pre-aggregate-tables">PRE-AGGREGATE TABLES</a></li>
 <li><a href="#bucketing">BUCKETING</a></li>
 <li><a href="#segment-management">SEGMENT MANAGEMENT</a></li>
 </ul>
@@ -249,11 +248,11 @@ And if you care about loading resources isolation strictly, because the system u
 <p>These properties are table level compaction configurations, if not specified, system level configurations in carbon.properties will be used.
 Following are 5 configurations:</p>
 <ul>
-<li>MAJOR_COMPACTION_SIZE: same meaning with carbon.major.compaction.size, size in MB.</li>
-<li>AUTO_LOAD_MERGE: same meaning with carbon.enable.auto.load.merge.</li>
-<li>COMPACTION_LEVEL_THRESHOLD: same meaning with carbon.compaction.level.threshold.</li>
-<li>COMPACTION_PRESERVE_SEGMENTS: same meaning with carbon.numberof.preserve.segments.</li>
-<li>ALLOWED_COMPACTION_DAYS: same meaning with carbon.allowed.compaction.days.</li>
+<li>MAJOR_COMPACTION_SIZE: same meaning as carbon.major.compaction.size, size in MB.</li>
+<li>AUTO_LOAD_MERGE: same meaning as carbon.enable.auto.load.merge.</li>
+<li>COMPACTION_LEVEL_THRESHOLD: same meaning as carbon.compaction.level.threshold.</li>
+<li>COMPACTION_PRESERVE_SEGMENTS: same meaning as carbon.numberof.preserve.segments.</li>
+<li>ALLOWED_COMPACTION_DAYS: same meaning as carbon.allowed.compaction.days.</li>
 </ul>
 <pre><code>TBLPROPERTIES ('MAJOR_COMPACTION_SIZE'='2048',
                'AUTO_LOAD_MERGE'='true',
@@ -272,26 +271,17 @@ Following are 5 configurations:</p>
 <h3>
 <a id="example" class="anchor" href="#example" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Example:</h3>
 <pre><code> CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                                productNumber Int,
-                                productName String,
-                                storeCity String,
-                                storeProvince String,
-                                productCategory String,
-                                productBatch String,
-                                saleQuantity Int,
-                                revenue Int)
+                                productNumber INT,
+                                productName STRING,
+                                storeCity STRING,
+                                storeProvince STRING,
+                                productCategory STRING,
+                                productBatch STRING,
+                                saleQuantity INT,
+                                revenue INT)
  STORED BY 'carbondata'
- TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber',
-                'NO_INVERTED_INDEX'='productBatch',
-                'SORT_COLUMNS'='productName,storeCity',
-                'SORT_SCOPE'='NO_SORT',
-                'TABLE_BLOCKSIZE'='512',
-                'MAJOR_COMPACTION_SIZE'='2048',
-                'AUTO_LOAD_MERGE'='true',
-                'COMPACTION_LEVEL_THRESHOLD'='5,6',
-                'COMPACTION_PRESERVE_SEGMENTS'='10',
- 			   'streaming'='true',
-                'ALLOWED_COMPACTION_DAYS'='5')
+ TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity',
+                'SORT_SCOPE'='NO_SORT')
 </code></pre>
 <h2>
 <a id="create-database" class="anchor" href="#create-database" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>CREATE DATABASE</h2>
@@ -324,7 +314,7 @@ OR
 SHOW TABLES IN defaultdb
 </code></pre>
 <h3>
-<a id="alter-talbe" class="anchor" href="#alter-talbe" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>ALTER TALBE</h3>
+<a id="alter-table" class="anchor" href="#alter-table" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>ALTER TABLE</h3>
 <p>The following section introduce the commands to modify the physical or logical state of the existing table(s).</p>
 <ul>
 <li>
@@ -333,9 +323,9 @@ SHOW TABLES IN defaultdb
 <pre><code>ALTER TABLE [db_name.]table_name RENAME TO new_table_name
 </code></pre>
 <p>Examples:</p>
-<pre><code>ALTER TABLE carbon RENAME TO carbondata
+<pre><code>ALTER TABLE carbon RENAME TO carbonTable
 OR
-ALTER TABLE test_db.carbon RENAME TO test_db.carbondata
+ALTER TABLE test_db.carbon RENAME TO test_db.carbonTable
 </code></pre>
 </li>
 <li>
@@ -408,14 +398,37 @@ Change of decimal data type from lower precision to higher precision will only b
 <li>Before executing this command the old table schema and data should be copied into the new database location.</li>
 <li>If the table is aggregate table, then all the aggregate tables should be copied to the new database location.</li>
 <li>For old store, the time zone of the source and destination cluster should be same.</li>
-<li>If old cluster uses HIVE meta store, refresh will not work as schema file does not exist in file system.</li>
+<li>If old cluster used HIVE meta store to store schema, refresh will not work as schema file does not exist in file system.</li>
 </ul>
+<h3>
+<a id="table-and-column-comment" class="anchor" href="#table-and-column-comment" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Table and Column Comment</h3>
+<p>You can provide more information on table by using table comment. Similarly you can provide more information about a particular column using column comment.
+You can see the column comment of an existing table using describe formatted command.</p>
+<pre><code>CREATE TABLE [IF NOT EXISTS] [db_name.]table_name[(col_name data_type [COMMENT col_comment], ...)]
+  [COMMENT table_comment]
+STORED BY 'carbondata'
+[TBLPROPERTIES (property_name=property_value, ...)]
+</code></pre>
+<p>Example:</p>
+<pre><code>CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
+                              productNumber Int COMMENT 'unique serial number for product')
+COMMENT ?This is table comment?
+ STORED BY 'carbondata'
+ TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber')
+</code></pre>
+<p>You can also SET and UNSET table comment using ALTER command.</p>
+<p>Example to SET table comment:</p>
+<pre><code>ALTER TABLE carbon SET TBLPROPERTIES ('comment'='this table comment is modified');
+</code></pre>
+<p>Example to UNSET table comment:</p>
+<pre><code>ALTER TABLE carbon UNSET TBLPROPERTIES ('comment');
+</code></pre>
 <h2>
 <a id="load-data" class="anchor" href="#load-data" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>LOAD DATA</h2>
 <h3>
 <a id="load-files-to-carbondata-table" class="anchor" href="#load-files-to-carbondata-table" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>LOAD FILES TO CARBONDATA TABLE</h3>
 <p>This command is used to load csv files to carbondata, OPTIONS are not mandatory for data loading process.
-Inside OPTIONS user can provide either of any options like DELIMITER, QUOTECHAR, FILEHEADER, ESCAPECHAR, MULTILINE as per requirement.</p>
+Inside OPTIONS user can provide any options like DELIMITER, QUOTECHAR, FILEHEADER, ESCAPECHAR, MULTILINE as per requirement.</p>
 <pre><code>LOAD DATA [LOCAL] INPATH 'folder_path' 
 INTO TABLE [db_name.]table_name 
 OPTIONS(property_name=property_value, ...)
@@ -438,6 +451,14 @@ OPTIONS(property_name=property_value, ...)
 </code></pre>
 </li>
 <li>
+<p><strong>HEADER:</strong> When you load the CSV file without the file header and the file header is the same with the table schema, then add 'HEADER'='false' to load data SQL as user need not provide the file header. By default the value is 'true'.
+false: CSV file is without file header.
+true: CSV file is with file header.</p>
+<pre><code>OPTIONS('HEADER'='false') 
+</code></pre>
+<p>NOTE: If the HEADER option exist and is set to 'true', then the FILEHEADER option is not required.</p>
+</li>
+<li>
 <p><strong>FILEHEADER:</strong> Headers can be provided in the LOAD DATA command if headers are missing in the source files.</p>
 <pre><code>OPTIONS('FILEHEADER'='column1,column2') 
 </code></pre>
@@ -448,7 +469,7 @@ OPTIONS(property_name=property_value, ...)
 </code></pre>
 </li>
 <li>
-<p><strong>ESCAPECHAR:</strong> Escape char can be provided if user want strict validation of escape character on CSV.</p>
+<p><strong>ESCAPECHAR:</strong> Escape char can be provided if user want strict validation of escape character in CSV files.</p>
 <pre><code>OPTIONS('ESCAPECHAR'='\') 
 </code></pre>
 </li>
@@ -499,6 +520,7 @@ OPTIONS(property_name=property_value, ...)
 <p>Example:</p>
 <pre><code>LOAD DATA local inpath '/opt/rawdata/data.csv' INTO table carbontable
 options('DELIMITER'=',', 'QUOTECHAR'='"','COMMENTCHAR'='#',
+'HEADER'='false',
 'FILEHEADER'='empno,empname,designation,doj,workgroupcategory,
 workgroupcategoryname,deptno,deptname,projectcode,
 projectjoindate,projectenddate,attendance,utilization,salary',
@@ -523,10 +545,10 @@ projectjoindate,projectenddate,attendance,utilization,salary',
 <li>BAD_RECORDS_ACTION property can have four type of actions for bad records FORCE, REDIRECT, IGNORE and FAIL.</li>
 <li>FAIL option is its Default value. If the FAIL option is used, then data loading fails if any bad records are found.</li>
 <li>If the REDIRECT option is used, CarbonData will add all bad records in to a separate CSV file. However, this file must not be used for subsequent data loading because the content may not exactly match the source record. You are advised to cleanse the original source record for further data ingestion. This option is used to remind you which records are bad records.</li>
-<li>If the FORCE option is used, then it auto-corrects the data by storing the bad records as NULL before Loading data.</li>
+<li>If the FORCE option is used, then it auto-converts the data by storing the bad records as NULL before Loading data.</li>
 <li>If the IGNORE option is used, then bad records are neither loaded nor written to the separate CSV file.</li>
 <li>In loaded data, if all records are bad records, the BAD_RECORDS_ACTION is invalid and the load operation fails.</li>
-<li>The maximum number of characters per column is 100000. If there are more than 100000 characters in a column, data loading will fail.</li>
+<li>The maximum number of characters per column is 32000. If there are more than 32000 characters in a column, data loading will fail.</li>
 </ul>
 <p>Example:</p>
 <pre><code>LOAD DATA INPATH 'filepath.csv' INTO TABLE tablename
@@ -572,7 +594,7 @@ It comes with the functionality to aggregate the records of a table by performin
 SET (column_name1, column_name2, ... column_name n) = (column1_expression , column2_expression, ... column n_expression )
 [ WHERE { &lt;filter_condition&gt; } ]
 </code></pre>
-<p>alternatively the following the command can also be used for updating the CarbonData Table :</p>
+<p>alternatively the following command can also be used for updating the CarbonData Table :</p>
 <pre><code>UPDATE &lt;table_name&gt;
 SET (column_name1, column_name2) =(select sourceColumn1, sourceColumn2 from sourceTable [ WHERE { &lt;filter_condition&gt; } ] )
 [ WHERE { &lt;filter_condition&gt; } ]
@@ -605,8 +627,7 @@ SET (column_name1, column_name2) =(select sourceColumn1, sourceColumn2 from sour
 </code></pre>
 <h2>
 <a id="compaction" class="anchor" href="#compaction" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>COMPACTION</h2>
-<p>Compaction improves the query performance significantly.
-During the load data, several CarbonData files are generated, this is because data is sorted only within each load (per load segment and one B+ tree index).</p>
+<p>Compaction improves the query performance significantly.</p>
 <p>There are two types of compaction, Minor and Major compaction.</p>
 <pre><code>ALTER TABLE [db_name.]table_name COMPACT 'MINOR/MAJOR'
 </code></pre>
@@ -627,7 +648,8 @@ If any segments are available to be merged, then compaction will run parallel wi
 </ul>
 <p>In Major compaction, multiple segments can be merged into one large segment.
 User will specify the compaction size until which segments can be merged, Major compaction is usually done during the off-peak time.
-This command merges the specified number of segments into one segment:</p>
+Configure the property carbon.major.compaction.size with appropriate value in MB.</p>
+<p>This command merges the specified number of segments into one segment:</p>
 <pre><code>ALTER TABLE table_name COMPACT 'MAJOR'
 </code></pre>
 <ul>
@@ -653,13 +675,13 @@ This command merges the specified number of segments into one segment:</p>
 </code></pre>
 <p>Example:</p>
 <pre><code> CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                              productNumber Int,
-                              productName String,
-                              storeCity String,
-                              storeProvince String,
-                              saleQuantity Int,
-                              revenue Int)
-PARTITIONED BY (productCategory String, productBatch String)
+                              productNumber INT,
+                              productName STRING,
+                              storeCity STRING,
+                              storeProvince STRING,
+                              saleQuantity INT,
+                              revenue INT)
+PARTITIONED BY (productCategory STRING, productBatch STRING)
 STORED BY 'carbondata'
 </code></pre>
 <h4>
@@ -667,15 +689,13 @@ STORED BY 'carbondata'
 <p>This command allows you to load data using static partition.</p>
 <pre><code>LOAD DATA [LOCAL] INPATH 'folder_path' 
 INTO TABLE [db_name.]table_name PARTITION (partition_spec) 
-OPTIONS(property_name=property_value, ...)
-  
+OPTIONS(property_name=property_value, ...)    
 INSERT INTO INTO TABLE [db_name.]table_name PARTITION (partition_spec) &lt;SELECT STATMENT&gt;
 </code></pre>
 <p>Example:</p>
 <pre><code>LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.csv'
 INTO TABLE locationTable
-PARTITION (country = 'US', state = 'CA')
-  
+PARTITION (country = 'US', state = 'CA')  
 INSERT INTO TABLE locationTable
 PARTITION (country = 'US', state = 'AL')
 SELECT &lt;columns list excluding partition columns&gt; FROM another_user
@@ -685,8 +705,7 @@ SELECT &lt;columns list excluding partition columns&gt; FROM another_user
 <p>This command allows you to load data using dynamic partition. If partition spec is not specified, then the partition is considered as dynamic.</p>
 <p>Example:</p>
 <pre><code>LOAD DATA LOCAL INPATH '${env:HOME}/staticinput.csv'
-INTO TABLE locationTable
-        
+INTO TABLE locationTable          
 INSERT INTO TABLE locationTable
 SELECT &lt;columns list excluding partition columns&gt; FROM another_user
 </code></pre>
@@ -702,7 +721,7 @@ SELECT &lt;columns list excluding partition columns&gt; FROM another_user
 </code></pre>
 <h4>
 <a id="insert-overwrite" class="anchor" href="#insert-overwrite" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Insert OVERWRITE</h4>
-<p>This command allows you to insert or load overwrite on a spcific partition.</p>
+<p>This command allows you to insert or load overwrite on a specific partition.</p>
 <pre><code> INSERT OVERWRITE TABLE table_name
  PARTITION (column = 'partition_name')
  select_statement
@@ -729,12 +748,12 @@ STORED BY 'carbondata'
 <p>NOTE: N is the number of hash partitions</p>
 <p>Example:</p>
 <pre><code>CREATE TABLE IF NOT EXISTS hash_partition_table(
-    col_A String,
-    col_B Int,
-    col_C Long,
-    col_D Decimal(10,2),
-    col_F Timestamp
-) PARTITIONED BY (col_E Long)
+    col_A STRING,
+    col_B INT,
+    col_C LONG,
+    col_D DECIMAL(10,2),
+    col_F TIMESTAMP
+) PARTITIONED BY (col_E LONG)
 STORED BY 'carbondata' TBLPROPERTIES('PARTITION_TYPE'='HASH','NUM_PARTITIONS'='9')
 </code></pre>
 <h3>
@@ -754,11 +773,11 @@ STORED BY 'carbondata'
 </ul>
 <p>Example:</p>
 <pre><code>CREATE TABLE IF NOT EXISTS range_partition_table(
-    col_A String,
-    col_B Int,
-    col_C Long,
-    col_D Decimal(10,2),
-    col_E Long
+    col_A STRING,
+    col_B INT,
+    col_C LONG,
+    col_D DECIMAL(10,2),
+    col_E LONG
  ) partitioned by (col_F Timestamp)
  PARTITIONED BY 'carbondata'
  TBLPROPERTIES('PARTITION_TYPE'='RANGE',
@@ -777,12 +796,12 @@ STORED BY 'carbondata'
 <p>NOTE: List partition supports list info in one level group.</p>
 <p>Example:</p>
 <pre><code>CREATE TABLE IF NOT EXISTS list_partition_table(
-    col_B Int,
-    col_C Long,
-    col_D Decimal(10,2),
-    col_E Long,
-    col_F Timestamp
- ) PARTITIONED BY (col_A String)
+    col_B INT,
+    col_C LONG,
+    col_D DECIMAL(10,2),
+    col_E LONG,
+    col_F TIMESTAMP
+ ) PARTITIONED BY (col_A STRING)
  STORED BY 'carbondata'
  TBLPROPERTIES('PARTITION_TYPE'='LIST',
  'LIST_INFO'='aaaa, bbbb, (cccc, dddd), eeee')
@@ -824,234 +843,6 @@ SegmentDir/part-0-0_batchno0-0-1502703086921.carbondata
 <li>When writing SQL on a partition table, try to use filters on the partition column.</li>
 </ul>
 <h2>
-<a id="pre-aggregate-tables" class="anchor" href="#pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>PRE-AGGREGATE TABLES</h2>
-<p>Carbondata supports pre aggregating of data so that OLAP kind of queries can fetch data
-much faster.Aggregate tables are created as datamaps so that the handling is as efficient as
-other indexing support.Users can create as many aggregate tables they require as datamaps to
-improve their query performance,provided the storage requirements and loading speeds are
-acceptable.</p>
-<p>For main table called <strong>sales</strong> which is defined as</p>
-<pre><code>CREATE TABLE sales (
-order_time timestamp,
-user_id string,
-sex string,
-country string,
-quantity int,
-price bigint)
-STORED BY 'carbondata'
-</code></pre>
-<p>user can create pre-aggregate tables using the DDL</p>
-<pre><code>CREATE DATAMAP agg_sales
-ON TABLE sales
-USING "preaggregate"
-AS
-SELECT country, sex, sum(quantity), avg(price)
-FROM sales
-GROUP BY country, sex
-</code></pre>
-<p><b></b></p><p align="left">Functions supported in pre-aggregate tables</p>
-<table>
-<thead>
-<tr>
-<th>Function</th>
-<th>Rollup supported</th>
-</tr>
-</thead>
-<tbody>
-<tr>
-<td>SUM</td>
-<td>Yes</td>
-</tr>
-<tr>
-<td>AVG</td>
-<td>Yes</td>
-</tr>
-<tr>
-<td>MAX</td>
-<td>Yes</td>
-</tr>
-<tr>
-<td>MIN</td>
-<td>Yes</td>
-</tr>
-<tr>
-<td>COUNT</td>
-<td>Yes</td>
-</tr>
-</tbody>
-</table>
-<h5>
-<a id="how-pre-aggregate-tables-are-selected" class="anchor" href="#how-pre-aggregate-tables-are-selected" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>How pre-aggregate tables are selected</h5>
-<p>For the main table <strong>sales</strong> and pre-aggregate table <strong>agg_sales</strong> created above, queries of the
-kind</p>
-<pre><code>SELECT country, sex, sum(quantity), avg(price) from sales GROUP BY country, sex
-
-SELECT sex, sum(quantity) from sales GROUP BY sex
-
-SELECT sum(price), country from sales GROUP BY country
-</code></pre>
-<p>will be transformed by Query Planner to fetch data from pre-aggregate table <strong>agg_sales</strong></p>
-<p>But queries of kind</p>
-<pre><code>SELECT user_id, country, sex, sum(quantity), avg(price) from sales GROUP BY user_id, country, sex
-
-SELECT sex, avg(quantity) from sales GROUP BY sex
-
-SELECT country, max(price) from sales GROUP BY country
-</code></pre>
-<p>will fetch the data from the main table <strong>sales</strong></p>
-<h5>
-<a id="loading-data-to-pre-aggregate-tables" class="anchor" href="#loading-data-to-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Loading data to pre-aggregate tables</h5>
-<p>For existing table with loaded data, data load to pre-aggregate table will be triggered by the
-CREATE DATAMAP statement when user creates the pre-aggregate table.
-For incremental loads after aggregates tables are created, loading data to main table triggers
-the load to pre-aggregate tables once main table loading is complete.These loads are automic
-meaning that data on main table and aggregate tables are only visible to the user after all tables
-are loaded</p>
-<h5>
-<a id="querying-data-from-pre-aggregate-tables" class="anchor" href="#querying-data-from-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Querying data from pre-aggregate tables</h5>
-<p>Pre-aggregate tables cannot be queries directly.Queries are to be made on main table.Internally
-carbondata will check associated pre-aggregate tables with the main table and if the
-pre-aggregate tables satisfy the query condition, the plan is transformed automatically to use
-pre-aggregate table to fetch the data</p>
-<h5>
-<a id="compacting-pre-aggregate-tables" class="anchor" href="#compacting-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Compacting pre-aggregate tables</h5>
-<p>Compaction command (ALTER TABLE COMPACT) need to be run separately on each pre-aggregate table.
-Running Compaction command on main table will <strong>not automatically</strong> compact the pre-aggregate
-tables.Compaction is an optional operation for pre-aggregate table. If compaction is performed on
-main table but not performed on pre-aggregate table, all queries still can benefit from
-pre-aggregate tables.To further improve performance on pre-aggregate tables, compaction can be
-triggered on pre-aggregate tables directly, it will merge the segments inside pre-aggregate table.</p>
-<h5>
-<a id="updatedelete-operations-on-pre-aggregate-tables" class="anchor" href="#updatedelete-operations-on-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Update/Delete Operations on pre-aggregate tables</h5>
-<p>This functionality is not supported.</p>
-<p>NOTE (<b>RESTRICTION</b>):</p>
-<ul>
-<li>Update/Delete operations are <b>not supported</b> on main table which has pre-aggregate tables
-created on it.All the pre-aggregate tables <b>will have to be dropped</b> before update/delete
-operations can be performed on the main table.Pre-aggregate tables can be rebuilt manually
-after update/delete operations are completed</li>
-</ul>
-<h5>
-<a id="delete-segment-operations-on-pre-aggregate-tables" class="anchor" href="#delete-segment-operations-on-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Delete Segment Operations on pre-aggregate tables</h5>
-<p>This functionality is not supported.</p>
-<p>NOTE (<b>RESTRICTION</b>):</p>
-<ul>
-<li>Delete Segment operations are <b>not supported</b> on main table which has pre-aggregate tables
-created on it.All the pre-aggregate tables <b>will have to be dropped</b> before update/delete
-operations can be performed on the main table.Pre-aggregate tables can be rebuilt manually
-after delete segment operations are completed</li>
-</ul>
-<h5>
-<a id="alter-table-operations-on-pre-aggregate-tables" class="anchor" href="#alter-table-operations-on-pre-aggregate-tables" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Alter Table Operations on pre-aggregate tables</h5>
-<p>This functionality is not supported.</p>
-<p>NOTE (<b>RESTRICTION</b>):</p>
-<ul>
-<li>Adding new column in new table does not have any affect on pre-aggregate tables. However if
-dropping or renaming a column has impact in pre-aggregate table, such operations will be
-rejected and error will be thrown.All the pre-aggregate tables <b>will have to be dropped</b>
-before Alter Operations can be performed on the main table.Pre-aggregate tables can be rebuilt
-manually after Alter Table operations are completed</li>
-</ul>
-<h3>
-<a id="supporting-timeseries-data-alpha-feature-in-130" class="anchor" href="#supporting-timeseries-data-alpha-feature-in-130" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>Supporting timeseries data (Alpha feature in 1.3.0)</h3>
-<p>Carbondata has built-in understanding of time hierarchy and levels: year, month, day, hour, minute.
-Multiple pre-aggregate tables can be created for the hierarchy and Carbondata can do automatic
-roll-up for the queries on these hierarchies.</p>
-<pre><code>CREATE DATAMAP agg_year
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'year_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-  
-CREATE DATAMAP agg_month
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'month_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-  
-CREATE DATAMAP agg_day
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'day_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-      
-CREATE DATAMAP agg_sales_hour
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'hour_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-
-CREATE DATAMAP agg_minute
-ON TABLE sales
-USING "timeseries"
-DMPROPERTIES (
-'event_time?=?order_time?,
-'minute_granualrity?=?1?,
-) AS
-SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
- avg(price) FROM sales GROUP BY order_time, country, sex
-</code></pre>
-<p>For Querying data and automatically roll-up to the desired aggregation level,Carbondata supports
-UDF as</p>
-<pre><code>timeseries(timeseries column name, ?aggregation level?)
-</code></pre>
-<pre><code>Select timeseries(order_time, ?hour?), sum(quantity) from sales group by timeseries(order_time,
-?hour?)
-</code></pre>
-<p>It is <strong>not necessary</strong> to create pre-aggregate tables for each granularity unless required for
-query.Carbondata can roll-up the data and fetch it.</p>
-<p>For Example: For main table <strong>sales</strong> , If pre-aggregate tables were created as</p>
-<pre><code>CREATE DATAMAP agg_day
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time?=?order_time?,
-  'day_granualrity?=?1?,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-        
-  CREATE DATAMAP agg_sales_hour
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-  'event_time?=?order_time?,
-  'hour_granualrity?=?1?,
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-</code></pre>
-<p>Queries like below will be rolled-up and fetched from pre-aggregate tables</p>
-<pre><code>Select timeseries(order_time, ?month?), sum(quantity) from sales group by timeseries(order_time,
-  ?month?)
-  
-Select timeseries(order_time, ?year?), sum(quantity) from sales group by timeseries(order_time,
-  ?year?)
-</code></pre>
-<p>NOTE (<b>RESTRICTION</b>):</p>
-<ul>
-<li>Only value of 1 is supported for hierarchy levels. Other hierarchy levels are not supported.
-Other hierarchy levels are not supported</li>
-<li>pre-aggregate tables for the desired levels needs to be created one after the other</li>
-<li>pre-aggregate tables created for each level needs to be dropped separately</li>
-</ul>
-<h2>
 <a id="bucketing" class="anchor" href="#bucketing" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>BUCKETING</h2>
 <p>Bucketing feature can be used to distribute/organize the table/partition data into multiple files such
 that similar records are present in the same file. While creating a table, user needs to specify the
@@ -1065,19 +856,19 @@ TBLPROPERTIES('BUCKETNUMBER'='noOfBuckets',
 </code></pre>
 <p>NOTE:</p>
 <ul>
-<li>Bucketing can not be performed for columns of Complex Data Types.</li>
-<li>Columns in the BUCKETCOLUMN parameter must be only dimension. The BUCKETCOLUMN parameter can not be a measure or a combination of measures and dimensions.</li>
+<li>Bucketing cannot be performed for columns of Complex Data Types.</li>
+<li>Columns in the BUCKETCOLUMN parameter must be dimensions. The BUCKETCOLUMN parameter cannot be a measure or a combination of measures and dimensions.</li>
 </ul>
 <p>Example:</p>
 <pre><code>CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                              productNumber Int,
-                              saleQuantity Int,
-                              productName String,
-                              storeCity String,
-                              storeProvince String,
-                              productCategory String,
-                              productBatch String,
-                              revenue Int)
+                              productNumber INT,
+                              saleQuantity INT,
+                              productName STRING,
+                              storeCity STRING,
+                              storeProvince STRING,
+                              productCategory STRING,
+                              productBatch STRING,
+                              revenue INT)
 STORED BY 'carbondata'
 TBLPROPERTIES ('BUCKETNUMBER'='4', 'BUCKETCOLUMNS'='productName')
 </code></pre>
@@ -1085,7 +876,7 @@ TBLPROPERTIES ('BUCKETNUMBER'='4', 'BUCKETCOLUMNS'='productName')
 <a id="segment-management" class="anchor" href="#segment-management" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>SEGMENT MANAGEMENT</h2>
 <h3>
 <a id="show-segment" class="anchor" href="#show-segment" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>SHOW SEGMENT</h3>
-<p>This command is used to get the segments of CarbonData table.</p>
+<p>This command is used to list the segments of CarbonData table.</p>
 <pre><code>SHOW SEGMENTS FOR TABLE [db_name.]table_name LIMIT number_of_segments
 </code></pre>
 <p>Example:</p>
@@ -1125,13 +916,13 @@ The segment created before the particular date will be removed from the specific
 </code></pre>
 <p>NOTE:
 carbon.input.segments: Specifies the segment IDs to be queried. This property allows you to query specified segments of the specified table. The CarbonScan will read data from specified segments only.</p>
-<p>If user wants to query with segments reading in multi threading mode, then CarbonSession.threadSet can be used instead of SET query.</p>
+<p>If user wants to query with segments reading in multi threading mode, then CarbonSession. threadSet can be used instead of SET query.</p>
 <pre><code>CarbonSession.threadSet ("carbon.input.segments.&lt;database_name&gt;.&lt;table_name&gt;","&lt;list of segment IDs&gt;");
 </code></pre>
 <p>Reset the segment IDs</p>
 <pre><code>SET carbon.input.segments.&lt;database_name&gt;.&lt;table_name&gt; = *;
 </code></pre>
-<p>If user wants to query with segments reading in multi threading mode, then CarbonSession.threadSet can be used instead of SET query.</p>
+<p>If user wants to query with segments reading in multi threading mode, then CarbonSession. threadSet can be used instead of SET query.</p>
 <pre><code>CarbonSession.threadSet ("carbon.input.segments.&lt;database_name&gt;.&lt;table_name&gt;","*");
 </code></pre>
 <p><strong>Examples:</strong></p>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/main/webapp/faq.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/faq.html b/src/main/webapp/faq.html
index b42f8bd..b51b071 100644
--- a/src/main/webapp/faq.html
+++ b/src/main/webapp/faq.html
@@ -236,7 +236,7 @@ The property carbon.lock.type configuration specifies the type of lock to be acq
 <h2>
 <a id="how-carbon-will-behave-when-execute-insert-operation-in-abnormal-scenarios" class="anchor" href="#how-carbon-will-behave-when-execute-insert-operation-in-abnormal-scenarios" aria-hidden="true"><span aria-hidden="true" class="octicon octicon-link"></span></a>How Carbon will behave when execute insert operation in abnormal scenarios?</h2>
 <p>Carbon support insert operation, you can refer to the syntax mentioned in <a href="dml-operation-on-carbondata.html">DML Operations on CarbonData</a>.
-First, create a soucre table in spark-sql and load data into this created table.</p>
+First, create a source table in spark-sql and load data into this created table.</p>
 <pre><code>CREATE TABLE source_table(
 id String,
 name String,
@@ -266,7 +266,7 @@ id  city    name
 3   davi    shenzhen
 </code></pre>
 <p>As result shows, the second column is city in carbon table, but what inside is name, such as jack. This phenomenon is same with insert data into hive table.</p>
-<p>If you want to insert data into corresponding column in carbon table, you have to specify the column order same in insert statment.</p>
+<p>If you want to insert data into corresponding column in carbon table, you have to specify the column order same in insert statement.</p>
 <pre><code>INSERT INTO TABLE carbon_table SELECT id, city, name FROM source_table;
 </code></pre>
 <p><strong>Scenario 2</strong> :</p>

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/main/webapp/installation-guide.html
----------------------------------------------------------------------
diff --git a/src/main/webapp/installation-guide.html b/src/main/webapp/installation-guide.html
index 7da254e..5f1df57 100644
--- a/src/main/webapp/installation-guide.html
+++ b/src/main/webapp/installation-guide.html
@@ -388,7 +388,6 @@ mv carbondata.tar.gz carbonlib/
 <p>a. cd <code>$SPARK_HOME</code></p>
 <p>b. Run the following command to start the CarbonData thrift server.</p>
 <pre><code>./bin/spark-submit
---conf spark.sql.hive.thriftServer.singleSession=true
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
 $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR &lt;carbon_store_path&gt;
 </code></pre>
@@ -413,12 +412,18 @@ $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR &lt;carbon_store_path&gt;
 </tr>
 </tbody>
 </table>
+<p><strong>NOTE</strong>: From Spark 1.6, by default the Thrift server runs in multi-session mode. Which means each JDBC/ODBC connection owns a copy of their own SQL configuration and temporary function registry. Cached tables are still shared though. If you prefer to run the Thrift server in single-session mode and share all SQL configuration and temporary function registry, please set option <code>spark.sql.hive.thriftServer.singleSession</code> to <code>true</code>. You may either add this option to <code>spark-defaults.conf</code>, or pass it to <code>spark-submit.sh</code> via <code>--conf</code>:</p>
+<pre><code>./bin/spark-submit
+--conf spark.sql.hive.thriftServer.singleSession=true
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+$SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR &lt;carbon_store_path&gt;
+</code></pre>
+<p><strong>But</strong> in single-session mode, if one user changes the database from one connection, the database of the other connections will be changed too.</p>
 <p><strong>Examples</strong></p>
 <ul>
 <li>Start with default memory and executors.</li>
 </ul>
 <pre><code>./bin/spark-submit
---conf spark.sql.hive.thriftServer.singleSession=true
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
 $SPARK_HOME/carbonlib
 /carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
@@ -427,7 +432,7 @@ hdfs://&lt;host_name&gt;:port/user/hive/warehouse/carbon.store
 <ul>
 <li>Start with Fixed executors and resources.</li>
 </ul>
-<pre><code>./bin/spark-submit --conf spark.sql.hive.thriftServer.singleSession=true 
+<pre><code>./bin/spark-submit
 --class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
 --num-executors 3 --driver-memory 20g --executor-memory 250g 
 --executor-cores 32 

http://git-wip-us.apache.org/repos/asf/carbondata-site/blob/b0888c1b/src/main/webapp/pdf/maven-pdf-plugin.pdf
----------------------------------------------------------------------
diff --git a/src/main/webapp/pdf/maven-pdf-plugin.pdf b/src/main/webapp/pdf/maven-pdf-plugin.pdf
new file mode 100644
index 0000000..cb6de01
Binary files /dev/null and b/src/main/webapp/pdf/maven-pdf-plugin.pdf differ