You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by ch...@apache.org on 2016/07/20 14:15:58 UTC

[1/2] incubator-carbondata git commit: Updated docs as per latest changes

Repository: incubator-carbondata
Updated Branches:
  refs/heads/master 1730082ff -> 56a1e402f


Updated docs as per latest changes

Updated doc


Project: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/commit/d13f4700
Tree: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/tree/d13f4700
Diff: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/diff/d13f4700

Branch: refs/heads/master
Commit: d13f4700ce54afd0e6e6bbafd5d1252a710f2d84
Parents: 1730082
Author: ravipesala <ra...@gmail.com>
Authored: Tue Jul 19 12:41:30 2016 +0530
Committer: chenliang613 <ch...@apache.org>
Committed: Wed Jul 20 22:13:28 2016 +0800

----------------------------------------------------------------------
 README.md                                       |  19 +--
 docs/Carbon-Interfaces.md                       |  72 ----------
 docs/Carbon-Packaging-and-Interfaces.md         |  72 ++++++++++
 docs/Carbondata-Management.md                   | 144 -------------------
 docs/DDL-Operations-on-Carbon.md                | 131 ++++++-----------
 docs/DML-Operations-on-Carbon.md                | 138 ++++++------------
 docs/Data-Management.md                         | 141 ++++++++++++++++++
 ...stalling-CarbonData-And-IDE-Configuartion.md |  46 ++----
 docs/Quick-Start.md                             |  89 +++++-------
 9 files changed, 364 insertions(+), 488 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index c26367c..0b6535b 100644
--- a/README.md
+++ b/README.md
@@ -30,33 +30,36 @@ CarbonData file format is a columnar store in HDFS, it has many features that a
 * Column group: Allow multiple columns to form a column group that would be stored as row format. This reduces the row reconstruction cost at query time.
 * Supports for various use cases with one single Data format : like interactive OLAP-style query, Sequential Access (big scan), Random Access (narrow scan). 
 
+### Documentation
+Please visit [CarbonData cwiki](https://cwiki.apache.org/confluence/display/CARBONDATA)
+
 ### Building CarbonData and using Development tools
-Please refer [Building CarbonData and configuring IDE](docs/Installing-CarbonData-And-IDE-Configuartion.md)
+Please refer [Building CarbonData and configuring IDE](https://cwiki.apache.org/confluence/display/CARBONDATA/Building+CarbonData+And+IDE+Configuration)
 
 ### Getting Started
-Read the [quick start](docs/Quick-Start.md)
+Read the [quick start](https://cwiki.apache.org/confluence/display/CARBONDATA/Quick+Start)
 
 ### Usage of CarbonData
- [DDL Operations on CarbonData](docs/DDL-Operations-on-Carbon.md) 
+ [DDL Operations on CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/DDL+operations+on+CarbonData) 
  
- [DML Operations on CarbonData](docs/DML-Operations-on-Carbon.md)  
+ [DML Operations on CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/DML+operations+on+CarbonData)  
  
- [CarbonData data management](docs/Carbondata-Management.md)  
+ [CarbonData data management](https://cwiki.apache.org/confluence/display/CARBONDATA/Data+Management)  
 
 ### CarbonData File Structure and interfaces
-Please refer [CarbonData File Format](docs/Carbondata-File-Structure-and-Format.md) and [CarbonData Interfaces](docs/Carbon-Interfaces.md)
+Please refer [CarbonData File Format](https://cwiki.apache.org/confluence/display/CARBONDATA/CarbonData+File+Structure+and+Format) and [CarbonData Interfaces](https://cwiki.apache.org/confluence/display/CARBONDATA/Carbon+Packaging+and+Interfaces)
 
 ### Other Technical Material
 [Apache CarbonData meetup material](docs/Apache-CarbonData-meetup-material.pdf)
 
 ### Fork and Contribute
 This is an active open source project for everyone, and we are always open to people who want to use this system or contribute to it. 
-This guide document introduce [how to contribute to CarbonData](docs/How-to-contribute-to-Apache-CarbonData.md).
+This guide document introduce [how to contribute to CarbonData](https://cwiki.apache.org/confluence/display/CARBONDATA/Contributing+to+CarbonData).
 
 ### Contact us
 To get involved in CarbonData:
 
-* [Subscribe:dev@carbondata.incubator.apache.org](mailto:dev-subscribe@carbondata.incubator.apache.org) then [mail](mailto:dev@carbondata.incubator.apache.org) to us
+* We discuss design and implementation issues on [dev@carbondata.incubator.apache.org](mailto:dev@carbondata.incubator.apache.org) Join by emailing [dev-subscribe@carbondata.incubator.apache.org](mailto:dev-subscribe@carbondata.incubator.apache.org)
 * Report issues on [Jira](https://issues.apache.org/jira/browse/CARBONDATA).
 
 ## About

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/docs/Carbon-Interfaces.md
----------------------------------------------------------------------
diff --git a/docs/Carbon-Interfaces.md b/docs/Carbon-Interfaces.md
deleted file mode 100644
index dcd3c4a..0000000
--- a/docs/Carbon-Interfaces.md
+++ /dev/null
@@ -1,72 +0,0 @@
-## Packaging
-Carbon provides following JAR packages:
-
-![carbon modules2](https://cloud.githubusercontent.com/assets/6500698/14255195/831c6e90-fac5-11e5-87ab-3b16d84918fb.png)
-
-- carbon-store.jar or carbondata-assembly.jar: This is the main Jar for carbon project, the target user of it are both user and developer. 
-      - For MapReduce application users, this jar provides API to read and write carbon files through CarbonInput/OutputFormat in carbon-hadoop module.
-      - For developer, this jar can be used to integrate carbon with processing engine like spark and hive, by leveraging API in carbon-processing module.
-
-- carbon-spark.jar(Currently it is part of assembly jar): provides support for spark user, spark user can manipulate carbon data files by using native spark DataFrame/SQL interface. Apart from this, in order to leverage carbon's builtin lifecycle management function, higher level concept like Managed Carbon Table, Database and corresponding DDL are introduced.
-
-- carbon-hive.jar(not yet provided): similar to carbon-spark, which provides integration to carbon and hive.
-
-## API
-Carbon can be used in following scenarios:
-### 1. For MapReduce application user
-This User API is provided by carbon-hadoop. In this scenario, user can process carbon files in his MapReduce application by choosing CarbonInput/OutputFormat, and is responsible using it correctly.Currently only CarbonInputFormat is provided and OutputFormat will be provided soon.
-
-
-### 2. For Spark user 
-This User API is provided by the Spark itself. There are also two levels of APIs
--  **Carbon File**
-
-Similar to parquet, json, or other data source in Spark, carbon can be used with data source API. For example(please refer to DataFrameAPIExample for the more detail):
-```
-// User can create a DataFrame from any data source or transformation.
-val df = ...
-
-// Write data
-// User can write a DataFrame to a carbon file
- df.write
-   .format("org.apache.spark.sql.CarbonSource")
-   .option("tableName", "carbontable")
-   .mode(SaveMode.Overwrite)
-   .save()
-
-
-// read carbon data by data source API
-df = carbonContext.read
-  .format("org.apache.spark.sql.CarbonSource")
-  .option("tableName", "carbontable")
-  .load("/path")
-
-// User can then use DataFrame for analysis
-df.count
-SVMWithSGD.train(df, numIterations)
-
-// User can also register the DataFrame with a table name, and use SQL for analysis
-df.registerTempTable("t1")  // register temporary table in SparkSQL catalog
-df.registerHiveTable("t2")  // Or, use a implicit funtion to register to Hive metastore
-sqlContext.sql("select count(*) from t1").show
-```
-
-- **Managed Carbon Table**
-
-Since carbon has builtin support for high level concept like Table, Database, and supports full data lifecycle management, instead of dealing with just files, user can use carbon specific DDL to manipulate data in Table and Database level. Please refer [DDL](https://github.com/HuaweiBigData/carbondata/wiki/Language-Manual:-DDL) and [DML] (https://github.com/HuaweiBigData/carbondata/wiki/Language-Manual:-DML)
-
-For example:
-```
-// Use SQL to manage table and query data
-carbonContext.sql("create database db1")
-carbonContext.sql("use database db1")
-carbonContext.sql("show databases")
-carbonContext.sql("create table tbl1 using org.carbondata.spark")
-carbonContext.sql("load data into table tlb1 path 'some_files'")
-carbonContext.sql("select count(*) from tbl1")
-```
-
-### 3. For developer
-For developer who want to integrate carbon into a processing engine like spark/hive/flink, use API provided by carbon-hadoop and carbon-processing:
-  - Query: integrate carbon-hadoop with engine specific API, like spark data source API 
-  - Data life cycle management: carbon provides utility functions in carbon-processing to manage data life cycle, like data loading, compact, retention, schema evolution. Developer can implement DDLs of their choice and leverage these utility function to do data life cycle management.

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/docs/Carbon-Packaging-and-Interfaces.md
----------------------------------------------------------------------
diff --git a/docs/Carbon-Packaging-and-Interfaces.md b/docs/Carbon-Packaging-and-Interfaces.md
new file mode 100644
index 0000000..9897480
--- /dev/null
+++ b/docs/Carbon-Packaging-and-Interfaces.md
@@ -0,0 +1,72 @@
+## Packaging
+Carbon provides following JAR packages:
+
+![carbon modules2](https://cloud.githubusercontent.com/assets/6500698/14255195/831c6e90-fac5-11e5-87ab-3b16d84918fb.png)
+
+- **carbon-store.jar or carbondata-assembly.jar:** This is the main Jar for carbon project, the target user of it are both user and developer. 
+      - For MapReduce application users, this jar provides API to read and write carbon files through CarbonInput/OutputFormat in carbon-hadoop module.
+      - For developer, this jar can be used to integrate carbon with processing engine like spark and hive, by leveraging API in carbon-processing module.
+
+- **carbon-spark.jar(Currently it is part of assembly jar):** provides support for spark user, spark user can manipulate carbon data files by using native spark DataFrame/SQL interface. Apart from this, in order to leverage carbon's builtin lifecycle management function, higher level concept like Managed Carbon Table, Database and corresponding DDL are introduced.
+
+- **carbon-hive.jar(not yet provided):** similar to carbon-spark, which provides integration to carbon and hive.
+
+## API
+Carbon can be used in following scenarios:
+### 1. For MapReduce application user
+This User API is provided by carbon-hadoop. In this scenario, user can process carbon files in his MapReduce application by choosing CarbonInput/OutputFormat, and is responsible using it correctly.Currently only CarbonInputFormat is provided and OutputFormat will be provided soon.
+
+
+### 2. For Spark user 
+This User API is provided by the Spark itself. There are also two levels of APIs
+-  **Carbon File**
+
+Similar to parquet, json, or other data source in Spark, carbon can be used with data source API. For example(please refer to DataFrameAPIExample for the more detail):
+```
+// User can create a DataFrame from any data source or transformation.
+val df = ...
+
+// Write data
+// User can write a DataFrame to a carbon file
+ df.write
+   .format("carbondata")
+   .option("tableName", "carbontable")
+   .mode(SaveMode.Overwrite)
+   .save()
+
+
+// read carbon data by data source API
+df = carbonContext.read
+  .format("carbondata")
+  .option("tableName", "carbontable")
+  .load("/path")
+
+// User can then use DataFrame for analysis
+df.count
+SVMWithSGD.train(df, numIterations)
+
+// User can also register the DataFrame with a table name, and use SQL for analysis
+df.registerTempTable("t1")  // register temporary table in SparkSQL catalog
+df.registerHiveTable("t2")  // Or, use a implicit funtion to register to Hive metastore
+sqlContext.sql("select count(*) from t1").show
+```
+
+- **Managed Carbon Table**
+
+Since carbon has builtin support for high level concept like Table, Database, and supports full data lifecycle management, instead of dealing with just files, user can use carbon specific DDL to manipulate data in Table and Database level. Please refer [DDL](https://github.com/HuaweiBigData/carbondata/wiki/Language-Manual:-DDL) and [DML] (https://github.com/HuaweiBigData/carbondata/wiki/Language-Manual:-DML)
+
+For example:
+```
+// Use SQL to manage table and query data
+create database db1;
+use database db1;
+show databases;
+create table tbl1 using org.carbondata.spark;
+load data into table tlb1 path 'some_files';
+select count(*) from tbl1;
+```
+
+### 3. For developer
+For developer who want to integrate carbon into a processing engine like spark/hive/flink, use API provided by carbon-hadoop and carbon-processing:
+  - Query: integrate carbon-hadoop with engine specific API, like spark data source API 
+  - Data life cycle management: carbon provides utility functions in carbon-processing to manage data life cycle, like data loading, compact, retention, schema evolution. Developer can implement DDLs of their choice and leverage these utility function to do data life cycle management.

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/docs/Carbondata-Management.md
----------------------------------------------------------------------
diff --git a/docs/Carbondata-Management.md b/docs/Carbondata-Management.md
deleted file mode 100644
index 06b7cc7..0000000
--- a/docs/Carbondata-Management.md
+++ /dev/null
@@ -1,144 +0,0 @@
-
-* [Load Data](#Load Data)
-* [Deleting Data](#Deleting Data)
-* [Compacting Data](#Compacting Data)
-
-
-***
-
-
-# Load Data
-### Scenario
-Once the table is created, data can be loaded into table using LOAD DATA command and will be available for query. When data load is triggered, the data is encoded in Carbon format and copied into HDFS Carbon store path(mentioned in carbon.properties file) in compressed, multi dimentional columnar format for quick analysis queries.
-The same command can be used for loading the new data or to update the existing data.
-Only one data load can be triggered for one table. The high cardinality columns of the dictionary encoding are automatically recognized and these columns will not be used as dictionary encoding.
-
-### Prerequisite
-
- The Table must be created.
-
-### Procedure
-
-Data loading is a process that involves execution of various steps to read, sort, and encode the date in Carbon store format. Each step will be executed in different threads.
-After data loading process is complete, the status (success/partial success) will be updated to Carbon store metadata. Following are the data load status:
-
-1. Success: All the data is loaded into table and no bad records found.
-2. Partial Success: Data is loaded into table and bad records are found. Bad records are stored at carbon.badrecords.location.
-
-In case of failure, the error will be logged in error log.
-Details of loads can be seen with SHOW SEGMENTS command.
-* Sequence Id
-* Status of data load
-* Load Start time
-* Load End time
-Following steps needs to be performed for invoking data load.
-Run the following command for historical data load:
-Command:
-```ruby
-LOAD DATA [LOCAL] INPATH 'folder_path' [OVERWRITE] INTO TABLE [db_name.]table_name
-OPTIONS(property_name=property_value, ...)
-```
-OPTIONS are also mandatory for data loading process. Inside OPTIONS user can provide either of any options like DELIMITER,QUOTECHAR, ESCAPERCHAR,MULTILINE as per need.
-
-Note: The path shall be canonical path.
-
-***
-
-# Deleting Data
-### Scenario
-If you have loaded wrong data into the table, or too many bad records and wanted to modify and reload the data, you can delete required loads data. The load can be deleted using the load ID or if the table contains date field then the data can be deleted using the date field.
-
-### Delete by Segment ID
-
-Each segment has a unique segment ID associated with it. Using this segment ID, you can remove the segment.
-Run the following command to get the segmentID.
-Command:
-```ruby
-SHOW SEGMENTS FOR Table dbname.tablename LIMIT number_of_segments
-```
-Example:
-```ruby
-SHOW SEGMENTS FOR TABLE carbonTable
-```
-The above command will show all the segments of the table carbonTable.
-```ruby
-SHOW SEGMENTS FOR TABLE carbonTable LIMIT 3
-```
-The above DDL will show only limited number of segments specified by number_of_segments.
-
-output: 
-
-| SegmentSequenceId | Status | Load Start Time | Load End Time | 
-|--------------|-----------------|--------------------|--------------------| 
-| 2| Success | 2015-11-19 20:25:... | 2015-11-19 20:49:... | 
-| 1| Marked for Delete | 2015-11-19 19:54:... | 2015-11-19 20:08:... | 
-| 0| Marked for Update | 2015-11-19 19:14:... | 2015-11-19 19:14:... | 
- 
-The show segment command output consists of SegmentSequenceID, START_TIME OF LOAD, END_TIME OF LOAD, and LOAD STATUS. The latest load will be displayed first in the output.
-After you get the segment ID of the segment that you want to delete, execute the following command to delete the selected segment.
-Command:
-```ruby
-DELETE SEGMENT segment_sequence_id1, segments_sequence_id2, .... FROM TABLE tableName
-```
-Example:
-```ruby
-DELETE SEGMENT l,2,3 FROM TABLE carbonTable
-```
-
-### Delete by Date Field
-
-If the table contains date field, you can delete the data based on a specific date.
-Command:
-```ruby
-DELETE FROM TABLE [schema_name.]table_name WHERE[DATE_FIELD]BEFORE [DATE_VALUE]
-```
-Example:
-```ruby
-DELETE FROM TABLE table_name WHERE productionDate BEFORE '2017-07-01'
-```
-Here productionDate is the column of type time stamp.
-The above DDL will delete all the data before the date '2017-07-01'.
-
-
-Note: 
-* When the delete segment DML is called, segment will not be deleted physically from the file system. Instead the segment status will be marked as "Marked for Delete". For the query execution, this deleted segment will be excluded.
-* The deleted segment will be deleted physically during the next load operation and only after the maximum query execution time configured using "max.query.execution.time". By default it is 60 minutes.
-* If the user wants to force delete the segment physically then he can use CLEAN FILES DML.
-Example:
-```ruby
-CLEAN FILES FOR TABLE table1
-```
-This DML will physically delete the segment which are "Marked for delete" immediately.
-
-
-
-***
-
-# Compacting Data
-### Scenario
-Frequent data ingestion results in several fragmented carbon files in the store directory. Since data is sorted only within each load, the indices perform only within each load. This mean that there will be one index for each load and as number of data load increases, the number of indices also increases. As each index works only on one load, the performance of indices is reduced. Carbon provides provision for compacting the loads. Compaction process combines several segments into one large segment by merge sorting the data from across the segments.
-
-### Prerequisite
-
- The data should be loaded multiple times.
-
-### Procedure
-
-There are two types of compaction Minor and Major compaction.
-Minor Compaction:
-In minor compaction the user can specify how many loads to be merged. Minor compaction triggers for every data load if the parameter carbon.enable.auto.load.merge is set. If any segments are available to be merged, then compaction will run parallel with data load.
-There are 2 levels in minor compaction.
-* Level 1: Merging of the segments which are not yet compacted.
-* Level 2: Merging of the compacted segments again to form a bigger segment.
-Major Compaction:
-In Major compaction, many segments can be merged into one big segment. User will specify the compaction size until which segments can be merged. Major compaction is usually done during the off-peak time.
-
-### Parameters of Compaction
-| Parameter | Default | Applicable | Description | 
-| --------- | --------| -----------|-------------|
-| carbon.compaction.level.threshold | 4,3 | Minor | This property is for minor compaction which decides how many segments to be merged.**Example**: if it is set as 2,3 then minor compaction will be triggered for every 2 segments. 3 is the number of level 1 compacted segment which is further compacted to new segment.
-Valid values are from 0-100. |
-| carbon.major.compaction.size | 1024 mb | Major | Major compaction size can be configured using this parameter. Sum of the segments which is below this threshold will be merged. |
-| carbon.numberof.preserve.segments | 0 | Minor/Major| If the user wants to preserve some number of segments from being compacted then he can set this property.**Example**:carbon.numberof.preserve.segments=2 then 2 latest segments will always be excluded from the compaction.No segments will be preserved by default. |
-| carbon.allowed.compaction.days | 0 | Minor/Major| Compaction will merge the segments which are loaded with in the specific number of days configured.**Example**: if the configuration is 2, then the segments which are loaded in the time frame of 2 days only will get merged. Segments which are loaded 2 days apart will not be merged.This is disabled by default. |
-| carbon.number.of.cores.while.compacting | 2 | Minor/Major| Number of cores which is used to write data during compaction. |

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/docs/DDL-Operations-on-Carbon.md
----------------------------------------------------------------------
diff --git a/docs/DDL-Operations-on-Carbon.md b/docs/DDL-Operations-on-Carbon.md
index 1cd9f0d..b4dcaa7 100644
--- a/docs/DDL-Operations-on-Carbon.md
+++ b/docs/DDL-Operations-on-Carbon.md
@@ -8,47 +8,26 @@
 
 
 # CREATE TABLE
-### Function
 This command can be used to create carbon table by specifying the list of fields along with the table properties.
 
-### Syntax
-
-  ```ruby
+  ```
   CREATE TABLE [IF NOT EXISTS] [db_name.]table_name 
                [(col_name data_type , ...)]               
-         STORED BY 'org.apache.carbondata.format'
+         STORED BY 'carbondata'
                [TBLPROPERTIES (property_name=property_value, ...)]
                // All Carbon's additional table options will go into properties
   ```
      
-**Example:**
-
-  ```ruby
-  CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-                  productNumber Int,
-                  productName String, 
-                  storeCity String, 
-                  storeProvince String, 
-                  productCategory String, 
-                  productBatch String,
-                  saleQuantity Int,
-                  revenue Int)       
-       STORED BY 'org.apache.carbondata.format' 
-       TBLPROPERTIES ('COLUMN_GROUPS'='(productName,productCategory)',
-                     'DICTIONARY_EXCLUDE'='productName',
-                     'DICTIONARY_INCLUDE'='productNumber',
-                     'NO_INVERTED_INDEX'='productBatch')
-  ```
 
 ### Parameter Description
 
-| Parameter | Description |
-| ------------- | -----|
-| db_name | Name of the Database. Database name should consist of Alphanumeric characters and underscore(_) special character. |
-| field_list | Comma separated List of fields with data type. The field names should consist of Alphanumeric characters and underscore(_) special character.|
-|table_name | The name of the table in Database. Table Name should consist of Alphanumeric characters and underscore(_) special character. |
-| STORED BY | "org.apache.carbondata.format", identifies and creates carbon table. |
-| TBLPROPERTIES | List of carbon table properties. |
+| Parameter | Description | Optional |
+| ------------- | -----| ---------- |
+| db_name | Name of the Database. Database name should consist of Alphanumeric characters and underscore(_) special character. | YES |
+| field_list | Comma separated List of fields with data type. The field names should consist of Alphanumeric characters and underscore(_) special character.| NO |
+|table_name | The name of the table in Database. Table Name should consist of Alphanumeric characters and underscore(_) special character. | NO |
+| STORED BY | "org.apache.carbondata.format", identifies and creates carbon table. | NO |
+| TBLPROPERTIES | List of carbon table properties. | YES |
 
 ### Usage Guideline
 Following are the table properties usage.
@@ -79,88 +58,86 @@ Here, DICTIONARY_EXCLUDE will exclude dictionary creation. This is applicable fo
   TBLPROPERTIES ("NO_INVERTED_INDEX"="column1,column3")
   ```
 Here, NO_INVERTED_INDEX will not use inverted index for the specified columns. This is applicable for high-cardinality columns and is a optional parameter.
-### Scenarios
-#### Create table by specifying schema
 
- The create table command is same as the Hive DDL. The Carbon's extra configurations are given as table properties.
+*Note : By default all columns except numeric datatype columns are treated as dimensions and all numeric datatype columns are treated as measures. All dimensions except complex datatype columns are part of multi dimensional key(MDK). This behavior can be overridden by using TBLPROPERTIES, If user wants to keep any column (except complex datatype) in multi dimensional key then he can keep the columns either in DICTIONARY_EXCLUDE or DICTIONARY_INCLUDE*
 
-  ```ruby
-  CREATE TABLE [IF NOT EXISTS] [db_name.]table_name
-               [(col_name data_type , ...)]
-         STORED BY \u2018org.carbondata.hive.CarbonHanlder\u2019
-               [TBLPROPERTIES (property_name=property_value ,...)]             
+
+**Example:**
+
+  ```
+  CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
+                  productNumber Int,
+                  productName String, 
+                  storeCity String, 
+                  storeProvince String, 
+                  productCategory String, 
+                  productBatch String,
+                  saleQuantity Int,
+                  revenue Int)       
+       STORED BY 'carbondata' 
+       TBLPROPERTIES ('COLUMN_GROUPS'='(productName,productCategory)',
+                     'DICTIONARY_EXCLUDE'='productName',
+                     'DICTIONARY_INCLUDE'='productNumber',
+                     'NO_INVERTED_INDEX'='productBatch')
   ```
 ***
 
 # SHOW TABLE
-### Function
 This command can be used to list all the tables in current database or all the tables of a specific database.
 
-### Syntax
-
   ```ruby
   SHOW TABLES [IN db_Name];
   ```
 
+### Parameter Description
+| Parameter | Description | Optional |
+|-----------|-------------| -------- |
+| IN db_Name | Name of the database. Required only if tables of this specific database are to be listed. | YES |
+
 **Example:**
 
   ```ruby
   SHOW TABLES IN ProductSchema;
   ```
 
-### Parameter Description
-| Parameter | Description |
-|-----------|-------------|
-| IN db_Name | Name of the database. Required only if tables of this specific database are to be listed. |
-
-### Usage Guideline
-IN db_Name is optional.
-
-### Scenarios
-NA
-
 ***
 
 # DROP TABLE
-### Function
 This command can be used to delete the existing table.
 
-### Syntax
-
   ```ruby
   DROP TABLE [IF EXISTS] [db_name.]table_name;
   ```
 
+### Parameter Description
+| Parameter | Description | Optional |
+|-----------|-------------| -------- |
+| db_Name | Name of the database. If not specified, current database will be selected. | YES |
+| table_name | Name of the table to be deleted. | NO |
+
 **Example:**
 
   ```ruby
   DROP TABLE IF EXISTS productSchema.productSalesTable;
   ```
 
-### Parameter Description
-| Parameter | Description |
-|-----------|-------------|
-| db_Name | Name of the database. If not specified, current database will be selected. |
-| table_name | Name of the table to be deleted. |
-
-### Usage Guideline
-In this command IF EXISTS and db_name are optional.
-
-### Scenarios
-NA
-
 ***
 
 # COMPACTION
-### Function
  This command will merge the specified number of segments into one segment. This will enhance the query performance of the table.
 
-### Syntax
-
   ```ruby
   ALTER TABLE [db_name.]table_name COMPACT 'MINOR/MAJOR'
   ```
 
+### Parameter Description
+
+| Parameter | Description | Optional |
+| ------------- | -----| ----------- |
+| db_name | Database name, if it is not specified then it uses current database. | YES |
+| table_name | The name of the table in provided database.| NO |
+ 
+
 **Example:**
 
   ```ruby
@@ -168,18 +145,4 @@ NA
   ALTER TABLE carbontable COMPACT MAJOR
   ```
 
-### Parameter Description
-
-| Parameter | Description |
-| ------------- | -----|
-| db_name | Database name, if it is not specified then it uses current database. |
-| table_name | The name of the table in provided database.|
- 
-
-### Usage Guideline
-NA
-
-### Scenarios
-NA
-
 ***

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/docs/DML-Operations-on-Carbon.md
----------------------------------------------------------------------
diff --git a/docs/DML-Operations-on-Carbon.md b/docs/DML-Operations-on-Carbon.md
index 6ce1bea..e5ede7d 100644
--- a/docs/DML-Operations-on-Carbon.md
+++ b/docs/DML-Operations-on-Carbon.md
@@ -6,8 +6,7 @@
 ***
 
 # LOAD DATA
-### Function
- This command loads the user data in raw format to the Carbon specific data format store, this way Carbon provides good performance while querying the data.
+ This command loads the user data in raw format to the Carbon specific data format store, this way Carbon provides good performance while querying the data.Please visit [Data Management](Carbondata-Management.md) for more details on LOAD
 
 ### Syntax
 
@@ -16,29 +15,14 @@
               OPTIONS(property_name=property_value, ...)
   ```
 
-**Example:**
-
-  ```ruby
-  LOAD DATA local inpath '/opt/rawdata/data.csv' INTO table carbontable
-                         options('DELIMITER'=',', 'QUOTECHAR'='"',
-                                 'FILEHEADER'='empno,empname,
-                                  designation,doj,workgroupcategory,
-                                  workgroupcategoryname,deptno,deptname,projectcode,
-                                  projectjoindate,projectenddate,attendance,utilization,salary',
-                                 'MULTILINE'='true', 'ESCAPECHAR'='\', 
-                                 'COMPLEX_DELIMITER_LEVEL_1'='$', 
-                                 'COMPLEX_DELIMITER_LEVEL_2'=':',
-                                 'LOCAL_DICTIONARY_PATH'='/opt/localdictionary/',
-                                 'DICTIONARY_FILE_EXTENSION'='.dictionary') 
-  ```
-
 ### Parameter Description
 
-| Parameter | Description |
-| ------------- | -----|
-| folder_path | Path of raw csv data folder or file. |
-| db_name | Database name, if it is not specified then it uses current database. |
-| table_name | The name of the table in provided database.|
+| Parameter | Description | Optional |
+| ------------- | -----| -------- |
+| folder_path | Path of raw csv data folder or file. | NO |
+| db_name | Database name, if it is not specified then it uses current database. | YES |
+| table_name | The name of the table in provided database.| NO |
+| OPTIONS | Extra options provided to Load | YES |
  
 
 ### Usage Guideline
@@ -89,75 +73,63 @@ Following are the options that can be used in load data:
     OPTIONS('DICTIONARY_FILE_EXTENSION'='.dictionary') 
     ```
 
-### Scenarios
-
-#### Load from CSV files
-
-To load carbon table from CSV file, use the following syntax.
-
-  ```ruby
-  LOAD DATA [LOCAL] INPATH 'folder path' INTO TABLE tablename OPTIONS(property_name=property_value, ...)
-  ```
+**Example:**
 
- **Example:**
-  
   ```ruby
-  LOAD DATA local inpath './src/test/resources/data.csv' INTO table carbontable 
-                      options('DELIMITER'=',', 'QUOTECHAR'='"', 
-                              'FILEHEADER'='empno,empname,designation,doj,
-                               workgroupcategory,workgroupcategoryname,
-                               deptno,deptname,projectcode,projectjoindate,
-                               projectenddate,attendance,utilization,salary', 
-                              'MULTILINE'='true', 'ESCAPECHAR'='\', 
-                              'COMPLEX_DELIMITER_LEVEL_1'='$', 'COMPLEX_DELIMITER_LEVEL_2'=':', 
-                              'LOCAL_DICTIONARY_PATH'='/opt/localdictionary/','DICTIONARY_FILE_EXTENSION'='.dictionary')
+  LOAD DATA local inpath '/opt/rawdata/data.csv' INTO table carbontable
+                         options('DELIMITER'=',', 'QUOTECHAR'='"',
+                                 'FILEHEADER'='empno,empname,
+                                  designation,doj,workgroupcategory,
+                                  workgroupcategoryname,deptno,deptname,projectcode,
+                                  projectjoindate,projectenddate,attendance,utilization,salary',
+                                 'MULTILINE'='true', 'ESCAPECHAR'='\', 
+                                 'COMPLEX_DELIMITER_LEVEL_1'='$', 
+                                 'COMPLEX_DELIMITER_LEVEL_2'=':',
+                                 'LOCAL_DICTIONARY_PATH'='/opt/localdictionary/',
+                                 'DICTIONARY_FILE_EXTENSION'='.dictionary') 
   ```
 
 ***
 
 # SHOW SEGMENTS
-### Function
 This command is to show the segments of carbon table to the user.
 
-### Syntax
-
   ```ruby
   SHOW SEGMENTS FOR TABLE [db_name.]table_name LIMIT number_of_segments;
   ```
 
+### Parameter Description
+
+| Parameter | Description | Optional |
+| ------------- | -----| --------- |
+| db_name | Database name, if it is not specified then it uses current database. | YES |
+| table_name | The name of the table in provided database.| NO |
+| number_of_loads | limit the output to this number. | YES |
+
 **Example:**
 
   ```ruby
   SHOW SEGMENTS FOR TABLE CarbonDatabase.CarbonTable LIMIT 2;
   ```
 
-### Parameter Description
-
-| Parameter | Description |
-| ------------- | -----|
-| db_name | Database name, if it is not specified then it uses current database. |
-| table_name | The name of the table in provided database.|
-| number_of_loads | limit the output to this number. |
-
-### Usage Guideline
-NA
-
-### Scenarios
-NA
-
 ***
 
 # DELETE SEGMENT BY ID
-### Function
 
 This command is to delete segment by using the segment ID.
 
-### Syntax
-
   ```ruby
   DELETE SEGMENT segment_id1,segment_id2 FROM TABLE [db_name.]table_name;
   ```
 
+### Parameter Description
+
+| Parameter | Description | Optional |
+| ------------- | -----| --------- |
+| segment_id | Segment Id of the load. | NO |
+| db_name | Database name, if it is not specified then it uses current database. | YES |
+| table_name | The name of the table in provided database.| NO |
+
 **Example:**
 
   ```ruby
@@ -166,51 +138,27 @@ This command is to delete segment by using the segment ID.
   Note: Here 0.1 is compacted segment sequence id.  
   ```
 
-### Parameter Description
-
-| Parameter | Description |
-| ------------- | -----|
-| segment_id | Segment Id of the load. |
-| db_name | Database name, if it is not specified then it uses current database. |
-| table_name | The name of the table in provided database.|
-
-### Usage Guideline
-NA
-
-### Scenarios
-NA
-
 ***
 
 # DELETE SEGMENT BY DATE
-### Function
-
 This command will allow to deletes the Carbon segment(s) from the store based on the date provided by the user in the DML command. The segment created before the particular date will be removed from the specific stores.
 
-### Syntax
-
   ```ruby
   DELETE SEGMENTS FROM TABLE [db_name.]table_name WHERE STARTTIME BEFORE [DATE_VALUE];
   ```
 
+### Parameter Description
+
+| Parameter | Description | Optional |
+| ------------- | -----| ------ |
+| DATE_VALUE | Valid segement load start time value. All the segments before this specified date will be deleted. | NO |
+| db_name | Database name, if it is not specified then it uses current database. | YES |
+| table_name | The name of the table in provided database.| NO |
+
 **Example:**
 
   ```ruby
   DELETE SEGMENTS FROM TABLE CarbonDatabase.CarbonTable WHERE STARTTIME BEFORE '2017-06-01 12:05:06';  
   ```
 
-### Parameter Description
-
-| Parameter | Description |
-| ------------- | -----|
-| DATE_VALUE | Valid segement load start time value. All the segments before this specified date will be deleted. |
-| db_name | Database name, if it is not specified then it uses current database. |
-| table_name | The name of the table in provided database.|
-
-### Usage Guideline
-NA
-
-### Scenarios
-NA
-
 ***
\ No newline at end of file

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/docs/Data-Management.md
----------------------------------------------------------------------
diff --git a/docs/Data-Management.md b/docs/Data-Management.md
new file mode 100644
index 0000000..b63f692
--- /dev/null
+++ b/docs/Data-Management.md
@@ -0,0 +1,141 @@
+
+* [Load Data](#Load Data)
+* [Deleting Data](#Deleting Data)
+* [Compacting Data](#Compacting Data)
+
+
+***
+
+
+# Load Data
+### Scenario
+Once the table is created, data can be loaded into table using LOAD DATA command and will be available for query. When data load is triggered, the data is encoded in Carbon format and copied into HDFS Carbon store path(mentioned in carbon.properties file) in compressed, multi dimentional columnar format for quick analysis queries.
+The same command can be used for loading the new data or to update the existing data.
+Only one data load can be triggered for one table. The high cardinality columns of the dictionary encoding are automatically recognized and these columns will not be used as dictionary encoding.
+
+### Procedure
+
+Data loading is a process that involves execution of various steps to read, sort, and encode the date in Carbon store format. Each step will be executed in different threads.
+After data loading process is complete, the status (success/partial success) will be updated to Carbon store metadata. Following are the data load status:
+
+1. Success: All the data is loaded into table and no bad records found.
+2. Partial Success: Data is loaded into table and bad records are found. Bad records are stored at carbon.badrecords.location.
+
+In case of failure, the error will be logged in error log.
+Details of loads can be seen with SHOW SEGMENTS command.
+* Sequence Id
+* Status of data load
+* Load Start time
+* Load End time
+Following steps needs to be performed for invoking data load.
+Run the following command for historical data load:
+```ruby
+LOAD DATA [LOCAL] INPATH 'folder_path' [OVERWRITE] INTO TABLE [db_name.]table_name
+OPTIONS(property_name=property_value, ...)
+```
+OPTIONS are not mandatory for data loading process. Inside OPTIONS user can provide either of any options like DELIMITER,QUOTECHAR, ESCAPERCHAR,MULTILINE as per need.
+
+Note: The path shall be canonical path.
+
+***
+
+# Deleting Data
+### Scenario
+If you have loaded wrong data into the table, or too many bad records and wanted to modify and reload the data, you can delete required loads data. The load can be deleted using the load ID or if the table contains date field then the data can be deleted using the date field.
+
+### Delete by Segment ID
+
+Each segment has a unique segment ID associated with it. Using this segment ID, you can remove the segment.
+Run the following command to get the segmentID.
+```ruby
+SHOW SEGMENTS FOR Table dbname.tablename LIMIT number_of_segments
+```
+Example:
+```ruby
+SHOW SEGMENTS FOR TABLE carbonTable
+```
+The above command will show all the segments of the table carbonTable.
+```ruby
+SHOW SEGMENTS FOR TABLE carbonTable LIMIT 3
+```
+The above DDL will show only limited number of segments specified by number_of_segments.
+
+output: 
+
+| SegmentSequenceId | Status | Load Start Time | Load End Time | 
+|--------------|-----------------|--------------------|--------------------| 
+| 2| Success | 2015-11-19 20:25:... | 2015-11-19 20:49:... | 
+| 1| Marked for Delete | 2015-11-19 19:54:... | 2015-11-19 20:08:... | 
+| 0| Marked for Update | 2015-11-19 19:14:... | 2015-11-19 19:14:... | 
+ 
+The show segment command output consists of SegmentSequenceID, START_TIME OF LOAD, END_TIME OF LOAD, and LOAD STATUS. The latest load will be displayed first in the output.
+After you get the segment ID of the segment that you want to delete, execute the following command to delete the selected segment.
+Command:
+```ruby
+DELETE SEGMENT segment_sequence_id1, segments_sequence_id2, .... FROM TABLE tableName
+```
+Example:
+```ruby
+DELETE SEGMENT l,2,3 FROM TABLE carbonTable
+```
+
+### Delete by Date Field
+
+If the table contains date field, you can delete the data based on a specific date.
+Command:
+```ruby
+DELETE FROM TABLE [schema_name.]table_name WHERE[DATE_FIELD]BEFORE [DATE_VALUE]
+```
+Example:
+```ruby
+DELETE FROM TABLE table_name WHERE productionDate BEFORE '2017-07-01'
+```
+Here productionDate is the column of type time stamp.
+The above DDL will delete all the data before the date '2017-07-01'.
+
+
+Note: 
+* When the delete segment DML is called, segment will not be deleted physically from the file system. Instead the segment status will be marked as "Marked for Delete". For the query execution, this deleted segment will be excluded.
+* The deleted segment will be deleted physically during the next load operation and only after the maximum query execution time configured using "max.query.execution.time". By default it is 60 minutes.
+* If the user wants to force delete the segment physically then he can use CLEAN FILES DML.
+Example:
+```ruby
+CLEAN FILES FOR TABLE table1
+```
+This DML will physically delete the segment which are "Marked for delete" immediately.
+
+
+
+***
+
+# Compacting Data
+### Scenario
+Frequent data ingestion results in several fragmented carbon files in the store directory. Since data is sorted only within each load, the indices perform only within each load. This mean that there will be one index for each load and as number of data load increases, the number of indices also increases. As each index works only on one load, the performance of indices is reduced. Carbon provides provision for compacting the loads. Compaction process combines several segments into one large segment by merge sorting the data from across the segments.
+
+### Prerequisite
+
+ The data should be loaded multiple times.
+
+### Procedure
+
+There are two types of compaction Minor and Major compaction.
+
+#### Minor Compaction:
+In minor compaction the user can specify how many loads to be merged. Minor compaction triggers for every data load if the parameter carbon.enable.auto.load.merge is set. If any segments are available to be merged, then compaction will run parallel with data load. 
+
+There are 2 levels in minor compaction.
+* Level 1: Merging of the segments which are not yet compacted.
+* Level 2: Merging of the compacted segments again to form a bigger segment.
+    
+#### Major Compaction:
+In Major compaction, many segments can be merged into one big segment. User will specify the compaction size until which segments can be merged. Major compaction is usually done during the off-peak time.
+
+### Parameters of Compaction
+| Parameter | Default | Applicable | Description | 
+| --------- | --------| -----------|-------------|
+| carbon.compaction.level.threshold | 4,3 | Minor | This property is for minor compaction which decides how many segments to be merged.**Example**: if it is set as 2,3 then minor compaction will be triggered for every 2 segments. 3 is the number of level 1 compacted segment which is further compacted to new segment.
+Valid values are from 0-100. |
+| carbon.major.compaction.size | 1024 mb | Major | Major compaction size can be configured using this parameter. Sum of the segments which is below this threshold will be merged. |
+| carbon.numberof.preserve.segments | 0 | Minor/Major| If the user wants to preserve some number of segments from being compacted then he can set this property.**Example**:carbon.numberof.preserve.segments=2 then 2 latest segments will always be excluded from the compaction.No segments will be preserved by default. |
+| carbon.allowed.compaction.days | 0 | Minor/Major| Compaction will merge the segments which are loaded with in the specific number of days configured.**Example**: if the configuration is 2, then the segments which are loaded in the time frame of 2 days only will get merged. Segments which are loaded 2 days apart will not be merged.This is disabled by default. |
+| carbon.number.of.cores.while.compacting | 2 | Minor/Major| Number of cores which is used to write data during compaction. |

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/docs/Installing-CarbonData-And-IDE-Configuartion.md
----------------------------------------------------------------------
diff --git a/docs/Installing-CarbonData-And-IDE-Configuartion.md b/docs/Installing-CarbonData-And-IDE-Configuartion.md
index 7ada9cc..5015a48 100644
--- a/docs/Installing-CarbonData-And-IDE-Configuartion.md
+++ b/docs/Installing-CarbonData-And-IDE-Configuartion.md
@@ -1,34 +1,34 @@
 ### Building CarbonData
 Prerequisites for building CarbonData:
 * Unix-like environment (Linux, Mac OS X)
-* git
-* Apache Maven (we recommend version 3.3 or later)
-* Java 7 or 8
+* [Git](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)
+* [Apache Maven (we recommend version 3.3 or later)](https://maven.apache.org/download.cgi)
+* [Java 7 or 8](http://www.oracle.com/technetwork/java/javase/downloads/index.html)
 * Scala 2.10
-* Apache Thrift 0.9.3
+* [Apache Thrift 0.9.3](https://thrift.apache.org/download)
 
 I. Clone CarbonData
 ```
 $ git clone https://github.com/apache/incubator-carbondata.git
 ```
 II. Build the project 
-* Build without test:
+* Build without test.By default carbon takes Spark 1.5.2 to build the project
 ```
 $ mvn -DskipTests clean package 
 ```
-* Build along with test:
+* Build with different spark versions
 ```
-$ mvn clean package
-```
-* Build with different spark versions (Default it takes Spark 1.5.2 version)
-```
-$ mvn -Pspark-1.5.2 clean package
-            or
-$ mvn -Pspark-1.6.1 clean install
+$ mvn -DskipTests -Pspark-1.5 -Dspark.version=1.5.0 clean package
+$ mvn -DskipTests -Pspark-1.5 -Dspark.version=1.5.1 clean package
+$ mvn -DskipTests -Pspark-1.5 -Dspark.version=1.5.2 clean package
+ 
+$ mvn -DskipTests -Pspark-1.6 -Dspark.version=1.6.0 clean package
+$ mvn -DskipTests -Pspark-1.6 -Dspark.version=1.6.1 clean package
+$ mvn -DskipTests -Pspark-1.6 -Dspark.version=1.6.2 clean package
 ```
-* Build along with integration test cases: (Note : It takes more time to build)
+* Build with test
 ```
-$ mvn -Pintegration-test clean package
+$ mvn clean package
 ```
 
 ### Developing CarbonData
@@ -48,19 +48,3 @@ You can also make those setting to be the default by setting to the "Defaults ->
 #### Eclipse
 * Download the Scala IDE (preferred) or install the scala plugin to Eclipse.
 * Import the CarbonData Maven projects ("File" -> "Import" -> "Maven" -> "Existing Maven Projects" -> locate the CarbonData source directory).
-
-### Getting Started
-Read the [quick start](https://github.com/HuaweiBigData/carbondata/wiki/Quick-Start).
-
-### Fork and Contribute
-This is an open source project for everyone, and we are always open to people who want to use this system or contribute to it. 
-This guide document introduce [how to contribute to CarbonData](https://github.com/HuaweiBigData/carbondata/wiki/How-to-contribute-and-Code-Style).
-
-### Contact us
-To get involved in CarbonData:
-
-* [Subscribe](mailto:dev-subscribe@carbondata.incubator.apache.org) then [mail](mailto:dev@carbondata.incubator.apache.org) to us
-* Report issues on [Jira](https://issues.apache.org/jira/browse/CARBONDATA).
-
-### About
-CarbonData project original contributed from the [Huawei](http://www.huawei.com)

http://git-wip-us.apache.org/repos/asf/incubator-carbondata/blob/d13f4700/docs/Quick-Start.md
----------------------------------------------------------------------
diff --git a/docs/Quick-Start.md b/docs/Quick-Start.md
index dbc985d..c50e17f 100644
--- a/docs/Quick-Start.md
+++ b/docs/Quick-Start.md
@@ -21,84 +21,65 @@
 
 This tutorial provides a quick introduction to using CarbonData.
 
-## Examples
 
-Firstly suggest you go through
-all [examples](https://github.com/apache/incubator-carbondata/tree/master/examples), to understand
-how to create table, how to load data, how to make query.
+## Install
 
-## Interactive Query with the Spark Shell
+* Download released package of [Spark 1.5.0 or later](http://spark.apache.org/downloads.html)
+* Download and install Apache Thrift 0.9.3, make sure thrift is added to system path.
+* Download [Apache CarbonData code](https://github.com/apache/incubator-carbondata) and build it. Please visit [Building CarbonData And IDE Configuration](Installing-CarbonData-And-IDE-Configuartion.md) for more information.
 
-### 1.Install
+## Interactive Data Query
 
-* Download a packaged release of  [Spark 1.5.0 or later](http://spark.apache.org/downloads.html)
-* Configure the Hive Metastore using Mysql (you can use this key words to search:mysql hive metastore)
-and move mysql-connector-java jar to ${SPARK_HOME}/lib
-* Download [thrift](https://thrift.apache.org/), rename to thrift and add to path.
-* Download [Apache CarbonData code](https://github.com/apache/incubator-carbondata) and build it
+### Prerequisite
+Create sample.csv file in carbondata directory
 ```
-$ git clone https://github.com/apache/incubator-carbondata.git carbondata
 $ cd carbondata
-$ mvn clean install -DskipTests
-$ cp assembly/target/scala-2.10/carbondata_*.jar ${SPARK_HOME}/lib
-$ mkdir ${SPARK_HOME}/carbondata
-$ cp -r processing/carbonplugins ${SPARK_HOME}/carbondata
+$ cat > sample.csv << EOF
+  id,name,city,age
+  1,david,shenzhen,31
+  2,eason,shenzhen,27
+  3,jarry,wuhan,35
+  EOF
 ```
 
-### 2 Interactive Data Query
-
-* Run spark shell
+### Carbon Spark Shell
+Carbon Spark shell is a wrapper around Apache Spark Shell, it provides a simple way to learn the API, as well as a powerful tool to analyze data interactively. Please visit Apache Spark Documentation for more details on Spark shell.
+Start Spark shell by running the following in the Carbon directory:
 ```
-$ cd ${SPARK_HOME}
-$ carbondata_jar=./lib/$(ls -1 lib |grep "^carbondata_.*\.jar$")
-$ mysql_jar=./lib/$(ls -1 lib |grep "^mysql.*\.jar$")
-$ ./bin/spark-shell --master local --jars ${carbondata_jar},${mysql_jar}
+./bin/carbon-spark-shell
 ```
+*Note*: In this shell SparkContext is readily available as sc and CarbonContext is available as cc.
+
+**Create table**
 
-* Create CarbonContext instance
 ```
-import org.apache.spark.sql.CarbonContext
-import java.io.File
-import org.apache.hadoop.hive.conf.HiveConf
-val storePath = "hdfs://hacluster/Opt/CarbonStore"
-val cc = new CarbonContext(sc, storePath)
-cc.setConf("carbon.kettle.home","./carbondata/carbonplugins")
-val metadata = new File("").getCanonicalPath + "/carbondata/metadata"
-cc.setConf("hive.metastore.warehouse.dir", metadata)
-cc.setConf(HiveConf.ConfVars.HIVECHECKFILEFORMAT.varname, "false")
+scala>cc.sql("create table if not exists test_table (id string, name string, city string, age Int) STORED BY 'carbondata'")
 ```
-*Note*: `storePath` can be a hdfs path or a local path , the path is used to store table data.
-
-* Create table
 
+**Load data to table**
 ```
-cc.sql("create table if not exists table1 (id string, name string, city string, age Int) STORED BY 'org.apache.carbondata.format'")
+scala>val dataFilePath = new File("../carbondata/sample.csv").getCanonicalPath
+scala>cc.sql("load data inpath '$dataFilePath' into table test_table")
 ```
 
-* Create sample.csv file in ${SPARK_HOME}/carbondata directory
+**Query data from table**
 
 ```
-cd ${SPARK_HOME}/carbondata
-cat > sample.csv << EOF
-id,name,city,age
-1,david,shenzhen,31
-2,eason,shenzhen,27
-3,jarry,wuhan,35
-EOF
+scala>cc.sql("select * from test_table").show
+scala>cc.sql("select city, avg(age), sum(age) from test_table group by city").show
 ```
 
-* Load data to table1 in spark shell
+### Carbon SQL CLI
+The Carbon Spark SQL CLI is a wrapper around Apache Spark SQL CLI. It is a convenient tool to execute queries input from the command line. Please visit Apache Spark Documentation for more information Spark SQL CLI.
+Start the Carbon Spark SQL CLI, run the following in the Carbon directory
 
 ```
-val dataFilePath = new File("").getCanonicalPath + "/carbondata/sample.csv"
-cc.sql(s"load data inpath '$dataFilePath' into table table1")
+./bin/carbon-spark-sql
 ```
 
-Note: Carbondata also support `LOAD DATA LOCAL INPATH 'folder_path' INTO TABLE [db_name.]table_name OPTIONS(property_name=property_value, ...)` syntax, but right now there is no significant meaning to local in carbondata.We just keep it to align with hive syntax. `dataFilePath` can be hdfs path as well like `val dataFilePath = hdfs://hacluster//carbondata/sample.csv`  
-
-* Query data from table1
-
+**Execute Queries in CLI**
 ```
-cc.sql("select * from table1").show
-cc.sql("select city, avg(age), sum(age) from table1 group by city").show
+spark-sql> create table if not exists test_table (id string, name string, city string, age Int) STORED BY 'carbondata'
+spark-sql> load data inpath '../sample.csv' into table test_table
+spark-sql> select city, avg(age), sum(age) from test_table group by city
 ```


[2/2] incubator-carbondata git commit: [CARBONDATA-78] Updated ReadMe with cwiki links and updated docs with latest changes This closes #43

Posted by ch...@apache.org.
[CARBONDATA-78] Updated ReadMe with cwiki links and updated docs with latest changes This closes #43


Project: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/commit/56a1e402
Tree: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/tree/56a1e402
Diff: http://git-wip-us.apache.org/repos/asf/incubator-carbondata/diff/56a1e402

Branch: refs/heads/master
Commit: 56a1e402f682ed43e1f2a31bbd13ff8c5b92ac7b
Parents: 1730082 d13f470
Author: chenliang613 <ch...@apache.org>
Authored: Wed Jul 20 22:15:06 2016 +0800
Committer: chenliang613 <ch...@apache.org>
Committed: Wed Jul 20 22:15:06 2016 +0800

----------------------------------------------------------------------
 README.md                                       |  19 +--
 docs/Carbon-Interfaces.md                       |  72 ----------
 docs/Carbon-Packaging-and-Interfaces.md         |  72 ++++++++++
 docs/Carbondata-Management.md                   | 144 -------------------
 docs/DDL-Operations-on-Carbon.md                | 131 ++++++-----------
 docs/DML-Operations-on-Carbon.md                | 138 ++++++------------
 docs/Data-Management.md                         | 141 ++++++++++++++++++
 ...stalling-CarbonData-And-IDE-Configuartion.md |  46 ++----
 docs/Quick-Start.md                             |  89 +++++-------
 9 files changed, 364 insertions(+), 488 deletions(-)
----------------------------------------------------------------------