You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by xu...@apache.org on 2018/12/27 10:20:10 UTC
carbondata git commit: [CARBONDATA-3176] Optimize quick-start-guide documentation

Repository: carbondata
Updated Branches:
  refs/heads/master 04b52568d -> ca32374a4


[CARBONDATA-3176] Optimize quick-start-guide documentation

optimize sql with multi lines and fix some markdown style in doc

This closes #2992


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/ca32374a
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/ca32374a
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/ca32374a

Branch: refs/heads/master
Commit: ca32374a4ea6dd0bdf2088eb35c8af6562037e31
Parents: 04b5256
Author: lamber-ken <22...@qq.com>
Authored: Mon Dec 17 01:17:55 2018 +0800
Committer: xubo245 <xu...@huawei.com>
Committed: Thu Dec 27 18:18:48 2018 +0800

----------------------------------------------------------------------
 docs/datamap/datamap-management.md       |  14 +-
 docs/datamap/timeseries-datamap-guide.md |  43 +++--
 docs/ddl-of-carbondata.md                | 164 ++++++++++---------
 docs/faq.md                              |  33 ++--
 docs/hive-guide.md                       |   4 +-
 docs/quick-start-guide.md                | 222 ++++++++++++++------------
 docs/s3-guide.md                         |   9 +-
 docs/streaming-guide.md                  |  14 +-
 8 files changed, 264 insertions(+), 239 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca32374a/docs/datamap/datamap-management.md
----------------------------------------------------------------------
diff --git a/docs/datamap/datamap-management.md b/docs/datamap/datamap-management.md
index ad8718a..0dc4718 100644
--- a/docs/datamap/datamap-management.md
+++ b/docs/datamap/datamap-management.md
@@ -34,13 +34,13 @@
 DataMap can be created using following DDL
 
 ```
-  CREATE DATAMAP [IF NOT EXISTS] datamap_name
-  [ON TABLE main_table]
-  USING "datamap_provider"
-  [WITH DEFERRED REBUILD]
-  DMPROPERTIES ('key'='value', ...)
-  AS
-    SELECT statement
+CREATE DATAMAP [IF NOT EXISTS] datamap_name
+[ON TABLE main_table]
+USING "datamap_provider"
+[WITH DEFERRED REBUILD]
+DMPROPERTIES ('key'='value', ...)
+AS
+  SELECT statement
 ```
 
 Currently, there are 5 DataMap implementations in CarbonData.

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca32374a/docs/datamap/timeseries-datamap-guide.md
----------------------------------------------------------------------
diff --git a/docs/datamap/timeseries-datamap-guide.md b/docs/datamap/timeseries-datamap-guide.md
index 3f849c4..d3357fa 100644
--- a/docs/datamap/timeseries-datamap-guide.md
+++ b/docs/datamap/timeseries-datamap-guide.md
@@ -97,8 +97,7 @@ For querying timeseries data, Carbondata has builtin support for following time
 timeseries(timeseries column name, 'aggregation level')
 ```
 ```
-SELECT timeseries(order_time, 'hour'), sum(quantity) FROM sales GROUP BY timeseries(order_time,
-'hour')
+SELECT timeseries(order_time, 'hour'), sum(quantity) FROM sales GROUP BY timeseries(order_time,'hour')
 ```
   
 It is **not necessary** to create pre-aggregate tables for each granularity unless required for 
@@ -108,25 +107,25 @@ For Example: For main table **sales** , if following timeseries datamaps were cr
 level and hour level pre-aggregate
   
 ```
-  CREATE DATAMAP agg_day
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-    'event_time'='order_time',
-    'day_granularity'='1',
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
-        
-  CREATE DATAMAP agg_sales_hour
-  ON TABLE sales
-  USING "timeseries"
-  DMPROPERTIES (
-    'event_time'='order_time',
-    'hour_granularity'='1',
-  ) AS
-  SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
-   avg(price) FROM sales GROUP BY order_time, country, sex
+CREATE DATAMAP agg_day
+ON TABLE sales
+USING "timeseries"
+DMPROPERTIES (
+  'event_time'='order_time',
+  'day_granularity'='1',
+) AS
+SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
+ avg(price) FROM sales GROUP BY order_time, country, sex
+      
+CREATE DATAMAP agg_sales_hour
+ON TABLE sales
+USING "timeseries"
+DMPROPERTIES (
+  'event_time'='order_time',
+  'hour_granularity'='1',
+) AS
+SELECT order_time, country, sex, sum(quantity), max(quantity), count(user_id), sum(price),
+ avg(price) FROM sales GROUP BY order_time, country, sex
 ```
 
 Queries like below will not be rolled-up and hit the main table
@@ -138,7 +137,7 @@ Select timeseries(order_time, 'year'), sum(quantity) from sales group by timeser
   'year')
 ```
 
-NOTE (<b>RESTRICTION</b>):
+NOTE (**RESTRICTION**):
 * Only value of 1 is supported for hierarchy levels. Other hierarchy levels will be supported in
 the future CarbonData release. 
 * timeseries datamap for the desired levels needs to be created one after the other

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca32374a/docs/ddl-of-carbondata.md
----------------------------------------------------------------------
diff --git a/docs/ddl-of-carbondata.md b/docs/ddl-of-carbondata.md
index 90153b7..3d3db1e 100644
--- a/docs/ddl-of-carbondata.md
+++ b/docs/ddl-of-carbondata.md
@@ -155,22 +155,22 @@ CarbonData DDL statements are documented here,which includes:
      * GLOBAL_SORT: It increases the query performance, especially high concurrent point query.
        And if you care about loading resources isolation strictly, because the system uses the spark GroupBy to sort data, the resource can be controlled by spark. 
 
-    ### Example:
-
-    ```
-    CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
-      productNumber INT,
-      productName STRING,
-      storeCity STRING,
-      storeProvince STRING,
-      productCategory STRING,
-      productBatch STRING,
-      saleQuantity INT,
-      revenue INT)
-    STORED AS carbondata
-    TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity',
-                   'SORT_SCOPE'='NO_SORT')
-    ```
+ ### Example:
+
+   ```
+   CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
+     productNumber INT,
+     productName STRING,
+     storeCity STRING,
+     storeProvince STRING,
+     productCategory STRING,
+     productBatch STRING,
+     saleQuantity INT,
+     revenue INT)
+   STORED AS carbondata
+   TBLPROPERTIES ('SORT_COLUMNS'='productName,storeCity',
+                  'SORT_SCOPE'='NO_SORT')
+   ```
 
    **NOTE:** CarbonData also supports "using carbondata". Find example code at [SparkSessionExample](https://github.com/apache/carbondata/blob/master/examples/spark2/src/main/scala/org/apache/carbondata/examples/SparkSessionExample.scala) in the CarbonData repo.
 
@@ -286,17 +286,13 @@ CarbonData DDL statements are documented here,which includes:
 ### Example:
 
    ```
-   CREATE TABLE carbontable(
-             
-               column1 string,
-             
-               column2 string,
-             
-               column3 LONG )
-             
-     STORED AS carbondata
-     TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='true','LOCAL_DICTIONARY_THRESHOLD'='1000',
-     'LOCAL_DICTIONARY_INCLUDE'='column1','LOCAL_DICTIONARY_EXCLUDE'='column2')
+   CREATE TABLE carbontable(             
+     column1 string,             
+     column2 string,             
+     column3 LONG)
+   STORED AS carbondata
+   TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='true','LOCAL_DICTIONARY_THRESHOLD'='1000',
+   'LOCAL_DICTIONARY_INCLUDE'='column1','LOCAL_DICTIONARY_EXCLUDE'='column2')
    ```
 
    **NOTE:** 
@@ -410,7 +406,7 @@ CarbonData DDL statements are documented here,which includes:
 
        Following table property enables this feature and default value is false.
        ```
-        'flat_folder'='true'
+       'flat_folder'='true'
        ```
 
        Example:
@@ -477,7 +473,7 @@ CarbonData DDL statements are documented here,which includes:
      be later viewed in table description for reference.
 
      ```
-       TBLPROPERTIES('BAD_RECORD_PATH'='/opt/badrecords')
+     TBLPROPERTIES('BAD_RECORD_PATH'='/opt/badrecords')
      ```
      
    - ##### Load minimum data size
@@ -489,7 +485,7 @@ CarbonData DDL statements are documented here,which includes:
      Notice that once you enable this feature, for load balance, carbondata will ignore the data locality while assigning input data to nodes, this will cause more network traffic.
 
      ```
-       TBLPROPERTIES('LOAD_MIN_SIZE_INMB'='256')
+     TBLPROPERTIES('LOAD_MIN_SIZE_INMB'='256')
      ```
 
 ## CREATE TABLE AS SELECT
@@ -504,26 +500,37 @@ CarbonData DDL statements are documented here,which includes:
 
 ### Examples
   ```
-  carbon.sql("CREATE TABLE source_table(
-                             id INT,
-                             name STRING,
-                             city STRING,
-                             age INT)
-              STORED AS parquet")
+  carbon.sql(
+             s"""
+                | CREATE TABLE source_table(
+                |   id INT,
+                |   name STRING,
+                |   city STRING,
+                |   age INT)
+                | STORED AS parquet
+             """.stripMargin)
+                
   carbon.sql("INSERT INTO source_table SELECT 1,'bob','shenzhen',27")
+  
   carbon.sql("INSERT INTO source_table SELECT 2,'david','shenzhen',31")
   
-  carbon.sql("CREATE TABLE target_table
-              STORED AS carbondata
-              AS SELECT city,avg(age) FROM source_table GROUP BY city")
+  carbon.sql(
+             s"""
+                | CREATE TABLE target_table
+                | STORED AS carbondata
+                | AS SELECT city, avg(age) 
+                |    FROM source_table 
+                |    GROUP BY city
+             """.stripMargin)
               
   carbon.sql("SELECT * FROM target_table").show
-    // results:
-    //    +--------+--------+
-    //    |    city|avg(age)|
-    //    +--------+--------+
-    //    |shenzhen|    29.0|
-    //    +--------+--------+
+  
+  // results:
+  //    +--------+--------+
+  //    |    city|avg(age)|
+  //    +--------+--------+
+  //    |shenzhen|    29.0|
+  //    +--------+--------+
 
   ```
 
@@ -545,11 +552,12 @@ CarbonData DDL statements are documented here,which includes:
   sql("INSERT INTO origin select 200,'hive'")
   // creates a table in $storeLocation/origin
   
-  sql(s"""
-  |CREATE EXTERNAL TABLE source
-  |STORED AS carbondata
-  |LOCATION '$storeLocation/origin'
-  """.stripMargin)
+  sql(
+      s"""
+         | CREATE EXTERNAL TABLE source
+         | STORED AS carbondata
+         | LOCATION '$storeLocation/origin'
+      """.stripMargin)
   checkAnswer(sql("SELECT count(*) from source"), sql("SELECT count(*) from origin"))
   ```
 
@@ -560,8 +568,10 @@ CarbonData DDL statements are documented here,which includes:
   **Example:**
   ```
   sql(
-  s"""CREATE EXTERNAL TABLE sdkOutputTable STORED AS carbondata LOCATION
-  |'$writerPath' """.stripMargin)
+      s"""
+         | CREATE EXTERNAL TABLE sdkOutputTable STORED AS carbondata LOCATION
+         |'$writerPath'
+      """.stripMargin)
   ```
 
   Here writer path will have carbondata and index files.
@@ -700,33 +710,33 @@ Users can specify which columns to include and exclude for local dictionary gene
      This command is used to merge all the CarbonData index files (.carbonindex) inside a segment to a single CarbonData index merge file (.carbonindexmerge). This enhances the first query performance.
 
      ```
-      ALTER TABLE [db_name.]table_name COMPACT 'SEGMENT_INDEX'
+     ALTER TABLE [db_name.]table_name COMPACT 'SEGMENT_INDEX'
      ```
 
-      Examples:
+     Examples:
 
      ```
-      ALTER TABLE test_db.carbon COMPACT 'SEGMENT_INDEX'
-      ```
+     ALTER TABLE test_db.carbon COMPACT 'SEGMENT_INDEX'
+     ```
 
-      **NOTE:**
+     **NOTE:**
 
-      * Merge index is not supported on streaming table.
+     * Merge index is not supported on streaming table.
 
 - ##### SET and UNSET for Local Dictionary Properties
 
    When set command is used, all the newly set properties will override the corresponding old properties if exists.
   
    Example to SET Local Dictionary Properties:
-    ```
+   ```
    ALTER TABLE tablename SET TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE'='false','LOCAL_DICTIONARY_THRESHOLD'='1000','LOCAL_DICTIONARY_INCLUDE'='column1','LOCAL_DICTIONARY_EXCLUDE'='column2')
-    ```
+   ```
    When Local Dictionary properties are unset, corresponding default values will be used for these properties.
    
    Example to UNSET Local Dictionary Properties:
-    ```
+   ```
    ALTER TABLE tablename UNSET TBLPROPERTIES('LOCAL_DICTIONARY_ENABLE','LOCAL_DICTIONARY_THRESHOLD','LOCAL_DICTIONARY_INCLUDE','LOCAL_DICTIONARY_EXCLUDE')
-    ```
+   ```
    
    **NOTE:** For old tables, by default, local dictionary is disabled. If user wants local dictionary for these tables, user can enable/disable local dictionary for new data at their discretion. 
    This can be achieved by using the alter table set command.
@@ -779,8 +789,8 @@ Users can specify which columns to include and exclude for local dictionary gene
   CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
                                 productNumber Int COMMENT 'unique serial number for product')
   COMMENT "This is table comment"
-   STORED AS carbondata
-   TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber')
+  STORED AS carbondata
+  TBLPROPERTIES ('DICTIONARY_INCLUDE'='productNumber')
   ```
 
   You can also SET and UNSET table comment using ALTER command.
@@ -818,7 +828,7 @@ Users can specify which columns to include and exclude for local dictionary gene
 
   Example:
   ```
-   CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
+  CREATE TABLE IF NOT EXISTS productSchema.productSalesTable (
                                 productNumber INT,
                                 productName STRING,
                                 storeCity STRING,
@@ -856,9 +866,9 @@ Users can specify which columns to include and exclude for local dictionary gene
   This command allows you to insert or load overwrite on a specific partition.
 
   ```
-   INSERT OVERWRITE TABLE table_name
-   PARTITION (column = 'partition_name')
-   select_statement
+  INSERT OVERWRITE TABLE table_name
+  PARTITION (column = 'partition_name')
+  select_statement
   ```
 
   Example:
@@ -925,10 +935,10 @@ Users can specify which columns to include and exclude for local dictionary gene
       col_C LONG,
       col_D DECIMAL(10,2),
       col_E LONG
-   ) partitioned by (col_F Timestamp)
-   PARTITIONED BY 'carbondata'
-   TBLPROPERTIES('PARTITION_TYPE'='RANGE',
-   'RANGE_INFO'='2015-01-01, 2016-01-01, 2017-01-01, 2017-02-01')
+  ) partitioned by (col_F Timestamp)
+  PARTITIONED BY 'carbondata'
+  TBLPROPERTIES('PARTITION_TYPE'='RANGE',
+  'RANGE_INFO'='2015-01-01, 2016-01-01, 2017-01-01, 2017-02-01')
   ```
 
 ### Create List Partition Table
@@ -953,9 +963,9 @@ Users can specify which columns to include and exclude for local dictionary gene
       col_E LONG,
       col_F TIMESTAMP
    ) PARTITIONED BY (col_A STRING)
-   STORED AS carbondata
-   TBLPROPERTIES('PARTITION_TYPE'='LIST',
-   'LIST_INFO'='aaaa, bbbb, (cccc, dddd), eeee')
+  STORED AS carbondata
+  TBLPROPERTIES('PARTITION_TYPE'='LIST',
+  'LIST_INFO'='aaaa, bbbb, (cccc, dddd), eeee')
   ```
 
 
@@ -981,9 +991,9 @@ Users can specify which columns to include and exclude for local dictionary gene
 
 ### Drop a partition
 
-   Only drop partition definition, but keep data
+  Only drop partition definition, but keep data
   ```
-    ALTER TABLE [db_name].table_name DROP PARTITION(partition_id)
+  ALTER TABLE [db_name].table_name DROP PARTITION(partition_id)
   ```
 
   Drop both partition definition and data

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca32374a/docs/faq.md
----------------------------------------------------------------------
diff --git a/docs/faq.md b/docs/faq.md
index dbcda4f..7317d1c 100644
--- a/docs/faq.md
+++ b/docs/faq.md
@@ -79,14 +79,12 @@ The store location specified while creating carbon session is used by the Carbon
 Try creating ``carbonsession`` with ``storepath`` specified in the following manner :
 
 ```
-val carbon = SparkSession.builder().config(sc.getConf)
-             .getOrCreateCarbonSession(<store_path>)
+val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession(<carbon_store_path>)
 ```
 Example:
 
 ```
-val carbon = SparkSession.builder().config(sc.getConf)
-             .getOrCreateCarbonSession("hdfs://localhost:9000/carbon/store")
+val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("hdfs://localhost:9000/carbon/store")
 ```
 
 ## What is Carbon Lock Type?
@@ -292,10 +290,11 @@ java.io.FileNotFoundException: hdfs:/localhost:9000/carbon/store/default/hdfstab
 
   2. Use the following command :
 
-```
-"mvn -Pspark-2.1 -Dspark.version {yourSparkVersion} clean package"
-```
-Note :  Refrain from using "mvn clean package" without specifying the profile.
+  ```
+  mvn -Pspark-2.1 -Dspark.version {yourSparkVersion} clean package
+  ```
+  
+Note : Refrain from using "mvn clean package" without specifying the profile.
 
 ## Failed to execute load query on cluster
 
@@ -416,9 +415,9 @@ Note :  Refrain from using "mvn clean package" without specifying the profile.
 
   Insertion fails with the following exception :
 
-   ```
-   Data Load failure exception
-   ```
+  ```
+  Data Load failure exception
+  ```
 
   **Possible Cause**
 
@@ -445,9 +444,9 @@ Note :  Refrain from using "mvn clean package" without specifying the profile.
 
   Execution fails with the following exception :
 
-   ```
-   Table is locked for updation.
-   ```
+  ```
+  Table is locked for updation.
+  ```
 
   **Possible Cause**
 
@@ -463,9 +462,9 @@ Note :  Refrain from using "mvn clean package" without specifying the profile.
 
   Execution fails with the following exception :
 
-   ```
-   Table creation fails.
-   ```
+  ```
+  Table creation fails.
+  ```
 
   **Possible Cause**
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca32374a/docs/hive-guide.md
----------------------------------------------------------------------
diff --git a/docs/hive-guide.md b/docs/hive-guide.md
index c38a539..e675057 100644
--- a/docs/hive-guide.md
+++ b/docs/hive-guide.md
@@ -55,7 +55,7 @@ $HADOOP_HOME/bin/hadoop fs -put sample.csv <hdfs store path>/sample.csv
 ```
 import org.apache.spark.sql.SparkSession
 import org.apache.spark.sql.CarbonSession._
-val rootPath = "hdfs:////user/hadoop/carbon"
+val rootPath = "hdfs:///user/hadoop/carbon"
 val storeLocation = s"$rootPath/store"
 val warehouse = s"$rootPath/warehouse"
 val metastoredb = s"$rootPath/metastore_db"
@@ -84,7 +84,9 @@ export HADOOP_OPTS="-Dorg.xerial.snappy.lib.path=/Library/Java/Extensions -Dorg.
 ```
 
 ### Start hive client
+```
 $HIVE_HOME/bin/hive
+```
 
 ### Query data from hive table
 

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca32374a/docs/quick-start-guide.md
----------------------------------------------------------------------
diff --git a/docs/quick-start-guide.md b/docs/quick-start-guide.md
index a14b1cd..4afa515 100644
--- a/docs/quick-start-guide.md
+++ b/docs/quick-start-guide.md
@@ -80,43 +80,49 @@ import org.apache.spark.sql.CarbonSession._
 * Create a CarbonSession :
 
 ```
-val carbon = SparkSession.builder().config(sc.getConf)
-             .getOrCreateCarbonSession("<hdfs store path>")
+val carbon = SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("<carbon_store_path>")
 ```
-**NOTE**: By default metastore location points to `../carbon.metastore`, user can provide own metastore location to CarbonSession like `SparkSession.builder().config(sc.getConf)
-.getOrCreateCarbonSession("<hdfs store path>", "<local metastore path>")`
+**NOTE** 
+ - By default metastore location points to `../carbon.metastore`, user can provide own metastore location to CarbonSession like 
+   `SparkSession.builder().config(sc.getConf).getOrCreateCarbonSession("<carbon_store_path>", "<local metastore path>")`.
+ - Data storage location can be specified by `<carbon_store_path>`, like `/carbon/data/store`, `hdfs://localhost:9000/carbon/data/store` or `s3a://carbon/data/store`.
 
 #### Executing Queries
 
 ###### Creating a Table
 
 ```
-scala>carbon.sql("CREATE TABLE
-                    IF NOT EXISTS test_table(
-                    id string,
-                    name string,
-                    city string,
-                    age Int)
-                  STORED AS carbondata")
+carbon.sql(
+           s"""
+              | CREATE TABLE IF NOT EXISTS test_table(
+              |   id string,
+              |   name string,
+              |   city string,
+              |   age Int)
+              | STORED AS carbondata
+           """.stripMargin)
 ```
 
 ###### Loading Data to a Table
 
 ```
-scala>carbon.sql("LOAD DATA INPATH '/path/to/sample.csv'
-                  INTO TABLE test_table")
+carbon.sql("LOAD DATA INPATH '/path/to/sample.csv' INTO TABLE test_table")
 ```
+
 **NOTE**: Please provide the real file path of `sample.csv` for the above script. 
 If you get "tablestatus.lock" issue, please refer to [FAQ](faq.md)
 
 ###### Query Data from a Table
 
 ```
-scala>carbon.sql("SELECT * FROM test_table").show()
+carbon.sql("SELECT * FROM test_table").show()
 
-scala>carbon.sql("SELECT city, avg(age), sum(age)
-                  FROM test_table
-                  GROUP BY city").show()
+carbon.sql(
+           s"""
+              | SELECT city, avg(age), sum(age)
+              | FROM test_table
+              | GROUP BY city
+           """.stripMargin).show()
 ```
 
 
@@ -150,23 +156,23 @@ scala>carbon.sql("SELECT city, avg(age), sum(age)
 | spark.driver.extraJavaOptions   | `-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties` | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |
 | spark.executor.extraJavaOptions | `-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties` | A string of extra JVM options to pass to executors. For instance, GC settings or other logging. **NOTE**: You can enter multiple values separated by space. |
 
-1. Add the following properties in `$SPARK_HOME/conf/carbon.properties` file:
+7. Add the following properties in `$SPARK_HOME/conf/carbon.properties` file:
 
 | Property             | Required | Description                                                  | Example                              | Remark                        |
 | -------------------- | -------- | ------------------------------------------------------------ | ------------------------------------ | ----------------------------- |
 | carbon.storelocation | NO       | Location where data CarbonData will create the store and write the data in its own format. If not specified then it takes spark.sql.warehouse.dir path. | hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose to set HDFS directory |
 
-1. Verify the installation. For example:
+8. Verify the installation. For example:
 
 ```
-./spark-shell --master spark://HOSTNAME:PORT --total-executor-cores 2
+./bin/spark-shell \
+--master spark://HOSTNAME:PORT \
+--total-executor-cores 2 \
 --executor-memory 2G
 ```
 
 **NOTE**: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
 
-
-
 ## Installing and Configuring CarbonData on Spark on YARN Cluster
 
    This section provides the procedure to install CarbonData on "Spark on YARN" cluster.
@@ -195,7 +201,7 @@ tar -zcvf carbondata.tar.gz carbonlib/
 mv carbondata.tar.gz carbonlib/
 ```
 
-1. Configure the properties mentioned in the following table in `$SPARK_HOME/conf/spark-defaults.conf` file.
+4. Configure the properties mentioned in the following table in `$SPARK_HOME/conf/spark-defaults.conf` file.
 
 | Property                        | Description                                                  | Value                                                        |
 | ------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
@@ -207,20 +213,23 @@ mv carbondata.tar.gz carbonlib/
 | spark.driver.extraClassPath     | Extra classpath entries to prepend to the classpath of the driver. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. | `$SPARK_HOME/carbonlib/*`                                    |
 | spark.driver.extraJavaOptions   | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. | `-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties` |
 
-1. Add the following properties in `$SPARK_HOME/conf/carbon.properties`:
+5. Add the following properties in `$SPARK_HOME/conf/carbon.properties`:
 
 | Property             | Required | Description                                                  | Example                              | Default Value                 |
 | -------------------- | -------- | ------------------------------------------------------------ | ------------------------------------ | ----------------------------- |
 | carbon.storelocation | NO       | Location where CarbonData will create the store and write the data in its own format. If not specified then it takes spark.sql.warehouse.dir path. | hdfs://HOSTNAME:PORT/Opt/CarbonStore | Propose to set HDFS directory |
 
-1. Verify the installation.
+6. Verify the installation.
 
 ```
- ./bin/spark-shell --master yarn-client --driver-memory 1g
- --executor-cores 2 --executor-memory 2G
+./bin/spark-shell \
+--master yarn-client \
+--driver-memory 1G \
+--executor-memory 2G \
+--executor-cores 2
 ```
 
-  **NOTE**: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
+**NOTE**: Make sure you have permissions for CarbonData JARs and files through which driver and executor will start.
 
 
 
@@ -228,13 +237,13 @@ mv carbondata.tar.gz carbonlib/
 
 ### Starting CarbonData Thrift Server.
 
-   a. cd `$SPARK_HOME`
+a. cd `$SPARK_HOME`
 
-   b. Run the following command to start the CarbonData thrift server.
+b. Run the following command to start the CarbonData thrift server.
 
 ```
-./bin/spark-submit
---class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+./bin/spark-submit \
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer \
 $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
 ```
 
@@ -246,9 +255,9 @@ $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
 **NOTE**: From Spark 1.6, by default the Thrift server runs in multi-session mode. Which means each JDBC/ODBC connection owns a copy of their own SQL configuration and temporary function registry. Cached tables are still shared though. If you prefer to run the Thrift server in single-session mode and share all SQL configuration and temporary function registry, please set option `spark.sql.hive.thriftServer.singleSession` to `true`. You may either add this option to `spark-defaults.conf`, or pass it to `spark-submit.sh` via `--conf`:
 
 ```
-./bin/spark-submit
---conf spark.sql.hive.thriftServer.singleSession=true
---class org.apache.carbondata.spark.thriftserver.CarbonThriftServer
+./bin/spark-submit \
+--conf spark.sql.hive.thriftServer.singleSession=true \
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer \
 $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
 ```
 
@@ -259,34 +268,34 @@ $SPARK_HOME/carbonlib/$CARBON_ASSEMBLY_JAR <carbon_store_path>
 - Start with default memory and executors.
 
 ```
-./bin/spark-submit
---class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
-$SPARK_HOME/carbonlib
-/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
+./bin/spark-submit \
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer \
+$SPARK_HOME/carbonlib/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar \
 hdfs://<host_name>:port/user/hive/warehouse/carbon.store
 ```
 
 - Start with Fixed executors and resources.
 
 ```
-./bin/spark-submit
---class org.apache.carbondata.spark.thriftserver.CarbonThriftServer 
---num-executors 3 --driver-memory 20g --executor-memory 250g 
---executor-cores 32 
-/srv/OSCON/BigData/HACluster/install/spark/sparkJdbc/lib
-/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar
+./bin/spark-submit \
+--class org.apache.carbondata.spark.thriftserver.CarbonThriftServer \
+--num-executors 3 \
+--driver-memory 20G \
+--executor-memory 250G \
+--executor-cores 32 \
+$SPARK_HOME/carbonlib/carbondata_2.xx-x.x.x-SNAPSHOT-shade-hadoop2.7.2.jar \
 hdfs://<host_name>:port/user/hive/warehouse/carbon.store
 ```
 
 ### Connecting to CarbonData Thrift Server Using Beeline.
 
 ```
-     cd $SPARK_HOME
-     ./sbin/start-thriftserver.sh
-     ./bin/beeline -u jdbc:hive2://<thriftserver_host>:port
+cd $SPARK_HOME
+./sbin/start-thriftserver.sh
+./bin/beeline -u jdbc:hive2://<thriftserver_host>:port
 
-     Example
-     ./bin/beeline -u jdbc:hive2://10.10.10.10:10000
+Example
+./bin/beeline -u jdbc:hive2://10.10.10.10:10000
 ```
 
 
@@ -300,79 +309,81 @@ Once the table is created,it can be queried from Presto.**
 
 ### Installing Presto
 
- 1. Download the 0.210 version of Presto using:
-    `wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.210/presto-server-0.210.tar.gz`
+1. Download the 0.210 version of Presto using:
+`wget https://repo1.maven.org/maven2/com/facebook/presto/presto-server/0.210/presto-server-0.210.tar.gz`
 
- 2. Extract Presto tar file: `tar zxvf presto-server-0.210.tar.gz`.
+2. Extract Presto tar file: `tar zxvf presto-server-0.210.tar.gz`.
 
- 3. Download the Presto CLI for the coordinator and name it presto.
+3. Download the Presto CLI for the coordinator and name it presto.
 
-  ```
-    wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.210/presto-cli-0.210-executable.jar
+```
+wget https://repo1.maven.org/maven2/com/facebook/presto/presto-cli/0.210/presto-cli-0.210-executable.jar
 
-    mv presto-cli-0.210-executable.jar presto
+mv presto-cli-0.210-executable.jar presto
 
-    chmod +x presto
-  ```
+chmod +x presto
+```
 
 ### Create Configuration Files
 
-  1. Create `etc` folder in presto-server-0.210 directory.
-  2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files.
-  3. Install uuid to generate a node.id.
+1. Create `etc` folder in presto-server-0.210 directory.
+2. Create `config.properties`, `jvm.config`, `log.properties`, and `node.properties` files.
+3. Install uuid to generate a node.id.
 
-      ```
-      sudo apt-get install uuid
+  ```
+  sudo apt-get install uuid
 
-      uuid
-      ```
+  uuid
+  ```
 
 
 ##### Contents of your node.properties file
 
-  ```
-  node.environment=production
-  node.id=<generated uuid>
-  node.data-dir=/home/ubuntu/data
-  ```
+```
+node.environment=production
+node.id=<generated uuid>
+node.data-dir=/home/ubuntu/data
+```
 
 ##### Contents of your jvm.config file
 
-  ```
-  -server
-  -Xmx16G
-  -XX:+UseG1GC
-  -XX:G1HeapRegionSize=32M
-  -XX:+UseGCOverheadLimit
-  -XX:+ExplicitGCInvokesConcurrent
-  -XX:+HeapDumpOnOutOfMemoryError
-  -XX:OnOutOfMemoryError=kill -9 %p
-  ```
+```
+-server
+-Xmx16G
+-XX:+UseG1GC
+-XX:G1HeapRegionSize=32M
+-XX:+UseGCOverheadLimit
+-XX:+ExplicitGCInvokesConcurrent
+-XX:+HeapDumpOnOutOfMemoryError
+-XX:OnOutOfMemoryError=kill -9 %p
+```
 
 ##### Contents of your log.properties file
-  ```
-  com.facebook.presto=INFO
-  ```
+
+```
+com.facebook.presto=INFO
+```
 
  The default minimum level is `INFO`. There are four levels: `DEBUG`, `INFO`, `WARN` and `ERROR`.
 
 ### Coordinator Configurations
 
 ##### Contents of your config.properties
-  ```
-  coordinator=true
-  node-scheduler.include-coordinator=false
-  http-server.http.port=8086
-  query.max-memory=5GB
-  query.max-total-memory-per-node=5GB
-  query.max-memory-per-node=3GB
-  memory.heap-headroom-per-node=1GB
-  discovery-server.enabled=true
-  discovery.uri=http://localhost:8086
-  task.max-worker-threads=4
-  optimizer.dictionary-aggregation=true
-  optimizer.optimize-hash-generation = false
-  ```
+
+```
+coordinator=true
+node-scheduler.include-coordinator=false
+http-server.http.port=8086
+query.max-memory=5GB
+query.max-total-memory-per-node=5GB
+query.max-memory-per-node=3GB
+memory.heap-headroom-per-node=1GB
+discovery-server.enabled=true
+discovery.uri=http://localhost:8086
+task.max-worker-threads=4
+optimizer.dictionary-aggregation=true
+optimizer.optimize-hash-generation = false
+```
 The options `node-scheduler.include-coordinator=false` and `coordinator=true` indicate that the node is the coordinator and tells the coordinator not to do any of the computation work itself and to use the workers.
 
 **Note**: It is recommended to set `query.max-memory-per-node` to half of the JVM config max memory, though the workload is highly concurrent, lower value for `query.max-memory-per-node` is to be used.
@@ -385,13 +396,13 @@ Then, `query.max-memory=<30GB * number of nodes>`.
 
 ##### Contents of your config.properties
 
-  ```
-  coordinator=false
-  http-server.http.port=8086
-  query.max-memory=5GB
-  query.max-memory-per-node=2GB
-  discovery.uri=<coordinator_ip>:8086
-  ```
+```
+coordinator=false
+http-server.http.port=8086
+query.max-memory=5GB
+query.max-memory-per-node=2GB
+discovery.uri=<coordinator_ip>:8086
+```
 
 **Note**: `jvm.config` and `node.properties` files are same for all the nodes (worker + coordinator). All the nodes should have different `node.id`.(generated by uuid command).
 
@@ -420,6 +431,7 @@ To run it as a background process.
 To run it in foreground.
 
 ### Start Presto CLI
+
 ```
 ./presto
 ```

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca32374a/docs/s3-guide.md
----------------------------------------------------------------------
diff --git a/docs/s3-guide.md b/docs/s3-guide.md
index 1121164..94aebae 100644
--- a/docs/s3-guide.md
+++ b/docs/s3-guide.md
@@ -33,7 +33,7 @@ to be configured with Object Store path in CarbonProperties file.
 
 For example:
 ```
-carbon.storelocation=s3a://mybucket/carbonstore.
+carbon.storelocation=s3a://mybucket/carbonstore
 ```
 
 If the existing store location cannot be changed or only specific tables need to be stored 
@@ -68,8 +68,11 @@ spark.hadoop.fs.s3a.access.key=456
 
 Example:
 ```
-./bin/spark-submit --master yarn --conf spark.hadoop.fs.s3a.secret.key=123 --conf spark.hadoop.fs
-.s3a.access.key=456 --class=
+./bin/spark-submit \
+--master yarn \
+--conf spark.hadoop.fs.s3a.secret.key=123 \
+--conf spark.hadoop.fs.s3a.access.key=456 \
+--class=xxx
 ```  
 
 4. Set authentication properties to hadoop configuration object in sparkContext.

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca32374a/docs/streaming-guide.md
----------------------------------------------------------------------
diff --git a/docs/streaming-guide.md b/docs/streaming-guide.md
index 0987ed2..b993f82 100644
--- a/docs/streaming-guide.md
+++ b/docs/streaming-guide.md
@@ -46,7 +46,7 @@ mvn clean package -DskipTests -Pspark-2.2
 
 Start a socket data server in a terminal
 ```shell
- nc -lk 9099
+nc -lk 9099
 ```
  type some CSV rows as following
 ```csv
@@ -131,12 +131,12 @@ Continue to type some rows into data server, and spark-shell will show the new d
 Streaming table is just a normal carbon table with "streaming" table property, user can create
 streaming table using following DDL.
 ```sql
- CREATE TABLE streaming_table (
-  col1 INT,
-  col2 STRING
- )
- STORED AS carbondata
- TBLPROPERTIES('streaming'='true')
+CREATE TABLE streaming_table (
+ col1 INT,
+ col2 STRING
+)
+STORED AS carbondata
+TBLPROPERTIES('streaming'='true')
 ```
 
  property name | default | description