You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by ak...@apache.org on 2020/05/06 10:29:56 UTC

[carbondata] branch master updated: [CARBONDATA-3791] Correct spelling, link and ddl in SI and MV Documentation

This is an automated email from the ASF dual-hosted git repository.

akashrn5 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
     new 3ea6b18  [CARBONDATA-3791] Correct spelling, link and ddl in SI and MV Documentation
3ea6b18 is described below

commit 3ea6b181b41b0f9a6de348574d166df8ff7019f6
Author: Indhumathi27 <in...@gmail.com>
AuthorDate: Sun May 3 17:17:06 2020 +0530

    [CARBONDATA-3791] Correct spelling, link and ddl in SI and MV Documentation
    
    Why is this PR needed?
    Correct spelling, link and ddl in SI and MV Documentation
    
    What changes were proposed in this PR?
    Fixed spelling, link and ddl in SI and MV Documentation
    
    This closes #3735
---
 docs/configuration-parameters.md                   |  2 +-
 docs/index/bloomfilter-index-guide.md              | 15 +++--
 docs/index/index-management.md                     | 37 +++++------
 docs/index/lucene-index-guide.md                   | 30 ++++-----
 docs/index/secondary-index-guide.md                | 76 +++++++++++-----------
 docs/mv-guide.md                                   | 58 ++++++++---------
 .../CarbonDataFileMergeTestCaseOnSI.scala          |  2 +-
 7 files changed, 109 insertions(+), 111 deletions(-)

diff --git a/docs/configuration-parameters.md b/docs/configuration-parameters.md
index 486b133..4627cac 100644
--- a/docs/configuration-parameters.md
+++ b/docs/configuration-parameters.md
@@ -116,7 +116,7 @@ This section provides the details of all the configurations required for the Car
 | carbon.compaction.prefetch.enable | false | Compaction operation is similar to Query + data load where in data from qualifying segments are queried and data loading performed to generate a new single segment. This configuration determines whether to query ahead data from segments and feed it for data loading. **NOTE: **This configuration is disabled by default as it needs extra resources for querying extra data. Based on the memory availability on the cluster, user can enable it to imp [...]
 | carbon.merge.index.in.segment | true | Each CarbonData file has a companion CarbonIndex file which maintains the metadata about the data. These CarbonIndex files are read and loaded into driver and is used subsequently for pruning of data during queries. These CarbonIndex files are very small in size(few KB) and are many. Reading many small files from HDFS is not efficient and leads to slow IO performance. Hence these CarbonIndex files belonging to a segment can be combined into  a sin [...]
 | carbon.enable.range.compaction | true | To configure Ranges-based Compaction to be used or not for RANGE_COLUMN. If true after compaction also the data would be present in ranges. |
-| carbon.si.segment.merge | false | Making this true degrade the LOAD performance. When the number of small files increase for SI segments(it can happen as number of columns will be less and we store position id and reference columns), user an either set to true which will merge the data files for upcoming loads or run SI rebuild command which does this job for all segments. (REBUILD INDEX <index_table>) |
+| carbon.si.segment.merge | false | Making this true degrade the LOAD performance. When the number of small files increase for SI segments(it can happen as number of columns will be less and we store position id and reference columns), user an either set to true which will merge the data files for upcoming loads or run SI refresh command which does this job for all segments. (REFRESH INDEX <index_table>) |
 
 ## Query Configuration
 
diff --git a/docs/index/bloomfilter-index-guide.md b/docs/index/bloomfilter-index-guide.md
index 85f284a..03085f1 100644
--- a/docs/index/bloomfilter-index-guide.md
+++ b/docs/index/bloomfilter-index-guide.md
@@ -36,14 +36,15 @@ Creating BloomFilter Index
 Dropping Specified Index
   ```
   DROP INDEX [IF EXISTS] index_name
-  ON TABLE main_table
+  ON [TABLE] main_table
   ```
 
 Showing all Indexes on this table
   ```
   SHOW INDEXES
-  ON TABLE main_table
+  ON [TABLE] main_table
   ```
+> NOTE: Keywords given inside `[]` is optional.
 
 Disable Index
 > The index by default is enabled. To support tuning on query, we can disable a specific index during query to observe whether we can gain performance enhancement from it. This is effective only for current session.
@@ -59,7 +60,7 @@ Disable Index
 ## BloomFilter Index Introduction
 A Bloom filter is a space-efficient probabilistic data structure that is used to test whether an element is a member of a set.
 Carbondata introduced BloomFilter as an index to enhance the performance of querying with precise value.
-It is well suitable for queries that do precise match on high cardinality columns(such as Name/ID).
+It is well suitable for queries that do precise matching on high cardinality columns(such as Name/ID).
 Internally, CarbonData maintains a BloomFilter per blocklet for each index column to indicate that whether a value of the column is in this blocklet.
 Just like the other indexes, BloomFilter index is managed along with main tables by CarbonData.
 User can create BloomFilter index on specified columns with specified BloomFilter configurations such as size and probability.
@@ -79,7 +80,7 @@ For instance, main table called **index_test** which is defined as:
 
 In the above example, `id` and `name` are high cardinality columns
 and we always query on `id` and `name` with precise value.
-since `id` is in the sort_columns and it is orderd,
+since `id` is in the sort_columns and it is ordered,
 query on it will be fast because CarbonData can skip all the irrelative blocklets.
 But queries on `name` may be bad since the blocklet minmax may not help,
 because in each blocklet the range of the value of `name` may be the same -- all from A* to z*.
@@ -96,7 +97,7 @@ User can create BloomFilter Index using the Create Index DDL:
   PROPERTIES ('BLOOM_SIZE'='640000', 'BLOOM_FPP'='0.00001', 'BLOOM_COMPRESS'='true')
   ```
 
-Here, (name,id) are INDEX_COLUMNS. Carbondata will generate BloomFilter index on these columns. Queries on these columns are usually like 'COL = VAL'.
+Here, (name,id) are INDEX_COLUMNS. Carbondata will generate BloomFilter index on these columns. Queries on these columns are usually like `'COL = VAL'`.
 
 **Properties for BloomFilter Index**
 
@@ -131,7 +132,7 @@ You can refer to the corresponding section in [CarbonData Lucene Index](https://
 + We can create multiple BloomFilter Indexes on one table,
  but we do recommend you to create one BloomFilter Index that contains multiple index columns,
  because the data loading and query performance will be better.
-+ `BLOOM_FPP` is only the expected number from user, the actually FPP may be worse.
++ `BLOOM_FPP` is only the expected number from user, the actual FPP may be worse.
  If the BloomFilter Index does not work well,
  you can try to increase `BLOOM_SIZE` and decrease `BLOOM_FPP` at the same time.
  Notice that bigger `BLOOM_SIZE` will increase the size of index file
@@ -145,5 +146,5 @@ You can refer to the corresponding section in [CarbonData Lucene Index](https://
 + In some scenarios, the BloomFilter Index may not enhance the query performance significantly
  but if it can reduce the number of spark task,
  there is still a chance that BloomFilter Index can enhance the performance for concurrent query.
-+ Note that BloomFilter Index will decrease the data loading performance and may cause slightly storage expansion (for index file).
++ Note that BloomFilter Index will decrease the data loading performance and may cause slight storage expansion (for index file).
 
diff --git a/docs/index/index-management.md b/docs/index/index-management.md
index 6b4b6ec..7bd9c75 100644
--- a/docs/index/index-management.md
+++ b/docs/index/index-management.md
@@ -51,54 +51,51 @@ Currently, there are 3 Index implementations in CarbonData.
 
 There are two kinds of management semantic for Index.
 
-1. Automatic Refresh: Create index without `WITH DEFERRED REBUILD` in the statement, which is by default.
-2. Manual Refresh: Create index with `WITH DEFERRED REBUILD` in the statement
+1. Automatic Refresh
+2. Manual Refresh
 
 ### Automatic Refresh
 
-When user creates a index on the main table without using `WITH DEFERRED REFRESH` syntax, the index will be managed by system automatically.
-For every data load to the main table, system will immediately trigger a load to the index automatically. These two data loading (to main table and index) is executed in a transactional manner, meaning that it will be either both success or neither success. 
+When a user creates an index on the main table without using `WITH DEFERRED REFRESH` syntax, the index will be managed by the system automatically.
+For every data load to the main table, the system will immediately trigger a load to the index automatically. These two data loading (to main table and index) is executed in a transactional manner, meaning that it will be either both success or neither success. 
 
-The data loading to index is incremental based on Segment concept, avoiding a expensive total rebuild.
+The data loading to index is incremental based on Segment concept, avoiding an expensive total refresh.
 
-If user perform following command on the main table, system will return failure. (reject the operation)
+If a user performs the following command on the main table, the system will return failure. (reject the operation)
 
 1. Data management command: `UPDATE/DELETE/DELETE SEGMENT`.
 2. Schema management command: `ALTER TABLE DROP COLUMN`, `ALTER TABLE CHANGE DATATYPE`,
    `ALTER TABLE RENAME`. Note that adding a new column is supported, and for dropping columns and
    change datatype command, CarbonData will check whether it will impact the index table, if
-    not, the operation is allowed, otherwise operation will be rejected by throwing exception.
+    not, the operation is allowed, otherwise operation will be rejected by throwing an exception.
 3. Partition management command: `ALTER TABLE ADD/DROP PARTITION`.
 
-If user do want to perform above operations on the main table, user can first drop the index, perform the operation, and re-create the index again.
+If a user does want to perform above operations on the main table, the user can first drop the index, perform the operation, and re-create the index again.
 
-If user drop the main table, the index will be dropped immediately too.
+If a user drops the main table, the index will be dropped immediately too.
 
-We do recommend you to use this management for index.
+We do recommend you to use this management for indexing.
 
 ### Manual Refresh
 
-When user creates a index specifying manual refresh semantic, the index is created with status *disabled* and query will NOT use this index until user can issue REFRESH INDEX command to build the index. For every REFRESH INDEX command, system will trigger a full rebuild of the index. After rebuild is done, system will change index status to *enabled*, so that it can be used in query rewrite.
+When a user creates an index on the main table using `WITH DEFERRED REFRESH` syntax, the index will be created with status *disabled* and query will NOT use this index until the user issues `REFRESH INDEX` command to build the index. For every `REFRESH INDEX` command, the system will trigger a full refresh of the index. Once the refresh operation is finished, system will change index status to *enabled*, so that it can be used in query rewrite.
 
 For every new data loading, data update, delete, the related index will be made *disabled*,
 which means that the following queries will not benefit from the index before it becomes *enabled* again.
 
-If the main table is dropped by user, the related index will be dropped immediately.
+If the main table is dropped by the user, the related index will be dropped immediately.
 
 **Note**:
-+ If you are creating a index on external table, you need to do manual management of the index.
-+ For index such as BloomFilter index, there is no need to do manual refresh.
- By default it is automatic refresh,
- which means its data will get refreshed immediately after the index is created or the main table is loaded.
- Manual refresh on this index will has no impact.
++ If you are creating an index on an external table, you need to do manual management of the index.
++ Currently, all types of indexes supported by carbon will be automatically refreshed by default, which means its data will get refreshed immediately after the index is created or the main table is loaded. Manual refresh on these indexes is not supported.
 
 ## Index Related Commands
 
 ### Explain
 
-How can user know whether index is used in the query?
+How can users know whether an index is used in the query?
 
-User can set enable.query.statistics = true and use EXPLAIN command to know, it will print out something like
+User can set `enable.query.statistics = true` and use `EXPLAIN` command to know, it will print out something like
 
 ```text
 == CarbonData Profiler ==
@@ -113,7 +110,7 @@ Table Scan on default.main
 
 ### Show Index
 
-There is a SHOW INDEXES command, when this is issued, system will read all index from the carbon table and print all information on screen. The current information includes:
+There is a SHOW INDEXES command, when this is issued, the system will read all indexes from the carbon table and print all information on screen. The current information includes:
 
 - Name
 - Provider like lucene
diff --git a/docs/index/lucene-index-guide.md b/docs/index/lucene-index-guide.md
index c811ec3..87f840a 100644
--- a/docs/index/lucene-index-guide.md
+++ b/docs/index/lucene-index-guide.md
@@ -36,14 +36,15 @@ index_columns is the list of string columns on which lucene creates indexes.
 Index can be dropped using following DDL:
   ```
   DROP INDEX [IF EXISTS] index_name
-  ON TABLE main_table
+  ON [TABLE] main_table
   ```
 To show all Indexes created, use:
   ```
   SHOW INDEXES
-  ON TABLE main_table
+  ON [TABLE] main_table
   ```
-It will show all Indexes created on main table.
+It will show all Indexes created on the main table.
+> NOTE: Keywords given inside `[]` is optional.
 
 
 ## Lucene Index Introduction
@@ -83,28 +84,28 @@ It will show all Indexes created on main table.
 When loading data to main table, lucene index files will be generated for all the
 index_columns(String Columns) given in CREATE statement which contains information about the data
 location of index_columns. These index files will be written inside a folder named with index name
-inside each segment folders.
+inside each segment folder.
 
-A system level configuration carbon.lucene.compression.mode can be added for best compression of
+A system level configuration `carbon.lucene.compression.mode` can be added for best compression of
 lucene index files. The default value is speed, where the index writing speed will be more. If the
 value is compression, the index file size will be compressed.
 
 ## Querying data
 As a technique for query acceleration, Lucene indexes cannot be queried directly.
-Queries are to be made on main table. when a query with TEXT_MATCH('name:c10') or 
+Queries are to be made on the main table. When a query with TEXT_MATCH('name:c10') or 
 TEXT_MATCH_WITH_LIMIT('name:n10',10)[the second parameter represents the number of result to be 
 returned, if user does not specify this value, all results will be returned without any limit] is 
-fired, two jobs are fired. The first job writes the temporary files in folder created at table level 
-which contains lucene's seach results and these files will be read in second job to give faster 
+fired, two jobs will be launched. The first job writes the temporary files in folder created at table level 
+which contains lucene's search results and these files will be read in second job to give faster 
 results. These temporary files will be cleared once the query finishes.
 
-User can verify whether a query can leverage Lucene index or not by executing `EXPLAIN`
+User can verify whether a query can leverage Lucene index or not by executing the `EXPLAIN`
 command, which will show the transformed logical plan, and thus user can check whether TEXT_MATCH()
 filter is applied on query or not.
 
 **Note:**
- 1. The filter columns in TEXT_MATCH or TEXT_MATCH_WITH_LIMIT must be always in lower case and 
-filter condition like 'AND','OR' must be in upper case.
+ 1. The filter columns in TEXT_MATCH or TEXT_MATCH_WITH_LIMIT must be always in lowercase and 
+filter conditions like 'AND','OR' must be in upper case.
 
       Ex: 
       ```
@@ -124,7 +125,7 @@ filter condition like 'AND','OR' must be in upper case.
    ```
        
           
-Below like queries can be converted to text_match queries as following:
+Below `like` queries can be converted to text_match queries as following:
 ```
 select * from index_test where name='n10'
 
@@ -151,9 +152,8 @@ select * from index_test where TEXT_MATCH('name:*10 -name:*n*')
 **Note:** For lucene queries and syntax, refer to [lucene-syntax](http://www.lucenetutorial.com/lucene-query-syntax.html)
 
 ## Data Management with lucene index
-Once there is lucene index is created on the main table, following command on the main
-table
-is not supported:
+Once there is a lucene index created on the main table, following command on the main
+table is not supported:
 1. Data management command: `UPDATE/DELETE`.
 2. Schema management command: `ALTER TABLE DROP COLUMN`, `ALTER TABLE CHANGE DATATYPE`, 
 `ALTER TABLE RENAME`.
diff --git a/docs/index/secondary-index-guide.md b/docs/index/secondary-index-guide.md
index e588ed9..1d86b82 100644
--- a/docs/index/secondary-index-guide.md
+++ b/docs/index/secondary-index-guide.md
@@ -30,34 +30,36 @@ Start spark-sql in terminal and run the following queries,
 ```
 CREATE TABLE maintable(a int, b string, c string) stored as carbondata;
 insert into maintable select 1, 'ab', 'cd';
-CREATE index inex1 on table maintable(c) AS 'carbondata';
+CREATE index index1 on table maintable(c) AS 'carbondata';
 SELECT a from maintable where c = 'cd';
 // NOTE: run explain query and check if query hits the SI table from the plan
 EXPLAIN SELECT a from maintable where c = 'cd';
 ```
 
 ## Secondary Index Introduction
-  Sencondary index tables are created as a indexes and managed as child tables internally by
-  Carbondata. Users can create secondary index based on the column position in main table(Recommended
+  Secondary index tables are created as indexes and managed as child tables internally by
+  Carbondata. Users can create a secondary index based on the column position in the main table(Recommended
   for right columns) and the queries should have filter on that column to improve the filter query
   performance.
   
-  SI tables will always be loaded non-lazy way. Once SI table is created, Carbondata's 
+  Data refresh to the secondary index is always automatic. Once SI table is created, Carbondata's 
   CarbonOptimizer with the help of `CarbonSITransformationRule`, transforms the query plan to hit the
   SI table based on the filter condition or set of filter conditions present in the query.
-  So first level of pruning will be done on SI table as it stores blocklets and main table/parent
+  So the first level of pruning will be done on the SI table as it stores blocklets and main table/parent
   table pruning will be based on the SI output, which helps in giving the faster query results with
   better pruning.
 
-  Secondary Index table can be create with below syntax
+  Secondary Index table can be created with the below syntax
 
    ```
    CREATE INDEX [IF NOT EXISTS] index_name
    ON TABLE maintable(index_column)
    AS
    'carbondata'
-   [TBLPROPERTIES('table_blocksize'='1')]
+   [PROPERTIES('table_blocksize'='1')]
    ```
+> NOTE: Keywords given inside `[]` is optional.
+
   For instance, main table called **sales** which is defined as
 
   ```
@@ -78,16 +80,16 @@ EXPLAIN SELECT a from maintable where c = 'cd';
   ON TABLE sales(user_id)
   AS
   'carbondata'
-  TBLPROPERTIES('table_blocksize'='1')
+  PROPERTIES('table_blocksize'='1')
   ```
  
  
 #### How SI tables are selected
 
-When a user executes a filter query, during query planning phase, CarbonData with help of
+When a user executes a filter query, during the query planning phase, CarbonData with the help of
 `CarbonSITransformationRule`, checks if there are any index tables present on the filter column of
-query. If there are any, then filter query plan will be transformed such a way that, execution will
-first hit the corresponding SI table and give input to main table for further pruning.
+query. If there are any, then the filter query plan will be transformed in such a way that execution will
+first hit the corresponding SI table and give input to the main table for further pruning.
 
 
 For the main table **sales** and SI table  **index_sales** created above, following queries
@@ -105,27 +107,27 @@ will be transformed by CarbonData's `CarbonSITransformationRule` to query agains
 
 ### Loading data to Secondary Index table(s).
 
-*case1:* When SI table is created and the main table does not have any data. In this case every
-consecutive load will load to SI table once main table data load is finished.
+*case1:* When the SI table is created and the main table does not have any data. In this case every
+consecutive load to the main table, will load data to the SI table once the main table data load is finished.
 
-*case2:* When SI table is created and main table already contains some data, then SI creation will
-also load to SI table with same number of segments as main table. There after, consecutive load to
-main table will load to SI table also.
+*case2:* When the SI table is created and the main table already contains some data, then SI creation will
+also load data to the SI table with the same number of segments as the main table. Thereafter, consecutive load to
+the main table will also load data to the SI table.
 
  **NOTE**:
- * In case of data load failure to SI table, then we make the SI table disable by setting a hive serde
+ * In case of data load failure to the SI table, then we make the SI table disable by setting a hive serde
  property. The subsequent main table load will load the old failed loads along with current load and
  makes the SI table enable and available for query.
 
 ## Querying data
-Direct query can be made on SI tables to see the data present in position reference columns.
-When a filter query is fired, if the filter column is a secondary index column, then plan is
-transformed accordingly to hit SI table first to make better pruning with main table and in turn
+Direct query can be made on SI tables to check the data present in position reference columns.
+When a filter query is fired, and if the filter column is a secondary index column, then plan is
+transformed accordingly to hit the SI table first to make better pruning with the main table and in turn
 helps for faster query results.
 
-User can verify whether a query can leverage SI table or not by executing `EXPLAIN`
-command, which will show the transformed logical plan, and thus user can check whether SI table
-table is selected.
+Users can verify whether a query can leverage the SI table or not by executing the `EXPLAIN`
+command, which will show the transformed logical plan, and thus users can check whether the SI table
+is selected.
 
 
 ## Compacting SI table
@@ -133,33 +135,33 @@ table is selected.
 ### Compacting SI table table through Main Table compaction
 Running Compaction command (`ALTER TABLE COMPACT`)[COMPACTION TYPE-> MINOR/MAJOR] on main table will
 automatically delete all the old segments of SI and creates a new segment with same name as main
-table compacted segmet and loads data to it.
+table compacted segment and loads data to it.
 
-### Compacting SI table's individual segment(s) through REBUILD command
-Where there are so many small files present in the SI table, then we can use REBUILD command to
+### Compacting SI table's individual segment(s) through REFRESH INDEX command
+Where there are so many small files present in the SI table, then we can use the REFRESH INDEX command to
 compact the files within an SI segment to avoid many small files.
 
   ```
-  REBUILD INDEX sales_index
+  REFRESH INDEX sales_index
   ```
-This command merges data files in  each segment of SI table.
+This command merges data files in each segment of the SI table.
 
   ```
-  REBUILD INDEX sales_index WHERE SEGMENT.ID IN(1)
+  REFRESH INDEX sales_index WHERE SEGMENT.ID IN(1)
   ```
-This command merges data files within specified segment of SI table.
+This command merges data files within a specified segment of the SI table.
 
 ## How to skip Secondary Index?
-When Secondary indexes are created on a table(s), always data fetching happens from secondary
+When Secondary indexes are created on a table(s), data fetching happens from secondary
 indexes created on the main tables for better performance. But sometimes, data fetching from the
-secondary index might degrade query performance in case where the data is sparse and most of the
+secondary index might degrade query performance in cases where the data is sparse and most of the
 blocklets need to be scanned. So to avoid such secondary indexes, we use NI as a function on filters
-with in WHERE clause.
+within WHERE clause.
 
   ```
   SELECT country, sex from sales where NI(user_id = 'xxx')
   ```
-The above query ignores column user_id from secondary index and fetch data from main table.
+The above query ignores column `user_id` from the secondary index and fetches data from the main table.
 
 ## DDLs on Secondary Index
 
@@ -168,7 +170,7 @@ This command is used to get information about all the secondary indexes on a tab
 
 Syntax
   ```
-  SHOW INDEXES  on [db_name.]table_name
+  SHOW INDEXES ON [TABLE] [db_name.]table_name
   ```
 
 ### Drop index Command
@@ -176,7 +178,7 @@ This command is used to drop an existing secondary index on a table
 
 Syntax
   ```
-  DROP INDEX [IF EXISTS] index_name on [db_name.]table_name
+  DROP INDEX [IF EXISTS] index_name ON [TABLE] [db_name.]table_name
   ```
 
 ### Register index Command
@@ -185,5 +187,5 @@ where we have old stores.
 
 Syntax
   ```
-  REGISTER INDEX TABLE index_name ON [db_name.]table_name
+  REGISTER INDEX TABLE index_name ON [TABLE] [db_name.]table_name
   ```
\ No newline at end of file
diff --git a/docs/mv-guide.md b/docs/mv-guide.md
index 9902e1c..24e38b1 100644
--- a/docs/mv-guide.md
+++ b/docs/mv-guide.md
@@ -35,17 +35,17 @@
      INSERT INTO maintable SELECT 1, 'ab', 2;
      CREATE MATERIALIZED VIEW view1 AS SELECT a, sum(b) FROM maintable GROUP BY a;
      SELECT a, sum(b) FROM maintable GROUP BY a;
-     // NOTE: run explain query and check if query hits the Index table from the plan
+     // NOTE: run explain query and check if query hits the mv table from the plan
      EXPLAIN SELECT a, sum(b) FROM maintable GROUP BY a;
    ```
 
-## Introductions
+## Introduction
 
- Materialized views are created as tables from queries. User can create limitless materialized view 
+ Materialized views are created as tables from queries. Users can create limitless materialized views 
  to improve query performance provided the storage requirements and loading time is acceptable.
  
  Materialized view can be refreshed on commit or on manual. Once materialized views are created, 
- CarbonData's MVRewriteRule helps to select the most efficient materialized view based on 
+ CarbonData's `MVRewriteRule` helps to select the most efficient materialized view based on 
  the user query and rewrite the SQL to select the data from materialized view instead of 
  fact tables. Since the data size of materialized view is smaller and data is pre-processed, 
  user queries are much faster.
@@ -63,7 +63,7 @@
      STORED AS carbondata
    ```
 
- User can create materialized view using the CREATE MATERIALIZED VIEW statement.
+ Users can create a materialized view using the CREATE MATERIALIZED VIEW statement.
  
    ```
      CREATE MATERIALIZED VIEW agg_sales
@@ -75,7 +75,7 @@
    ```
 
  **NOTE**:
-   * Group by and Order by columns has to be provided in projection list while creating materialized view.
+   * Group by and Order by columns has to be provided in the projection list while creating a materialized view.
    * If only single fact table is involved in materialized view creation, then TableProperties of 
      fact table (if not present in a aggregate function like sum(col)) listed below will be 
      inherited to materialized view.
@@ -93,7 +93,7 @@
    * Creating materialized view with select query containing only project of all columns of fact 
      table is unsupported.
      **Example:**
-       If table 'x' contains columns 'a,b,c', then creating MV Index with below queries is not supported.
+       If table 'x' contains columns 'a,b,c', then creating MV with below queries is not supported.
          1. ```SELECT a,b,c FROM x```
          2. ```SELECT * FROM x```
    * TableProperties can be provided in Properties excluding LOCAL_DICTIONARY_INCLUDE,
@@ -107,9 +107,9 @@
 
 #### How materialized views are selected
 
- When a user query is submitted, during query planning phase, CarbonData will collect modular plan
- candidates and process the the ModularPlan based on registered summary data sets. Then,
- materialized view for this query will be selected among the candidates.
+ When a user query is submitted, during the query planning phase, CarbonData will collect modular plan
+ candidates and process the ModularPlan based on registered summary data sets. Then,
+ a materialized view for this query will be selected among the candidates.
 
  For the fact table **sales** and materialized view **agg_sales** created above, following queries
    ```
@@ -140,7 +140,7 @@
  view will be triggered by the CREATE MATERIALIZED VIEW statement when user creates the materialized 
  view.
 
- For incremental loads to fact table, data to materialized view will be loaded once the 
+ For incremental loads to the fact table, data to materialized view will be loaded once the 
  corresponding fact table load is completed.
 
 ### Loading data on manual
@@ -148,7 +148,7 @@
  In case of WITH DEFERRED REFRESH, data load to materialized view will be triggered by the refresh 
  command. Materialized view will be in DISABLED state in below scenarios.
 
-   * when materialized view is created.
+   * when a materialized view is created.
    * when data of fact table and materialized view are not in sync.
   
  User should fire REFRESH MATERIALIZED VIEW command to sync all segments of fact table with 
@@ -163,27 +163,27 @@
 
  During load to fact table, if anyone of the load to materialized view fails, then that 
  corresponding materialized view will be DISABLED and load to other materialized views mapped 
- to fact table will continue. 
+ to the fact table will continue. 
 
  User can fire REFRESH MATERIALIZED VIEW command to sync or else the subsequent table load 
  will load the old failed loads along with current load and enable the disabled materialized view.
 
  **NOTE**:
    * In case of InsertOverwrite/Update operation on fact table, all segments of materialized view 
-     will be MARKED_FOR_DELETE and reload to Index table will happen by REFRESH MATERIALIZED VIEW, 
+     will be MARKED_FOR_DELETE and reload to mv table will happen by REFRESH MATERIALIZED VIEW, 
      in case of materialized view which refresh on manual and once the InsertOverwrite/Update 
      operation on fact table is finished, in case of materialized view which refresh on commit.
    * In case of full scan query, Data Size and Index Size of fact table and materialized view 
-     will not the same, as fact table and materialized view has different column names.
+     will not be the same, as fact table and materialized view have different column names.
 
 ## Querying data
 
- Queries are to be made on fact table. While doing query planning, internally CarbonData will check
+ Queries are to be made on the fact table. While doing query planning, internally CarbonData will check
  for the materialized views which are associated with the fact table, and do query plan 
  transformation accordingly.
  
- User can verify whether a query can leverage materialized view or not by executing `EXPLAIN` command, 
- which will show the transformed logical plan, and thus user can check whether materialized view 
+ Users can verify whether a query can leverage materialized view or not by executing the `EXPLAIN` command, 
+ which will show the transformed logical plan, and thus the user can check whether a materialized view 
  is selected.
 
 ## Compacting
@@ -207,7 +207,7 @@
       materialized view, if not, the operation is allowed, otherwise operation will be rejected by
       throwing exception.
    3. Partition management command: `ALTER TABLE ADD/DROP PARTITION`. Note that dropping a partition
-      will be allowed only if partition is participating in all indexes associated with fact table.
+      will be allowed only if the partition column of fact table is participating in all of the table's materialized views.
       Drop Partition is not allowed, if any materialized view is associated with more than one 
       fact table. Drop Partition directly on materialized view is not allowed.
    4. Complex Datatype's for materialized view is not supported.
@@ -215,7 +215,7 @@
  However, there is still way to support these operations on fact table, in current CarbonData
  release, user can do as following:
  
-   1. Remove the materialized by `DROP MATERIALIZED VIEW` command.
+   1. Remove the materialized view by `DROP MATERIALIZED VIEW` command.
    2. Carry out the data management operation on fact table.
    3. Create the materialized view again by `CREATE MATERIALIZED VIEW` command.
    
@@ -273,14 +273,14 @@
      GROUP BY timeseries(order_time, 'minute')
    ```
  And execute the below query to check time series data. In this example, a materialized view of 
- aggregated table on price column will be created, which will be aggregated on every one minute.
+ the aggregated table on the price column will be created, which will be aggregated every one minute.
   
    ```
      SELECT timeseries(order_time,'minute'), avg(price)
      FROM sales
      GROUP BY timeseries(order_time,'minute')
    ```
- Find below the result of above query aggregated over minute.
+ Find below the result of the above query aggregated over a minute.
  
    ```
      +---------------------------------------+----------------+
@@ -300,19 +300,17 @@
  granularity provided during creation and stored on each segment.
  
  **NOTE**:
-   1. Single select statement cannot contain time series udf(s) neither with different granularity
-      nor with different timestamp/date columns.
-   2. Retention policies for time series is not supported yet.
+   1. Retention policies for time series is not supported yet.
  
 ## Time Series RollUp Support
 
- Time series queries can be rolled up from existing materialized view.
+ Time series queries can be rolled up from an existing materialized view.
  
 ### Query RollUp
 
  Consider an example where the query is on hour level granularity, but the materialized view
  with hour level granularity is not present but materialized view with minute level granularity is 
- present, then we can get the data from minute level and the aggregate the hour level data and 
+ present, then we can get the data from minute level and aggregate the hour level data and 
  give output. This is called query rollup.
  
  Consider if user create's below time series materialized view,
@@ -334,10 +332,10 @@
    ```
 
  Then, the above query can be rolled up from materialized view 'agg_sales', by adding hour
- level time series aggregation on minute level aggregation. Users can fire explain command
- to check if query is rolled up from existing materialized view.
+ level time series aggregation on minute level aggregation. Users can fire the `EXPLAIN` command
+ to check if a query is rolled up from an existing materialized view.
  
   **NOTE**:
-    1. Queries cannot be rolled up, if filter contains time series function.
+    1. Queries cannot be rolled up, if the filter contains a time series function.
     2. Roll up is not yet supported for queries having join clause or order by functions.
   
\ No newline at end of file
diff --git a/index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/mergedata/CarbonDataFileMergeTestCaseOnSI.scala b/index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/mergedata/CarbonDataFileMergeTestCaseOnSI.scala
index 9eced78..00c7d4a 100644
--- a/index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/mergedata/CarbonDataFileMergeTestCaseOnSI.scala
+++ b/index/secondary-index/src/test/scala/org/apache/carbondata/spark/testsuite/mergedata/CarbonDataFileMergeTestCaseOnSI.scala
@@ -142,7 +142,7 @@ class CarbonDataFileMergeTestCaseOnSI
     checkAnswer(sql("""Select count(*) from nonindexmerge where name='n164419'"""), rows)
   }
 
-  test("Verify command of REBUILD INDEX command with invalid segments") {
+  test("Verify command of REFRESH INDEX command with invalid segments") {
     CarbonProperties.getInstance()
       .addProperty(CarbonCommonConstants.CARBON_SI_SEGMENT_MERGE, "false")
     sql("DROP TABLE IF EXISTS nonindexmerge")