You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by ra...@apache.org on 2018/10/09 15:50:44 UTC

[42/45] carbondata git commit: [Documentation] Readme updated with latest topics and new TOC

[Documentation] Readme updated with latest topics and new TOC

Readme updated with the new structure
Formatting issue fixed
Review comments fixed

This closes #2788


Project: http://git-wip-us.apache.org/repos/asf/carbondata/repo
Commit: http://git-wip-us.apache.org/repos/asf/carbondata/commit/ca30ad97
Tree: http://git-wip-us.apache.org/repos/asf/carbondata/tree/ca30ad97
Diff: http://git-wip-us.apache.org/repos/asf/carbondata/diff/ca30ad97

Branch: refs/heads/branch-1.5
Commit: ca30ad97da020daceb49669fba454a4346241935
Parents: d392717
Author: sgururajshetty <sg...@gmail.com>
Authored: Fri Sep 28 19:13:08 2018 +0530
Committer: kunal642 <ku...@gmail.com>
Committed: Fri Oct 5 15:00:13 2018 +0530

----------------------------------------------------------------------
 README.md                                      |  33 ++--
 docs/carbon-as-spark-datasource-guide.md       |  29 ++--
 docs/configuration-parameters.md               | 158 ++++++++++----------
 docs/datamap-developer-guide.md                |   4 +-
 docs/datamap/bloomfilter-datamap-guide.md      |   6 +-
 docs/datamap/datamap-management.md             |   6 +-
 docs/datamap/lucene-datamap-guide.md           |   4 +-
 docs/datamap/preaggregate-datamap-guide.md     |   2 +-
 docs/ddl-of-carbondata.md                      |  97 +++++++-----
 docs/dml-of-carbondata.md                      |   6 +-
 docs/documentation.md                          |   2 +-
 docs/faq.md                                    |   6 +-
 docs/file-structure-of-carbondata.md           |   2 +-
 docs/how-to-contribute-to-apache-carbondata.md |   4 +-
 docs/introduction.md                           |  20 +--
 docs/language-manual.md                        |   2 +
 docs/performance-tuning.md                     |  10 +-
 docs/quick-start-guide.md                      |   6 +-
 docs/s3-guide.md                               |   2 +-
 docs/streaming-guide.md                        |   6 +-
 docs/usecases.md                               |  32 ++--
 21 files changed, 229 insertions(+), 208 deletions(-)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca30ad97/README.md
----------------------------------------------------------------------
diff --git a/README.md b/README.md
index ba2cbf7..87bb71e 100644
--- a/README.md
+++ b/README.md
@@ -45,23 +45,26 @@ CarbonData file format is a columnar store in HDFS, it has many features that a
 CarbonData is built using Apache Maven, to [build CarbonData](https://github.com/apache/carbondata/blob/master/build)
 
 ## Online Documentation
+* [What is CarbonData](https://github.com/apache/carbondata/blob/master/docs/introduction.md)
 * [Quick Start](https://github.com/apache/carbondata/blob/master/docs/quick-start-guide.md)
-* [CarbonData File Structure](https://github.com/apache/carbondata/blob/master/docs/file-structure-of-carbondata.md)
-* [Data Types](https://github.com/apache/carbondata/blob/master/docs/supported-data-types-in-carbondata.md)
-* [Data Management on CarbonData](https://github.com/apache/carbondata/blob/master/docs/language-manual.md)
-* [Configuring Carbondata](https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md)
-* [Streaming Ingestion](https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md)
-* [SDK Guide](https://github.com/apache/carbondata/blob/master/docs/sdk-guide.md)
-* [S3 Guide](https://github.com/apache/carbondata/blob/master/docs/s3-guide.md)
-* [DataMap Developer Guide](https://github.com/apache/carbondata/blob/master/docs/datamap-developer-guide.md)
-* [CarbonData DataMap Management](https://github.com/apache/carbondata/blob/master/docs/datamap/datamap-management.md)
-* [CarbonData BloomFilter DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/bloomfilter-datamap-guide.md)
-* [CarbonData Lucene DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/lucene-datamap-guide.md)
-* [CarbonData Pre-aggregate DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/preaggregate-datamap-guide.md)
-* [CarbonData Timeseries DataMap](https://github.com/apache/carbondata/blob/master/docs/datamap/timeseries-datamap-guide.md)
-* [Performance Tuning](https://github.com/apache/carbondata/blob/master/docs/performance-tuning.md)
-* [FAQ](https://github.com/apache/carbondata/blob/master/docs/faq.md)
 * [Use Cases](https://github.com/apache/carbondata/blob/master/docs/usecases.md)
+* [Language Reference](https://github.com/apache/carbondata/blob/master/docs/language-manual.md)
+ * [CarbonData Data Definition Language](https://github.com/apache/carbondata/blob/master/docs/ddl-of-carbondata.md) 
+ * [CarbonData Data Manipulation Language](https://github.com/apache/carbondata/blob/master/docs/dml-of-carbondata.md) 
+ * [CarbonData Streaming Ingestion](https://github.com/apache/carbondata/blob/master/docs/streaming-guide.md) 
+ * [Configuring CarbonData](https://github.com/apache/carbondata/blob/master/docs/configuration-parameters.md) 
+ * [DataMap Developer Guide](https://github.com/apache/carbondata/blob/master/docs/datamap-developer-guide.md) 
+ * [Data Types](https://github.com/apache/carbondata/blob/master/docs/supported-data-types-in-carbondata.md) 
+* [CarbonData DataMap Management](https://github.com/apache/carbondata/blob/master/docs/datamap-management.md) 
+ * [CarbonData BloomFilter DataMap](https://github.com/apache/carbondata/blob/master/docs/bloomfilter-datamap-guide.md) 
+ * [CarbonData Lucene DataMap](https://github.com/apache/carbondata/blob/master/docs/lucene-datamap-guide.md) 
+ * [CarbonData Pre-aggregate DataMap](https://github.com/apache/carbondata/blob/master/docs/preaggregate-datamap-guide.md) 
+ * [CarbonData Timeseries DataMap](https://github.com/apache/carbondata/blob/master/docs/timeseries-datamap-guide.md) 
+* [SDK Guide](https://github.com/apache/carbondata/blob/master/docs/sdk-guide.md) 
+* [Performance Tuning](https://github.com/apache/carbondata/blob/master/docs/performance-tuning.md) 
+* [S3 Storage](https://github.com/apache/carbondata/blob/master/docs/s3-guide.md) 
+* [Carbon as Spark's Datasource](https://github.com/apache/carbondata/blob/master/docs/carbon-as-spark-datasource-guide.md) 
+* [FAQs](https://github.com/apache/carbondata/blob/master/docs/faq.md) 
 
 ## Other Technical Material
 * [Apache CarbonData meetup material](https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=66850609)

http://git-wip-us.apache.org/repos/asf/carbondata/blob/ca30ad97/docs/carbon-as-spark-datasource-guide.md
----------------------------------------------------------------------
diff --git a/docs/carbon-as-spark-datasource-guide.md b/docs/carbon-as-spark-datasource-guide.md
index 1d286cf..bc56a54 100644
--- a/docs/carbon-as-spark-datasource-guide.md
+++ b/docs/carbon-as-spark-datasource-guide.md
@@ -15,19 +15,20 @@
     limitations under the License.
 -->
 
-# Carbon as Spark's datasource guide
+# CarbonData as Spark's Datasource
 
-Carbon fileformat can be integrated to Spark using datasource to read and write data without using CarbonSession.
+The CarbonData fileformat is now integrated as Spark datasource for read and write operation without using CarbonSession. This is useful for users who wants to use carbondata as spark's data source. 
 
+**Note:** You can only apply the functions/features supported by spark datasource APIs, functionalities supported would be similar to Parquet. The carbon session features are not supported.
 
 # Create Table with DDL
 
-Carbon table can be created with spark's datasource DDL syntax as follows.
+Now you can create Carbon table using Spark's datasource DDL syntax.
 
 ```
  CREATE [TEMPORARY] TABLE [IF NOT EXISTS] [db_name.]table_name
      [(col_name1 col_type1 [COMMENT col_comment1], ...)]
-     USING carbon
+     USING CARBON
      [OPTIONS (key1=val1, key2=val2, ...)]
      [PARTITIONED BY (col_name1, col_name2, ...)]
      [CLUSTERED BY (col_name3, col_name4, ...) INTO num_buckets BUCKETS]
@@ -41,25 +42,23 @@ Carbon table can be created with spark's datasource DDL syntax as follows.
 
 | Property | Default Value | Description |
 |-----------|--------------|------------|
-| table_blocksize | 1024 | Size of blocks to write onto hdfs |
-| table_blocklet_size | 64 | Size of blocklet to write |
-| local_dictionary_threshold | 10000 | Cardinality upto which the local dictionary can be generated  |
-| local_dictionary_enable | false | Enable local dictionary generation  |
-| sort_columns | all dimensions are sorted | comma separated string columns which to include in sort and its order of sort |
-| sort_scope | local_sort | Sort scope of the load.Options include no sort, local sort ,batch sort and global sort |
-| long_string_columns | null | comma separated string columns which are more than 32k length |
+| table_blocksize | 1024 | Size of blocks to write onto hdfs. For  more details, see [Table Block Size Configuration](./ddl-of-carbondata.md#table-block-size-configuration). |
+| table_blocklet_size | 64 | Size of blocklet to write. |
+| local_dictionary_threshold | 10000 | Cardinality upto which the local dictionary can be generated. For  more details, see [Local Dictionary Configuration](./ddl-of-carbondata.md#local-dictionary-configuration). |
+| local_dictionary_enable | false | Enable local dictionary generation. For  more details, see [Local Dictionary Configuration](./ddl-of-carbondata.md#local-dictionary-configuration). |
+| sort_columns | all dimensions are sorted | Columns to include in sort and its order of sort. For  more details, see [Sort Columns Configuration](./ddl-of-carbondata.md#sort-columns-configuration). |
+| sort_scope | local_sort | Sort scope of the load.Options include no sort, local sort, batch sort, and global sort. For  more details, see [Sort Scope Configuration](./ddl-of-carbondata.md#sort-scope-configuration). |
+| long_string_columns | null | Comma separated string/char/varchar columns which are more than 32k length. For  more details, see [String longer than 32000 characters](./ddl-of-carbondata.md#string-longer-than-32000-characters). |
 
 ## Example 
 
 ```
- CREATE TABLE CARBON_TABLE (NAME  STRING) USING CARBON OPTIONS(‘table_block_size’=’256’)
+ CREATE TABLE CARBON_TABLE (NAME  STRING) USING CARBON OPTIONS('table_block_size'='256')
 ```
 
-Note: User can only apply the features of what spark datasource like parquet supports. It cannot support the features of carbon session like IUD, compaction etc. 
-
 # Using DataFrame
 
-Carbon format can be used in dataframe also using the following way.
+Carbon format can be used in dataframe also. Following are the ways to use carbon format in dataframe.
 
 Write carbon using dataframe 
 ```