You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@carbondata.apache.org by ch...@apache.org on 2021/08/08 12:00:34 UTC

[carbondata] branch master updated: Update quick-start-guide.md

This is an automated email from the ASF dual-hosted git repository.

chenliang613 pushed a commit to branch master
in repository https://gitbox.apache.org/repos/asf/carbondata.git


The following commit(s) were added to refs/heads/master by this push:
     new 926b67b  Update quick-start-guide.md
     new fac48be  [CARBONDATA-4267][Doc][summer-2021]Update and modify some content in quick-start-guide.md This closes #4197
926b67b is described below

commit 926b67b906c8df2987b42a2f31a2659913695fa3
Author: CHEN XIN <74...@users.noreply.github.com>
AuthorDate: Thu Aug 5 19:57:12 2021 +0800

    Update quick-start-guide.md
    
    Modify minor errors and correct some misunderstandings in the document
    
    Create quick-start-guide.md
---
 docs/quick-start-guide.md | 32 ++++++++++++++++++++++++++------
 1 file changed, 26 insertions(+), 6 deletions(-)

diff --git a/docs/quick-start-guide.md b/docs/quick-start-guide.md
index 62f5f42..4782917 100644
--- a/docs/quick-start-guide.md
+++ b/docs/quick-start-guide.md
@@ -161,12 +161,23 @@ Start Spark shell by running the following command in the Spark directory:
 ```
 ./bin/spark-shell --conf spark.sql.extensions=org.apache.spark.sql.CarbonExtensions --jars <carbondata assembly jar path>
 ```
+
+In this shell, SparkSession is readily available as `spark` and Spark context is readily available as `sc`.
+
+In order to create a SparkSession we will have to configure it explicitly in the following manner :
+
+* Import the following :
+
+```
+import org.apache.spark.sql.SparkSession
+```
+
 **NOTE** 
  - In this flow, we can use the built-in SparkSession `spark` instead of `carbon`.
    We also can create a new SparkSession instead of the built-in SparkSession `spark` if need. 
    It need to add "org.apache.spark.sql.CarbonExtensions" into spark configuration "spark.sql.extensions". 
    ```
-   SparkSession newSpark = SparkSession
+   val spark = SparkSession
      .builder()
      .config(sc.getConf)
      .enableHiveSupport
@@ -178,6 +189,8 @@ Start Spark shell by running the following command in the Spark directory:
 #### Executing Queries
 
 ###### Creating a Table
+**NOTE** :
+We use the built-in SparkSession `spark` in the following
 
 ```
 carbon.sql(
@@ -205,7 +218,9 @@ We suggest to use CarbonExtensions instead of CarbonSession.
 ###### Loading Data to a Table
 
 ```
-carbon.sql("LOAD DATA INPATH '/path/to/sample.csv' INTO TABLE test_table")
+carbon.sql("LOAD DATA INPATH '/local-path/sample.csv' INTO TABLE test_table")
+
+carbon.sql("LOAD DATA INPATH 'hdfs://hdfs-path/sample.csv' INTO TABLE test_table")
 ```
 
 **NOTE**: Please provide the real file path of `sample.csv` for the above script. 
@@ -250,11 +265,14 @@ carbon.sql(
 
 6. In Spark node[master], configure the properties mentioned in the following table in `$SPARK_HOME/conf/spark-defaults.conf` file.
 
-| Property                        | Value                                                        | Description                                                  |
-| ------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
+| Property                   | Value                                                        | Description                                                  |
+| -------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
 | spark.driver.extraJavaOptions   | `-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties` | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. |
 | spark.executor.extraJavaOptions | `-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties` | A string of extra JVM options to pass to executors. For instance, GC settings or other logging. **NOTE**: You can enter multiple values separated by space. |
 
+
+**NOTE**: Please provide the real directory file path of "SPARK_HOME" instead of the "$SPARK_HOME" for the above script and there is no space on both sides of `=` in the 'Value' column.
+
 7. Verify the installation. For example:
 
 ```
@@ -298,8 +316,8 @@ mv carbondata.tar.gz carbonlib/
 
 4. Configure the properties mentioned in the following table in `$SPARK_HOME/conf/spark-defaults.conf` file.
 
-| Property                        | Description                                                  | Value                                                        |
-| ------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------ |
+| Property                        | Description                                                                                      | Value                                                        |
+| ------------------------------- | ------------------------------------------------------------------------------------------------ | ------------------------------------------------------------ |
 | spark.master                    | Set this value to run the Spark in yarn cluster mode.        | Set yarn-client to run the Spark in yarn cluster mode.       |
 | spark.yarn.dist.files           | Comma-separated list of files to be placed in the working directory of each executor. | `$SPARK_HOME/conf/carbon.properties`                         |
 | spark.yarn.dist.archives        | Comma-separated list of archives to be extracted into the working directory of each executor. | `$SPARK_HOME/carbonlib/carbondata.tar.gz`                    |
@@ -308,6 +326,8 @@ mv carbondata.tar.gz carbonlib/
 | spark.driver.extraClassPath     | Extra classpath entries to prepend to the classpath of the driver. **NOTE**: If SPARK_CLASSPATH is defined in spark-env.sh, then comment it and append the value in below parameter spark.driver.extraClassPath. | `$SPARK_HOME/carbonlib/*`                                    |
 | spark.driver.extraJavaOptions   | A string of extra JVM options to pass to the driver. For instance, GC settings or other logging. | `-Dcarbon.properties.filepath = $SPARK_HOME/conf/carbon.properties` |
 
+**NOTE**: Please provide the real directory file path of "SPARK_HOME" instead of the "$SPARK_HOME" for the above script and there is no space on both sides of `=` in the 'Value' column.
+
 5. Verify the installation.
 
 ```