You are viewing a plain text version of this content. The canonical link for it is here.
Posted to commits@zeppelin.apache.org by zj...@apache.org on 2021/02/11 08:00:07 UTC
[zeppelin] branch branch-0.9 updated: [ZEPPELIN-5241] Typos in spark tutorial

This is an automated email from the ASF dual-hosted git repository.

zjffdu pushed a commit to branch branch-0.9
in repository https://gitbox.apache.org/repos/asf/zeppelin.git


The following commit(s) were added to refs/heads/branch-0.9 by this push:
     new e7ef666  [ZEPPELIN-5241] Typos in spark tutorial
e7ef666 is described below

commit e7ef666fef0bd595543c62787450ee7e734bf509
Author: OmriK <om...@dynamicyield.com>
AuthorDate: Sun Feb 7 18:12:06 2021 +0200

    [ZEPPELIN-5241] Typos in spark tutorial
    
    ### What is this PR for?
    Fixing some typos from the tutorials notebook
    
    ### What type of PR is it?
    Documentation
    
    ### Todos
    * [x] - Task
    
    ### What is the Jira issue?
    [ZEPPELIN-5241](https://issues.apache.org/jira/browse/ZEPPELIN-5241)
    
    ### How should this be tested?
    * Standard CI tests
    ### Screenshots (if appropriate)
    
    ### Questions:
    * Does the licenses files need update? - no
    * Is there breaking changes for older versions? - no
    * Does this needs documentation? - no
    
    Author: OmriK <om...@dynamicyield.com>
    
    Closes #4048 from omrisk/typos_in_spark_tutorial and squashes the following commits:
    
    d85861463 [OmriK] Checked part 1
    
    (cherry picked from commit f3bdd4a1fa0cf19bc1015955d8ade4bc79a8e16f)
    Signed-off-by: Jeff Zhang <zj...@apache.org>
---
 .... Spark Interpreter Introduction_2F8KN6TKK.zpln | 26 +++++++++++-----------
 .../3. Spark SQL (PySpark)_2EWM84JXA.zpln          | 10 ++++-----
 .../3. Spark SQL (Scala)_2EYUV26VR.zpln            | 14 ++++++------
 .../Spark Tutorial/4. Spark MlLib_2EZFM3GJA.zpln   |  2 +-
 4 files changed, 26 insertions(+), 26 deletions(-)

diff --git a/notebook/Spark Tutorial/1. Spark Interpreter Introduction_2F8KN6TKK.zpln b/notebook/Spark Tutorial/1. Spark Interpreter Introduction_2F8KN6TKK.zpln
index d085d9f..0f3cb7a 100644
--- a/notebook/Spark Tutorial/1. Spark Interpreter Introduction_2F8KN6TKK.zpln	
+++ b/notebook/Spark Tutorial/1. Spark Interpreter Introduction_2F8KN6TKK.zpln	
@@ -2,7 +2,7 @@
   "paragraphs": [
     {
       "title": "",
-      "text": "%md\n\n# Introduction\n\nThis tutorial is for how to use Spark Interpreter in Zeppelin.\n\n1. Specify `SPARK_HOME` in interpreter setting. If you don\u0027t specify `SPARK_HOME`, Zeppelin will use the embedded spark which can only run in local mode. And some advanced features may not work in this embedded spark.\n2. Specify `spark.master` for spark execution mode.\n    * `local[*]`  - Driver and Executor would both run in the same host of zeppelin server. It is only for te [...]
+      "text": "%md\n\n# Introduction\n\nThis tutorial is for how to use Spark Interpreter in Zeppelin.\n\n1. Specify `SPARK_HOME` in interpreter setting. If you don\u0027t specify `SPARK_HOME`, Zeppelin will use the embedded spark which can only run in local mode. And some advanced features may not work in this embedded spark.\n2. Specify `spark.master` for spark execution mode.\n    * `local[*]`  - Driver and Executor would both run in the same host of zeppelin server. It is only for te [...]
       "user": "anonymous",
       "dateUpdated": "2020-05-04 13:44:39.482",
       "config": {
@@ -29,7 +29,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003ch1\u003eIntroduction\u003c/h1\u003e\n\u003cp\u003eThis tutorial is for how to use Spark Interpreter in Zeppelin.\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eSpecify \u003ccode\u003eSPARK_HOME\u003c/code\u003e in interpreter setting. If you don\u0026rsquo;t specify \u003ccode\u003eSPARK_HOME\u003c/code\u003e, Zeppelin will use the embedded spark which can only run in local mode. And some advanced features may not wo [...]
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003ch1\u003eIntroduction\u003c/h1\u003e\n\u003cp\u003eThis tutorial is for how to use Spark Interpreter in Zeppelin.\u003c/p\u003e\n\u003col\u003e\n\u003cli\u003eSpecify \u003ccode\u003eSPARK_HOME\u003c/code\u003e in interpreter setting. If you don\u0026rsquo;t specify \u003ccode\u003eSPARK_HOME\u003c/code\u003e, Zeppelin will use the embedded spark which can only run in local mode. And some advanced features may not wo [...]
           }
         ]
       },
@@ -44,7 +44,7 @@
     },
     {
       "title": "Use Generic Inline Configuration instead of Interpreter Setting",
-      "text": "%md\n\nCustomize your spark interpreter is indispensible for Zeppelin Notebook. E.g. You want to add third party jars, change the execution mode, change the number of exceutor or its memory and etc. You can check this link for all the available [spark configuration](http://spark.apache.org/docs/latest/configuration.html)\nAlthough you can customize these in interpreter setting, it is recommended to do via the generic inline configuration. Because interpreter setting is sha [...]
+      "text": "%md\n\nCustomize your spark interpreter is indispensable for Zeppelin Notebook. E.g. You want to add third party jars, change the execution mode, change the number of executor or its memory and etc. You can check this link for all the available [spark configuration](http://spark.apache.org/docs/latest/configuration.html)\nAlthough you can customize these in interpreter setting, it is recommended to do via the generic inline configuration. Because interpreter setting is sha [...]
       "user": "anonymous",
       "dateUpdated": "2020-05-04 13:45:44.204",
       "config": {
@@ -72,7 +72,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eCustomize your spark interpreter is indispensible for Zeppelin Notebook. E.g. You want to add third party jars, change the execution mode, change the number of exceutor or its memory and etc. You can check this link for all the available \u003ca href\u003d\"http://spark.apache.org/docs/latest/configuration.html\"\u003espark configuration\u003c/a\u003e\u003cbr /\u003e\nAlthough you can customize these in inter [...]
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eCustomize your spark interpreter is indispensable for Zeppelin Notebook. E.g. You want to add third party jars, change the execution mode, change the number of executor or its memory and etc. You can check this link for all the available \u003ca href\u003d\"http://spark.apache.org/docs/latest/configuration.html\"\u003espark configuration\u003c/a\u003e\u003cbr /\u003e\nAlthough you can customize these in inter [...]
           }
         ]
       },
@@ -87,7 +87,7 @@
     },
     {
       "title": "Generic Inline Configuration",
-      "text": "%spark.conf\n\nSPARK_HOME  \u003cPATH_TO_SPAKR_HOME\u003e\n\n# set driver memrory to 8g\nspark.driver.memory 8g\n\n# set executor number to be 6\nspark.executor.instances  6\n\n# set executor memrory 4g\nspark.executor.memory  4g\n\n# Any other spark properties can be set here. Here\u0027s avaliable spark configruation you can set. (http://spark.apache.org/docs/latest/configuration.html)\n",
+      "text": "%spark.conf\n\nSPARK_HOME  \u003cPATH_TO_SPARK_HOME\u003e\n\n# set driver memory to 8g\nspark.driver.memory 8g\n\n# set executor number to be 6\nspark.executor.instances  6\n\n# set executor memory 4g\nspark.executor.memory  4g\n\n# Any other spark properties can be set here. Here\u0027s avaliable spark configruation you can set. (http://spark.apache.org/docs/latest/configuration.html)\n",
       "user": "anonymous",
       "dateUpdated": "2020-04-30 10:56:30.840",
       "config": {
@@ -145,7 +145,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThere\u0026rsquo;re 2 ways to add third party libraries.\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ccode\u003eGeneric Inline Configuration\u003c/code\u003e   It is the recommended way to add third party jars/packages. Use \u003ccode\u003espark.jars\u003c/code\u003e for adding local jar file and \u003ccode\u003espark.jars.packages\u003c/code\u003e for adding packages\u003c/li\u003e\n\u003cli\u003e\u003 [...]
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThere\u0026rsquo;re 2 ways to add third party libraries.\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003ccode\u003eGeneric Inline Configuration\u003c/code\u003e   It is the recommended way to add third party jars/packages. Use \u003ccode\u003espark.jars\u003c/code\u003e for adding local jar file and \u003ccode\u003espark.jars.packages\u003c/code\u003e for adding packages\u003c/li\u003e\n\u003cli\u003e\u003 [...]
           }
         ]
       },
@@ -160,7 +160,7 @@
     },
     {
       "title": "",
-      "text": "%spark.conf\n\n# Must set SPARK_HOME for this example, because it won\u0027t work for Zeppelin\u0027s embedded spark mode. The embedded spark mode doesn\u0027t \n# use spark-submit to launch spark interpreter, so spark.jars and spark.jars.packages won\u0027t take affect. \nSPARK_HOME \u003cPATH_TO_SPAKR_HOME\u003e\n\n# set execution mode\nmaster yarn-client\n\n# spark.jars can be used for adding any local jar files into spark interpreter\n# spark.jars  \u003cpath_to_local_ [...]
+      "text": "%spark.conf\n\n# Must set SPARK_HOME for this example, because it won\u0027t work for Zeppelin\u0027s embedded spark mode. The embedded spark mode doesn\u0027t \n# use spark-submit to launch spark interpreter, so spark.jars and spark.jars.packages won\u0027t take affect. \nSPARK_HOME \u003cPATH_TO_SPARK_HOME\u003e\n\n# set execution mode\nmaster yarn-client\n\n# spark.jars can be used for adding any local jar files into spark interpreter\n# spark.jars  \u003cpath_to_local_ [...]
       "user": "anonymous",
       "dateUpdated": "2020-04-30 11:01:36.681",
       "config": {
@@ -272,7 +272,7 @@
     },
     {
       "title": "Code Completion in Scala",
-      "text": "%md\n\nSpark interpreter provide code completion feature. As long as you type `tab`, code completion will start to work and provide you with a list of candiates. Here\u0027s one screenshot of how it works. \n\n**To be noticed**, code completion only works after spark interpreter is launched. So it will not work when you type code in the first paragraph as the spark interpreter is not launched yet. For me, usually I will run one simple code such as `sc.version` to launch sp [...]
+      "text": "%md\n\nSpark interpreter provide code completion feature. As long as you type `tab`, code completion will start to work and provide you with a list of candidates. Here\u0027s one screenshot of how it works. \n\n**To be noticed**, code completion only works after spark interpreter is launched. So it will not work when you type code in the first paragraph as the spark interpreter is not launched yet. For me, usually I will run one simple code such as `sc.version` to launch s [...]
       "user": "anonymous",
       "dateUpdated": "2020-04-30 11:03:03.127",
       "config": {
@@ -300,7 +300,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eSpark interpreter provide code completion feature. As long as you type \u003ccode\u003etab\u003c/code\u003e, code completion will start to work and provide you with a list of candiates. Here\u0026rsquo;s one screenshot of how it works.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTo be noticed\u003c/strong\u003e, code completion only works after spark interpreter is launched. So it will not work when you typ [...]
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eSpark interpreter provide code completion feature. As long as you type \u003ccode\u003etab\u003c/code\u003e, code completion will start to work and provide you with a list of candidates. Here\u0026rsquo;s one screenshot of how it works.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eTo be noticed\u003c/strong\u003e, code completion only works after spark interpreter is launched. So it will not work when you ty [...]
           }
         ]
       },
@@ -315,7 +315,7 @@
     },
     {
       "title": "PySpark",
-      "text": "%md\n\nFor using PySpark, you need to do some other pyspark configration besides the above spark configuration we mentioned before. The most important property you need to set is python path for both driver and executor. If you hit the following error, it means your python on driver is mismatched with that of executor. In this case you need to check the 2 properties: `PYSPARK_PYTHON` and `PYSPARK_DRIVER_PYTHON`. (You can use `spark.pyspark.python` and `spark.pyspark.driver [...]
+      "text": "%md\n\nFor using PySpark, you need to do some other pyspark configuration besides the above spark configuration we mentioned before. The most important property you need to set is python path for both driver and executor. If you hit the following error, it means your python on driver is mismatched with that of executor. In this case you need to check the 2 properties: `PYSPARK_PYTHON` and `PYSPARK_DRIVER_PYTHON`. (You can use `spark.pyspark.python` and `spark.pyspark.drive [...]
       "user": "anonymous",
       "dateUpdated": "2020-04-30 11:04:18.086",
       "config": {
@@ -343,7 +343,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eFor using PySpark, you need to do some other pyspark configration besides the above spark configuration we mentioned before. The most important property you need to set is python path for both driver and executor. If you hit the following error, it means your python on driver is mismatched with that of executor. In this case you need to check the 2 properties: \u003ccode\u003ePYSPARK_PYTHON\u003c/code\u003e a [...]
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eFor using PySpark, you need to do some other pyspark configuration besides the above spark configuration we mentioned before. The most important property you need to set is python path for both driver and executor. If you hit the following error, it means your python on driver is mismatched with that of executor. In this case you need to check the 2 properties: \u003ccode\u003ePYSPARK_PYTHON\u003c/code\u003e  [...]
           }
         ]
       },
@@ -392,7 +392,7 @@
     },
     {
       "title": "Use IPython",
-      "text": "%md\n\nStarting from Zeppelin 0.8.0, `ipython` is integrated into Zeppelin. And `PySparkInterpreter`(`%spark.pyspark`) would use `ipython` if it is avalible. It is recommended to use `ipython` interpreter as it provides more powerful feature than the old PythonInterpreter. Spark create a new interpreter called `IPySparkInterpreter` (`%spark.ipyspark`) which use IPython underneath. You can use all the `ipython` features in this IPySparkInterpreter. There\u0027s one ipython  [...]
+      "text": "%md\n\nStarting from Zeppelin 0.8.0, `ipython` is integrated into Zeppelin. And `PySparkInterpreter`(`%spark.pyspark`) would use `ipython` if it is available. It is recommended to use `ipython` interpreter as it provides more powerful feature than the old PythonInterpreter. Spark create a new interpreter called `IPySparkInterpreter` (`%spark.ipyspark`) which use IPython underneath. You can use all the `ipython` features in this IPySparkInterpreter. There\u0027s one ipython [...]
       "user": "anonymous",
       "dateUpdated": "2020-04-30 11:10:07.426",
       "config": {
@@ -420,7 +420,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eStarting from Zeppelin 0.8.0, \u003ccode\u003eipython\u003c/code\u003e is integrated into Zeppelin. And \u003ccode\u003ePySparkInterpreter\u003c/code\u003e(\u003ccode\u003e%spark.pyspark\u003c/code\u003e) would use \u003ccode\u003eipython\u003c/code\u003e if it is avalible. It is recommended to use \u003ccode\u003eipython\u003c/code\u003e interpreter as it provides more powerful feature than the old PythonInt [...]
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eStarting from Zeppelin 0.8.0, \u003ccode\u003eipython\u003c/code\u003e is integrated into Zeppelin. And \u003ccode\u003ePySparkInterpreter\u003c/code\u003e(\u003ccode\u003e%spark.pyspark\u003c/code\u003e) would use \u003ccode\u003eipython\u003c/code\u003e if it is available. It is recommended to use \u003ccode\u003eipython\u003c/code\u003e interpreter as it provides more powerful feature than the old PythonIn [...]
           }
         ]
       },
diff --git a/notebook/Spark Tutorial/3. Spark SQL (PySpark)_2EWM84JXA.zpln b/notebook/Spark Tutorial/3. Spark SQL (PySpark)_2EWM84JXA.zpln
index 53c5ca3..7802e98 100644
--- a/notebook/Spark Tutorial/3. Spark SQL (PySpark)_2EWM84JXA.zpln	
+++ b/notebook/Spark Tutorial/3. Spark SQL (PySpark)_2EWM84JXA.zpln	
@@ -2,7 +2,7 @@
   "paragraphs": [
     {
       "title": "Introduction",
-      "text": "%md\n\nThis is a tutorial for Spark SQL in PySpark (based on Spark 2.x).  First we need to clarifiy serveral concetps of Spark SQL\n\n* **SparkSession**   - This is the entry point of Spark SQL, you need use `SparkSession` to create DataFrame/Dataset, register UDF, query table and etc.\n* **DataFrame**      - There\u0027s no Dataset in PySpark, but only DataFrame. The DataFrame of PySpark is very similar with DataFrame concept of Pandas, but is distributed. \n",
+      "text": "%md\n\nThis is a tutorial for Spark SQL in PySpark (based on Spark 2.x).  First we need to clarify several concepts of Spark SQL\n\n* **SparkSession**   - This is the entry point of Spark SQL, you need use `SparkSession` to create DataFrame/Dataset, register UDF, query table and etc.\n* **DataFrame**      - There\u0027s no Dataset in PySpark, but only DataFrame. The DataFrame of PySpark is very similar with DataFrame concept of Pandas, but is distributed. \n",
       "user": "anonymous",
       "dateUpdated": "2020-03-11 11:16:37.393",
       "config": {
@@ -32,7 +32,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThis is a tutorial for Spark SQL in PySpark (based on Spark 2.x).  First we need to clarifiy serveral concetps of Spark SQL\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eSparkSession\u003c/strong\u003e   - This is the entry point of Spark SQL, you need use \u003ccode\u003eSparkSession\u003c/code\u003e to create DataFrame/Dataset, register UDF, query table and etc.\u003c/li\u003e\n\u003cli\u00 [...]
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThis is a tutorial for Spark SQL in PySpark (based on Spark 2.x).  First we need to clarify several concepts of Spark SQL\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eSparkSession\u003c/strong\u003e   - This is the entry point of Spark SQL, you need use \u003ccode\u003eSparkSession\u003c/code\u003e to create DataFrame/Dataset, register UDF, query table and etc.\u003c/li\u003e\n\u003cli\u003e [...]
           }
         ]
       },
@@ -137,7 +137,7 @@
     },
     {
       "title": "Spark Configuration",
-      "text": "%spark.conf\n\n# It is strongly recommended to set SPARK_HOME explictly instead of using the embedded spark of Zeppelin. As the function of embedded spark of Zeppelin is limited and can only run in local mode.\n# SPARK_HOME \u003cyour_spark_dist_path\u003e\n\n# Uncomment the following line if you want to use yarn-cluster mode (It is recommended to use yarn-cluster mode from Zeppelin 0.8, as the driver will run on the remote host of yarn cluster which can mitigate memory pr [...]
+      "text": "%spark.conf\n\n# It is strongly recommended to set SPARK_HOME explicitly instead of using the embedded spark of Zeppelin. As the function of embedded spark of Zeppelin is limited and can only run in local mode.\n# SPARK_HOME \u003cyour_spark_dist_path\u003e\n\n# Uncomment the following line if you want to use yarn-cluster mode (It is recommended to use yarn-cluster mode from Zeppelin 0.8, as the driver will run on the remote host of yarn cluster which can mitigate memory p [...]
       "user": "anonymous",
       "dateUpdated": "2020-03-11 13:26:01.470",
       "config": {
@@ -713,7 +713,7 @@
     },
     {
       "title": "Visualize DataFrame/Dataset",
-      "text": "%md\n\nThere\u0027s 2 approaches to visuliaze DataFrame/Dataset in Zeppelin\n\n* Use SparkSQLInterpreter via `%spark.sql`\n* Use ZeppelinContext via `z.show`\n\n",
+      "text": "%md\n\nThere\u0027s 2 approaches to visualize DataFrame/Dataset in Zeppelin\n\n* Use SparkSQLInterpreter via `%spark.sql`\n* Use ZeppelinContext via `z.show`\n\n",
       "user": "anonymous",
       "dateUpdated": "2020-01-21 15:47:18.301",
       "config": {
@@ -743,7 +743,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThere\u0026rsquo;s 2 approaches to visuliaze DataFrame/Dataset in Zeppelin\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eUse SparkSQLInterpreter via \u003ccode\u003e%spark.sql\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003eUse ZeppelinContext via \u003ccode\u003ez.show\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\n\u003c/div\u003e"
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThere\u0026rsquo;s 2 approaches to visualize DataFrame/Dataset in Zeppelin\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003eUse SparkSQLInterpreter via \u003ccode\u003e%spark.sql\u003c/code\u003e\u003c/li\u003e\n\u003cli\u003eUse ZeppelinContext via \u003ccode\u003ez.show\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\n\u003c/div\u003e"
           }
         ]
       },
diff --git a/notebook/Spark Tutorial/3. Spark SQL (Scala)_2EYUV26VR.zpln b/notebook/Spark Tutorial/3. Spark SQL (Scala)_2EYUV26VR.zpln
index 7ad5809..da4002f 100644
--- a/notebook/Spark Tutorial/3. Spark SQL (Scala)_2EYUV26VR.zpln	
+++ b/notebook/Spark Tutorial/3. Spark SQL (Scala)_2EYUV26VR.zpln	
@@ -2,7 +2,7 @@
   "paragraphs": [
     {
       "title": "Introduction",
-      "text": "%md\n\nThis is a tutorial for Spark SQL in scala (based on Spark 2.x).  First we need to clarifiy serveral basic concepts of Spark SQL\n\n* **SparkSession**   - This is the entry point of Spark SQL, you need use `SparkSession` to create DataFrame/Dataset, register UDF, query table and etc.\n* **Dataset**        - Dataset is the core abstraction of Spark SQL. Underneath Dataset is RDD, but Dataset know more about your data, specifically its structure, so that Dataset could  [...]
+      "text": "%md\n\nThis is a tutorial for Spark SQL in scala (based on Spark 2.x).  First we need to clarify several basic concepts of Spark SQL\n\n* **SparkSession**   - This is the entry point of Spark SQL, you need use `SparkSession` to create DataFrame/Dataset, register UDF, query table and etc.\n* **Dataset**        - Dataset is the core abstraction of Spark SQL. Underneath Dataset is RDD, but Dataset know more about your data, specifically its structure, so that Dataset could do [...]
       "user": "anonymous",
       "dateUpdated": "2020-03-11 13:26:59.236",
       "config": {
@@ -32,7 +32,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThis is a tutorial for Spark SQL in scala (based on Spark 2.x).  First we need to clarifiy serveral basic concepts of Spark SQL\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eSparkSession\u003c/strong\u003e   - This is the entry point of Spark SQL, you need use \u003ccode\u003eSparkSession\u003c/code\u003e to create DataFrame/Dataset, register UDF, query table and etc.\u003c/li\u003e\n\u003cli [...]
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThis is a tutorial for Spark SQL in scala (based on Spark 2.x).  First we need to clarify several basic concepts of Spark SQL\u003c/p\u003e\n\u003cul\u003e\n\u003cli\u003e\u003cstrong\u003eSparkSession\u003c/strong\u003e   - This is the entry point of Spark SQL, you need use \u003ccode\u003eSparkSession\u003c/code\u003e to create DataFrame/Dataset, register UDF, query table and etc.\u003c/li\u003e\n\u003cli\u [...]
           }
         ]
       },
@@ -140,7 +140,7 @@
     },
     {
       "title": "",
-      "text": "%spark.conf\n\n# It is strongly recommended to set SPARK_HOME explictly instead of using the embedded spark of Zeppelin. As the function of embedded spark of Zeppelin is limited and can only run in local mode.\n# SPARK_HOME \u003cyour_spark_dist_path\u003e\n\n# Uncomment the following line if you want to use yarn-cluster mode (It is recommended to use yarn-cluster mode after Zeppelin 0.8, as the driver will run on the remote host of yarn cluster which can mitigate memory p [...]
+      "text": "%spark.conf\n\n# It is strongly recommended to set SPARK_HOME explicitly instead of using the embedded spark of Zeppelin. As the function of embedded spark of Zeppelin is limited and can only run in local mode.\n# SPARK_HOME \u003cyour_spark_dist_path\u003e\n\n# Uncomment the following line if you want to use yarn-cluster mode (It is recommended to use yarn-cluster mode after Zeppelin 0.8, as the driver will run on the remote host of yarn cluster which can mitigate memory  [...]
       "user": "anonymous",
       "dateUpdated": "2020-03-11 13:28:13.784",
       "config": {
@@ -556,7 +556,7 @@
     },
     {
       "title": "Join on Single Field",
-      "text": "%spark\n\nval df1 \u003d spark.createDataFrame(Seq((1, \"andy\", 20, 1), (2, \"jeff\", 23, 2), (3, \"james\", 18, 3))).toDF(\"id\", \"name\", \"age\", \"c_id\")\ndf1.show()\n\nval df2 \u003d spark.createDataFrame(Seq((1, \"USA\"), (2, \"China\"))).toDF(\"c_id\", \"c_name\")\ndf2.show()\n\n// You can just specify the key name if join on the same key\nval df3 \u003d df1.join(df2, \"c_id\")\ndf3.show()\n\n// Or you can specify the join condition expclitly in case the key is d [...]
+      "text": "%spark\n\nval df1 \u003d spark.createDataFrame(Seq((1, \"andy\", 20, 1), (2, \"jeff\", 23, 2), (3, \"james\", 18, 3))).toDF(\"id\", \"name\", \"age\", \"c_id\")\ndf1.show()\n\nval df2 \u003d spark.createDataFrame(Seq((1, \"USA\"), (2, \"China\"))).toDF(\"c_id\", \"c_name\")\ndf2.show()\n\n// You can just specify the key name if join on the same key\nval df3 \u003d df1.join(df2, \"c_id\")\ndf3.show()\n\n// Or you can specify the join condition explicitly in case the key is  [...]
       "user": "anonymous",
       "dateUpdated": "2020-03-11 13:34:11.058",
       "config": {
@@ -600,7 +600,7 @@
     },
     {
       "title": "Join on Multiple Fields",
-      "text": "%spark\n\nval df1 \u003d spark.createDataFrame(Seq((\"andy\", 20, 1, 1), (\"jeff\", 23, 1, 2), (\"james\", 12, 2, 2))).toDF(\"name\", \"age\", \"key_1\", \"key_2\")\ndf1.show()\n\nval df2 \u003d spark.createDataFrame(Seq((1, 1, \"USA\"), (2, 2, \"China\"))).toDF(\"key_1\", \"key_2\", \"country\")\ndf2.show()\n\n// Join on 2 fields: key_1, key_2\n\n// You can pass a list of field name if the join field names are the same in both tables\nval df3 \u003d df1.join(df2, Seq(\"ke [...]
+      "text": "%spark\n\nval df1 \u003d spark.createDataFrame(Seq((\"andy\", 20, 1, 1), (\"jeff\", 23, 1, 2), (\"james\", 12, 2, 2))).toDF(\"name\", \"age\", \"key_1\", \"key_2\")\ndf1.show()\n\nval df2 \u003d spark.createDataFrame(Seq((1, 1, \"USA\"), (2, 2, \"China\"))).toDF(\"key_1\", \"key_2\", \"country\")\ndf2.show()\n\n// Join on 2 fields: key_1, key_2\n\n// You can pass a list of field name if the join field names are the same in both tables\nval df3 \u003d df1.join(df2, Seq(\"ke [...]
       "user": "anonymous",
       "dateUpdated": "2020-03-11 13:34:12.577",
       "config": {
@@ -688,7 +688,7 @@
     },
     {
       "title": "Visualize DataFrame/Dataset",
-      "text": "%md\n\nThere\u0027s 2 approaches to visuliaze DataFrame/Dataset in Zeppelin\n\n* Use SparkSQLInterpreter via `%spark.sql`\n* Use ZeppelinContext via `z.show`\n\n",
+      "text": "%md\n\nThere\u0027s 2 approaches to visualize DataFrame/Dataset in Zeppelin\n\n* Use SparkSQLInterpreter via `%spark.sql`\n* Use ZeppelinContext via `z.show`\n\n",
       "user": "anonymous",
       "dateUpdated": "2020-01-21 15:55:08.071",
       "config": {
@@ -716,7 +716,7 @@
         "msg": [
           {
             "type": "HTML",
-            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThere\u0026rsquo;s 2 approaches to visuliaze DataFrame/Dataset in Zeppelin\u003c/p\u003e\n\u003cul\u003e\n  \u003cli\u003eUse SparkSQLInterpreter via \u003ccode\u003e%spark.sql\u003c/code\u003e\u003c/li\u003e\n  \u003cli\u003eUse ZeppelinContext via \u003ccode\u003ez.show\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/div\u003e"
+            "data": "\u003cdiv class\u003d\"markdown-body\"\u003e\n\u003cp\u003eThere\u0026rsquo;s 2 approaches to visualize DataFrame/Dataset in Zeppelin\u003c/p\u003e\n\u003cul\u003e\n  \u003cli\u003eUse SparkSQLInterpreter via \u003ccode\u003e%spark.sql\u003c/code\u003e\u003c/li\u003e\n  \u003cli\u003eUse ZeppelinContext via \u003ccode\u003ez.show\u003c/code\u003e\u003c/li\u003e\n\u003c/ul\u003e\n\u003c/div\u003e"
           }
         ]
       },
diff --git a/notebook/Spark Tutorial/4. Spark MlLib_2EZFM3GJA.zpln b/notebook/Spark Tutorial/4. Spark MlLib_2EZFM3GJA.zpln
index fa5998b..53540ed 100644
--- a/notebook/Spark Tutorial/4. Spark MlLib_2EZFM3GJA.zpln	
+++ b/notebook/Spark Tutorial/4. Spark MlLib_2EZFM3GJA.zpln	
@@ -2,7 +2,7 @@
   "paragraphs": [
     {
       "title": "Introduction",
-      "text": "%md\n\nThis is a tutorial of how to use Spark MLlib in Zeppelin, we have 2 examples in this note:\n\n* Linear regression, we generate some random data and use a linear regression to fit this data. We use bokeh here to visualize the data and the fitted model.  Besides training, we also visualize the loss value over iteration.\n* Logstic regression, we use the offical `sample_binary_classification_data` of spark as the training data. Besides training, we also visualize the l [...]
+      "text": "%md\n\nThis is a tutorial of how to use Spark MLlib in Zeppelin, we have 2 examples in this note:\n\n* Linear regression, we generate some random data and use a linear regression to fit this data. We use bokeh here to visualize the data and the fitted model.  Besides training, we also visualize the loss value over iteration.\n* Logstic regression, we use the official `sample_binary_classification_data` of spark as the training data. Besides training, we also visualize the  [...]
       "user": "anonymous",
       "dateUpdated": "2020-03-11 14:08:34.165",
       "config": {